From bix at sendu.me.uk  Fri Jun  1 04:06:04 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 01 Jun 2007 09:06:04 +0100
Subject: [Bioperl-l] ClustalW Score?
In-Reply-To: <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu>
References: <00e201c7a2de$91f60f50$2d01a8c0@PICO><DFEEDFC9-68C4-4821-846F-69AC9559C70B@bioperl.org><465E9B58.1020403@sendu.me.uk>	<49B6333A-18B9-4B63-80EF-81C57A295494@bioperl.org>
	<1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu>
Message-ID: <465FD36C.5060603@sendu.me.uk>

Kevin Brown wrote:
>> you're right --- it is not really my code, I was just 
>> elaborating Kevin's example --- it would probably need to be 
>> more specific or perhaps the last Score seen is sufficient 
>> for what one is trying to capture?
> 
> I took that code from a pairwise clustal alignment script that I wrote
> to deal with aligning a bunch of short sequences against a long one to
> see where they line up at.  When all of them were fed to Clustal the
> short sequences all ended up aligned to each other and not well aligned
> to the longer sequence.  I only saw one score in the output from the
> pairwise, so that is what I used to find a reasonable value.

Ok, well I've hedged my bets and used both. Now commited to CVS.

From jy at genseq.co.uk  Fri Jun  1 22:39:48 2007
From: jy at genseq.co.uk (Jean-Yves Sireau)
Date: Sat, 2 Jun 2007 10:39:48 +0800
Subject: [Bioperl-l] Genseq
Message-ID: <20070602103948.093d713c@jys.my.regentmarkets.com>

Dear List members,

I would like to let you know of the formation of Genseq Ltd., a
bioinformatics company that will (in time!) offer genome sequencing to
high net worth individuals and bioinformatic analysis of the sequence
data to detect predisposition to illness.  The company's website is
www.genseq.co.uk

Genseq would be willing to sponsor bioperl, whether financially or by
providing resources, notably for any bioperl-related activities in the
Asia Pacific region.  Genseq's bioinformatics team will be based in
Cyberjaya (Malaysia), and we are in particular interested to promote
bioperl in Malaysia.  We are also actively recruiting at the moment
in Malaysia and India.

If there was sufficient demand, we would be willing to organise a
bioperl conference in Cyberjaya at the Cyberview Lodge
(www.cyberview-lodge.com), which would be the ideal place for such a
conference in Malaysia.

Looking forward to your comments, suggestions and proposals.

Best regards
Jean-Yves Sireau

-- 

Jean-Yves Sireau
CEO, Genseq Ltd.
www.genseq.co.uk

From cjfields at uiuc.edu  Sat Jun  2 01:16:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 00:16:05 -0500
Subject: [Bioperl-l] EUtilities overhaul started
Message-ID: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>

To anyone using Bio::DB::EUilities,

I am in the midst of a major overhaul to the various EUtilities tools  
and to Bio::DB::GenericWebDBI (the latter which I am forming into  
more or less a test bed for other database interfaces).  I'm about  
80% done at this point, and will likely start committing changes this  
coming week.

The overall interface will change (something I had warned about in  
the Bio::DB::EUtilities POD) but I am hoping it will be more  
intuitive and easier to use in the long run.  I'll describe the  
overall redesign and use in an upcoming HOWTO (as recommended by  
Brian a while back).

If anyone has any suggestions/ideas/flames, please let me know!

Cheers!

chris

From cjfields at uiuc.edu  Sat Jun  2 10:39:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 09:39:25 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
Message-ID: <AF243C87-B82E-4C33-939D-2B84B9E41537@uiuc.edu>

Yes, there are a few odd issues, though that's one I've not heard of  
yet.  You might try one of the sub-nucleotide databases (nuccore,  
nucest, nucgss).

I'll try looking into it and (if necessary) pester NCBI about it.   
I'll pass this on to the mail list to see if anyone else knows about  
the problem.

chris

On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote:

> Hi Chris,
>
> Thanks for your work on EUtilities.
> For a production task, I used EUtilitities directly (given your
> announced overhaul). I noticed a recent problem at NCBI (reported two
> weeks ago to NCBI, no reply yet). Possibly you may run into this with
> testing: if you ePOST gi ids to the EU server and then use this set in
> Esearch (using the query key) no results are returned for the
> nucleotide database.
> ESearches like "db=$db%23$QueryKey" typically fail if the $db is
> nucleotide (but work f $db='protein'). The XML output has Count 0 and
> an empty QueryTranslationSet for db=nucleotide only.
> For completeness, I attach a simple test script I used.
>
>
> Best regards,
> Bernd
>
>
> On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> To anyone using Bio::DB::EUilities,
>>
>> I am in the midst of a major overhaul to the various EUtilities tools
>> and to Bio::DB::GenericWebDBI (the latter which I am forming into
>> more or less a test bed for other database interfaces).  I'm about
>> 80% done at this point, and will likely start committing changes this
>> coming week.
>>
>> The overall interface will change (something I had warned about in
>> the Bio::DB::EUtilities POD) but I am hoping it will be more
>> intuitive and easier to use in the long run.  I'll describe the
>> overall redesign and use in an upcoming HOWTO (as recommended by
>> Brian a while back).
>>
>> If anyone has any suggestions/ideas/flames, please let me know!
>>
>> Cheers!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> <EUsearch.pl>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Jun  3 00:51:57 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 23:51:57 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <e572b3c70706020948l708f14c8q706b65c73617c86d@mail.gmail.com>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
	<AF243C87-B82E-4C33-939D-2B84B9E41537@uiuc.edu>
	<e572b3c70706020948l708f14c8q706b65c73617c86d@mail.gmail.com>
Message-ID: <1A2AF5C4-6A58-4FDD-A4CA-6ABCE30F0D1B@uiuc.edu>

I can confirm this; however it only relates to the use of history  
with esearch and nucleotide (use of the history with other eutils  
seems to work fine); retrieving sequences via efetch is not  
affected.  If I find out anything more I'll post something on the  
mail list.

chris

On Jun 2, 2007, at 11:48 AM, Bernd Brandt wrote:

> I can confirm that using the correct sub-nucleotide database works
> (nuccore in my case).
> This seems to be a quite recent change/bug at NCBI. Until recently,
> db=nucleotide worked. Moreover, EInfo still lists nucleotide as valid
> db.
> It is not optimal to have to choose the sub-database and the searches
> work via the Entrez web-interface. Note that this problem is related
> to the ESearch and db=nucleotide.
>
> bernd
>
> On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> Yes, there are a few odd issues, though that's one I've not heard of
>> yet.  You might try one of the sub-nucleotide databases (nuccore,
>> nucest, nucgss).
>>
>> I'll try looking into it and (if necessary) pester NCBI about it.
>> I'll pass this on to the mail list to see if anyone else knows about
>> the problem.
>>
>> chris
>>
>> On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote:
>>
>> > Hi Chris,
>> >
>> > Thanks for your work on EUtilities.
>> > For a production task, I used EUtilitities directly (given your
>> > announced overhaul). I noticed a recent problem at NCBI  
>> (reported two
>> > weeks ago to NCBI, no reply yet). Possibly you may run into this  
>> with
>> > testing: if you ePOST gi ids to the EU server and then use this  
>> set in
>> > Esearch (using the query key) no results are returned for the
>> > nucleotide database.
>> > ESearches like "db=$db%23$QueryKey" typically fail if the $db is
>> > nucleotide (but work f $db='protein'). The XML output has Count  
>> 0 and
>> > an empty QueryTranslationSet for db=nucleotide only.
>> > For completeness, I attach a simple test script I used.
>> >
>> >
>> > Best regards,
>> > Bernd
>> >
>> >
>> > On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> >> To anyone using Bio::DB::EUilities,
>> >>
>> >> I am in the midst of a major overhaul to the various EUtilities  
>> tools
>> >> and to Bio::DB::GenericWebDBI (the latter which I am forming into
>> >> more or less a test bed for other database interfaces).  I'm about
>> >> 80% done at this point, and will likely start committing  
>> changes this
>> >> coming week.
>> >>
>> >> The overall interface will change (something I had warned about in
>> >> the Bio::DB::EUtilities POD) but I am hoping it will be more
>> >> intuitive and easier to use in the long run.  I'll describe the
>> >> overall redesign and use in an upcoming HOWTO (as recommended by
>> >> Brian a while back).
>> >>
>> >> If anyone has any suggestions/ideas/flames, please let me know!
>> >>
>> >> Cheers!
>> >>
>> >> chris
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>
>> >> <EUsearch.pl>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From basu at pharm.stonybrook.edu  Sun Jun  3 10:44:18 2007
From: basu at pharm.stonybrook.edu (Siddhartha Basu)
Date: Sun, 03 Jun 2007 10:44:18 -0400
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
Message-ID: <web-5961520@pharm.stonybrook.edu>

On Sat, 2 Jun 2007 00:16:05 -0500
  Chris Fields <cjfields at uiuc.edu> wrote:
> To anyone using Bio::DB::EUilities,
> 
> I am in the midst of a major overhaul to the various 
>EUtilities tools  
> and to Bio::DB::GenericWebDBI (the latter which I am 
>forming into  
> more or less a test bed for other database interfaces). 
> I'm about  
> 80% done at this point, and will likely start committing 
>changes this  
> coming week.
> 
> The overall interface will change (something I had 
>warned about in  
> the Bio::DB::EUtilities POD) but I am hoping it will be 
>more  
> intuitive and easier to use in the long run.  I'll 
>describe the  
> overall redesign and use in an upcoming HOWTO (as 
>recommended by  
> Brian a while back).

Hi chris,
Being a frequent user of EUtilities, hopefully this api 
facelift and upcoming howto will definitely be more 
helpful.
Anyway, one thing i noticed that for each eutil call such 
as efetch,epost,esearch,esummary a new 
'Bio::DB::Utilities' object has to be
instantiated. And thereafter it cannot be set during 
runtime such as
$eutils->id('ids'), for example....

my $eutils = Bio::DB::Eutilities->new ( -id => $id,
                                        -eutil => 
'esummary',
                                        -db => 'protein',
                                      );
my $ct = $eutils->get_response->content();

## -- now i cannot do this...
$eutils->id($newid);
my $ct = $eutils->get_response->content();

Is the new api going to address something along this line 
or is there currently anyway to reuse
the object.
Thanks again for this nice toolkit.

-siddhartha


> 
> If anyone has any suggestions/ideas/flames, please let 
>me know!
> 
> Cheers!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Sun Jun  3 19:52:39 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 3 Jun 2007 18:52:39 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <web-5961520@pharm.stonybrook.edu>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<web-5961520@pharm.stonybrook.edu>
Message-ID: <5120BD7B-CA89-46E4-8D6B-6B24C1F93A5E@uiuc.edu>

On Jun 3, 2007, at 9:44 AM, Siddhartha Basu wrote:

> ...
> Hi chris,
> Being a frequent user of EUtilities, hopefully this api facelift  
> and upcoming howto will definitely be more helpful.
> Anyway, one thing i noticed that for each eutil call such as  
> efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has  
> to be
> instantiated. And thereafter it cannot be set during runtime such as
> $eutils->id('ids'), for example....
>
> my $eutils = Bio::DB::Eutilities->new ( -id => $id,
>                                        -eutil => 'esummary',
>                                        -db => 'protein',
>                                      );
> my $ct = $eutils->get_response->content();
>
> ## -- now i cannot do this...
> $eutils->id($newid);
> my $ct = $eutils->get_response->content();

I'll have to check up on that, though changing id() should work with  
the old API.  It won't matter with the new API (it works fine), but  
it is still troubling...

> Is the new api going to address something along this line or is  
> there currently anyway to reuse
> the object.
> Thanks again for this nice toolkit.
>
> -siddhartha

The old API was based upon the idea of creating discrete user agents  
for each eutil to retrieve data.  The problem with the old interface  
is it attempts to do too much (take care of parameters, set up  
requests, retrieve responses, parse data, etc), and many tasks  
required instantiating a new EUtilities object.  I was never really  
satisfied with it.

The new interface is a composition of three classes: the web user  
agent (LWP::UserAgent), a class encapsulating parameter handling, and  
a parser class (all which can be used independently if needed).  When  
parameters change a new request is made 'lazily' (i.e. only when  
needed).  Similarly, when data is requested after any parameter  
change a new parser instance is created and the new response is parsed.

With that in mind you can now do the following:
----------------------------------------
my @params = (-eutil => 'esearch',
               -db    => 'protein',
               -term => 'BRCA1',
               -retmax => 100);

my $eutil = Bio::DB::EUtilities->new(@params);

# no need to get response first; get_ids() calls that if needed

my @ids = $eutil->get_ids;

# below changes only those parameters, leaves all others set as before
$eutil->set_parameters(-eutil => 'efetch',
                        -id  => \@ids,
                        -retmode => 'text',
                        -rettype => 'fasta');

# sends streamed content directly to a file
$eutil->get_response(-content_file => 'seqs.fas');

# or to a LWP::UserAgent-supported request callback
$eutil->get_response(-content_cb => \&my_cb);

my @newparams = (-eutil => 'esearch',
               -db    => 'protein',
               -term => 'BRCA2',
               -retmax => 100);

# Resets eutility to passed parameters (or undef)
$eutil->reset_parameters(@newparams);

# retrieve new IDs
my @new_ids = $eutil->get_ids;
----------------------------------------

Note the same eutil object is used for all of the above, so to answer  
your last question, yes, you should be able to create data pipelines  
using the same object if necessary.

chris


From sac at bioperl.org  Mon Jun  4 13:56:57 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Mon, 4 Jun 2007 10:56:57 -0700
Subject: [Bioperl-l] question about Bio::Restriction::Analysis
In-Reply-To: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu>
References: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu>
Message-ID: <8f200b4c0706041056o4dbaadfexddf9f82fc33c6da@mail.gmail.com>

Hi Apurva,

I'm cc:ing the list to let others know you have found performance
issues with Bio::Restriction::Analysis. Ideally, we should focus on
addressing those issues rather than fixing a module that is now
deprecated.

But taking a quick look at my Bio::Tools::RestrictionEnzyme module,
I'm not sure why HpaII would give slower performance relative to other
non-ambiguous cutters. This enzyme has a 4-base recognition sequence
CCGG, and if you're feeding it a large CG-rich input sequence, that
could be a factor. To test, you might try using some other 4-base
cutters that aren't CG-rich (TaqI, TasI) or try some other input
sequences. There is no special flag to indicate that the enzyme is
non-ambiguous. The module handles that automatically.

Good luck,
Steve

On 6/4/07, Apurva Narechania <apurva at cshl.edu> wrote:
> Hi Rob and Steve,
>
> I was hoping you could answer a quick performance question regarding
> the Bio::Restriction::Analysis module. I have found that though this
> module works well, it is considerably slower than the deprecated
> Bio::Tools::RestrictionEnzyme. I see that there are two algorithms
> available to your module, and since I am using HpaII, a non-ambiguous
> enzyme, I thought I might find similar performance to the older,
> deprecated module, but I do not. Is it possible that I am not setting
> the non-ambiguous flag correctly? Does it need to be set in the first
> place?
>
> As far as Bio::Tools::RestrictionEnzyme, though it is faster, I have
> found instances where it is inaccurate, especially in calculating
> fragments of extremely small size 1-5 base pairs, so I would like to
> use your module if possible. It just seems slow to me.
>
> Can you clarify?
>
> I have copied my code below since it is a short, simple script.
>
> Thanks!
> Apurva Narechania
> Ware Lab
> Cold Spring Harbor Labs
>
> ----------
>
> #!/usr/bin/perl
>
> # This program generates a fasta of restriction frags given an
> # input fasta and a restriction cut site
>
> use Getopt::Std;
> use Bio::Seq;
> use Bio::SeqIO;
> use strict;
>
> use Bio::Tools::RestrictionEnzyme;
>
> my %opts = ();
> getopts ('f:', \%opts);
> my $fasta  = $opts{'f'};
>
> # read fasta file
> my $seqin = Bio::SeqIO -> new (-format => 'Fasta', -file => "$fasta");
>
> my $x = 0;
> while (my $sequence_obj = $seqin -> next_seq()){
>      $x++;
>      my $id = $sequence_obj->id();
>
>      print STDERR "$x Working on $id\n";
>
>      # generate the rx object
>      my $ra = new Bio::Tools::RestrictionEnzyme(-NAME=>'HpaII');
>
>      my @frags = $ra->cut_seq($sequence_obj);
>
>      my $counter = 0;
>      foreach my $frag (@frags){
>          $counter++;
>          my $length = length ($frag);
>          print ">$id.$counter length=$length\n$frag\n";
>      }
>
> }
>
>

From anhthu.tieu at gsf.de  Tue Jun  5 04:14:09 2007
From: anhthu.tieu at gsf.de (Tieu, Anh-Thu)
Date: Tue, 5 Jun 2007 10:14:09 +0200
Subject: [Bioperl-l] problems with image maps and IE 6 or higher
Message-ID: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>

Hi, 

 I have a problem using the bioperl image maps function with the IE6 or and
 higher browser. It might be a more general problem with IE6 rather than with bioperl,
 but as I used bioperl to create my image maps, I thought I could still post this problem 
 here and ask for people's opinion. I wondered if anyone else faced the same problem and if
 possible if anyone could share their experiences and their solutions. 
 
  
<div>
<p><img src="/ggtc/tmp_bilder/19727dab708e1cbf567dd48480febb96.png" usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/></p>
<map name="mapnameD064C01" id="mapnameD064C01">
<area shape="rect" coords="108,0,608,20" href="javascript:void(0)" onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale " alt="scale " target="_blank"/>
<area shape="rect" coords="234,44,244,55" href="javascript:void(0)" onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
alue: ' ));;return false;" title="alignment5 " alt="alignment5 " target="_blank"/>
<area shape="rect" coords="241,57,247,68" href="javascript:void(0)" onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
alue: ' ));;return false;" title="integration_pt " alt="integration_pt " target="_blank"/>
<area shape="rect" coords="108,70,608,81" href="javascript:void(0)" onclick="javascript:void(zmenu( 'Nphs1                                   ', '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', '
stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene " alt="gene " target="_blank"/>
<area shape="rect" coords="108,83,117,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop: 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a
lt="exon1 " target="_blank"/>
<area shape="rect" coords="117,83,119,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop: 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1
 " alt="intron1 " target="_blank"/>
<area shape="rect" coords="119,83,123,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop: 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a
lt="exon2 " target="_blank"/>
<area shape="rect" coords="123,83,124,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop: 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2
...
</div>


 This is part of the code I used in my HTML file to display the image map and it really runs beautifully
 with Mozilla 1.7 or the latest Firefox version. However, if used in IE6 the clickable pop-ups do not appear/ work.
 
 I appreciate any help and would like to thank everyone for their help. 
 
 Best regards, 
 
 
 Anh-Thu
________________________________________________________________________
GSF-Forschungszentrum

Ingolst?dter Landstr. 1

85764 M?nchen-Neuherberg, Germany

Chairman of Supervisory Board: MinDir Dr. Peter Lange

Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum

Register of Societies: Amtsgericht M?nchen HRB 6466


From lstein at cshl.edu  Tue Jun  5 09:56:57 2007
From: lstein at cshl.edu (Lincoln Stein)
Date: Tue, 5 Jun 2007 09:55:57 -0401
Subject: [Bioperl-l] problems with image maps and IE 6 or higher
In-Reply-To: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>
References: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>
Message-ID: <6dce9a0b0706050656n783d27b3u9229f948b2710d90@mail.gmail.com>

Hi Anh-Thu,

Could you send me a snippet of the code that is generating this imagemap? It
looks like you are relying on a javascript library for the zmenu() call, and
it may be that this library is in need of updating.

You might also consider replacing the library with Sheldon McKay's popup
balloon library, located at
http://www.wormbase.org/wiki/index.php/Balloon_Tooltips

Lincoln

On 6/5/07, Tieu, Anh-Thu <anhthu.tieu at gsf.de> wrote:
>
> Hi,
>
> I have a problem using the bioperl image maps function with the IE6 or and
> higher browser. It might be a more general problem with IE6 rather than
> with bioperl,
> but as I used bioperl to create my image maps, I thought I could still
> post this problem
> here and ask for people's opinion. I wondered if anyone else faced the
> same problem and if
> possible if anyone could share their experiences and their solutions.
>
>
> <div>
> <p><img src="/ggtc/tmp_bilder/19727dab708e1cbf567dd48480febb96.png"
> usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/></p>
> <map name="mapnameD064C01" id="mapnameD064C01">
> <area shape="rect" coords="108,0,608,20" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale "
> alt="scale " target="_blank"/>
> <area shape="rect" coords="234,44,244,55" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '',
> 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
> alue: ' ));;return false;" title="alignment5 " alt="alignment5 "
> target="_blank"/>
> <area shape="rect" coords="241,57,247,68" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '',
> 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
> alue: ' ));;return false;" title="integration_pt " alt="integration_pt "
> target="_blank"/>
> <area shape="rect" coords="108,70,608,81" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'Nphs1                                   ',
> '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', '
> stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene "
> alt="gene " target="_blank"/>
> <area shape="rect" coords="108,83,117,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop:
> 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a
> lt="exon1 " target="_blank"/>
> <area shape="rect" coords="117,83,119,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop:
> 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1
> " alt="intron1 " target="_blank"/>
> <area shape="rect" coords="119,83,123,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop:
> 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a
> lt="exon2 " target="_blank"/>
> <area shape="rect" coords="123,83,124,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop:
> 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2
> ..
> </div>
>
>
> This is part of the code I used in my HTML file to display the image map
> and it really runs beautifully
> with Mozilla 1.7 or the latest Firefox version. However, if used in IE6
> the clickable pop-ups do not appear/ work.
>
> I appreciate any help and would like to thank everyone for their help.
>
> Best regards,
>
>
> Anh-Thu
> ________________________________________________________________________
> GSF-Forschungszentrum
>
> Ingolst?dter Landstr. 1
>
> 85764 M?nchen-Neuherberg, Germany
>
> Chairman of Supervisory Board: MinDir Dr. Peter Lange
>
> Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum
>
> Register of Societies: Amtsgericht M?nchen HRB 6466
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From cjfields at uiuc.edu  Tue Jun  5 11:28:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 5 Jun 2007 10:28:24 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <46656D64.7010508@ribosome.natur.cuni.cz>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
Message-ID: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>

Martin,

The example file you give in the bioperl bugzilla report has several  
blank annotation lines which may lead to additional problems.  When  
the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,  
DEFINITION, etc) then it expects there will also be relevant data  
(text descriptions) accompanying it; I assume the BioPython parser  
expects likewise though I may be wrong.

AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- 
compliant.  GenBank records lacking text either have a '.' instead or  
are left out entirely:

http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html

We could add a fix but you should probably contact the ApE developers  
and request that field names w/o text be left out or have '.' added.

chris

On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:

> Ezequiel Panepucci wrote:
>>>     genbank entry = parser.parse(fhandle)
>>
>> there is a space character between "genbank" and "entry".
>> It is a syntax error.
>> I suppose you meant "genbank_entry" ?
>
> Yes, the next command was right and has shown the error. Sorry, I  
> forgot
> to delete the first attempt. ;-)
>
>>>> genbank_entry = parser.parse(fhandle)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",  
> line 187, in parse
>    self._scanner.feed(handle, self._consumer)
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",  
> line 360, in feed
>    self._feed_first_line(consumer, self.line)
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",  
> line 835, in _feed_first_line
>    assert False, \
> AssertionError: Did not recognise the LOCUS line layout:
> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>
>>>>
>
> Martin
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stewarta at nmrc.navy.mil  Tue Jun  5 11:34:14 2007
From: stewarta at nmrc.navy.mil (Andrew Stewart)
Date: Tue, 5 Jun 2007 11:34:14 -0400
Subject: [Bioperl-l] Setting attributes on a Bio::DB::GFF::Feature object
Message-ID: <95C9F539-A4C4-4B6A-8DA8-079B957BF909@nmrc.navy.mil>

I see bidirectional mutator methods for source, type, strand, etc. in  
the Bio::DB::GFF::Feature documentation but I see that ->attributes  
is only able to get and not set the feature attributes.  Is there no  
way to modify the attributes of a Bio::DB::GFF::Feature live?


--
Andrew Stewart
Research Assistant, Genomics Team
Navy Medical Research Center (NMRC)
Biological Defense Research Directorate (BDRD)
BDRD Annex
12300 Washington Avenue, 2nd Floor
Rockville, MD 20852

email: stewarta at nmrc.navy.mil
phone: 301-231-6700 Ext 270


From cjfields at uiuc.edu  Tue Jun  5 12:07:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 5 Jun 2007 11:07:41 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
Message-ID: <D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>

One thing I missed which explains the biopython error: the LOCUS line  
is missing the locus identifier (see the NCBI example record link).   
This doesn't choke the bioperl parser but it appears to stop the  
biopython parser in it's tracks (maybe a feature instead of a bug!).

You should try adding a unique identifier (maybe the name of the file  
or record) to the LOCUS line to see if it works:

LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006

The bioperl parser in CVS writes out the correct alphabet when this  
is added:

LOCUS       testfile                6499 bp    ds-DNA  linear   02- 
AUG-2006

I'll try adding a warning to the bioperl parser for this.

chris

On Jun 5, 2007, at 10:28 AM, Chris Fields wrote:

> Martin,
>
> The example file you give in the bioperl bugzilla report has several
> blank annotation lines which may lead to additional problems.  When
> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,
> DEFINITION, etc) then it expects there will also be relevant data
> (text descriptions) accompanying it; I assume the BioPython parser
> expects likewise though I may be wrong.
>
> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL-
> compliant.  GenBank records lacking text either have a '.' instead or
> are left out entirely:
>
> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html
>
> We could add a fix but you should probably contact the ApE developers
> and request that field names w/o text be left out or have '.' added.
>
> chris
>
> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:
>
>> Ezequiel Panepucci wrote:
>>>>     genbank entry = parser.parse(fhandle)
>>>
>>> there is a space character between "genbank" and "entry".
>>> It is a syntax error.
>>> I suppose you meant "genbank_entry" ?
>>
>> Yes, the next command was right and has shown the error. Sorry, I
>> forgot
>> to delete the first attempt. ;-)
>>
>>>>> genbank_entry = parser.parse(fhandle)
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in ?
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",
>> line 187, in parse
>>    self._scanner.feed(handle, self._consumer)
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>> line 360, in feed
>>    self._feed_first_line(consumer, self.line)
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>> line 835, in _feed_first_line
>>    assert False, \
>> AssertionError: Did not recognise the LOCUS line layout:
>> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>>
>>>>>
>>
>> Martin
>> _______________________________________________
>> BioPython mailing list  -  BioPython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From staffa at niehs.nih.gov  Tue Jun  5 22:00:34 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Tue, 05 Jun 2007 22:00:34 -0400
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C170E69F.246E%staffa@niehs.nih.gov>
Message-ID: <C28B8D82.51AE%staffa@niehs.nih.gov>

I am wondering if I knew what this error message exactly meant, if I could
discern my error. 
I don't see much difference in this program and programs that worked.
Can I assume that the new worked because an index file exists?
I don't know how the filehandle UTR_TT_GENES gets involved.
Maybe I should use some other module, but I really would like to have
get_Seq_by_id functionality.

The error message:
Dpse ortholog = Dpse_GA17307
fetching GA17307
Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,
<UTR_TT_GENES> line 4.

Relevant code:
#!/usr/bin/perl
#
#
#
use strict;
use Bio::DB::Fasta;
use Bio::Tools::SeqWords;
use Bio::Seq;
use Bio::SeqIO;
#
my $db = 
Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/TT_orthol
ogs_Dpse_genes.fa',
                                -makeid => \&make_my_id);
...
...
...
my $pse_obj = $db->get_Seq_by_id('GA17307');
my $pse_sequence = $pse_obj->seq;


Nick Staffa 
Telephone: 919-316-4569  (NIEHS: 6-4569)
Scientific Computing Support Group
NIEHS Information Technology Support Services Contract
(Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov)
National Institute of Environmental Health Sciences
National Institutes of Health
Research Triangle Park, North Carolina


From jason at bioperl.org  Tue Jun  5 23:12:40 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 5 Jun 2007 20:12:40 -0700
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C28B8D82.51AE%staffa@niehs.nih.gov>
References: <C28B8D82.51AE%staffa@niehs.nih.gov>
Message-ID: <EC9E4A2E-2C06-4ADE-8317-9E25DDF1C9C4@bioperl.org>

the file handle is probably not important, Perl just reports this if  
there is a filehandle open.

more importantly what is on line 84....

my guess is you are trying to get a sequence out and it doesn't exist  
- some error code around the lines getting the sequence out would be  
helpful.


On Jun 5, 2007, at 7:00 PM, Staffa, Nick (NIH/NIEHS) wrote:

> I am wondering if I knew what this error message exactly meant, if  
> I could
> discern my error.
> I don't see much difference in this program and programs that worked.
> Can I assume that the new worked because an index file exists?
> I don't know how the filehandle UTR_TT_GENES gets involved.
> Maybe I should use some other module, but I really would like to have
> get_Seq_by_id functionality.
>
> The error message:
> Dpse ortholog = Dpse_GA17307
> fetching GA17307
> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl  
> line 84,
> <UTR_TT_GENES> line 4.
>
> Relevant code:
> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> #
> my $db =
> Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/ 
> TT_orthol
> ogs_Dpse_genes.fa',
>                                 -makeid => \&make_my_id);
> ...
> ...
> ...
> my $pse_obj = $db->get_Seq_by_id('GA17307');
> my $pse_sequence = $pse_obj->seq;
>
>
>
>
> Nick Staffa
> Telephone: 919-316-4569  (NIEHS: 6-4569)
> Scientific Computing Support Group
> NIEHS Information Technology Support Services Contract
> (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov)
> National Institute of Environmental Health Sciences
> National Institutes of Health
> Research Triangle Park, North Carolina
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070605/7e056ff6/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2613 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070605/7e056ff6/attachment-0001.bin 

From torsten.seemann at infotech.monash.edu.au  Wed Jun  6 02:06:37 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 6 Jun 2007 16:06:37 +1000
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C28B8D82.51AE%staffa@niehs.nih.gov>
References: <C170E69F.246E%staffa@niehs.nih.gov>
	<C28B8D82.51AE%staffa@niehs.nih.gov>
Message-ID: <a79f6a4b0706052306r16f7ce61y28448c18349ac3f4@mail.gmail.com>

Nick,

> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,

The error makes it pretty clear. You are calling the ->seq method on
an undefined value, ie. $pse_obj.

> my $pse_obj = $db->get_Seq_by_id('GA17307');

# check we got something!
die "sequence not in database" unless $pse_obj;

> my $pse_sequence = $pse_obj->seq;


-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010

From shameer at ncbs.res.in  Wed Jun  6 02:27:42 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Wed, 6 Jun 2007 11:57:42 +0530 (IST)
Subject: [Bioperl-l] Validation of files using BioPerl
Message-ID: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>

Dear All,

How to validate an input file in fasta/PIR/GenPept/PDB format using
Bioperl ? (This is to avoid unnecessary files to be submitted to servers
by new users).   Any module available ?

Many thanks in advance,
-- 
Shameer Khadar


From cjfields at uiuc.edu  Wed Jun  6 08:37:28 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 6 Jun 2007 07:37:28 -0500
Subject: [Bioperl-l] Validation of files using BioPerl
In-Reply-To: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>
References: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>
Message-ID: <39F5F622-0C93-4DC5-B969-491F789FC932@uiuc.edu>

It has been discussed but never coded.  I believe if it passes  
through the Bio::SeqIO parser it's generally considered validly  
formatted (spacing, balanced quotes), though it doesn't specifically  
check FT keys and qualifiers for invalid ones, look for missing  
annotation, check taxonomy, etc.

As long as the end sequence mark (//) is present for every file, you  
cold try parsing the file into chunks (read with 'local $/ = '//';')  
and tossing the seq chunks as a filehandle (via IO::String) to a  
Bio::SeqIO object wrapped in an eval block (the parser resets $/, so  
it should work).  Follow the eval with a check of $@ for caught  
errors.  It might get tedious for big sequences...

chris

On Jun 6, 2007, at 1:27 AM, Shameer Khadar wrote:

> Dear All,
>
> How to validate an input file in fasta/PIR/GenPept/PDB format using
> Bioperl ? (This is to avoid unnecessary files to be submitted to  
> servers
> by new users).   Any module available ?
>
> Many thanks in advance,
> -- 
> Shameer Khadar
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From staffa at niehs.nih.gov  Wed Jun  6 10:40:49 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Wed, 06 Jun 2007 10:40:49 -0400
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <a79f6a4b0706052306r16f7ce61y28448c18349ac3f4@mail.gmail.com>
Message-ID: <C28C3FB1.4B73%staffa@niehs.nih.gov>

Indeed.
One must know what is actually in his header,
AND 
one must write the appropriate make_id subroutine
AND
one must specify the exact ID.
THEN things might work.
And they did!
THANK YOU


On 6/6/07 2:06 AM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

> Nick,
> 
>> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,
> 
> The error makes it pretty clear. You are calling the ->seq method on
> an undefined value, ie. $pse_obj.
> 
>> my $pse_obj = $db->get_Seq_by_id('GA17307');
> 
> # check we got something!
> die "sequence not in database" unless $pse_obj;
> 
>> my $pse_sequence = $pse_obj->seq;
> 


From jaudall at gmail.com  Wed Jun  6 17:51:33 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Wed, 6 Jun 2007 15:51:33 -0600
Subject: [Bioperl-l] blastxml interation
Message-ID: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>

I was searching in the deobfuscator under
*Bio::Search::Result::BlastResult*but there doesn't seem to be a
method to extract the iteration number from a
blastxml report.  I can see this number being possibly useful to count the
number of queries that didn't hit anything since the are no empty reports in
the blastxml output.  If I'm missing something, I would welcome an example
how to retrieve the result iteration number.  Thanks in advance for any
suggestions.

Josh

From dmessina at wustl.edu  Wed Jun  6 18:18:26 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 6 Jun 2007 17:18:26 -0500
Subject: [Bioperl-l] blastxml interation
In-Reply-To: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
Message-ID: <CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>

I think you want to look at the hits(), num_hits() and no_hits_found 
() methods. There is a private method _next_iteration_index() which  
should do what you asked for, but num_hits() looks like the better way.

By the way, hits() and num_hits() are listed on the Deobfuscator as  
having no documentation. This (as the below shows) is incorrect and  
is due to some nonstandard formatting issues which I will correct.  
_next_iteration_index() isn't listed on the Deobfuscator because it's  
a private method.


Hope this helps!
Dave


hits()

This method overrides Bio::Search::Result::GenericResult::hits to take
into account the possibility of multiple iterations, as occurs in PSI- 
BLAST reports.
If there are multiple iterations, all 'new' hits for all iterations  
are returned.
These are the hits that did not occur in a previous iteration.
See Also: Bio::Search::Result::GenericResult::hits

num_hits()

This method overrides Bio::Search::Result::GenericResult::num_hits to  
take
into account the possibility of multiple iterations, as occurs in PSI- 
BLAST reports.
If there are multiple iterations, calling num_hits() returns the  
number of
'new' hits for each iteration. These are the hits that did not occur
in a previous iteration.
See Also: Bio::Search::Result::GenericResult::num_hits

no_hits_found()

  Usage     : $nohits = $blast->no_hits_found( $iteration_number );
  Purpose   : Get boolean indicator indicating whether or not any hits
              were present in the report.
              This is NOT the same as determining the number of hits via
              the hits() method, which will return zero hits if there  
were no
              hits in the report or if all hits were filtered out  
during the parse.

              Thus, this method can be used to distinguish these  
possibilities
              for hitless reports generated when filtering.

  Returns   : Boolean
  Argument  : (optional) integer indicating the iteration number (PSI- 
BLAST)
              If iteration number is not specified and this is a PSI- 
BLAST result,
              then this method will return true only if all  
iterations had
              no hits found.


From apurva at cshl.edu  Wed Jun  6 19:51:45 2007
From: apurva at cshl.edu (Apurva Narechania)
Date: Wed, 6 Jun 2007 19:51:45 -0400
Subject: [Bioperl-l] non-palindromic issue in Bio::Restriction::Analysis
Message-ID: <3F7C7E33-416A-4141-969A-DDC4716E8A44@cshl.edu>

Hi,

I was hoping you could confirm and give me some feedback on an issue  
I think I've found with the Bio::Restriction::Analysis module. I am  
using the enzyme AciI, a non-palindromic restriction enzyme with a 5'  
C | CGC 3' recognition site. The module should search both the  
forward and the reverse complement strings in the case of a non- 
palindromic enzyme. I have found that the this works only  
intermittently. For example, the following sequence:

GAAAAAAACAAAGGAAGAAGCTAGCTAGCAGGGCACGCGGTTTGAGGATGGCTGGTGGCCGACCGCAGGGCG 
CGCGGTTG
GAGGATTGCTGGTGGCCGACCAGATGAAACTCACGCGCGGCTGGGGACAGCTGGAATATTTGGGCGGCGGCG 
GCTGGTAT
TACGGGAAAGGAGAGATAGGGTTTTGGACGGCAGCAGCTGGTATTTGGGCCACCAATTTTGCGCGCCAGTAC 
AGGACACC
GATGCCGCAAATTGCACAATGCCTTTTATGGCGACTGACAGTGCGATGCTATAGGTATGAATTGTCGACTGA 
CAAAGTGA
CACTATTCACATATAAATATAACGAATAACACTCAGTTGGAATATAGACATATGCCGACTCACCATCTGTGG 
CAATGTAT
ACCGACTAACAATTCGATGCTAATTCTCTATTTATAGCGACAGTCGTCAGACACTAATTTGGTGTTGTGGTA 
TAATGCTA
GTGCCTCACCGCTGTAGGTGTTGGTCTACTGGTGC

Should digest into 10 fragments using this enzyme, but the module  
produces only 7. Could you please confirm this behavior, and if  
observed, suggest some possible fixes? This may be a bug in the  
_non_pal_enz method, or may be me overlooking something pretty obvious.

Thanks,
Apurva Narechania.


From cjfields at uiuc.edu  Wed Jun  6 20:51:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 6 Jun 2007 19:51:00 -0500
Subject: [Bioperl-l] blastxml interation
In-Reply-To: <CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>
References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
	<CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>
Message-ID: <B494A9F2-80CE-4761-B67F-127B37358819@uiuc.edu>

Joshua,

Just to make sure there is no confusion, do you mean a  
Bio::Search::Iteration::IterationI-based object?  The iteration tags  
have multiple meanings apparently in BLAST XML output (multiple  
queries, multiple PSI-BLAST iterations).  The current  
SearchIO::blastxml parser returns multiple  
Bio::Search::Result::BlastResult objects based on the iterations, so  
PSI-BLAST output is treated as multiple BLAST reports regardless  
(i.e. no Iteration objects).  This is something I want to rectify but  
it may not be a easy fix.

chris

On Jun 6, 2007, at 5:18 PM, David Messina wrote:

> I think you want to look at the hits(), num_hits() and no_hits_found
> () methods. There is a private method _next_iteration_index() which
> should do what you asked for, but num_hits() looks like the better  
> way.
>
> By the way, hits() and num_hits() are listed on the Deobfuscator as
> having no documentation. This (as the below shows) is incorrect and
> is due to some nonstandard formatting issues which I will correct.
> _next_iteration_index() isn't listed on the Deobfuscator because it's
> a private method.
>
>
> Hope this helps!
> Dave
>
>
> hits()
>
> This method overrides Bio::Search::Result::GenericResult::hits to take
> into account the possibility of multiple iterations, as occurs in PSI-
> BLAST reports.
> If there are multiple iterations, all 'new' hits for all iterations
> are returned.
> These are the hits that did not occur in a previous iteration.
> See Also: Bio::Search::Result::GenericResult::hits
>
> num_hits()
>
> This method overrides Bio::Search::Result::GenericResult::num_hits to
> take
> into account the possibility of multiple iterations, as occurs in PSI-
> BLAST reports.
> If there are multiple iterations, calling num_hits() returns the
> number of
> 'new' hits for each iteration. These are the hits that did not occur
> in a previous iteration.
> See Also: Bio::Search::Result::GenericResult::num_hits
>
> no_hits_found()
>
>   Usage     : $nohits = $blast->no_hits_found( $iteration_number );
>   Purpose   : Get boolean indicator indicating whether or not any hits
>               were present in the report.
>               This is NOT the same as determining the number of  
> hits via
>               the hits() method, which will return zero hits if there
> were no
>               hits in the report or if all hits were filtered out
> during the parse.
>
>               Thus, this method can be used to distinguish these
> possibilities
>               for hitless reports generated when filtering.
>
>   Returns   : Boolean
>   Argument  : (optional) integer indicating the iteration number (PSI-
> BLAST)
>               If iteration number is not specified and this is a PSI-
> BLAST result,
>               then this method will return true only if all
> iterations had
>               no hits found.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Wed Jun  6 20:45:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 6 Jun 2007 20:45:14 -0400
Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db
Message-ID: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>

I have added support to BioSQL and bioperl-db for schemas in  
PostgreSQL. A schema in PostgreSQL is more or less a namespace for  
database objects (tables, indexes, views, etc) within a database.

(A database in PostgreSQL is similar to the concept of a user in  
Oracle or MySQL, and therefore for the latter two schemas are  
synonymous with a user. [Not sure I'm still up-to-date on this for  
MySQL, but at least that's what I recall.])

When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts,  
you specify the schema in which BioSQL resides using the --schema  
option.

If you are using bioperl-db as a library, the Bio::DB::BioDB->new()  
call also accepts a -schema named parameter, and Bio::DB::DBContextI  
objects have a $dbc->schema() property for getting/setting the  
schema, Bio::DB::SimpleDBContext->new() accepts a -schema parameter,  
and you may also add the property to the .bioperldb connection  
parameter file (-schema => 'yourschemahere').

Thanks for Brian Osborne for being the instigator (and tester, and  
for adding the code to load_ncbi_taxonomy.pl - I came too late).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jaudall at gmail.com  Wed Jun  6 17:41:08 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Wed, 6 Jun 2007 15:41:08 -0600
Subject: [Bioperl-l] blastxml interation number
Message-ID: <52cea20c0706061441n96ce803v9422e8d14461c2bd@mail.gmail.com>

I was searching in the deobfuscator under
*Bio::Search::Result::BlastResult*but there doesn't seem to be a
method to extract the iteration number from a
blastxml report.  I can see this number being very useful to count the
number of queries that didn't hit anything since the are no empty reports in
the blastxml output.  If I'm missing something, I would welcome an example
how to retrieve the result iteration number, otherwise I'm suggesting that
an iteration_count feature be added to the Result object.  Thanks in advance
for any suggestions.

Josh

From holland at ebi.ac.uk  Thu Jun  7 03:33:25 2007
From: holland at ebi.ac.uk (Richard Holland)
Date: Thu, 07 Jun 2007 08:33:25 +0100
Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db
In-Reply-To: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>
References: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>
Message-ID: <4667B4C5.6070107@ebi.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sounds great.

BioJava users shouldn't need to change anything to get this to work as
PostgreSQL JDBC connection objects already require you to specify a schema.

cheers,
Richard


Hilmar Lapp wrote:
> I have added support to BioSQL and bioperl-db for schemas in PostgreSQL.
> A schema in PostgreSQL is more or less a namespace for database objects
> (tables, indexes, views, etc) within a database.
> 
> (A database in PostgreSQL is similar to the concept of a user in Oracle
> or MySQL, and therefore for the latter two schemas are synonymous with a
> user. [Not sure I'm still up-to-date on this for MySQL, but at least
> that's what I recall.])
> 
> When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you
> specify the schema in which BioSQL resides using the --schema option.
> 
> If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call
> also accepts a -schema named parameter, and Bio::DB::DBContextI objects
> have a $dbc->schema() property for getting/setting the schema,
> Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may
> also add the property to the .bioperldb connection parameter file
> (-schema => 'yourschemahere').
> 
> Thanks for Brian Osborne for being the instigator (and tester, and for
> adding the code to load_ncbi_taxonomy.pl - I came too late).
> 
>     -hilmar
> --===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGZ7TF4C5LeMEKA/QRApwUAJ48q46iX152pB6Xcc/717Ie8foUTQCgm3ij
W/+0iO/ZsNDn1pLuf5yXbYA=
=asUn
-----END PGP SIGNATURE-----

From mmokrejs at ribosome.natur.cuni.cz  Thu Jun  7 10:26:44 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 07 Jun 2007 16:26:44 +0200
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
	<D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
Message-ID: <466815A4.9060505@ribosome.natur.cuni.cz>

Hi,

Chris Fields wrote:
> One thing I missed which explains the biopython error: the LOCUS line is 
> missing the locus identifier (see the NCBI example record link).  This 
> doesn't choke the bioperl parser but it appears to stop the biopython 
> parser in it's tracks (maybe a feature instead of a bug!).
> 
> You should try adding a unique identifier (maybe the name of the file or 
> record) to the LOCUS line to see if it works:
> 
> LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006
> 
> The bioperl parser in CVS writes out the correct alphabet when this is 
> added:
> 
> LOCUS       testfile                6499 bp    ds-DNA  linear   02-AUG-2006
> 
> I'll try adding a warning to the bioperl parser for this.

I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 but let me
emphasize the LOCUS line now contains 

LOCUS                      pRL        5428 bp ds-DNA   linear       07-JUN-2007


which still does not comply with the line you have proposed. But it can be
parsed by bioperl-live from cvs. Is it still wrong? Testcase as pRL.gb-new
in the bugzilla record #2305.

Martin

> 
> chris
> 
> On Jun 5, 2007, at 10:28 AM, Chris Fields wrote:
> 
>> Martin,
>>
>> The example file you give in the bioperl bugzilla report has several
>> blank annotation lines which may lead to additional problems.  When
>> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,
>> DEFINITION, etc) then it expects there will also be relevant data
>> (text descriptions) accompanying it; I assume the BioPython parser
>> expects likewise though I may be wrong.
>>
>> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL-
>> compliant.  GenBank records lacking text either have a '.' instead or
>> are left out entirely:
>>
>> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html
>>
>> We could add a fix but you should probably contact the ApE developers
>> and request that field names w/o text be left out or have '.' added.
>>
>> chris
>>
>> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:
>>
>>> Ezequiel Panepucci wrote:
>>>>>     genbank entry = parser.parse(fhandle)
>>>>
>>>> there is a space character between "genbank" and "entry".
>>>> It is a syntax error.
>>>> I suppose you meant "genbank_entry" ?
>>>
>>> Yes, the next command was right and has shown the error. Sorry, I
>>> forgot
>>> to delete the first attempt. ;-)
>>>
>>>>>> genbank_entry = parser.parse(fhandle)
>>> Traceback (most recent call last):
>>>  File "<stdin>", line 1, in ?
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",
>>> line 187, in parse
>>>    self._scanner.feed(handle, self._consumer)
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>>> line 360, in feed
>>>    self._feed_first_line(consumer, self.line)
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>>> line 835, in _feed_first_line
>>>    assert False, \
>>> AssertionError: Did not recognise the LOCUS line layout:
>>> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>>>
>>>>>>
>>>
>>> Martin
>>> _______________________________________________
>>> BioPython mailing list  -  BioPython at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>
>> _______________________________________________
>> BioPython mailing list  -  BioPython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs


From cjfields at uiuc.edu  Thu Jun  7 11:31:45 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 7 Jun 2007 10:31:45 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <466815A4.9060505@ribosome.natur.cuni.cz>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
	<D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
	<466815A4.9060505@ribosome.natur.cuni.cz>
Message-ID: <2A403865-F1E8-4D19-8D19-455C22E7C6D9@uiuc.edu>

On Jun 7, 2007, at 9:26 AM, Martin MOKREJ? wrote:

> Hi,
>
> Chris Fields wrote:
>> One thing I missed which explains the biopython error: the LOCUS  
>> line is missing the locus identifier (see the NCBI example record  
>> link).  This doesn't choke the bioperl parser but it appears to  
>> stop the biopython parser in it's tracks (maybe a feature instead  
>> of a bug!).
>> You should try adding a unique identifier (maybe the name of the  
>> file or record) to the LOCUS line to see if it works:
>> LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006
>> The bioperl parser in CVS writes out the correct alphabet when  
>> this is added:
>> LOCUS       testfile                6499 bp    ds-DNA  linear   02- 
>> AUG-2006
>> I'll try adding a warning to the bioperl parser for this.
>
> I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305  
> but let me
> emphasize the LOCUS line now contains
> LOCUS                      pRL        5428 bp ds-DNA   linear        
> 07-JUN-2007
>
>
> which still does not comply with the line you have proposed. But it  
> can be
> parsed by bioperl-live from cvs. Is it still wrong? Testcase as  
> pRL.gb-new
> in the bugzilla record #2305.
>
> Martin

That should work.  There isn't a strict uniqueness test (that would  
require caching and isn't worth the trouble IMHO), though it's  
required you add something unique for the accession/locus if you plan  
on indexing them in the future.

Parsing GenBank data produced from third-party software is  
problematic at best; there seems to be no steadfast rule with GenBank  
output for some programs, even though the specification is plainly  
stated in the NCBI release notes.  My take on that is to have a  
stricter (read:follows release notes) GenBank parser which passes off  
the data in the record to default handler methods.  A user could then  
subjugate the defined handlers with their own by subclassing the  
default handler class and overloading the methods or adding their own  
code references directly.

chris

...


From rich at thevillas.eclipse.co.uk  Fri Jun  8 07:00:45 2007
From: rich at thevillas.eclipse.co.uk (richard)
Date: Fri, 08 Jun 2007 12:00:45 +0100
Subject: [Bioperl-l] protparam
Message-ID: <466936DD.8080604@thevillas.eclipse.co.uk>


Hi,

I noticed that in April someone asked whether there was a bioperl mod 
for obtaining protein sequence related properties using protparam.
I have a module that could potentially be submitted to bioperl for this 
purpose. Does anybody have any thoughts on whether it should go in?

Example script and the module are at:

http://81.5.159.173/webshare/ 


Cheers
Rich


From cjfields at uiuc.edu  Fri Jun  8 08:37:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 8 Jun 2007 07:37:27 -0500
Subject: [Bioperl-l] protparam
In-Reply-To: <466936DD.8080604@thevillas.eclipse.co.uk>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
Message-ID: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>

Richard,

We'll gladly add this in, though it'll need to be bioperlized  
(inherit Bio::Root::Root).  We also generally ask for tests but it  
should be easy to write up a quick test suite using any protein seq.

If you can could you add some bioperl-like POD to the module (i.e.  
SYNOPSIS, AUTHOR, DESCRIPTION, etc)?

thanks!

chris

On Jun 8, 2007, at 6:00 AM, richard wrote:

>
> Hi,
>
> I noticed that in April someone asked whether there was a bioperl mod
> for obtaining protein sequence related properties using protparam.
> I have a module that could potentially be submitted to bioperl for  
> this
> purpose. Does anybody have any thoughts on whether it should go in?
>
> Example script and the module are at:
>
> http://81.5.159.173/webshare/
>
>
> Cheers
> Rich
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From mmokrejs at ribosome.natur.cuni.cz  Fri Jun  8 07:09:42 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Fri, 08 Jun 2007 13:09:42 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file?
Message-ID: <466938F6.7050903@ribosome.natur.cuni.cz>

Hi,
  how can I convert GenBank/EMBL formatted file to a GFF file? The manpage for
Bio::Graphics::FeatureFile does not help me in this way. The information is in
the file, so I want just to extract the features to a GFF format, probably somewhere
the sequence has to be stored ...
 Is there a tool so I can convert it automatically? ;) This would be great. I
can't make the GFF manually for every file. Other programs draw plasmid maps
also automatically from the GenBank formatted input so how can I do it in bioperl?
Thanks for help,
Martin

From shameer at ncbs.res.in  Fri Jun  8 10:11:00 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Fri, 8 Jun 2007 19:41:00 +0530 (IST)
Subject: [Bioperl-l] protparam
In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
Message-ID: <54411.192.168.1.1.1181311860.squirrel@mail.ncbs.res.in>

Richard,

I asked for protparam module in bioperl !
Thats a good job.

Cheers,
SK

> Richard,
>
> We'll gladly add this in, though it'll need to be bioperlized
> (inherit Bio::Root::Root).  We also generally ask for tests but it
> should be easy to write up a quick test suite using any protein seq.
>
> If you can could you add some bioperl-like POD to the module (i.e.
> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>
> thanks!
>
> chris
>
> On Jun 8, 2007, at 6:00 AM, richard wrote:
>
>>
>> Hi,
>>
>> I noticed that in April someone asked whether there was a bioperl mod
>> for obtaining protein sequence related properties using protparam.
>> I have a module that could potentially be submitted to bioperl for
>> this
>> purpose. Does anybody have any thoughts on whether it should go in?
>>
>> Example script and the module are at:
>>
>> http://81.5.159.173/webshare/
>>
>>
>> Cheers
>> Rich
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From dmessina at wustl.edu  Fri Jun  8 10:58:20 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 8 Jun 2007 09:58:20 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <466938F6.7050903@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
Message-ID: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>

Hi Martin,

You're in luck -- the BioPerl core distribution includes two scripts  
for doing just that:

	genbank2gff
	genbank2gff3

Look in the scripts directory of the distro.

Also, there is a *huge* amount of documentation and examples on the  
BioPerl website.

	http://www.bioperl.org/wiki/HOWTOs

Reading those, reading the FAQ, and searching the mailing list  
archives are where I look first when I don't know how to do something  
in BioPerl.


Dave

--
Dave Messina
Senior Analyst, Assembly Group
Genome Sequencing Center
Washington University
St. Louis, MO


From rich at thevillas.eclipse.co.uk  Fri Jun  8 11:51:21 2007
From: rich at thevillas.eclipse.co.uk (richard)
Date: Fri, 08 Jun 2007 16:51:21 +0100
Subject: [Bioperl-l] protparam
In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
Message-ID: <46697AF9.2090502@thevillas.eclipse.co.uk>


Hi,

ok, great, that's no problem. I'll add the POD and bioperlize it,

thanks
Rich

Chris Fields wrote:
> Richard,
>
> We'll gladly add this in, though it'll need to be bioperlized  
> (inherit Bio::Root::Root).  We also generally ask for tests but it  
> should be easy to write up a quick test suite using any protein seq.
>
> If you can could you add some bioperl-like POD to the module (i.e.  
> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>
> thanks!
>
> chris
>
> On Jun 8, 2007, at 6:00 AM, richard wrote:
>
>   
>> Hi,
>>
>> I noticed that in April someone asked whether there was a bioperl mod
>> for obtaining protein sequence related properties using protparam.
>> I have a module that could potentially be submitted to bioperl for  
>> this
>> purpose. Does anybody have any thoughts on whether it should go in?
>>
>> Example script and the module are at:
>>
>> http://81.5.159.173/webshare/
>>
>>
>> Cheers
>> Rich
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>   


From cjfields at uiuc.edu  Fri Jun  8 13:45:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 8 Jun 2007 12:45:17 -0500
Subject: [Bioperl-l] protparam
In-Reply-To: <46697AF9.2090502@thevillas.eclipse.co.uk>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
	<46697AF9.2090502@thevillas.eclipse.co.uk>
Message-ID: <AA43E9C9-7064-438A-89A9-12E4B21E4F04@uiuc.edu>

Another issue is namespace.  I suggest Bio::Tools::ProtParam, though  
there may be some others out there.

We can add support for direct Bio::Seq/PrimarySeq input and other  
odds and ends once it's committed.  Good work!

chris

On Jun 8, 2007, at 10:51 AM, richard wrote:

>
> Hi,
>
> ok, great, that's no problem. I'll add the POD and bioperlize it,
>
> thanks
> Rich
>
> Chris Fields wrote:
>> Richard,
>>
>> We'll gladly add this in, though it'll need to be bioperlized
>> (inherit Bio::Root::Root).  We also generally ask for tests but it
>> should be easy to write up a quick test suite using any protein seq.
>>
>> If you can could you add some bioperl-like POD to the module (i.e.
>> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>>
>> thanks!
>>
>> chris
>>
>> On Jun 8, 2007, at 6:00 AM, richard wrote:
>>
>>
>>> Hi,
>>>
>>> I noticed that in April someone asked whether there was a bioperl  
>>> mod
>>> for obtaining protein sequence related properties using protparam.
>>> I have a module that could potentially be submitted to bioperl for
>>> this
>>> purpose. Does anybody have any thoughts on whether it should go in?
>>>
>>> Example script and the module are at:
>>>
>>> http://81.5.159.173/webshare/
>>>
>>>
>>> Cheers
>>> Rich
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Mon Jun 11 07:30:24 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 11 Jun 2007 07:30:24 -0400
Subject: [Bioperl-l] script to load ITIS taxonomy
Message-ID: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>

Hi all -

I added a script to load the ITIS taxonomy (www.itis.gov) into the  
phylodb module. It is called load_itis_taxonomy.pl and is in the  
scripts/ directory.

It is independent of BioPerl right now (the ITIS download is either a  
MS SQL Server or an Informix dump - no kidding), but I'm hoping that  
at some point support for this can be integrated into Bio::TreeIO.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 11 08:24:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 11 Jun 2007 07:24:50 -0500
Subject: [Bioperl-l] script to load ITIS taxonomy
In-Reply-To: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>
References: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>
Message-ID: <99AC6C0F-10DD-4587-AFB3-32BC495CD2BD@uiuc.edu>


On Jun 11, 2007, at 6:30 AM, Hilmar Lapp wrote:

> Hi all -
>
> I added a script to load the ITIS taxonomy (www.itis.gov) into the
> phylodb module. It is called load_itis_taxonomy.pl and is in the
> scripts/ directory.
>
> It is independent of BioPerl right now (the ITIS download is either a
> MS SQL Server or an Informix dump - no kidding), but I'm hoping that
> at some point support for this can be integrated into Bio::TreeIO.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

I second the TreeIO support.  Anyone up for it?

chris

From ryanx07 at hotmail.com  Mon Jun 11 11:24:31 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Mon, 11 Jun 2007 10:24:31 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>

I just started to learn BioPerl by reading the BioPerl Tutorial on the 
BioPerl website. By trying the 1st example on my window,
use Bio::Perl;
$seq_object = get_sequence('swiss',"ID ROA1_HUMAN");
write_sequence(">roa1.fasta",'fasta',$seq_object);

I got the error as the following:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
3
STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
STACK: t8.pl:7

I cannot figure out where is wrong but cannot find the solution on the web. 
Could someone help me please?

Also, this lead to my 2nd question: is there a way to search in the archieve 
of the current list?

Thanks so much


R

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Like puzzles? Play free games & earn great prizes. Play Clink now. 
http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2


From dmessina at wustl.edu  Mon Jun 11 12:34:29 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 11 Jun 2007 11:34:29 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>
References: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>
Message-ID: <25517EA3-7BDA-44AC-BDF3-93A6810D9D63@wustl.edu>

The example code works here, but I'm on OS X. Could you tell us which  
version of Perl and BioPerl you are using, and which operating system?

Are you getting anything in the roa1.fasta file?


> is there a way to search in the archieve of the current list?

http://www.bioperl.org/wiki/Mailing_lists


Dave


From dmessina at wustl.edu  Mon Jun 11 14:48:23 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 11 Jun 2007 13:48:23 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F39783926A21896CCB15CD9B41A0@phx.gbl>
References: <BAY106-F39783926A21896CCB15CD9B41A0@phx.gbl>
Message-ID: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu>

Hi,

Please use 'Reply All' so everyone on the list can follow the  
discussion.

Try adding the following line after the line that starts with  
$seq_object:

	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";

And then run the program again. What do you get? Could you post a  
complete printout of what you're doing?


Dave


On Jun 11, 2007, at 11:45 AM, L Xu wrote:
> I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
> activeperl 5.8.8.819 Thank you very much.


From johnsonm at gmail.com  Mon Jun 11 20:45:13 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Mon, 11 Jun 2007 19:45:13 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
Message-ID: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>

    This bit in Bio::SeqFeature::Gene::Exon is causing me some
problems trying to extend Bio::Tools::Glimmer to handle 'wraparound'
genes (circular genomes):

sub location {
   my ($self,$value) = @_;

   if(defined($value) && $value->isa('Bio::Location::SplitLocationI')) {
       $self->throw("split or compound location is not allowed ".
                    "for an object of type " . ref($self));
   }
   return $self->SUPER::location($value);
}

    That seems to be there all the way back to the initial revision
(checked in by Hilmar).  I presume it's there because of code like
this ( from the seq() method in Bio::SeqFeature::Generic):

# assumming our seq object is sensible, it should not have to yank
# the entire sequence out here.

my $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end());

    That's not going to work too well with a feature that has a
Bio::Location::Split location.  Fixing it up seems straightforward, if
a bit hackish.  Something like:

my $seq;

if (ref($self->location()) eq 'Bio::Location::Split')) {
    my $seqstring;
    my @sublocs = $self->location()->sub_Location();

    foreach my $subloc (@sublocs) {
        $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(),
$subloc->end())->seq();
    }

    my $seq = Bio::Seq->new(
                                          -id =>
$self->{'_gsf_seq'}->display_id(),
                                          -seq => $seqstring
                                         );
}
else {
    $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end());
}

    I don't see any companion to trunc() in Bio::PrimarySeqI for
joining sequences.  A join() would be handy, and make the above
cleaner.
    Comments, suggestions, rotten fruit?

From torsten.seemann at infotech.monash.edu.au  Tue Jun 12 02:18:27 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 12 Jun 2007 16:18:27 +1000
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
Message-ID: <a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>

Mark,

> if (ref($self->location()) eq 'Bio::Location::Split')) {
>     my $seqstring;
>     my @sublocs = $self->location()->sub_Location();
>
>     foreach my $subloc (@sublocs) {
>         $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(),
> $subloc->end())->seq();
>     }

Can you use the ->spliced_seq() method to do this?

http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010

From pengchy at yahoo.com.cn  Tue Jun 12 03:00:46 2007
From: pengchy at yahoo.com.cn (=?gb2312?q?=D1=EE=20=C5=F4=B3=CC?=)
Date: Tue, 12 Jun 2007 15:00:46 +0800 (CST)
Subject: [Bioperl-l] Can't locate loadable object for module
	TFBS::Ext::pwmsearch
Message-ID: <66745.92089.qm@web15205.mail.cnb.yahoo.com>

hi all,
   
  Today, I download the TFBS package from http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the files contained in the TFBS and Ext directories to directory "C:\perl\site\lib", then put Ext under the TFBS directory. I run the example script1.pl, but a wrong message respond: 
   
  Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC (@INC contains: C:/perl/site/lib C:/perl/lib .) at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, <
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, <DATA> line 206.
Compilation failed in require at script1.pl line 3, <DATA> line 206.
BEGIN failed--compilation aborted at script1.pl line 3, <DATA> line 206.
shell returned 2
   
  when I run the list_matrices.pl script, the same message respond. But when I empty the pwmsearch.pm file, following message respond:
   
  TFBS/Ext/pwmsearch.pm did not return a true value at :/perl/site/lib/TFBS/Matr
x/PWM.pm line 141, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 11, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137,
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 17, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52,
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line2, <DATA> line 206.
Compilation failed in require at script1.pl line 3, <DATA> line 206.
BEGIN failed--compilation aborted at script1.pl line 3, <DATA> line 206.
   
  Is anyone else meet the same problem? Is it a bug for TFBS package?


Best wishes!

Sincerely, Pengcheng
       
---------------------------------
????????????????3.5G??????20M?????? 

From bix at sendu.me.uk  Tue Jun 12 03:32:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 12 Jun 2007 08:32:02 +0100
Subject: [Bioperl-l] Can't locate loadable object for
	module	TFBS::Ext::pwmsearch
In-Reply-To: <66745.92089.qm@web15205.mail.cnb.yahoo.com>
References: <66745.92089.qm@web15205.mail.cnb.yahoo.com>
Message-ID: <466E4BF2.7020504@sendu.me.uk>

? ?? wrote:
> hi all,
> 
> Today, I download the TFBS package from
> http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the
> files contained in the TFBS and Ext directories to directory
> "C:\perl\site\lib", then put Ext under the TFBS directory. I run the
> example script1.pl, but a wrong message respond:
> 
> Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC

You have to follow the installation instructions in the README file.
Copying the files out is insufficient - you have to 'make'.

From ryanx07 at hotmail.com  Tue Jun 12 07:30:09 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 06:30:09 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu>
Message-ID: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>

Here is the code:

use Bio::Perl;
$seq_object = get_sequence('swiss',"ROA1_HUMAN");
print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
write_sequence(">roa1.fasta",'fasta',$seq_object);

The output looks like the same as the previous version:

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\~Scripts>perl test.pl

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
3
STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
STACK: test.pl:7
-----------------------------------------------------------

Thanks.


>From: David Messina <dmessina at wustl.edu>
>To: L Xu <ryanx07 at hotmail.com>
>CC: BioPerl list <bioperl-l at lists.open-bio.org>
>Subject: Re: [Bioperl-l] basic questions
>Date: Mon, 11 Jun 2007 13:48:23 -0500
>
>Hi,
>
>Please use 'Reply All' so everyone on the list can follow the  discussion.
>
>Try adding the following line after the line that starts with  $seq_object:
>
>	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
>
>And then run the program again. What do you get? Could you post a  complete 
>printout of what you're doing?
>
>
>Dave
>
>
>On Jun 11, 2007, at 11:45 AM, L Xu wrote:
>>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
>>activeperl 5.8.8.819 Thank you very much.
>

_________________________________________________________________
Picture this ? share your photos and you could win big!  
http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us


From pengchy at yahoo.com.cn  Tue Jun 12 10:33:15 2007
From: pengchy at yahoo.com.cn (Pengcheng Yang)
Date: Tue, 12 Jun 2007 22:33:15 +0800 (CST)
Subject: [Bioperl-l]
	=?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20basic=20questions?=
In-Reply-To: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>
Message-ID: <936780.8655.qm@web15215.mail.cnb.yahoo.com>


I got the same questions.

I guess that the swissprote database has some problems!

code:
use Bio::DB::SwissProt;
$sp = new Bio::DB::SwissProt;
$seq = $sp->get_Seq_by_id('KPY1_ECOLI'); 
print ref($seq),"\t",$seq->display_id,"\n"

the mesage:

------------- EXCEPTION  -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK Bio::SeqIO::swiss::next_seq C:/perl/site/lib/Bio\SeqIO\swiss.pm:180
STACK Bio::DB::WebDBSeqI::get_Seq_by_id
C:/perl/site/lib/Bio/DB/WebDBSeqI.pm:154

STACK toplevel t.pl:7

--------------------------------------


--- L Xu <ryanx07 at hotmail.com>????:

> Here is the code:
> 
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
> write_sequence(">roa1.fasta",'fasta',$seq_object);
> 
> The output looks like the same as the previous version:
> 
> Microsoft Windows XP [Version 5.1.2600]
> (C) Copyright 1985-2001 Microsoft Corp.
> 
> C:\~Scripts>perl test.pl
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: swissprot stream with no ID. Not swissprot in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
> STACK: Bio::SeqIO::swiss::next_seq
> C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
> 3
> STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
> STACK: test.pl:7
> -----------------------------------------------------------
> 
> Thanks.
> 
> 
> 
> 
> 
> >From: David Messina <dmessina at wustl.edu>
> >To: L Xu <ryanx07 at hotmail.com>
> >CC: BioPerl list <bioperl-l at lists.open-bio.org>
> >Subject: Re: [Bioperl-l] basic questions
> >Date: Mon, 11 Jun 2007 13:48:23 -0500
> >
> >Hi,
> >
> >Please use 'Reply All' so everyone on the list can follow the 
> discussion.
> >
> >Try adding the following line after the line that starts with 
> $seq_object:
> >
> >	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
> >
> >And then run the program again. What do you get? Could you post a 
> complete 
> >printout of what you're doing?
> >
> >
> >Dave
> >
> >
> >On Jun 11, 2007, at 11:45 AM, L Xu wrote:
> >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
> >>activeperl 5.8.8.819 Thank you very much.
> >
> 
> _________________________________________________________________
> Picture this ?share your photos and you could win big!  
> http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us
> 
> > _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


Best wishes!

Sincerely, Pengcheng


      ___________________________________________________________ 
????????????????3.5G??????20M?????? 
http://cn.mail.yahoo.com

From drummike at gmail.com  Tue Jun 12 11:49:36 2007
From: drummike at gmail.com (Mike Williams)
Date: Tue, 12 Jun 2007 11:49:36 -0400
Subject: [Bioperl-l]
	=?GB2312?B?UmU6IFtCaW9wZXJsLWxdILvYuLSjuiBSZTogYmFzaWMgcXVlc3Rpb25z?=
In-Reply-To: <936780.8655.qm@web15215.mail.cnb.yahoo.com>
References: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>
	<936780.8655.qm@web15215.mail.cnb.yahoo.com>
Message-ID: <bc95ab8d0706120849qc60ee50qf743f4a7342580e1@mail.gmail.com>

On 6/12/07, Pengcheng Yang <pengchy at yahoo.com.cn> wrote:
> I got the same questions.
> I guess that the swissprote database has some problems!
> code:
> use Bio::DB::SwissProt;
> $sp = new Bio::DB::SwissProt;
> $seq = $sp->get_Seq_by_id('KPY1_ECOLI');
> print ref($seq),"\t",$seq->display_id,"\n"
> ------------- EXCEPTION  -------------
> MSG: swissprot stream with no ID. Not swissprot in my book
> STACK toplevel t.pl:7

This is a different problem.  The id was not valid.  If you change
KPY1 to KPYK1 it works fine.

$seq = $sp->get_Seq_by_id('KPYK1_ECOLI');
print ref($seq),"\t",$seq->display_id,"\n"
[mike at Wheatley]$ ./bio_quest2.pl

Bio::Seq::RichSeq       KPYK1_ECOLI

If you got this example from the bio perl site would you please post
the url?  Seems to me this same problem has come up before, but I
could not find it in the archives nor on the web site.

Mike

From ryanx07 at hotmail.com  Tue Jun 12 11:42:28 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 10:42:28 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>

I tested another code (the 2nd test on the same machine) from the tutorial 
and got error again. I don't know what happened and please help.
Thanks so much.

===========================================================Code:
use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection;
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection){
   print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";
   # prints name, recognition site, overhang
}
=========================================== Results:

C:\~Scripts>perl t9.pl
Can't use string ("Bio::Restriction::EnzymeCollecti") as a HASH ref while 
"stric
t refs" in use at C:/Perl/site/lib/Bio/Restriction/EnzymeCollection.pm line 
236.


= = = Original message = = =

On Jun 11, 2007, at 11:45 AM, L Xu wrote:

   I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and? 
activeperl 5.8.8.819 Thank you very much.

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Need a break? Find your escape route with Live Search Maps. 
http://maps.live.com/default.aspx?ss=Restaurants~Hotels~Amusement%20Park&cp=33.832922~-117.915659&style=r&lvl=13&tilt=-90&dir=0&alt=-1000&scene=1118863&encType=1&FORM=MGAC01


From limericksean at gmail.com  Tue Jun 12 12:04:40 2007
From: limericksean at gmail.com (Sean O'Keeffe)
Date: Tue, 12 Jun 2007 18:04:40 +0200
Subject: [Bioperl-l] gff2xml
Message-ID: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>

Hi all,
I posted this on the gbrowse list earlier. I'm looking to convert gff
data files into xml. Does anyone know of a module written to do this
already?

respect,
sean.

From johnsonm at gmail.com  Tue Jun 12 12:10:45 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 11:10:45 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
Message-ID: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>

On 6/12/07, Torsten Seemann <torsten.seemann at infotech.monash.edu.au> wrote:
> Can you use the ->spliced_seq() method to do this?
>
> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11
>
> --
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> --Tel +61 3 9905 9010

    Actually, I'd forgotten about spliced_seq().  That seems like it
will Do The Right Thing.  It's just up to the invoker to call
spliced_seq() instead of seq() as appropriate.
    So, is there any other code that will break if I modify
Bio::SeqFeature::Gene::Exon::location to not throw an exception when
encountering Bio::Location::SplitLocationI?  I'm wondering if it's
just a paranoid check or if it's there to guard against something.  If
the latter, I need to know what code to fix.  I'll dig and look, but
if anybody knows or has an idea, save me some time.  I suppose I can
just change it and see what tests start failing. 8)

From dmessina at wustl.edu  Tue Jun 12 12:11:36 2007
From: dmessina at wustl.edu (David Messina)
Date: Tue, 12 Jun 2007 11:11:36 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>
References: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>
Message-ID: <30B8F841-E694-4577-8C15-8703E846CDFE@wustl.edu>

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps  
Perl wasn't seeing the second argument to get_sequence. And then your  
new program has the error 'Can't use string  
("Bio::Restriction::EnzymeCollecti")' where the end of the word is  
cut off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.  Are  
there any example scripts that come with ActivePerl? If there are,  
and they run correctly, perhaps you could look to see how the line  
breaks are done and make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem --  
anyone else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall BioPerl  
and make sure that you run the full test suite and that all of the  
tests pass. My guess is that something in your current setup is not  
quite right.

Dave


From cjfields at uiuc.edu  Tue Jun 12 12:42:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 11:42:29 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
Message-ID: <E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>


On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:

> On 6/12/07, Torsten Seemann  
> <torsten.seemann at infotech.monash.edu.au> wrote:
>> Can you use the ->spliced_seq() method to do this?
>>
>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ 
>> SeqFeatureI.html#POD11
>>
>> --
>> --Torsten Seemann
>> --Victorian Bioinformatics Consortium, Monash University
>> --Tel +61 3 9905 9010
>
>     Actually, I'd forgotten about spliced_seq().  That seems like it
> will Do The Right Thing.  It's just up to the invoker to call
> spliced_seq() instead of seq() as appropriate.
>     So, is there any other code that will break if I modify
> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> just a paranoid check or if it's there to guard against something.  If
> the latter, I need to know what code to fix.  I'll dig and look, but
> if anybody knows or has an idea, save me some time.  I suppose I can
> just change it and see what tests start failing. 8)

I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to  
describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs  
state that the Exon class is used to specifically describe exons, as  
the name implies.  Exons are primarily eukaryotic in origin, so you  
shouldn't encounter wraparounds, and should not have split locations  
by definition (which likely explains the exception).

Wouldn't a SeqFeature::Generic work just as well using a split location?

chris

From johnsonm at gmail.com  Tue Jun 12 12:59:54 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 11:59:54 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
Message-ID: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>

    That's a good point.  Both Bio::Tools::Glimmer and
Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
prokaryotic sequence (multiple exons for eukaryotic).  There are
eukaryotic and prokaryotic versions of both predictor families.  Maybe
the most elegant solution would be to simply modify both modules to
only emit Bio::SeqFeature::Generic features when operating on
prokaryotic mode output?  Fix the data model and the problem goes
away.  8)

On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>
> > On 6/12/07, Torsten Seemann
> > <torsten.seemann at infotech.monash.edu.au> wrote:
> >> Can you use the ->spliced_seq() method to do this?
> >>
> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
> >> SeqFeatureI.html#POD11
> >>
> >> --
> >> --Torsten Seemann
> >> --Victorian Bioinformatics Consortium, Monash University
> >> --Tel +61 3 9905 9010
> >
> >     Actually, I'd forgotten about spliced_seq().  That seems like it
> > will Do The Right Thing.  It's just up to the invoker to call
> > spliced_seq() instead of seq() as appropriate.
> >     So, is there any other code that will break if I modify
> > Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> > encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> > just a paranoid check or if it's there to guard against something.  If
> > the latter, I need to know what code to fix.  I'll dig and look, but
> > if anybody knows or has an idea, save me some time.  I suppose I can
> > just change it and see what tests start failing. 8)
>
> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
> state that the Exon class is used to specifically describe exons, as
> the name implies.  Exons are primarily eukaryotic in origin, so you
> shouldn't encounter wraparounds, and should not have split locations
> by definition (which likely explains the exception).
>
> Wouldn't a SeqFeature::Generic work just as well using a split location?
>
> chris
>

From ryanx07 at hotmail.com  Tue Jun 12 13:17:18 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 12:17:18 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>

I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 build 820.
However, both scripts generated the same error with my computer. I tested 
the code in another WinXP computer with the same versions of activePerl and 
BioPerl, the one for the swissprot did work but the restriction enzyme 
generated the same error.

= = = Original message = = =

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps? Perl 
wasn't seeing the second argument to get_sequence. And then your? new 
program has the error 'Can't use string? 
("Bio::Restriction::EnzymeCollecti")' where the end of the word is? cut off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.? Are? there 
any example scripts that come with ActivePerl? If there are,? and they run 
correctly, perhaps you could look to see how the line? breaks are done and 
make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem --? anyone 
else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall BioPerl? and 
make sure that you run the full test suite and that all of the? tests pass. 
My guess is that something in your current setup is not? quite right.

Dave

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Tue Jun 12 13:51:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 12:51:47 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>
References: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>
Message-ID: <D01CF97A-FE62-4E40-A3DD-FAFD97D8BA45@uiuc.edu>

This is an instance where 'use strict' would have shown the problem  
right away.  You left off your constructor call:

my $all_collection = Bio::Restriction::EnzymeCollection;

should be

my $all_collection = Bio::Restriction::EnzymeCollection->new;

chris

On Jun 12, 2007, at 12:17 PM, L Xu wrote:

> I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8  
> build 820.
> However, both scripts generated the same error with my computer. I  
> tested
> the code in another WinXP computer with the same versions of  
> activePerl and
> BioPerl, the one for the swissprot did work but the restriction enzyme
> generated the same error.
>
> = = = Original message = = =
>
> Hmm, it almost looks like you're having an issue with line breaks.
>
> The 'swissprot stream with no ID' error made me think that perhaps?  
> Perl
> wasn't seeing the second argument to get_sequence. And then your? new
> program has the error 'Can't use string?
> ("Bio::Restriction::EnzymeCollecti")' where the end of the word is?  
> cut off.
>
> I don't know how ActivePerl handles Windows vs UNIX line breaks.?  
> Are? there
> any example scripts that come with ActivePerl? If there are,? and  
> they run
> correctly, perhaps you could look to see how the line? breaks are  
> done and
> make sure the your program does it the same way.
>
> Other than that, I'm not seeing an obvious answer to your problem  
> --? anyone
> else have a suggestion?
>
> Perhaps the easiest thing for you to do would be to reinstall  
> BioPerl? and
> make sure that you run the full test suite and that all of the?  
> tests pass.
> My guess is that something in your current setup is not? quite right.
>
> Dave
>
> ___________________________________________________________
> Sent by ePrompter, the premier email notification software.
> Free download at http://www.ePrompter.com.
>
> _________________________________________________________________
> Get a preview of Live Earth, the hottest event this summer - only  
> on MSN
> http://liveearth.msn.com?source=msntaglineliveearthhm
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ryanx07 at hotmail.com  Tue Jun 12 14:11:15 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 13:11:15 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>

Thank you very much, it did make the script advanced a bit but I got the 
following error:

C:\~Scripts>perl t9.pl
Can't locate object method "name" via package 
"Bio::Restriction::EnzymeCollectio
n" at t9.pl line 5, <DATA> line 532.

I checked the documentation , there is no "name" method for the package. 
Thanks.

= = = Original message = = =

This is an instance where 'use strict' would have shown the problem? right 
away.? You left off your constructor call:

my $all_collection = Bio::Restriction::EnzymeCollection;

should be

my $all_collection = Bio::Restriction::EnzymeCollection->new;

chris

On Jun 12, 2007, at 12:17 PM, L Xu wrote:


   I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8? build 
820.
However, both scripts generated the same error with my computer. I? tested
the code in another WinXP computer with the same versions of? activePerl and
BioPerl, the one for the swissprot did work but the restriction enzyme
generated the same error.

= = = Original message = = =

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps?? Perl
wasn't seeing the second argument to get_sequence. And then your? new
program has the error 'Can't use string?
("Bio::Restriction::EnzymeCollecti")' where the end of the word is?? cut 
off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.?? Are? 
there
any example scripts that come with ActivePerl? If there are,? and? they run
correctly, perhaps you could look to see how the line? breaks are? done and
make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem? --? 
anyone
else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall? BioPerl? and
make sure that you run the full test suite and that all of the?? tests pass.
My guess is that something in your current setup is not? quite right.

Dave

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only? on MSN
http://liveearth.msn.com?source=msntaglineliveearthhm

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Tue Jun 12 14:35:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 13:35:15 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>
References: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>
Message-ID: <287E93E2-1902-4796-971E-B1DCA805D032@uiuc.edu>

Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme  
objects, each with its own name().  Using grouped methods like  
'$collection->cutters(6)' will retrieve a new EnzymeCollection  
containing all six-cutters from the original collection.  You should  
use one of the EnzymeCollection accessor methods to retrieve the  
enzyme that you wanted first or iterate through them all.  This works  
for me:

use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection->new();
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection->each_enzyme){
    print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";
}

chris

On Jun 12, 2007, at 1:11 PM, L Xu wrote:

> Thank you very much, it did make the script advanced a bit but I  
> got the following error:
>
> C:\~Scripts>perl t9.pl
> Can't locate object method "name" via package  
> "Bio::Restriction::EnzymeCollectio
> n" at t9.pl line 5, <DATA> line 532.
>
> I checked the documentation , there is no "name" method for the  
> package. Thanks.


From johnsonm at gmail.com  Tue Jun 12 15:07:57 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 14:07:57 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
Message-ID: <ebf5eb170706121207p4ad86a6cr9af85e766168cfbe@mail.gmail.com>

I'll wait a day, and if there is no opinion to the contrary, implement
it this way.

On 6/12/07, Mark Johnson <johnsonm at gmail.com> wrote:
>     That's a good point.  Both Bio::Tools::Glimmer and
> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
> prokaryotic sequence (multiple exons for eukaryotic).  There are
> eukaryotic and prokaryotic versions of both predictor families.  Maybe
> the most elegant solution would be to simply modify both modules to
> only emit Bio::SeqFeature::Generic features when operating on
> prokaryotic mode output?  Fix the data model and the problem goes
> away.  8)

From torsten.seemann at infotech.monash.edu.au  Tue Jun 12 20:18:27 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 13 Jun 2007 10:18:27 +1000
Subject: [Bioperl-l] gff2xml
In-Reply-To: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
Message-ID: <a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>

Sean

> I posted this on the gbrowse list earlier. I'm looking to convert gff
> data files into xml. Does anyone know of a module written to do this
> already?

What DTD do you want the XML to conform to?
eg. ChadoXML, TinySeq XML, TIGR XML ... ?

What program are you trying to get to load the XML?

BioPerl has some Bio::SeqIO:xxxxx modules for some XML formats that
you could use. There is a script "bp_seqconvert.pl -h" which comes
with BioPerl which may be useful.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010

From hlapp at gmx.net  Tue Jun 12 20:55:57 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 12 Jun 2007 20:55:57 -0400
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
Message-ID: <0915FAB4-E554-4E65-BA3F-1B916F0F95FC@gmx.net>

I think it was just trying to guard against people trying to do  
stupid things.

I'm actually not sure that representing locations on a circular  
genome using split locations really is the best thing. I'm wondering  
whether one shouldn't rather introduce a CircularLocation object  
(though obviously it isn't the location that's circular...).

Just a thought. In the end, if you have a way to make this work that  
you feel comfortable with than go for it.

	-hilmar

On Jun 12, 2007, at 12:10 PM, Mark Johnson wrote:

> On 6/12/07, Torsten Seemann  
> <torsten.seemann at infotech.monash.edu.au> wrote:
>> Can you use the ->spliced_seq() method to do this?
>>
>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ 
>> SeqFeatureI.html#POD11
>>
>> --
>> --Torsten Seemann
>> --Victorian Bioinformatics Consortium, Monash University
>> --Tel +61 3 9905 9010
>
>     Actually, I'd forgotten about spliced_seq().  That seems like it
> will Do The Right Thing.  It's just up to the invoker to call
> spliced_seq() instead of seq() as appropriate.
>     So, is there any other code that will break if I modify
> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> just a paranoid check or if it's there to guard against something.  If
> the latter, I need to know what code to fix.  I'll dig and look, but
> if anybody knows or has an idea, save me some time.  I suppose I can
> just change it and see what tests start failing. 8)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Tue Jun 12 20:57:06 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 12 Jun 2007 20:57:06 -0400
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
Message-ID: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>

I like that. Don't force a model to do what you want if it doesn't  
really apply anyway.

	-hilmar

On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote:

>     That's a good point.  Both Bio::Tools::Glimmer and
> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
> prokaryotic sequence (multiple exons for eukaryotic).  There are
> eukaryotic and prokaryotic versions of both predictor families.  Maybe
> the most elegant solution would be to simply modify both modules to
> only emit Bio::SeqFeature::Generic features when operating on
> prokaryotic mode output?  Fix the data model and the problem goes
> away.  8)
>
> On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>
>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>>
>>> On 6/12/07, Torsten Seemann
>>> <torsten.seemann at infotech.monash.edu.au> wrote:
>>>> Can you use the ->spliced_seq() method to do this?
>>>>
>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
>>>> SeqFeatureI.html#POD11
>>>>
>>>> --
>>>> --Torsten Seemann
>>>> --Victorian Bioinformatics Consortium, Monash University
>>>> --Tel +61 3 9905 9010
>>>
>>>     Actually, I'd forgotten about spliced_seq().  That seems like it
>>> will Do The Right Thing.  It's just up to the invoker to call
>>> spliced_seq() instead of seq() as appropriate.
>>>     So, is there any other code that will break if I modify
>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
>>> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
>>> just a paranoid check or if it's there to guard against  
>>> something.  If
>>> the latter, I need to know what code to fix.  I'll dig and look, but
>>> if anybody knows or has an idea, save me some time.  I suppose I can
>>> just change it and see what tests start failing. 8)
>>
>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
>> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
>> state that the Exon class is used to specifically describe exons, as
>> the name implies.  Exons are primarily eukaryotic in origin, so you
>> shouldn't encounter wraparounds, and should not have split locations
>> by definition (which likely explains the exception).
>>
>> Wouldn't a SeqFeature::Generic work just as well using a split  
>> location?
>>
>> chris
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Jun 12 21:20:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 20:20:41 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
	<80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>
Message-ID: <951EB9CA-2066-4CD1-BCD5-4E00232CA507@uiuc.edu>

It will be interesting to see if bioperl handles wrap-around split  
locations via spliced_seq() and other methods.  I can't see why it  
wouldn't but one never knows.  Might be something to add to location  
tests at some point...

chris

On Jun 12, 2007, at 7:57 PM, Hilmar Lapp wrote:

> I like that. Don't force a model to do what you want if it doesn't
> really apply anyway.
>
> 	-hilmar
>
> On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote:
>
>>     That's a good point.  Both Bio::Tools::Glimmer and
>> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
>> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
>> prokaryotic sequence (multiple exons for eukaryotic).  There are
>> eukaryotic and prokaryotic versions of both predictor families.   
>> Maybe
>> the most elegant solution would be to simply modify both modules to
>> only emit Bio::SeqFeature::Generic features when operating on
>> prokaryotic mode output?  Fix the data model and the problem goes
>> away.  8)
>>
>> On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>>
>>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>>>
>>>> On 6/12/07, Torsten Seemann
>>>> <torsten.seemann at infotech.monash.edu.au> wrote:
>>>>> Can you use the ->spliced_seq() method to do this?
>>>>>
>>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
>>>>> SeqFeatureI.html#POD11
>>>>>
>>>>> --
>>>>> --Torsten Seemann
>>>>> --Victorian Bioinformatics Consortium, Monash University
>>>>> --Tel +61 3 9905 9010
>>>>
>>>>     Actually, I'd forgotten about spliced_seq().  That seems  
>>>> like it
>>>> will Do The Right Thing.  It's just up to the invoker to call
>>>> spliced_seq() instead of seq() as appropriate.
>>>>     So, is there any other code that will break if I modify
>>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception  
>>>> when
>>>> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
>>>> just a paranoid check or if it's there to guard against
>>>> something.  If
>>>> the latter, I need to know what code to fix.  I'll dig and look,  
>>>> but
>>>> if anybody knows or has an idea, save me some time.  I suppose I  
>>>> can
>>>> just change it and see what tests start failing. 8)
>>>
>>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
>>> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
>>> state that the Exon class is used to specifically describe exons, as
>>> the name implies.  Exons are primarily eukaryotic in origin, so you
>>> shouldn't encounter wraparounds, and should not have split locations
>>> by definition (which likely explains the exception).
>>>
>>> Wouldn't a SeqFeature::Generic work just as well using a split
>>> location?
>>>
>>> chris
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ryanx07 at hotmail.com  Wed Jun 13 08:16:15 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Wed, 13 Jun 2007 07:16:15 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
Message-ID: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>

Thanks so much, Chris, it works now.
All the codes I tested were copied from Bioperl Tutorial. Why did they have 
such problems, because of the platform issue or different versions of 
BioPerl? I tested so far 6 scripts, three work and three don't.

Here is the problem for the 3rd failed script:
=================================
use strict;
use Bio::Tools::Run::RemoteBlast;
my $remote_blast = Bio::Tools::Run::RemoteBlast->new (
         -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' );
my $r = $remote_blast->submit_blast("d1.fa");
my $rc;
while ( my @rids = $remote_blast->each_rid ) {
    for my $rid ( @rids ) {
       $rc = $remote_blast->retrieve_blast($rid);
    }
}
print "$rc\n"; #I just want to print sth here before parsing the result
=========================================================d1.fa
>example
CCCTTCAGGTACCCCGAGGTAACACGAGACACTCGGGATCTGGGAAGGGGACTGGGGCTTCTTTAAAAGCGCTCAGTTTAAAAAGCTTCTATGCCTGAATAGGTGACCGGAGGCCGGCACC
=========================================================result
C:\>perl t13.pl

-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------

-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------
Terminating on signal SIGINT(2)

C:\>


Please help me to correct the problem, thanks.


= = = Original message = = =

Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme? objects, 
each with its own name().? Using grouped methods like? 
'$collection->cutters(6)' will retrieve a new EnzymeCollection? containing 
all six-cutters from the original collection.? You should? use one of the 
EnzymeCollection accessor methods to retrieve the? enzyme that you wanted 
first or iterate through them all.? This works? for me:

use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection->new();
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection->each_enzyme)
?? print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";


chris

On Jun 12, 2007, at 1:11 PM, L Xu wrote:


   Thank you very much, it did make the script advanced a bit but I? got the 
following error:

C:\~Scripts>perl t9.pl
Can't locate object method "name" via package? 
"Bio::Restriction::EnzymeCollectio
n" at t9.pl line 5, <DATA> line 532.

I checked the documentation , there is no "name" method for the? package. 
Thanks.

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Make every IM count. Download Messenger and join the i?m Initiative now. 
It?s free. http://im.live.com/messenger/im/home/?source=TAGHM_June07


From cjfields at uiuc.edu  Wed Jun 13 10:41:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 09:41:55 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
Message-ID: <4F7BE556-BD8C-4378-BDE7-1F31364F49DA@uiuc.edu>

Judging by the output it looks like you have no network access or  
can't connect to the server (what remoteblast needs).  Make sure you  
don't need proxy settings.

To preempt the next question, no, I'm not going to explain what a  
proxy is.  The RemoteBlast docs show how to set them, and Google is a  
wonderful tool...

chris

On Jun 13, 2007, at 7:16 AM, L Xu wrote:

> ...
> -------------------- WARNING ---------------------
> MSG: <HTML>
> <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> <BODY>
> <H1>An Error Occurred</H1>
> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> </BODY>
> </HTML>
>
> ---------------------------------------------------
> ...


From ryanx07 at hotmail.com  Wed Jun 13 11:01:07 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Wed, 13 Jun 2007 10:01:07 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
Message-ID: <BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>

I do have the internet connection bu not use the proxy server.
I tested the network connection with ping command (below). The ncbi website 
does not response. Is there any special network setting needed for 
connecting the ncbi website?
Thank you so much.

C:\>ping www.yahoo.com

Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:

Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
Reply from 69.147.114.210: bytes=32 time=360ms TTL=45

Ping statistics for 69.147.114.210:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 312ms, Maximum = 363ms, Average = 338ms

C:\>ping www.ncbi.nlm.nih.gov

Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:

Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 130.14.29.110:
    Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),


= = = Original message = = =

Judging by the output it looks like you have no network access or? can't 
connect to the server (what remoteblast needs).? Make sure you? don't need 
proxy settings.

To preempt the next question, no, I'm not going to explain what a? proxy 
is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
tool...

chris

On Jun 13, 2007, at 7:16 AM, L Xu wrote:


   ...
-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------
...

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Wed Jun 13 12:14:22 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 11:14:22 -0500
Subject: [Bioperl-l] method naming
Message-ID: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>

Some quick questions on method naming.  I couldn't find this on the  
mail list previously and just want some opinions.

1) Is there any preference on how to name a method that returns a  
list of class instances vs. data?  I have seen 'each' (each_Location,  
each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.  
simple (hits, hsps).

2) Do we want have methods which return objects have the object name  
in Title Case (each_Location, get_Seq_by_id, etc) or does it really  
matter?

chris

From dmessina at wustl.edu  Wed Jun 13 12:41:53 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 13 Jun 2007 11:41:53 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
Message-ID: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>

> 1) Is there any preference on how to name a method that returns a
> list of class instances vs. data?  I have seen 'each' (each_Location,
> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
> simple (hits, hsps).

I'd prefer 'get_all' because it's more intuitive to me what the  
method is doing. 'Each' is too programmer-y.


> 2) Do we want have methods which return objects have the object name
> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
> matter?

I like Title Case because it reinforces the notion that what you're  
getting back is a specific object with that name (Seq) rather than  
the generic thing that the name represents (AGTCTGTGATAT, the actual  
sequence as a string).


Dave


From hlapp at gmx.net  Wed Jun 13 13:03:59 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 13:03:59 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
Message-ID: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>

We set a convention a while back on how to name these. It is  
implemented in the bioperl.lisp file (too bad no one is using emacs  
any more these days - it's a great editor), and in fact we started a  
renaming campaign (not sure when that was) on the SeqI and  
SeqFeatureI classes (you'll still see the old names aliased).

However, we never got to finish the clean up.

The convention was to use get_{ClassName}s, and get_all_{ClassName}s  
if there is a difference to the former (mostly because of  
hierarchical data; for example features can be nested, and  
get_all_SeqFeatures returns them all flattened out, while  
get_SeqFeatures returns only the top objects), and for modifying add_ 
{ClassName} and remove_{ClassName}s.

The class name was to be in title case to emphasize the fact that it  
is an array of object you'd be getting back (and what kind of  
objects). If it is strings or any other scalar type, the name would  
be in lower case.

	-hilmar

On Jun 13, 2007, at 12:14 PM, Chris Fields wrote:

> Some quick questions on method naming.  I couldn't find this on the
> mail list previously and just want some opinions.
>
> 1) Is there any preference on how to name a method that returns a
> list of class instances vs. data?  I have seen 'each' (each_Location,
> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
> simple (hits, hsps).
>
> 2) Do we want have methods which return objects have the object name
> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
> matter?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 13 13:19:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 12:19:43 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
Message-ID: <B7E2E5CA-3027-4D25-B9EA-998D2BC59DBB@uiuc.edu>

Sounds good.  I agree with Dave also one the use of 'each', as it's a  
bit ambiguous (seems to imply iteration as opposed to returning a  
whole list).

We probably need to post this somewhere on the wiki for future  
reference; maybe in Advanced BioPerl?  I'll add this in shortly.

chris

On Jun 13, 2007, at 12:03 PM, Hilmar Lapp wrote:

> We set a convention a while back on how to name these. It is  
> implemented in the bioperl.lisp file (too bad no one is using emacs  
> any more these days - it's a great editor), and in fact we started  
> a renaming campaign (not sure when that was) on the SeqI and  
> SeqFeatureI classes (you'll still see the old names aliased).
>
> However, we never got to finish the clean up.
>
> The convention was to use get_{ClassName}s, and get_all_{ClassName} 
> s if there is a difference to the former (mostly because of  
> hierarchical data; for example features can be nested, and  
> get_all_SeqFeatures returns them all flattened out, while  
> get_SeqFeatures returns only the top objects), and for modifying  
> add_{ClassName} and remove_{ClassName}s.
>
> The class name was to be in title case to emphasize the fact that  
> it is an array of object you'd be getting back (and what kind of  
> objects). If it is strings or any other scalar type, the name would  
> be in lower case.
>
> 	-hilmar
>
> On Jun 13, 2007, at 12:14 PM, Chris Fields wrote:
>
>> Some quick questions on method naming.  I couldn't find this on the
>> mail list previously and just want some opinions.
>>
>> 1) Is there any preference on how to name a method that returns a
>> list of class instances vs. data?  I have seen 'each' (each_Location,
>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
>> simple (hits, hsps).
>>
>> 2) Do we want have methods which return objects have the object name
>> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
>> matter?
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Jun 13 14:43:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 13:43:41 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <467036FC.8000505@watson.wustl.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
	<467036FC.8000505@watson.wustl.edu>
Message-ID: <286EE81C-0926-4AAE-9110-02948DFADF36@uiuc.edu>


On Jun 13, 2007, at 1:27 PM, Michael Kiwala wrote:

>
> David Messina wrote:
>>> 1) Is there any preference on how to name a method that returns a
>>> list of class instances vs. data?  I have seen  
>>> 'each' (each_Location,
>>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures)  
>>> vs.
>>> simple (hits, hsps).
>>>
>>
>> I'd prefer 'get_all' because it's more intuitive to me what the   
>> method is doing. 'Each' is too programmer-y.
>>
>>
>>
> When I think 'get_all', I think of a method that returns a list of  
> objects at once. When I think of 'each', I think of a method that  
> returns a scalar but can be called multiple times to iterate over a  
> set of objects.

Yep, hence the ambiguity issue (and my confusion).  I think it was so  
you could both iterate and return a list using this:

for my $obj ($seq->each_Class) {...}
my @objs = $seq->each_Class;

I use 'next' and 'get/get_all' as an iterator and get accessor  
(similar to how it's used in Bio::SearchIO):

while (my $obj = $seq->next_Class) {...}
my @objs = $seq->get_Class; # or get_all_Class for flattened lists

which to me is much clearer.

chris

From mkiwala at watson.wustl.edu  Wed Jun 13 14:27:08 2007
From: mkiwala at watson.wustl.edu (Michael Kiwala)
Date: Wed, 13 Jun 2007 13:27:08 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
Message-ID: <467036FC.8000505@watson.wustl.edu>


David Messina wrote:
>> 1) Is there any preference on how to name a method that returns a
>> list of class instances vs. data?  I have seen 'each' (each_Location,
>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
>> simple (hits, hsps).
>>     
>
> I'd prefer 'get_all' because it's more intuitive to me what the  
> method is doing. 'Each' is too programmer-y.
>
>
>   
When I think 'get_all', I think of a method that returns a list of 
objects at once. When I think of 'each', I think of a method that 
returns a scalar but can be called multiple times to iterate over a set 
of objects.


From sac at bioperl.org  Wed Jun 13 17:17:27 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Wed, 13 Jun 2007 14:17:27 -0700
Subject: [Bioperl-l] method naming
In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
Message-ID: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>

On 6/13/07, Hilmar Lapp <hlapp at gmx.net> wrote:
> We set a convention a while back on how to name these. It is
> implemented in the bioperl.lisp file (too bad no one is using emacs
> any more these days - it's a great editor),

As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we
could improve the visibility of bioperl.lisp. In truth, I had
forgotten about it, though lit turns out I was loading an old version
of it. (Btw, using the latest version of bioperl.lisp with xemacs
21.4.17, I don't get a bioperl menu item, though I can access bioperl
functions via M-x. Suggestions?)

I see bioperl.lisp is mentioned twice parenthetically in the advanced
bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here
would help. While we're at it, maybe we could add a bioperl.vi file to
the distribution (if you can do such things with vi/vim).

On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
> We probably need to post this somewhere on the wiki for future
> reference; maybe in Advanced BioPerl?  I'll add this in shortly.

Another idea: Add a method naming check to the set of audits we
perform on CVS committed code. It could check for agreement with our
conventions and warn if nothing was found (may not be a problem
though).

Steve

From arareko at campus.iztacala.unam.mx  Wed Jun 13 18:03:34 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 13 Jun 2007 17:03:34 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
Message-ID: <467069B6.7080003@campus.iztacala.unam.mx>

By the time of the 1.5.2 release, I jumped onto the idea of creating a 
BioPerl template for Komodo. Chris F handed me one he had already made 
but in the end I didn't had enough spare time to get into it. If someone 
wants to give it a try please let ChrisF/me know.

Regards,
Mauricio.

Steve Chervitz wrote:
> On 6/13/07, Hilmar Lapp <hlapp at gmx.net> wrote:
>> We set a convention a while back on how to name these. It is
>> implemented in the bioperl.lisp file (too bad no one is using emacs
>> any more these days - it's a great editor),
> 
> As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we
> could improve the visibility of bioperl.lisp. In truth, I had
> forgotten about it, though lit turns out I was loading an old version
> of it. (Btw, using the latest version of bioperl.lisp with xemacs
> 21.4.17, I don't get a bioperl menu item, though I can access bioperl
> functions via M-x. Suggestions?)
> 
> I see bioperl.lisp is mentioned twice parenthetically in the advanced
> bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here
> would help. While we're at it, maybe we could add a bioperl.vi file to
> the distribution (if you can do such things with vi/vim).
> 
> On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> We probably need to post this somewhere on the wiki for future
>> reference; maybe in Advanced BioPerl?  I'll add this in shortly.
> 
> Another idea: Add a method naming check to the set of audits we
> perform on CVS committed code. It could check for agreement with our
> conventions and warn if nothing was found (may not be a problem
> though).
> 
> Steve
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From hlapp at gmx.net  Wed Jun 13 18:41:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 18:41:45 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
Message-ID: <FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>


On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:

> using the latest version of bioperl.lisp with xemacs 21.4.17, I  
> don't get a bioperl menu item

I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item  
it showing up just beautifully. (BTW it also have very nice icons for  
various functions - though I always feel guilty for using keystrokes  
instead.)

Is GNU Emacs finally winning this? ;)

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jason at bioperl.org  Wed Jun 13 18:58:51 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 13 Jun 2007 15:58:51 -0700
Subject: [Bioperl-l] method naming
In-Reply-To: <FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
Message-ID: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>

Post your dualing screenshots to the wiki!

I had started a couple of IDE pages on the wiki a while ago:
  http://bioperl.org/wiki/Emacs
  http://bioperl.org/wiki/Emacs_template
  http://bioperl.org/wiki/Vi

If anyone is feeling excited enough to write a few more IDE pages and  
link them into a common article that would be great.

-jason
On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:

>
> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>
>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>> don't get a bioperl menu item
>
> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item
> it showing up just beautifully. (BTW it also have very nice icons for
> various functions - though I always feel guilty for using keystrokes
> instead.)
>
> Is GNU Emacs finally winning this? ;)
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From cjfields at uiuc.edu  Wed Jun 13 19:08:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 18:08:17 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
Message-ID: <E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>

Would probably be worth writing one up for Komodo since Mauricio,  
Sendu, and I use it.

I updated the Advanced BioPerl page with Hilmar's methods suggestions/ 
rules (as well as a few I found dating back a number of years on the  
mail list).  It might be worth a glance in case there are any changes  
needed:

http://www.bioperl.org/wiki/Advanced_BioPerl

chris

On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote:

> Post your dualing screenshots to the wiki!
>
> I had started a couple of IDE pages on the wiki a while ago:
>  http://bioperl.org/wiki/Emacs
>  http://bioperl.org/wiki/Emacs_template
>  http://bioperl.org/wiki/Vi
>
> If anyone is feeling excited enough to write a few more IDE pages  
> and link them into a common article that would be great.
>
> -jason
> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:
>
>>
>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>>
>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>>> don't get a bioperl menu item
>>
>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item
>> it showing up just beautifully. (BTW it also have very nice icons for
>> various functions - though I always feel guilty for using keystrokes
>> instead.)
>>
>> Is GNU Emacs finally winning this? ;)
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Wed Jun 13 19:28:17 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 19:28:17 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
	<E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
Message-ID: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>

Thanks Chris for doing this - looks great. The only comment that I  
have is that method names should never start with a capital letter.  
If the getter/setter is for a single object (as opposed to a list),  
the name should probably be similar (if not identical) to the class  
being expected and returned, but lower-case.

E.g., $feature->location(), $seq->species() etc

	-hilmar

On Jun 13, 2007, at 7:08 PM, Chris Fields wrote:

> Would probably be worth writing one up for Komodo since Mauricio,  
> Sendu, and I use it.
>
> I updated the Advanced BioPerl page with Hilmar's methods  
> suggestions/rules (as well as a few I found dating back a number of  
> years on the mail list).  It might be worth a glance in case there  
> are any changes needed:
>
> http://www.bioperl.org/wiki/Advanced_BioPerl
>
> chris
>
> On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote:
>
>> Post your dualing screenshots to the wiki!
>>
>> I had started a couple of IDE pages on the wiki a while ago:
>>  http://bioperl.org/wiki/Emacs
>>  http://bioperl.org/wiki/Emacs_template
>>  http://bioperl.org/wiki/Vi
>>
>> If anyone is feeling excited enough to write a few more IDE pages  
>> and link them into a common article that would be great.
>>
>> -jason
>> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:
>>
>>>
>>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>>>
>>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>>>> don't get a bioperl menu item
>>>
>>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu  
>>> item
>>> it showing up just beautifully. (BTW it also have very nice icons  
>>> for
>>> various functions - though I always feel guilty for using keystrokes
>>> instead.)
>>>
>>> Is GNU Emacs finally winning this? ;)
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 13 19:44:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 18:44:08 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
	<E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
	<06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>
Message-ID: <91AF2018-EC27-49FD-A4D1-C31C0E73DEFB@uiuc.edu>

Agreed.  We can definitely add that in.

As we edge towards another release we try another round of cleaning  
up.  I wouldn't mind pushing out another 1.5 point release before  
summer's up if possible; most of the tough work was done for v.1.5.2  
by Sendu.

chris

On Jun 13, 2007, at 6:28 PM, Hilmar Lapp wrote:

> Thanks Chris for doing this - looks great. The only comment that I
> have is that method names should never start with a capital letter.
> If the getter/setter is for a single object (as opposed to a list),
> the name should probably be similar (if not identical) to the class
> being expected and returned, but lower-case.
>
> E.g., $feature->location(), $seq->species() etc
>
> 	-hilmar
>
> On Jun 13, 2007, at 7:08 PM, Chris Fields wrote:
>
>> Would probably be worth writing one up for Komodo since Mauricio,
>> Sendu, and I use it.
>>
>> I updated the Advanced BioPerl page with Hilmar's methods
>> suggestions/rules (as well as a few I found dating back a number of
>> years on the mail list).  It might be worth a glance in case there
>> are any changes needed:
>>
>> http://www.bioperl.org/wiki/Advanced_BioPerl
>>
>> chris
...

From johncumbers at gmail.com  Wed Jun 13 20:20:42 2007
From: johncumbers at gmail.com (John Cumbers)
Date: Wed, 13 Jun 2007 20:20:42 -0400
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
Message-ID: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>

Hello,

I have a simple problem, I'm trying to search a genome sequence for a motif,
I then want to output a BED file to display all the locations of this motif
on the UCSC Genome Browser.  I could not find a script to do this, so I
started to write my own.   I'm new to perl and my code below was my attempt
to read the sequence string and output the index bp of the start of each
motif.  With this I could build the BED file myself, which requires start
and finish base pairs.

For the first motif I can output the start index, but when I try and read
the next one off the sequence it does not work.  Instead I just get an
output of a list of 1's.  I realise that this is more a request for some
simple perl help, but any help much appreciated.

Best wishes,
John


$seq_object = read_sequence("Drosophila.Chr3.test.AE014296.fasta");  #turn
my FASTA file into a seq object.
$sequence_as_a_string = $seq_object->seq();  #turn it into a string
# search $sequence_as_a_string  string for motif AAA as example
# if found, return the index that it is found at

while ($sequence_as_a_string =~ m/AAA/g) {
  print "Found '$&'.  Next attempt at character " .
pos($sequence_as_a_string)+1 . "\n";
}


-- 
John Cumbers,  Graduate Student
Biology and Medicine
Brown University, Box G-W
Providence, Rhode Island, 02912, USA
Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
UK to USA: 0207 617 7824

From cjfields at uiuc.edu  Wed Jun 13 21:58:37 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 20:58:37 -0500
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
In-Reply-To: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
References: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
Message-ID: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>

This is answered in the FAQ (sorry if the URL wraps, but we don't  
like tinyurls):

http://www.bioperl.org/wiki/ 
FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. 
22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F

chris

On Jun 13, 2007, at 7:20 PM, John Cumbers wrote:

> Hello,
>
> I have a simple problem, I'm trying to search a genome sequence for  
> a motif,
> I then want to output a BED file to display all the locations of  
> this motif
> on the UCSC Genome Browser.  I could not find a script to do this,  
> so I
> started to write my own.   I'm new to perl and my code below was my  
> attempt
> to read the sequence string and output the index bp of the start of  
> each
> motif.  With this I could build the BED file myself, which requires  
> start
> and finish base pairs.
>
> For the first motif I can output the start index, but when I try  
> and read
> the next one off the sequence it does not work.  Instead I just get an
> output of a list of 1's.  I realise that this is more a request for  
> some
> simple perl help, but any help much appreciated.
>
> Best wishes,
> John
>
>
> $seq_object = read_sequence 
> ("Drosophila.Chr3.test.AE014296.fasta");  #turn
> my FASTA file into a seq object.
> $sequence_as_a_string = $seq_object->seq();  #turn it into a string
> # search $sequence_as_a_string  string for motif AAA as example
> # if found, return the index that it is found at
>
> while ($sequence_as_a_string =~ m/AAA/g) {
>   print "Found '$&'.  Next attempt at character " .
> pos($sequence_as_a_string)+1 . "\n";
> }
>
>
>
> -- 
> John Cumbers,  Graduate Student
> Biology and Medicine
> Brown University, Box G-W
> Providence, Rhode Island, 02912, USA
> Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
> UK to USA: 0207 617 7824
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Thu Jun 14 00:08:04 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 13 Jun 2007 21:08:04 -0700
Subject: [Bioperl-l] wiki bulk update
Message-ID: <992B2C7A-E944-4C69-BDE0-B0B0F6D1274D@bioperl.org>

I did a some bulk update of Module pages for new modules that had  
been created since we last setup these pages:
I outlined a little bit of what it requires behind the scenes.

http://bioperl.org/wiki/BioPerl:Module_pages

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From bix at sendu.me.uk  Thu Jun 14 05:35:00 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 10:35:00 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
Message-ID: <46710BC4.3060302@sendu.me.uk>

It is preferable to have ->new syntax over new Object syntax, as 
outlined here: 
http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules

I propose making this syntax change in all Bioperl POD documentation, so 
that the bad syntax is no longer suggested/encouraged. Any objections? 
If not, I'll go ahead and commit the changes.

(affects 907 modules in live)


Cheers,
Sendu.

From bix at sendu.me.uk  Thu Jun 14 06:01:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 11:01:02 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46710BC4.3060302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
Message-ID: <467111DE.6060800@sendu.me.uk>

Sendu Bala wrote:
> It is preferable to have ->new syntax over new Object syntax, as 
> outlined here: 
> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules 
> 
> 
> I propose making this syntax change in all Bioperl POD documentation,

Actually, I propose making the change to code as well.


From hlapp at gmx.net  Thu Jun 14 08:47:47 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 14 Jun 2007 08:47:47 -0400
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <467111DE.6060800@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk>
Message-ID: <0D7CD74F-DCB3-44F8-9AC7-144B1BD58946@gmx.net>

Sounds fine to me. People do go by working examples, and I've seen  
inconsistent examples leading to confusion on the end of newbies.

	-hilmar

On Jun 14, 2007, at 6:01 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>>
>> I propose making this syntax change in all Bioperl POD documentation,
>
> Actually, I propose making the change to code as well.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Jun 14 08:55:18 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 07:55:18 -0500
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <467111DE.6060800@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk>
Message-ID: <EC0DB8AB-F7C8-423B-9566-34B3FD24B3EC@uiuc.edu>

Sounds fine by me.  I may actually start tackling some of the feature/ 
annotation overloading stuff myself to see what happens (I'll drop a  
notice when that occurs).

chris

On Jun 14, 2007, at 5:01 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>>
>> I propose making this syntax change in all Bioperl POD documentation,
>
> Actually, I propose making the change to code as well.


From tanzeem.mb at gmail.com  Thu Jun 14 02:27:19 2007
From: tanzeem.mb at gmail.com (tanzeem)
Date: Wed, 13 Jun 2007 23:27:19 -0700 (PDT)
Subject: [Bioperl-l] Problem working with remoteblast submit method in
	webbrowser.
Message-ID: <11114623.post@talk.nabble.com>


 I have a program which uses the Bio perl remoteblast module which compares a
aminoacid  fasta file with swissprot database. The submit_blast() method 
works successfully when   run  from commandline.But when the program is run
from web browser it returns -1. I was trying to adapt the code from
Remoteblast synopsis for my need.
-- 
View this message in context: http://www.nabble.com/Problem-working-with-remoteblast-submit-method-in-webbrowser.-tf3919886.html#a11114623
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From bix at sendu.me.uk  Thu Jun 14 11:34:27 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 16:34:27 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46710BC4.3060302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
Message-ID: <46716003.2030302@sendu.me.uk>

Sendu Bala wrote:
> It is preferable to have ->new syntax over new Object syntax, as 
> outlined here: 
> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules
> 
> I propose making this syntax change in all Bioperl POD documentation, so 
> that the bad syntax is no longer suggested/encouraged. Any objections? 
> If not, I'll go ahead and commit the changes.
> 
> (affects 907 modules in live)

It was actually 515 modules & test scripts from live, 48 from run, 21
from db and 2 from network.

Now committed. Before and after my changes these were failing:


Failed Test     Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
t/BioGraphics.t    3   768    38    3  3-5
t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
                                        1932 2106
t/Sopma.t          2   512    16    2  8 15
t/genbank.t        2   512   247    2  122-123


BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
(unintentional?).

Sopma may not be a bug: results from server might have changed.

genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163
-> 1.164 not doing what the new tests expect.

PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
you working on that, or can I fix those errors?

Anyone care to look into those things?

Cheers,
Sendu.


From cjfields at uiuc.edu  Thu Jun 14 12:35:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 11:35:21 -0500
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <AAFC1021-9E3A-4C31-A9B8-4B0046F907A1@uiuc.edu>

The genbank commit was mine so I'll look into it; may be that I  
hadn't finished up the bug work.  If if have time I'll look into  
Sopma as well (unless you get to it first).

chris

On Jun 14, 2007, at 10:34 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>> I propose making this syntax change in all Bioperl POD  
>> documentation, so
>> that the bad syntax is no longer suggested/encouraged. Any  
>> objections?
>> If not, I'll go ahead and commit the changes.
>>
>> (affects 907 modules in live)
>
> It was actually 515 modules & test scripts from live, 48 from run, 21
> from db and 2 from network.
>
> Now committed. Before and after my changes these were failing:
>
>
> Failed Test     Stat Wstat Total Fail  List of Failed
> ---------------------------------------------------------------------- 
> ---------
> t/BioGraphics.t    3   768    38    3  3-5
> t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
>                                         1932 2106
> t/Sopma.t          2   512    16    2  8 15
> t/genbank.t        2   512   247    2  122-123
>
>
> BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
> (unintentional?).
>
> Sopma may not be a bug: results from server might have changed.
>
> genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm  
> 1.163
> -> 1.164 not doing what the new tests expect.
>
> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan,  
> are
> you working on that, or can I fix those errors?
>
> Anyone care to look into those things?
>
> Cheers,
> Sendu.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Thu Jun 14 12:43:43 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 17:43:43 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <4671703F.4010109@sheffield.ac.uk>

I'm just wondering if anyone passes their modules through perltidy in
order for them to have the same look/feel? If so, do you have a
.perltidyrc file? Also, is it worth running the Bioperl modules through it?

Nath

From n.haigh at sheffield.ac.uk  Thu Jun 14 12:36:37 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 17:36:37 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <46716E95.3090604@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as 
>> outlined here: 
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules
>>
>> I propose making this syntax change in all Bioperl POD documentation, so 
>> that the bad syntax is no longer suggested/encouraged. Any objections? 
>> If not, I'll go ahead and commit the changes.
>>
>> (affects 907 modules in live)
> 
> It was actually 515 modules & test scripts from live, 48 from run, 21
> from db and 2 from network.
> 
> Now committed. Before and after my changes these were failing:
> 
> 
> Failed Test     Stat Wstat Total Fail  List of Failed
> -------------------------------------------------------------------------------
> t/BioGraphics.t    3   768    38    3  3-5
> t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
>                                         1932 2106
> t/Sopma.t          2   512    16    2  8 15
> t/genbank.t        2   512   247    2  122-123
> 
> 
> BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
> (unintentional?).
> 
> Sopma may not be a bug: results from server might have changed.
> 
> genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163
> -> 1.164 not doing what the new tests expect.
> 
> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
> you working on that, or can I fix those errors?
> 

I can fix these - although I'm still trying to get my new Debian 4.0
system up-to-speed so it might take me a little while! RE the
PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't
installed. However, would it be better to have Test::Pod in t/lib so
that it runs on the user's system during installation or leave it as is?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGcW6VczuW2jkwy2gRAv3dAKCURgd4F881MhbessKxNh/cPrJu2wCeLwnS
7olroF2e6+4I0biz6fWRmu4=
=s3hK
-----END PGP SIGNATURE-----

From bix at sendu.me.uk  Thu Jun 14 13:15:24 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 18:15:24 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <4671703F.4010109@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk>
Message-ID: <467177AC.8060104@sendu.me.uk>

Nathan S. Haigh wrote:
> I'm just wondering if anyone passes their modules through perltidy in
> order for them to have the same look/feel? If so, do you have a
> .perltidyrc file? Also, is it worth running the Bioperl modules through it?

I don't use it, but I was contemplating the same thing. Chris uses it 
from time to time and I think we have a similar taste in style.

But we'd have to hammer something out that was agreeable to everyone.

From mmokrejs at ribosome.natur.cuni.cz  Thu Jun 14 13:19:42 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 14 Jun 2007 19:19:42 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
Message-ID: <467178AE.5040905@ribosome.natur.cuni.cz>


David Messina wrote:
> Hi Martin,
> 
> You're in luck -- the BioPerl core distribution includes two scripts  
> for doing just that:
> 
> 	genbank2gff

Somehow these scripts were not installed for me on Gentoo, but I have then in the
cvs copy. ;-) Anyway, the one above is not for me, I do not need the GFF database,
or better to say I have no intent to install that unknown thing, seems like an overkill
for my case. I just want to render a plasmid map.

> 	genbank2gff3

This one seems more promising but still with current cvs checkout I get...

$ perl /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl --in stdin --out stdout < ~/99.gb 
# Input: stdin
Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, <FH> line 7.
Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, <FH> line 7.
Can't call method "binomial" on an undefined value at /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl line 675, <FH> line 125.
$
$ bp_seqconvert.pl --from genbank --to embl < ~/IRESite/gb/99.gb 
Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, <STDIN> line 7.
Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, <STDIN> line 7.
ID   unknown; SV 1; circular; unassigned DNA; STD; UNC; 5391 BP.
XX
AC   unknown;
XX
XX
XX
CC   ApEinfo:methylated:0
...

Oh dear, I have just manually edited the files and still they are wrong? Oh no. :(

> 
> Look in the scripts directory of the distro.
> 
> Also, there is a *huge* amount of documentation and examples on the  
> BioPerl website.
> 
> 	http://www.bioperl.org/wiki/HOWTOs

You mean http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File ? ;-)

> 
> Reading those, reading the FAQ, and searching the mailing list  
> archives are where I look first when I don't know how to do something  
> in BioPerl.
> 
> 
> Dave
> 
> --
> Dave Messina
> Senior Analyst, Assembly Group
> Genome Sequencing Center
> Washington University
> St. Louis, MO
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 99.gb
Url: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070614/fc6e601a/attachment.pl 

From mmokrejs at ribosome.natur.cuni.cz  Thu Jun 14 13:23:28 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 14 Jun 2007 19:23:28 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <467178AE.5040905@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
Message-ID: <46717990.6040509@ribosome.natur.cuni.cz>

Martin MOKREJ? wrote:

>> Also, there is a *huge* amount of documentation and examples on the  
>> BioPerl website.
>>
>>     http://www.bioperl.org/wiki/HOWTOs
> 
> You mean 
> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File 
> ? ;-)

$ perl embl2picture.pl ~/99.gb | display -
Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature Bio::Location::Simple=HASH(0x893ebac): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature Bio::Location::Simple=HASH(0x893e720): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.
$

The plasmid is a circular DNA, why is the diagram in linear? ;-)

Martin


From bix at sendu.me.uk  Thu Jun 14 13:03:34 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 18:03:34 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716E95.3090604@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<46716E95.3090604@sheffield.ac.uk>
Message-ID: <467174E6.1090001@sendu.me.uk>

Nathan S. Haigh wrote:
>> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
>> you working on that, or can I fix those errors?
> 
> I can fix these - although I'm still trying to get my new Debian 4.0
> system up-to-speed so it might take me a little while! RE the
> PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't
> installed. However, would it be better to have Test::Pod in t/lib so
> that it runs on the user's system during installation or leave it as is?

Leave it as is. Every-day users don't need to check the syntax of the 
pod. In fact, it really only needs to be done once, prior to packaging 
up a new release.

From n.haigh at sheffield.ac.uk  Thu Jun 14 13:32:37 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 18:32:37 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <467177AC.8060104@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
Message-ID: <46717BB5.8000706@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> I'm just wondering if anyone passes their modules through perltidy in
>> order for them to have the same look/feel? If so, do you have a
>> .perltidyrc file? Also, is it worth running the Bioperl modules
>> through it?
> 
> I don't use it, but I was contemplating the same thing. Chris uses it
> from time to time and I think we have a similar taste in style.
> 
> But we'd have to hammer something out that was agreeable to everyone.

A starting place maybe Perl Best Practices by Damian Conway:
http://www.oreilly.com/catalog/perlbp/


The perltidyrc file can e found here:
http://www.perlmonks.org/?node_id=485885

I also found this nice thread with some ideas, inc some code that causes
emacs to auto-perltidy everything you use cperl-mode with. I don't use
emacs myself, ut here's the link if anyone is interested:
http://www.perlmonks.org/?node_id=516501

Nath

From johnsonm at gmail.com  Thu Jun 14 13:38:31 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Thu, 14 Jun 2007 12:38:31 -0500
Subject: [Bioperl-l] Perltidy
In-Reply-To: <467177AC.8060104@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
Message-ID: <ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>

    The nice thing about Perl Tidy is that everybody can have their
own config file.  There could be a bioperl default config that gets
applied at checkin time.  Anybody that didn't like it could script
checkouts to get run through their own config.  Diffs might get a
little hairy, but as long as you tidy before diffing, it shouldn't be
too bad.  Speaking of which....coding style is controversial enough,
but since that's already been opened, what about CVS vs Subversion? 8)
 Some of the scripting for this sort of thing might be easer in
Subversion.  Though maybe something like Git would fit the developer
model better (more support for distributed development).

On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
> Nathan S. Haigh wrote:
> > I'm just wondering if anyone passes their modules through perltidy in
> > order for them to have the same look/feel? If so, do you have a
> > .perltidyrc file? Also, is it worth running the Bioperl modules through it?
>
> I don't use it, but I was contemplating the same thing. Chris uses it
> from time to time and I think we have a similar taste in style.
>
> But we'd have to hammer something out that was agreeable to everyone.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From n.haigh at sheffield.ac.uk  Thu Jun 14 13:39:39 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 18:39:39 +0100
Subject: [Bioperl-l] cvs changes in working copy
Message-ID: <46717D5B.5040108@sheffield.ac.uk>

Not sure if I'm being dense or if it's because I've been working with
svn recently, but - how do I get a list of files that are different in
my working copy compared to the repository?

Cheers
Nath

From cjfields at uiuc.edu  Thu Jun 14 13:46:38 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 12:46:38 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
Message-ID: <CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>

Is 99.gb supposed to be a GenBank file?  And you're loading it into  
embl2picture (which I assume takes EMBL format files)?  Without  
example code we can easily make the wrong assumptions (i.e. that this  
is user error and not a BioPerl problem).

Also, I don't believe the feature plotting scripts plot circular  
chromosomes/plasmids.  If you want this functionality you'll have to  
code it for yourself.

chris

On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote:

> Martin MOKREJ? wrote:
>
>>> Also, there is a *huge* amount of documentation and examples on the
>>> BioPerl website.
>>>
>>>     http://www.bioperl.org/wiki/HOWTOs
>>
>> You mean
>> http://www.bioperl.org/wiki/ 
>> HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>> ? ;-)
>
> $ perl embl2picture.pl ~/99.gb | display -
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature  
> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature  
> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature  
> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature  
> Bio::Location::Simple=HASH(0x893e720): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature  
> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
> $
>
> The plasmid is a circular DNA, why is the diagram in linear? ;-)
>
> Martin
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Thu Jun 14 13:57:35 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 14 Jun 2007 12:57:35 -0500
Subject: [Bioperl-l] Perltidy
In-Reply-To: <46717BB5.8000706@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk> <46717BB5.8000706@sheffield.ac.uk>
Message-ID: <4671818F.5040902@campus.iztacala.unam.mx>

I think a consensus .perltidyrc could be placed in the source distribution.

Mauricio.

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> I'm just wondering if anyone passes their modules through perltidy in
>>> order for them to have the same look/feel? If so, do you have a
>>> .perltidyrc file? Also, is it worth running the Bioperl modules
>>> through it?
>> I don't use it, but I was contemplating the same thing. Chris uses it
>> from time to time and I think we have a similar taste in style.
>>
>> But we'd have to hammer something out that was agreeable to everyone.
> 
> A starting place maybe Perl Best Practices by Damian Conway:
> http://www.oreilly.com/catalog/perlbp/
> 
> 
> The perltidyrc file can e found here:
> http://www.perlmonks.org/?node_id=485885
> 
> I also found this nice thread with some ideas, inc some code that causes
> emacs to auto-perltidy everything you use cperl-mode with. I don't use
> emacs myself, ut here's the link if anyone is interested:
> http://www.perlmonks.org/?node_id=516501
> 
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Thu Jun 14 14:32:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 13:32:41 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
Message-ID: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>

To chip in on this, I only use perltidy when I need to clean bioperl  
code up for debugging (particularly if blocks are hard to see) and  
just use the defaults.  I agree it would be nice to have everything  
tidied up but it'll definitely need to be a consensus config file.

About svn, I like the idea of eventually migrating to using it over  
CVS (I think BioPython and BioJava have plans to but I'm not sure)  
but I don't really know enough to say how feasible/difficult the  
migration path would be.  Anyone know?

chris

On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote:

>     The nice thing about Perl Tidy is that everybody can have their
> own config file.  There could be a bioperl default config that gets
> applied at checkin time.  Anybody that didn't like it could script
> checkouts to get run through their own config.  Diffs might get a
> little hairy, but as long as you tidy before diffing, it shouldn't be
> too bad.  Speaking of which....coding style is controversial enough,
> but since that's already been opened, what about CVS vs Subversion? 8)
>  Some of the scripting for this sort of thing might be easer in
> Subversion.  Though maybe something like Git would fit the developer
> model better (more support for distributed development).
>
> On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
>> Nathan S. Haigh wrote:
>>> I'm just wondering if anyone passes their modules through  
>>> perltidy in
>>> order for them to have the same look/feel? If so, do you have a
>>> .perltidyrc file? Also, is it worth running the Bioperl modules  
>>> through it?
>>
>> I don't use it, but I was contemplating the same thing. Chris uses it
>> from time to time and I think we have a similar taste in style.
>>
>> But we'd have to hammer something out that was agreeable to everyone.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonm at gmail.com  Thu Jun 14 14:46:24 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Thu, 14 Jun 2007 13:46:24 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
Message-ID: <ebf5eb170706141146r6e07efffhbb98a6d101c45ccd@mail.gmail.com>

    If there was a default/standard/consensus bioperl perltidy config
file, I would probably use it prior to checkin, on my own, so I could
code in my schizophrenic style without worrying about starting any
format wars.  When I'm fixing or enhancing somebody else's code, I
always try and adapt to whatever style they used, even if it grates on
my nerves.  I'd love to not have to worry about that with Bioperl.  Of
course, nobody will every agree on a standard, so it's probably a moot
point.  8)

On 6/14/07, Chris Fields <cjfields at uiuc.edu> wrote:
> To chip in on this, I only use perltidy when I need to clean bioperl
> code up for debugging (particularly if blocks are hard to see) and
> just use the defaults.  I agree it would be nice to have everything
> tidied up but it'll definitely need to be a consensus config file.
>
> About svn, I like the idea of eventually migrating to using it over
> CVS (I think BioPython and BioJava have plans to but I'm not sure)
> but I don't really know enough to say how feasible/difficult the
> migration path would be.  Anyone know?
>
> chris

From jason at bioperl.org  Thu Jun 14 15:00:09 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 14 Jun 2007 12:00:09 -0700
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
Message-ID: <CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>


On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:

> To chip in on this, I only use perltidy when I need to clean bioperl
> code up for debugging (particularly if blocks are hard to see) and
> just use the defaults.  I agree it would be nice to have everything
> tidied up but it'll definitely need to be a consensus config file.
>

Can we do any sort of massive conversion at some logical timepoint.   
Probably after a branch release or something?  Because it basically  
means we're going to have differences on nearly every line which is  
going to make diff-ing difficult when debugging old/new versions.   
Maybe it is not a problem because we aren't introducing and new bugs!

> About svn, I like the idea of eventually migrating to using it over
> CVS (I think BioPython and BioJava have plans to but I'm not sure)
> but I don't really know enough to say how feasible/difficult the
> migration path would be.  Anyone know?
>

It's doable but non-trivial.  cvs2svn (python gah!) script exists to  
help in this.  There are pros and cons to converting.   There is a  
fair amount of documentation and other pointers out there that point  
to the CVS server for getting latest code so we'd need to think about  
whether we'd support some sort of backwards compatible SVN -> CVS for  
read-only or what.

Mostly it will need someone to lead the charge - I made a go at doing  
it in the winter, but I really don't have the SVN-foo to make this  
work.  We'd need someone with SVN experience to step up and help.   
You can always try and we can play with the converted repository for  
a while without making it the new code base.

-j

> chris
>
> On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote:
>
>>     The nice thing about Perl Tidy is that everybody can have their
>> own config file.  There could be a bioperl default config that gets
>> applied at checkin time.  Anybody that didn't like it could script
>> checkouts to get run through their own config.  Diffs might get a
>> little hairy, but as long as you tidy before diffing, it shouldn't be
>> too bad.  Speaking of which....coding style is controversial enough,
>> but since that's already been opened, what about CVS vs  
>> Subversion? 8)
>>  Some of the scripting for this sort of thing might be easer in
>> Subversion.  Though maybe something like Git would fit the developer
>> model better (more support for distributed development).
>>
>> On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
>>> Nathan S. Haigh wrote:
>>>> I'm just wondering if anyone passes their modules through
>>>> perltidy in
>>>> order for them to have the same look/feel? If so, do you have a
>>>> .perltidyrc file? Also, is it worth running the Bioperl modules
>>>> through it?
>>>
>>> I don't use it, but I was contemplating the same thing. Chris  
>>> uses it
>>> from time to time and I think we have a similar taste in style.
>>>
>>> But we'd have to hammer something out that was agreeable to  
>>> everyone.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Thu Jun 14 15:01:27 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 14 Jun 2007 12:01:27 -0700
Subject: [Bioperl-l] cvs changes in working copy
In-Reply-To: <46717D5B.5040108@sheffield.ac.uk>
References: <46717D5B.5040108@sheffield.ac.uk>
Message-ID: <EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>

cvs update | grep '^M'

On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote:

> Not sure if I'm being dense or if it's because I've been working with
> svn recently, but - how do I get a list of files that are different in
> my working copy compared to the repository?
>
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From cjfields at uiuc.edu  Thu Jun 14 15:20:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 14:20:46 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
Message-ID: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>


On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:

>
> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>
>> To chip in on this, I only use perltidy when I need to clean bioperl
>> code up for debugging (particularly if blocks are hard to see) and
>> just use the defaults.  I agree it would be nice to have everything
>> tidied up but it'll definitely need to be a consensus config file.
>>
>
> Can we do any sort of massive conversion at some logical timepoint.
> Probably after a branch release or something?  Because it basically
> means we're going to have differences on nearly every line which is
> going to make diff-ing difficult when debugging old/new versions.
> Maybe it is not a problem because we aren't introducing and new bugs!

I agree; if we intend on doing this it should be all at once, maybe  
on a branch dedicated to ensure that code changes don't tank tests  
(they shouldn't but one never knows).  We would then need a script up- 
and-running that tidies everything up prior to commits (though what  
happens if perltidy tanks?...).

Sendu, up for it?

>> About svn, I like the idea of eventually migrating to using it over
>> CVS (I think BioPython and BioJava have plans to but I'm not sure)
>> but I don't really know enough to say how feasible/difficult the
>> migration path would be.  Anyone know?
>>
>
> It's doable but non-trivial.  cvs2svn (python gah!) script exists to
> help in this.  There are pros and cons to converting.   There is a
> fair amount of documentation and other pointers out there that point
> to the CVS server for getting latest code so we'd need to think about
> whether we'd support some sort of backwards compatible SVN -> CVS for
> read-only or what.
>
> Mostly it will need someone to lead the charge - I made a go at doing
> it in the winter, but I really don't have the SVN-foo to make this
> work.  We'd need someone with SVN experience to step up and help.
> You can always try and we can play with the converted repository for
> a while without making it the new code base.
>
> -j

Stepped into that one, didn't I!  I'll look into how much effort is  
involved and try getting something going in the next month or two,  
maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
might be worth looking into.

chris


From arareko at campus.iztacala.unam.mx  Thu Jun 14 15:50:39 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 14 Jun 2007 14:50:39 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
Message-ID: <46719C0F.5010706@campus.iztacala.unam.mx>

Chris Fields wrote:
> On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:
> 
>> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>>
>>> About svn, I like the idea of eventually migrating to using it over
>>> CVS (I think BioPython and BioJava have plans to but I'm not sure)
>>> but I don't really know enough to say how feasible/difficult the
>>> migration path would be.  Anyone know?
>>>
>> It's doable but non-trivial.  cvs2svn (python gah!) script exists to
>> help in this.  There are pros and cons to converting.   There is a
>> fair amount of documentation and other pointers out there that point
>> to the CVS server for getting latest code so we'd need to think about
>> whether we'd support some sort of backwards compatible SVN -> CVS for
>> read-only or what.
>>
>> Mostly it will need someone to lead the charge - I made a go at doing
>> it in the winter, but I really don't have the SVN-foo to make this
>> work.  We'd need someone with SVN experience to step up and help.
>> You can always try and we can play with the converted repository for
>> a while without making it the new code base.
>>
>> -j
> 
> Stepped into that one, didn't I!  I'll look into how much effort is  
> involved and try getting something going in the next month or two,  
> maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
> might be worth looking into.
> 
> chris
> 

Chris D has worked with CVS-SVN transitioning for other projects, maybe 
he can shed some light on this.

Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From sac at bioperl.org  Thu Jun 14 17:33:39 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Thu, 14 Jun 2007 14:33:39 -0700
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
In-Reply-To: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>
References: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
	<5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>
Message-ID: <8f200b4c0706141433i37267774u1dc2193d8508c47b@mail.gmail.com>

This issue was discussed recently here. Check out this thread:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15046/focus=15048

Some of the tools mentioned in the FAQ item Chris mentioned do not
report where the match occurred, only that a match occurred
(String::Approx, agrep), though some do report do report match
locations (fuzznuc, fuzzprot; not sure about TFBS).

My Bio::Tools::SeqPattern module does not even perform any matches, it
just encapsulates a regular expression for a nuc or protein motif and
knows how to handle ambiguity code expansion and reverse
complementing. The idea is that you can use this to convert a
biological sequence motif into a string suitable for use in a perl
regex. Adding a match() method to this module would be handy.

There an example script for it in examples/tools of the distro (which,
btw references an obsolete module, so it won't run as is -- I'll fix).

Steve

On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
> This is answered in the FAQ (sorry if the URL wraps, but we don't
> like tinyurls):
>
> http://www.bioperl.org/wiki/
> FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_.
> 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F
>
> chris
>
> On Jun 13, 2007, at 7:20 PM, John Cumbers wrote:
>
> > Hello,
> >
> > I have a simple problem, I'm trying to search a genome sequence for
> > a motif,
> > I then want to output a BED file to display all the locations of
> > this motif
> > on the UCSC Genome Browser.  I could not find a script to do this,
> > so I
> > started to write my own.   I'm new to perl and my code below was my
> > attempt
> > to read the sequence string and output the index bp of the start of
> > each
> > motif.  With this I could build the BED file myself, which requires
> > start
> > and finish base pairs.
> >
> > For the first motif I can output the start index, but when I try
> > and read
> > the next one off the sequence it does not work.  Instead I just get an
> > output of a list of 1's.  I realise that this is more a request for
> > some
> > simple perl help, but any help much appreciated.
> >
> > Best wishes,
> > John
> >
> >
> > $seq_object = read_sequence
> > ("Drosophila.Chr3.test.AE014296.fasta");  #turn
> > my FASTA file into a seq object.
> > $sequence_as_a_string = $seq_object->seq();  #turn it into a string
> > # search $sequence_as_a_string  string for motif AAA as example
> > # if found, return the index that it is found at
> >
> > while ($sequence_as_a_string =~ m/AAA/g) {
> >   print "Found '$&'.  Next attempt at character " .
> > pos($sequence_as_a_string)+1 . "\n";
> > }
> >
> >
> >
> > --
> > John Cumbers,  Graduate Student
> > Biology and Medicine
> > Brown University, Box G-W
> > Providence, Rhode Island, 02912, USA
> > Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
> > UK to USA: 0207 617 7824
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From hlapp at gmx.net  Thu Jun 14 19:04:11 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 14 Jun 2007 19:04:11 -0400
Subject: [Bioperl-l] cvs changes in working copy
In-Reply-To: <EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>
References: <46717D5B.5040108@sheffield.ac.uk>
	<EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>
Message-ID: <3B262E6A-2C90-49FA-BCA1-BF1900C5AC3A@gmx.net>

Actually, that will update your repository. If you just wanted to  
take a peek you would use cvs status:

$ cvs status | grep 'Locally Modified'

	-hilmar

On Jun 14, 2007, at 3:01 PM, Jason Stajich wrote:

> cvs update | grep '^M'
>
> On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote:
>
>> Not sure if I'm being dense or if it's because I've been working with
>> svn recently, but - how do I get a list of files that are  
>> different in
>> my working copy compared to the repository?
>>
>> Cheers
>> Nath
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From mmokrejs at ribosome.natur.cuni.cz  Fri Jun 15 03:28:17 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Fri, 15 Jun 2007 09:28:17 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
Message-ID: <46723F91.60501@ribosome.natur.cuni.cz>

Chris Fields wrote:
> Is 99.gb supposed to be a GenBank file?  And you're loading it into 

Yes, it was attached to the email. ;)

> embl2picture (which I assume takes EMBL format files)?  Without example 
> code we can easily make the wrong assumptions (i.e. that this is user 
> error and not a BioPerl problem).

use constant USAGE =><<END;
Usage: $0 <file>
   Render a GenBank/EMBL entry into drawable form.
   Return as a GIF or PNG image on standard output.
 
   File must be in embl, genbank, or another SeqIO-
   recognized format.  Only the first entry will be
   rendered.
 
Example to try:
   embl2picture.pl factor7.embl | display -
 
END

> 
> Also, I don't believe the feature plotting scripts plot circular 
> chromosomes/plasmids.  If you want this functionality you'll have to 
> code it for yourself.

That's a pitty it does not, but at least if someone could improve the docs. ;)
Unfortunately I don't have the time to rewrite the code myself now,
I need a working, standalone, already available tool. :(
M.

> 
> chris
> 
> On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote:
> 
>> Martin MOKREJ? wrote:
>>
>>>> Also, there is a *huge* amount of documentation and examples on the
>>>> BioPerl website.
>>>>
>>>>     http://www.bioperl.org/wiki/HOWTOs
>>>
>>> You mean
>>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File 
>>>
>>> ? ;-)
>>
>> $ perl embl2picture.pl ~/99.gb | display -
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature 
>> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature 
>> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature 
>> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature 
>> Bio::Location::Simple=HASH(0x893e720): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature 
>> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>> $
>>
>> The plasmid is a circular DNA, why is the diagram in linear? ;-)
>>
>> Martin
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs


From dhoworth at mrc-lmb.cam.ac.uk  Fri Jun 15 04:59:09 2007
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Fri, 15 Jun 2007 09:59:09 +0100
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
Message-ID: <467254DD.3010505@mrc-lmb.cam.ac.uk>

Martin MOKREJ? wrote:
>>> Also, there is a *huge* amount of documentation and examples on
>>> the BioPerl website.
>>> 
>>> http://www.bioperl.org/wiki/HOWTOs
>> You mean 
>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>>  ? ;-)
> 
> $ perl embl2picture.pl ~/99.gb | display - Error returned while
> evaluating value of 'description' option for glyph
> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature
> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl
> line 141, <GEN0> line 125.

Hmm an error at line 141 of a 69 line script? Methinks you're not
actually running the script that's presented on the wiki page you
quoted. I cut-and-pasted the script and your file and it worked for me
(at least, it produced an image, along with a bunch of OOPS lines)

HTH, Dave

From n.haigh at sheffield.ac.uk  Fri Jun 15 06:21:38 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 11:21:38 +0100
Subject: [Bioperl-l] Installation using --install_base
Message-ID: <46726832.7080601@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm setting up a new installation of Debian 4.0 at home and though I'd
try to install BioPerl as a normal user rather than root. So in CPAN
options I set the --install_base to /home/username/perl and set PERL5LIB
to point to the same place.

Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
user and ask to install all optional modules, it tries to install them
through CPAN - however it seems to fail because some dependencies don't
seem to want to install in a user directory.

Has anyone else found this or might I be doing something wrong?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGcmgyczuW2jkwy2gRAtgqAKDIv717ciVHr5V+Z1kqPV2a++E8dgCfYr2a
VPt4tEPLW2J+BiKnN3B8aV8=
=c+9z
-----END PGP SIGNATURE-----

From bix at sendu.me.uk  Fri Jun 15 06:07:04 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 15 Jun 2007 11:07:04 +0100
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
Message-ID: <467264C8.4020202@sendu.me.uk>

Chris Fields wrote:
> On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:
> 
>> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>>
>>> To chip in on this, I only use perltidy when I need to clean bioperl
>>> code up for debugging (particularly if blocks are hard to see) and
>>> just use the defaults.  I agree it would be nice to have everything
>>> tidied up but it'll definitely need to be a consensus config file.
>>>
>> Can we do any sort of massive conversion at some logical timepoint.
>> Probably after a branch release or something?  Because it basically
>> means we're going to have differences on nearly every line which is
>> going to make diff-ing difficult when debugging old/new versions.
>> Maybe it is not a problem because we aren't introducing and new bugs!

Sorry, can you clarify the problem you envisage? And why would making a 
branch release help?


> I agree; if we intend on doing this it should be all at once, maybe  
> on a branch dedicated to ensure that code changes don't tank tests  
> (they shouldn't but one never knows).  We would then need a script up- 
> and-running that tidies everything up prior to commits (though what  
> happens if perltidy tanks?...).
> 
> Sendu, up for it?

If its going to be difficult and a hassle, for such an unnecessary thing 
I'm not sure its worth it. There are more pressing things to be done for 
Bioperl.

If I can just run perltidy on the entire package and commit, I'd do it. 
If that's not appropriate, I won't.


>>> About svn
[snip]
> Stepped into that one, didn't I!  I'll look into how much effort is  
> involved and try getting something going in the next month or two,  
> maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
> might be worth looking into.

I'd put this in the unnecessary-but-nice category as well. If it will be 
as easy as my ->new change, go ahead. If not, there are more pressing 
matters (POD fixing, test script updating and finishing...).


From n.haigh at sheffield.ac.uk  Fri Jun 15 06:35:40 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 11:35:40 +0100
Subject: [Bioperl-l] Installation using --install_base
Message-ID: <46726B7C.7070902@sheffield.ac.uk>

I'm setting up a new installation of Debian 4.0 at home and though I'd
try to install BioPerl as a normal user rather than root. So in CPAN
options I set the --install_base to /home/username/perl and set PERL5LIB
to point to the same place.

Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
user and ask to install all optional modules, it tries to install them
through CPAN - however it seems to fail because some dependencies don't
seem to want to install in a user directory.

Has anyone else found this or might I be doing something wrong?

Nath

From bix at sendu.me.uk  Fri Jun 15 06:45:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 15 Jun 2007 11:45:48 +0100
Subject: [Bioperl-l] Installation using --install_base
In-Reply-To: <46726832.7080601@sheffield.ac.uk>
References: <46726832.7080601@sheffield.ac.uk>
Message-ID: <46726DDC.8090202@sendu.me.uk>

Nathan S. Haigh wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> I'm setting up a new installation of Debian 4.0 at home and though I'd
> try to install BioPerl as a normal user rather than root. So in CPAN
> options I set the --install_base to /home/username/perl and set PERL5LIB
> to point to the same place.
> 
> Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
> user and ask to install all optional modules, it tries to install them
> through CPAN - however it seems to fail because some dependencies don't
> seem to want to install in a user directory.
> 
> Has anyone else found this or might I be doing something wrong?

You'll need to configure CPAN to install into your user directory. 
Upgrade to the latest version, then go read the docs on the various 
configurable options. I thought I at least mentioned this in the Bioperl 
INSTALL doc. If not, can someone come up with a concise clarification?

From sdavis2 at mail.nih.gov  Fri Jun 15 06:56:08 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 15 Jun 2007 06:56:08 -0400
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467264C8.4020202@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
Message-ID: <46727048.3080904@mail.nih.gov>

Sendu Bala wrote:
> If its going to be difficult and a hassle, for such an unnecessary thing 
> I'm not sure its worth it. There are more pressing things to be done for 
> Bioperl.
> 
> If I can just run perltidy on the entire package and commit, I'd do it. 
> If that's not appropriate, I won't.

I agree with the sentiment noted above.  I'm a bit of an outsider here,
but bioperl is a collaborative project.  Not everyone has the same
sentiments about what "correct" style means.  As a programmer, I really
wouldn't want significant changes on the style of my code.  And perl
happily puts up with many styles.  I would say leave things as they
are--let the individual programmers choose.  It reduces the amount of
work of questionable importance and allows the coding style freedom that
perl supports.

Just my $.02.

Sean

From cjfields at uiuc.edu  Fri Jun 15 10:05:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:05:07 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <46723F91.60501@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
	<46723F91.60501@ribosome.natur.cuni.cz>
Message-ID: <A2212781-75F3-4BB7-967F-1668B682E84E@uiuc.edu>


On Jun 15, 2007, at 2:28 AM, Martin MOKREJ? wrote:

> Chris Fields wrote:
>> Is 99.gb supposed to be a GenBank file?  And you're loading it into
>
> Yes, it was attached to the email. ;)

<bring foot to mouth and insert>

Sorry about that.  I notice that '.' was added, but the spacing  
seemed off.  I think bioperl catches that fine but it's something  
Wayne should consider.

>> embl2picture (which I assume takes EMBL format files)?  Without  
>> example
>> code we can easily make the wrong assumptions (i.e. that this is user
>> error and not a BioPerl problem).
>
> use constant USAGE =><<END;
> Usage: $0 <file>
>    Render a GenBank/EMBL entry into drawable form.
>    Return as a GIF or PNG image on standard output.
>
>    File must be in embl, genbank, or another SeqIO-
>    recognized format.  Only the first entry will be
>    rendered.
>
> Example to try:
>    embl2picture.pl factor7.embl | display -
>
> END

Horribly named script (should be seq2picture, since it converts both  
gb/embl).  The use of 'all_tags' makes me think the script version  
you are using is old, as those methods have long since been renamed.   
Dave has it working though, so maybe your version has been updated?   
The 'use of initialized data in' errors are probably from inclusion  
of mandatory fields with no data or '.'.

>> Also, I don't believe the feature plotting scripts plot circular
>> chromosomes/plasmids.  If you want this functionality you'll have to
>> code it for yourself.
>
> That's a pitty it does not, but at least if someone could improve  
> the docs. ;)
> Unfortunately I don't have the time to rewrite the code myself now,
> I need a working, standalone, already available tool. :(
> M.

As I said, unless someone shows interest and codes it just won't get  
done.  We have had very little interest in this, either b/c there are  
tools already out there to do this very thing (multitudes of plasmid  
drawing programs, some free like ApE) or that nobody's bothered to  
write it up.

chris


From cjfields at uiuc.edu  Fri Jun 15 10:22:23 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:22:23 -0500
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <46727048.3080904@mail.nih.gov>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov>
Message-ID: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>


On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:

> Sendu Bala wrote:
>> If its going to be difficult and a hassle, for such an unnecessary  
>> thing
>> I'm not sure its worth it. There are more pressing things to be  
>> done for
>> Bioperl.
>>
>> If I can just run perltidy on the entire package and commit, I'd  
>> do it.
>> If that's not appropriate, I won't.
>
> I agree with the sentiment noted above.  I'm a bit of an outsider  
> here,
> but bioperl is a collaborative project.  Not everyone has the same
> sentiments about what "correct" style means.  As a programmer, I  
> really
> wouldn't want significant changes on the style of my code.  And perl
> happily puts up with many styles.  I would say leave things as they
> are--let the individual programmers choose.  It reduces the amount of
> work of questionable importance and allows the coding style freedom  
> that
> perl supports.
>
> Just my $.02.
>
> Sean

I tend to run it on modules that need some reformatting  
(SearchIO::blast comes to mind).  I believe you're correct when this  
comes down to programming style, but I think this echoes a sentiment  
(frustration, perhaps) that some of us have with long-term  
maintenance of said code.

Maybe a compromise:  include a copy of .perltidyrc with the  
distribution that goes by what a consensus wants or by the general  
rules laid out in Perl Best Practices (spaced settings, use of spaces  
over tabs, etc).  Conversion would be encouraged but voluntary, with  
the caveat that if someone needs to clean up code down the road (bug  
fixes, enhancements, etc) and if the original author isn't able to  
add it in themselves, it could be perltidy'd in order to help the  
developer (locate and fix the issue)|(add relevant enhancement where  
needed).

chris


From cjfields at uiuc.edu  Fri Jun 15 10:56:23 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:56:23 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467264C8.4020202@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
Message-ID: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>


On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:

>>>> ...
>>> Can we do any sort of massive conversion at some logical timepoint.
>>> Probably after a branch release or something?  Because it basically
>>> means we're going to have differences on nearly every line which is
>>> going to make diff-ing difficult when debugging old/new versions.
>>> Maybe it is not a problem because we aren't introducing and new  
>>> bugs!
>
> Sorry, can you clarify the problem you envisage? And why would  
> making a branch release help?

Maybe the worry is that mass conversion in such a large codebase  
could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o  
trying?

>> I agree; if we intend on doing this it should be all at once,  
>> maybe  on a branch dedicated to ensure that code changes don't  
>> tank tests  (they shouldn't but one never knows).  We would then  
>> need a script up- and-running that tidies everything up prior to  
>> commits (though what  happens if perltidy tanks?...).
>> Sendu, up for it?
>
> If its going to be difficult and a hassle, for such an unnecessary  
> thing I'm not sure its worth it. There are more pressing things to  
> be done for Bioperl.
>
> If I can just run perltidy on the entire package and commit, I'd do  
> it. If that's not appropriate, I won't.

The choices aren't necessarily all or nothing.  What about voluntary,  
recommended use of a perltidy config file included with the  
distribution, with additional 'caveats'?  See my response to Sean.

>>>> About svn
> [snip]
>> Stepped into that one, didn't I!  I'll look into how much effort  
>> is  involved and try getting something going in the next month or  
>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as  
>> well but it  might be worth looking into.
>
> I'd put this in the unnecessary-but-nice category as well. If it  
> will be as easy as my ->new change, go ahead. If not, there are  
> more pressing matters (POD fixing, test script updating and  
> finishing...).

A few other open-bio projects have actively discussed a CVS->SVN  
migration (BioRuby and I think BioPython, though the latter could be  
wrong).  As I said, "it might be worth looking into" to weigh the  
pros/cons, get others opinions from others who have made the  
transition, etc.  We could, as Jason suggested, even set up a tester  
SVN w/o making it the default codebase (lock it off to a few testers,  
have CVS commits automatically/manually carry over to SVN, etc).

I agree with you that it's not feasible to switch over prior to a  
release and that there are more pressing issues, but it doesn't hurt  
having an open discussion about it.

chris


From sdavis2 at mail.nih.gov  Fri Jun 15 11:15:57 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 15 Jun 2007 11:15:57 -0400
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov>
	<78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
Message-ID: <4672AD2D.2090001@mail.nih.gov>

Chris Fields wrote:
> 
> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:
> 
>> Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary thing
>>> I'm not sure its worth it. There are more pressing things to be done for
>>> Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd do it.
>>> If that's not appropriate, I won't.
>>
>> I agree with the sentiment noted above.  I'm a bit of an outsider here,
>> but bioperl is a collaborative project.  Not everyone has the same
>> sentiments about what "correct" style means.  As a programmer, I really
>> wouldn't want significant changes on the style of my code.  And perl
>> happily puts up with many styles.  I would say leave things as they
>> are--let the individual programmers choose.  It reduces the amount of
>> work of questionable importance and allows the coding style freedom that
>> perl supports.
>>
>> Just my $.02.
>>
>> Sean
> 
> I tend to run it on modules that need some reformatting (SearchIO::blast
> comes to mind).  I believe you're correct when this comes down to
> programming style, but I think this echoes a sentiment (frustration,
> perhaps) that some of us have with long-term maintenance of said code.
> 
> Maybe a compromise:  include a copy of .perltidyrc with the distribution
> that goes by what a consensus wants or by the general rules laid out in
> Perl Best Practices (spaced settings, use of spaces over tabs, etc). 
> Conversion would be encouraged but voluntary, with the caveat that if
> someone needs to clean up code down the road (bug fixes, enhancements,
> etc) and if the original author isn't able to add it in themselves, it
> could be perltidy'd in order to help the developer (locate and fix the
> issue)|(add relevant enhancement where needed).

Don't get me wrong--I think whatever makes bioperl a better, more
maintainable beast should be what is done.  The bioperl gurus should
absolutely do what is best for them for code maintainability.

Sean

From n.haigh at sheffield.ac.uk  Fri Jun 15 11:17:15 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 16:17:15 +0100
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>	<467264C8.4020202@sendu.me.uk>
	<46727048.3080904@mail.nih.gov>
	<78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
Message-ID: <4672AD7B.4050109@sheffield.ac.uk>

Chris Fields wrote:
> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:
> 
>> Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary  
>>> thing
>>> I'm not sure its worth it. There are more pressing things to be  
>>> done for
>>> Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd  
>>> do it.
>>> If that's not appropriate, I won't.
>> I agree with the sentiment noted above.  I'm a bit of an outsider  
>> here,
>> but bioperl is a collaborative project.  Not everyone has the same
>> sentiments about what "correct" style means.  As a programmer, I  
>> really
>> wouldn't want significant changes on the style of my code.  And perl
>> happily puts up with many styles.  I would say leave things as they
>> are--let the individual programmers choose.  It reduces the amount of
>> work of questionable importance and allows the coding style freedom  
>> that
>> perl supports.
>>
>> Just my $.02.
>>
>> Sean
> 
> I tend to run it on modules that need some reformatting  
> (SearchIO::blast comes to mind).  I believe you're correct when this  
> comes down to programming style, but I think this echoes a sentiment  
> (frustration, perhaps) that some of us have with long-term  
> maintenance of said code.
> 
> Maybe a compromise:  include a copy of .perltidyrc with the  
> distribution that goes by what a consensus wants or by the general  
> rules laid out in Perl Best Practices (spaced settings, use of spaces  
> over tabs, etc).  

RE spaces, tabs etc - how well is the different coding styles handled
for displaying in html and via the online browsable cvs?

Conversion would be encouraged but voluntary, with
> the caveat that if someone needs to clean up code down the road (bug  
> fixes, enhancements, etc) and if the original author isn't able to  
> add it in themselves, it could be perltidy'd in order to help the  
> developer (locate and fix the issue)|(add relevant enhancement where  
> needed).
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From johnsonm at gmail.com  Fri Jun 15 15:37:26 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Fri, 15 Jun 2007 14:37:26 -0500
Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap
	start and stop coordinates??
In-Reply-To: <E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
References: <CED81D34E37D5043A1211565277A51E507E23161@exchkc02.stowers-institute.org>
	<79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu>
	<ebf5eb170705161211m6fb570b5r86ee055299993172@mail.gmail.com>
	<B012903E-7C0F-4E34-9BFE-E551855B6C62@uiuc.edu>
	<ebf5eb170705211348w57c37f18oeb128656c446cff@mail.gmail.com>
	<62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu>
	<ebf5eb170705211421w244933fcu4db8ba748653c090@mail.gmail.com>
	<9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu>
	<a79f6a4b0705211729j3ff17d60v610fab7f5e135303@mail.gmail.com>
	<E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
Message-ID: <ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>

Patches waiting in Bugzilla (Bug #2299).  Changes:

-Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for
prokaryotic reports (Glimmer2/Glimmer3)
-Bio::Tools::Glimmer now produces features with Fuzzy or Split
locations as appropriate (partial or circular/wraparound predictions)
-Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out
sequence lengths
-Bio::Tools::Run::Glimmer passes along the sequence length to
Bio::Tools::Glimmer for Glimmer2

I should probably modify Bio::Tools::Genemark to use
Bio::SeqFeature::Generic features for prokaryotic reports, to be
consistent, but this is more likely to surprise people.  If nobody
screams about the change to Bio::Tools::Glimmer, I'll do it at some
point.

On 5/21/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote:
>
> >> glimmer2/3 both assume the genome is circular by default (I'm
> >> assuming since Glimmer2/3 are used for bacterial genomes).  Acc. to
> >> the Glimmer3 release notes the detail file has the information in the
> >> header; from the Glimmer3 data used for tests:
> >
> > You beat me to the reply Chris - yes, Glimmer2/3 assume circular
> > chromosome by default. I had forgotten about this in earlier
> > discussions of the new Glimmer parsers as I normally run it in
> > --linear / -L mode (even if I know it is circular) because it is
> > easier to handle, and our sequencer/assembler team usually gets the
> > origin of replication right.
> >
> >> Command:  /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../BCTDNA
> >> Glimmer3.icm Glimmer3
> >
> > I did a double-take here - that's the path to my Glimmer3
> > installation! It took me a couple of minutes to realise that you got
> > it from the bioperl test data I created. D'oh! :-)
>
> Yep, I forgot about that!
>
> >> There are options available for glimmer3 (-L, -X) that specify a
> >> linear sequence or allow ORFs to extend past the end of the sequence
> >> analyzed (the latter assumes a linear sequence).
> >
> > If the -L mode should produce Bio::Location::Split objects, I guess if
> > -X is used
> > it should produce Bio::Location::Fuzzy objects too...
> >
> > --Torsten
>
> True, didn't think about that one.  Def. something to consider adding
> in.
>
> chris
>
>
>

From cjfields at uiuc.edu  Fri Jun 15 16:55:06 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 15:55:06 -0500
Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap
	start and stop coordinates??
In-Reply-To: <ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>
References: <CED81D34E37D5043A1211565277A51E507E23161@exchkc02.stowers-institute.org>
	<79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu>
	<ebf5eb170705161211m6fb570b5r86ee055299993172@mail.gmail.com>
	<B012903E-7C0F-4E34-9BFE-E551855B6C62@uiuc.edu>
	<ebf5eb170705211348w57c37f18oeb128656c446cff@mail.gmail.com>
	<62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu>
	<ebf5eb170705211421w244933fcu4db8ba748653c090@mail.gmail.com>
	<9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu>
	<a79f6a4b0705211729j3ff17d60v610fab7f5e135303@mail.gmail.com>
	<E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
	<ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>
Message-ID: <D09AF2F1-1459-4B6B-A3ED-85CEDE34D7B6@uiuc.edu>

I'll try getting to that in tonight.  Been pretty tied up lately...

chris

On Jun 15, 2007, at 2:37 PM, Mark Johnson wrote:

> Patches waiting in Bugzilla (Bug #2299).  Changes:
>
> -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for
> prokaryotic reports (Glimmer2/Glimmer3)
> -Bio::Tools::Glimmer now produces features with Fuzzy or Split
> locations as appropriate (partial or circular/wraparound predictions)
> -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out
> sequence lengths
> -Bio::Tools::Run::Glimmer passes along the sequence length to
> Bio::Tools::Glimmer for Glimmer2
>
> I should probably modify Bio::Tools::Genemark to use
> Bio::SeqFeature::Generic features for prokaryotic reports, to be
> consistent, but this is more likely to surprise people.  If nobody
> screams about the change to Bio::Tools::Glimmer, I'll do it at some
> point.
>
> On 5/21/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>
>> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote:
>>
>>>> glimmer2/3 both assume the genome is circular by default (I'm
>>>> assuming since Glimmer2/3 are used for bacterial genomes).  Acc. to
>>>> the Glimmer3 release notes the detail file has the information  
>>>> in the
>>>> header; from the Glimmer3 data used for tests:
>>>
>>> You beat me to the reply Chris - yes, Glimmer2/3 assume circular
>>> chromosome by default. I had forgotten about this in earlier
>>> discussions of the new Glimmer parsers as I normally run it in
>>> --linear / -L mode (even if I know it is circular) because it is
>>> easier to handle, and our sequencer/assembler team usually gets the
>>> origin of replication right.
>>>
>>>> Command:  /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../ 
>>>> BCTDNA
>>>> Glimmer3.icm Glimmer3
>>>
>>> I did a double-take here - that's the path to my Glimmer3
>>> installation! It took me a couple of minutes to realise that you got
>>> it from the bioperl test data I created. D'oh! :-)
>>
>> Yep, I forgot about that!
>>
>>>> There are options available for glimmer3 (-L, -X) that specify a
>>>> linear sequence or allow ORFs to extend past the end of the  
>>>> sequence
>>>> analyzed (the latter assumes a linear sequence).
>>>
>>> If the -L mode should produce Bio::Location::Split objects, I  
>>> guess if
>>> -X is used
>>> it should produce Bio::Location::Fuzzy objects too...
>>>
>>> --Torsten
>>
>> True, didn't think about that one.  Def. something to consider adding
>> in.
>>
>> chris
>>
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From rvos at interchange.ubc.ca  Fri Jun 15 17:08:17 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Fri, 15 Jun 2007 14:08:17 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
Message-ID: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>

Hi,

I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS.

Rutger


-----Original Message-----

> Date: Fri Jun 15 07:56:23 PDT 2007
> From: "Chris Fields" <cjfields at uiuc.edu>
> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> To: "Sendu Bala" <bix at sendu.me.uk>
>
> 
> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> 
> >>>> ...
> >>> Can we do any sort of massive conversion at some logical timepoint.
> >>> Probably after a branch release or something?  Because it basically
> >>> means we're going to have differences on nearly every line which is
> >>> going to make diff-ing difficult when debugging old/new versions.
> >>> Maybe it is not a problem because we aren't introducing and new  
> >>> bugs!
> >
> > Sorry, can you clarify the problem you envisage? And why would  
> > making a branch release help?
> 
> Maybe the worry is that mass conversion in such a large codebase  
> could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o  
> trying?
> 
> >> I agree; if we intend on doing this it should be all at once,  
> >> maybe  on a branch dedicated to ensure that code changes don't  
> >> tank tests  (they shouldn't but one never knows).  We would then  
> >> need a script up- and-running that tidies everything up prior to  
> >> commits (though what  happens if perltidy tanks?...).
> >> Sendu, up for it?
> >
> > If its going to be difficult and a hassle, for such an unnecessary  
> > thing I'm not sure its worth it. There are more pressing things to  
> > be done for Bioperl.
> >
> > If I can just run perltidy on the entire package and commit, I'd do  
> > it. If that's not appropriate, I won't.
> 
> The choices aren't necessarily all or nothing.  What about voluntary,  
> recommended use of a perltidy config file included with the  
> distribution, with additional 'caveats'?  See my response to Sean.
> 
> >>>> About svn
> > [snip]
> >> Stepped into that one, didn't I!  I'll look into how much effort  
> >> is  involved and try getting something going in the next month or  
> >> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as  
> >> well but it  might be worth looking into.
> >
> > I'd put this in the unnecessary-but-nice category as well. If it  
> > will be as easy as my ->new change, go ahead. If not, there are  
> > more pressing matters (POD fixing, test script updating and  
> > finishing...).
> 
> A few other open-bio projects have actively discussed a CVS->SVN  
> migration (BioRuby and I think BioPython, though the latter could be  
> wrong).  As I said, "it might be worth looking into" to weigh the  
> pros/cons, get others opinions from others who have made the  
> transition, etc.  We could, as Jason suggested, even set up a tester  
> SVN w/o making it the default codebase (lock it off to a few testers,  
> have CVS commits automatically/manually carry over to SVN, etc).
> 
> I agree with you that it's not feasible to switch over prior to a  
> release and that there are more pressing issues, but it doesn't hurt  
> having an open discussion about it.
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From spiros at lokku.com  Fri Jun 15 17:40:32 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Fri, 15 Jun 2007 22:40:32 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>

On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
> Hi,
>
> I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS.
>
> Rutger
>

I second that, SVN seems like the reasonable choice. I would be more
than happy to help out as well.

Spiros

>
> -----Original Message-----
>
> > Date: Fri Jun 15 07:56:23 PDT 2007
> > From: "Chris Fields" <cjfields at uiuc.edu>
> > Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> > To: "Sendu Bala" <bix at sendu.me.uk>
> >
> >
> > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> >
> > >>>> ...
> > >>> Can we do any sort of massive conversion at some logical timepoint.
> > >>> Probably after a branch release or something?  Because it basically
> > >>> means we're going to have differences on nearly every line which is
> > >>> going to make diff-ing difficult when debugging old/new versions.
> > >>> Maybe it is not a problem because we aren't introducing and new
> > >>> bugs!
> > >
> > > Sorry, can you clarify the problem you envisage? And why would
> > > making a branch release help?
> >
> > Maybe the worry is that mass conversion in such a large codebase
> > could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o
> > trying?
> >
> > >> I agree; if we intend on doing this it should be all at once,
> > >> maybe  on a branch dedicated to ensure that code changes don't
> > >> tank tests  (they shouldn't but one never knows).  We would then
> > >> need a script up- and-running that tidies everything up prior to
> > >> commits (though what  happens if perltidy tanks?...).
> > >> Sendu, up for it?
> > >
> > > If its going to be difficult and a hassle, for such an unnecessary
> > > thing I'm not sure its worth it. There are more pressing things to
> > > be done for Bioperl.
> > >
> > > If I can just run perltidy on the entire package and commit, I'd do
> > > it. If that's not appropriate, I won't.
> >
> > The choices aren't necessarily all or nothing.  What about voluntary,
> > recommended use of a perltidy config file included with the
> > distribution, with additional 'caveats'?  See my response to Sean.
> >
> > >>>> About svn
> > > [snip]
> > >> Stepped into that one, didn't I!  I'll look into how much effort
> > >> is  involved and try getting something going in the next month or
> > >> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
> > >> well but it  might be worth looking into.
> > >
> > > I'd put this in the unnecessary-but-nice category as well. If it
> > > will be as easy as my ->new change, go ahead. If not, there are
> > > more pressing matters (POD fixing, test script updating and
> > > finishing...).
> >
> > A few other open-bio projects have actively discussed a CVS->SVN
> > migration (BioRuby and I think BioPython, though the latter could be
> > wrong).  As I said, "it might be worth looking into" to weigh the
> > pros/cons, get others opinions from others who have made the
> > transition, etc.  We could, as Jason suggested, even set up a tester
> > SVN w/o making it the default codebase (lock it off to a few testers,
> > have CVS commits automatically/manually carry over to SVN, etc).
> >
> > I agree with you that it's not feasible to switch over prior to a
> > release and that there are more pressing issues, but it doesn't hurt
> > having an open discussion about it.
> >
> > chris
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From hlapp at gmx.net  Fri Jun 15 18:10:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 18:10:25 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
Message-ID: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>

So should we set up a sandbox svn repository and those who would like  
to help out

- take shots at migrating bioperl (any current cvs snapshot will do)  
to svn

- you document what you find yourself having to do in trying to make  
it work

- you report back when you think you have a working repository

- we all get a defined amount of time to test to our hearts' content,  
say 2 weeks

- you fix issues that were encountered

- report back when done, followed by retesting for, say 1 week

- iterate previous 2 steps until no issues and no objections to  
migration

- two more weeks of warning period to all developers to commit all  
outstanding changes, or reapply them to a future svn checkout

- pull the trigger by locking down cvs, applying the migration as  
worked out before, and announcing that BioPerl is now on svn

- get free beer at next BOSC (I'll pay if no one else does)

This may not be precisely the plan that needs to be executed, but  
it's probably somewhere along those lines.

If there are volunteers who would like to spearhead this, then power  
to you - I think everyone is in favor and the advantages of svn don't  
need to be debated. The only reason it hasn't happened yet is because  
no one has stepped forward who would have the energy.

I'm sure ChrisD will gladly create the svn sandbox if we have  
volunteers lined up to get going.

	-hilmar

On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:

> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>> Hi,
>>
>> I would very much prefer it if bioperl moved to svn. I'm  
>> considering merging Bio::Phylo (to the extent that that's possible/ 
>> practical) with bioperl and move it to an OBF repository, but I'd  
>> rather not go back to CVS.
>>
>> Rutger
>>
>
> I second that, SVN seems like the reasonable choice. I would be more
> than happy to help out as well.
>
> Spiros
>
>>
>> -----Original Message-----
>>
>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>
>>>
>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>
>>>>>>> ...
>>>>>> Can we do any sort of massive conversion at some logical  
>>>>>> timepoint.
>>>>>> Probably after a branch release or something?  Because it  
>>>>>> basically
>>>>>> means we're going to have differences on nearly every line  
>>>>>> which is
>>>>>> going to make diff-ing difficult when debugging old/new versions.
>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>> bugs!
>>>>
>>>> Sorry, can you clarify the problem you envisage? And why would
>>>> making a branch release help?
>>>
>>> Maybe the worry is that mass conversion in such a large codebase
>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows  
>>> w/o
>>> trying?
>>>
>>>>> I agree; if we intend on doing this it should be all at once,
>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>> need a script up- and-running that tidies everything up prior to
>>>>> commits (though what  happens if perltidy tanks?...).
>>>>> Sendu, up for it?
>>>>
>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>> thing I'm not sure its worth it. There are more pressing things to
>>>> be done for Bioperl.
>>>>
>>>> If I can just run perltidy on the entire package and commit, I'd do
>>>> it. If that's not appropriate, I won't.
>>>
>>> The choices aren't necessarily all or nothing.  What about  
>>> voluntary,
>>> recommended use of a perltidy config file included with the
>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>
>>>>>>> About svn
>>>> [snip]
>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>> is  involved and try getting something going in the next month or
>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>> well but it  might be worth looking into.
>>>>
>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>> more pressing matters (POD fixing, test script updating and
>>>> finishing...).
>>>
>>> A few other open-bio projects have actively discussed a CVS->SVN
>>> migration (BioRuby and I think BioPython, though the latter could be
>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>> pros/cons, get others opinions from others who have made the
>>> transition, etc.  We could, as Jason suggested, even set up a tester
>>> SVN w/o making it the default codebase (lock it off to a few  
>>> testers,
>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>
>>> I agree with you that it's not feasible to switch over prior to a
>>> release and that there are more pressing issues, but it doesn't hurt
>>> having an open discussion about it.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jason at bioperl.org  Fri Jun 15 18:23:15 2007
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 15 Jun 2007 15:23:15 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>

Sounds like a plan, I'll be curious to see if we can still get keep  
anonymous CVS working as I'd like to not have to pull the plug on  
that.  There are some threads out on the web about how to do this  
with a commit rule on SVN.

Also, can someone who is close enough to all the SVN benefits please  
elaborate how it is going to help _this_ project?
Perhaps you would be willing to put a few words up -- like on (a to  
be created):
http://bioperl.org/wiki/BioPerl:Version_control_changeover

This way if anonymous CVS is broken and/or developers who haven't  
been paying attention come back to commit code ask why things changed  
we don't have to compose long emails... =)

-jason
On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote:

> So should we set up a sandbox svn repository and those who would like
> to help out
>
> - take shots at migrating bioperl (any current cvs snapshot will do)
> to svn
>
> - you document what you find yourself having to do in trying to make
> it work
>
> - you report back when you think you have a working repository
>
> - we all get a defined amount of time to test to our hearts' content,
> say 2 weeks
>
> - you fix issues that were encountered
>
> - report back when done, followed by retesting for, say 1 week
>
> - iterate previous 2 steps until no issues and no objections to
> migration
>
> - two more weeks of warning period to all developers to commit all
> outstanding changes, or reapply them to a future svn checkout
>
> - pull the trigger by locking down cvs, applying the migration as
> worked out before, and announcing that BioPerl is now on svn
>
> - get free beer at next BOSC (I'll pay if no one else does)
>
> This may not be precisely the plan that needs to be executed, but
> it's probably somewhere along those lines.
>
> If there are volunteers who would like to spearhead this, then power
> to you - I think everyone is in favor and the advantages of svn don't
> need to be debated. The only reason it hasn't happened yet is because
> no one has stepped forward who would have the energy.

>
> I'm sure ChrisD will gladly create the svn sandbox if we have
> volunteers lined up to get going.
>
> 	-hilmar
>
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>
>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>> Hi,
>>>
>>> I would very much prefer it if bioperl moved to svn. I'm
>>> considering merging Bio::Phylo (to the extent that that's possible/
>>> practical) with bioperl and move it to an OBF repository, but I'd
>>> rather not go back to CVS.
>>>
>>> Rutger
>>>
>>
>> I second that, SVN seems like the reasonable choice. I would be more
>> than happy to help out as well.
>>
>> Spiros
>>
>>>
>>> -----Original Message-----
>>>
>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>
>>>>
>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>
>>>>>>>> ...
>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>> timepoint.
>>>>>>> Probably after a branch release or something?  Because it
>>>>>>> basically
>>>>>>> means we're going to have differences on nearly every line
>>>>>>> which is
>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>> versions.
>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>> bugs!
>>>>>
>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>> making a branch release help?
>>>>
>>>> Maybe the worry is that mass conversion in such a large codebase
>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>> w/o
>>>> trying?
>>>>
>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>> Sendu, up for it?
>>>>>
>>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>>> thing I'm not sure its worth it. There are more pressing things to
>>>>> be done for Bioperl.
>>>>>
>>>>> If I can just run perltidy on the entire package and commit,  
>>>>> I'd do
>>>>> it. If that's not appropriate, I won't.
>>>>
>>>> The choices aren't necessarily all or nothing.  What about
>>>> voluntary,
>>>> recommended use of a perltidy config file included with the
>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>
>>>>>>>> About svn
>>>>> [snip]
>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>> is  involved and try getting something going in the next month or
>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>> well but it  might be worth looking into.
>>>>>
>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>> more pressing matters (POD fixing, test script updating and
>>>>> finishing...).
>>>>
>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>> migration (BioRuby and I think BioPython, though the latter  
>>>> could be
>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>> pros/cons, get others opinions from others who have made the
>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>> tester
>>>> SVN w/o making it the default codebase (lock it off to a few
>>>> testers,
>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>
>>>> I agree with you that it's not feasible to switch over prior to a
>>>> release and that there are more pressing issues, but it doesn't  
>>>> hurt
>>>> having an open discussion about it.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From sheris at eps.berkeley.edu  Fri Jun 15 18:58:12 2007
From: sheris at eps.berkeley.edu (Sheri Simmons)
Date: Fri, 15 Jun 2007 15:58:12 -0700
Subject: [Bioperl-l] seq doesn't validate error
Message-ID: <200706151558.12911.sheris@eps.berkeley.edu>

Hi,
I'm getting an error as follows when I try to reverse complement a sequence 
string stored in a hash of arrays. The storage code is: 

		$nstarthash{$key} = [$sortchecks[0], join("", @nseq), 		
join("",@{$seqhash{$key}})];

the sequence of interest is the element at index 1. 

Later, I try to retrieve this string for a subset of keys so I can reverse 
complement it based on input from another hash (%complement):

			my %revcomphash = map { my $read = $_;
			grep $complement{$read} eq 'C', %complement;
			{$_, (Bio::Seq->new(-seq =>$nstarthash{$_}[1]))->revcom->seq()};}
			 keys(%nstarthash); 


I get the following warning (long sequence edited for clarity):

-- -------------------- WARNING ---------------------
MSG: seq doesn't validate, mismatch is 1
---------------------------------------------------

------------- EXCEPTION  -------------
MSG: Attempting to set the sequence to [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] 
which does not look healthy
STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498
STACK toplevel ../quality_wrapper.pl:103

I cannot find any non-allowed characters in the sequence, and the 
de-referencing appears to work correctly. Can anyone help me?
I'm using the latest Bioperl installation (1.5.2) with ActivePerl5.8 on a 
Mepis 6.5 system. 

Thanks
Sheri

---------------------------------------------------------------------
Sheri Simmons
Department of Earth and Planetary Sciences
University of California, Berkeley
Berkeley, CA 94720-4767

From Kevin.M.Brown at asu.edu  Fri Jun 15 19:11:34 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Fri, 15 Jun 2007 16:11:34 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <200706151558.12911.sheris@eps.berkeley.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
Message-ID: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>

> I'm getting an error as follows when I try to reverse 
> complement a sequence string stored in a hash of arrays. The 
> storage code is: 
> 
> 		$nstarthash{$key} = [$sortchecks[0], join("", 
> @nseq), 		
> join("",@{$seqhash{$key}})];
> 
> the sequence of interest is the element at index 1. 
> 
> Later, I try to retrieve this string for a subset of keys so 
> I can reverse complement it based on input from another hash 
> (%complement):
> 
> 			my %revcomphash = map { my $read = $_;
> 			grep $complement{$read} eq 'C', %complement;
> 			{$_, (Bio::Seq->new(-seq 
> =>$nstarthash{$_}[1]))->revcom->seq()};}
> 			 keys(%nstarthash); 
> 
> 
> I get the following warning (long sequence edited for clarity):
> 
> -- -------------------- WARNING ---------------------
> MSG: seq doesn't validate, mismatch is 1
> ---------------------------------------------------
> 
> ------------- EXCEPTION  -------------
> MSG: Attempting to set the sequence to 
> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
> which does not look healthy
> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK 
> toplevel ../quality_wrapper.pl:103
> 
> I cannot find any non-allowed characters in the sequence, and 
> the de-referencing appears to work correctly. Can anyone help me?
> I'm using the latest Bioperl installation (1.5.2) with 
> ActivePerl5.8 on a Mepis 6.5 system. 

Try telling the Bio::Seq object what alphabet to use when creating it.
I tend to create them like:

Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')


From sheris at eps.berkeley.edu  Fri Jun 15 19:53:04 2007
From: sheris at eps.berkeley.edu (Sheri Simmons)
Date: Fri, 15 Jun 2007 16:53:04 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
	<1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
Message-ID: <200706151653.04135.sheris@eps.berkeley.edu>

Thanks for the suggestion, but that still gives the same error as before.

On Friday 15 June 2007 4:11 pm, Kevin Brown wrote:
> > I'm getting an error as follows when I try to reverse
> > complement a sequence string stored in a hash of arrays. The
> > storage code is:
> >
> > 		$nstarthash{$key} = [$sortchecks[0], join("",
> > @nseq),
> > join("",@{$seqhash{$key}})];
> >
> > the sequence of interest is the element at index 1.
> >
> > Later, I try to retrieve this string for a subset of keys so
> > I can reverse complement it based on input from another hash
> > (%complement):
> >
> > 			my %revcomphash = map { my $read = $_;
> > 			grep $complement{$read} eq 'C', %complement;
> > 			{$_, (Bio::Seq->new(-seq
> > =>$nstarthash{$_}[1]))->revcom->seq()};}
> > 			 keys(%nstarthash);
> >
> >
> > I get the following warning (long sequence edited for clarity):
> >
> > -- -------------------- WARNING ---------------------
> > MSG: seq doesn't validate, mismatch is 1
> > ---------------------------------------------------
> >
> > ------------- EXCEPTION  -------------
> > MSG: Attempting to set the sequence to
> > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
> > which does not look healthy
> > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
> > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
> > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK
> > toplevel ../quality_wrapper.pl:103
> >
> > I cannot find any non-allowed characters in the sequence, and
> > the de-referencing appears to work correctly. Can anyone help me?
> > I'm using the latest Bioperl installation (1.5.2) with
> > ActivePerl5.8 on a Mepis 6.5 system.
>
> Try telling the Bio::Seq object what alphabet to use when creating it.
> I tend to create them like:
>
> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')

-- 
Sheri Simmons
Department of Earth and Planetary Sciences
University of California, Berkeley
Berkeley, CA 94720-4767

From hlapp at gmx.net  Fri Jun 15 21:27:42 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 21:27:42 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <EDC569BF-2E4B-4BFC-916A-665CC2FFABAF@gmx.net>

Could you post a ticket to the helpdesk: support at open-bio.org.

	-hilmar

On Jun 15, 2007, at 9:08 PM, George Hartzell wrote:

> Hilmar Lapp writes:
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>
> Free Beer, huh?  Do you deliver?
>
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
>
> thanks!
>
> g.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Fri Jun 15 21:08:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Fri, 15 Jun 2007 21:08:32 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <18035.14352.963113.473274@almost.alerce.com>

Hilmar Lapp writes:
 > So should we set up a sandbox svn repository and those who would like  
 > to help out
 > 
 > - take shots at migrating bioperl (any current cvs snapshot will do)  
 > to svn

Free Beer, huh?  Do you deliver?

Can you package up a tarball of the cvs repository (bzip or gzip would
save some time) itself?

thanks!

g.

From cjfields at uiuc.edu  Fri Jun 15 21:42:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 20:42:05 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>

The browsable CVS has a 'Download tarball' link if that helps.

http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
cvsroot=bioperl

chris

On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:

> Hilmar Lapp writes:
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>
> Free Beer, huh?  Do you deliver?
>
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
>
> thanks!
>
> g.


From cjfields at uiuc.edu  Fri Jun 15 21:50:09 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 20:50:09 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>

I'll help out to the extent I can w/o having the SVN know-how.  We  
need (as Jason points out) someone who can detail the benefits and  
maybe keep an updated journal on the wiki.

I believe at least one or two of the other Bio* contemplated moving  
over to SVN, which may be worth checking out.

chris

On Jun 15, 2007, at 5:10 PM, Hilmar Lapp wrote:

> So should we set up a sandbox svn repository and those who would like
> to help out
>
> - take shots at migrating bioperl (any current cvs snapshot will do)
> to svn
>
> - you document what you find yourself having to do in trying to make
> it work
>
> - you report back when you think you have a working repository
>
> - we all get a defined amount of time to test to our hearts' content,
> say 2 weeks
>
> - you fix issues that were encountered
>
> - report back when done, followed by retesting for, say 1 week
>
> - iterate previous 2 steps until no issues and no objections to
> migration
>
> - two more weeks of warning period to all developers to commit all
> outstanding changes, or reapply them to a future svn checkout
>
> - pull the trigger by locking down cvs, applying the migration as
> worked out before, and announcing that BioPerl is now on svn
>
> - get free beer at next BOSC (I'll pay if no one else does)
>
> This may not be precisely the plan that needs to be executed, but
> it's probably somewhere along those lines.
>
> If there are volunteers who would like to spearhead this, then power
> to you - I think everyone is in favor and the advantages of svn don't
> need to be debated. The only reason it hasn't happened yet is because
> no one has stepped forward who would have the energy.
>
> I'm sure ChrisD will gladly create the svn sandbox if we have
> volunteers lined up to get going.
>
> 	-hilmar
>
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>
>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>> Hi,
>>>
>>> I would very much prefer it if bioperl moved to svn. I'm
>>> considering merging Bio::Phylo (to the extent that that's possible/
>>> practical) with bioperl and move it to an OBF repository, but I'd
>>> rather not go back to CVS.
>>>
>>> Rutger
>>>
>>
>> I second that, SVN seems like the reasonable choice. I would be more
>> than happy to help out as well.
>>
>> Spiros
>>
>>>
>>> -----Original Message-----
>>>
>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>
>>>>
>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>
>>>>>>>> ...
>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>> timepoint.
>>>>>>> Probably after a branch release or something?  Because it
>>>>>>> basically
>>>>>>> means we're going to have differences on nearly every line
>>>>>>> which is
>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>> versions.
>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>> bugs!
>>>>>
>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>> making a branch release help?
>>>>
>>>> Maybe the worry is that mass conversion in such a large codebase
>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>> w/o
>>>> trying?
>>>>
>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>> Sendu, up for it?
>>>>>
>>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>>> thing I'm not sure its worth it. There are more pressing things to
>>>>> be done for Bioperl.
>>>>>
>>>>> If I can just run perltidy on the entire package and commit,  
>>>>> I'd do
>>>>> it. If that's not appropriate, I won't.
>>>>
>>>> The choices aren't necessarily all or nothing.  What about
>>>> voluntary,
>>>> recommended use of a perltidy config file included with the
>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>
>>>>>>>> About svn
>>>>> [snip]
>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>> is  involved and try getting something going in the next month or
>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>> well but it  might be worth looking into.
>>>>>
>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>> more pressing matters (POD fixing, test script updating and
>>>>> finishing...).
>>>>
>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>> migration (BioRuby and I think BioPython, though the latter  
>>>> could be
>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>> pros/cons, get others opinions from others who have made the
>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>> tester
>>>> SVN w/o making it the default codebase (lock it off to a few
>>>> testers,
>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>
>>>> I agree with you that it's not feasible to switch over prior to a
>>>> release and that there are more pressing issues, but it doesn't  
>>>> hurt
>>>> having an open discussion about it.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Fri Jun 15 22:12:55 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 22:12:55 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
Message-ID: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>

I think he meant the cvs repository itself, containing all the change  
data. -hilmar

On Jun 15, 2007, at 9:42 PM, Chris Fields wrote:

> The browsable CVS has a 'Download tarball' link if that helps.
>
> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
> cvsroot=bioperl
>
> chris
>
> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:
>
>> Hilmar Lapp writes:
>>> So should we set up a sandbox svn repository and those who would  
>>> like
>>> to help out
>>>
>>> - take shots at migrating bioperl (any current cvs snapshot will do)
>>> to svn
>>
>> Free Beer, huh?  Do you deliver?
>>
>> Can you package up a tarball of the cvs repository (bzip or gzip  
>> would
>> save some time) itself?
>>
>> thanks!
>>
>> g.
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Jun 15 22:37:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 21:37:55 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
Message-ID: <F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>

Ah, got it.  Sorry.

George, planning on taking this up?

chris

On Jun 15, 2007, at 9:12 PM, Hilmar Lapp wrote:

> I think he meant the cvs repository itself, containing all the  
> change data. -hilmar
>
> On Jun 15, 2007, at 9:42 PM, Chris Fields wrote:
>
>> The browsable CVS has a 'Download tarball' link if that helps.
>>
>> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
>> cvsroot=bioperl
>>
>> chris
>>
>> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:
>>
>>> Hilmar Lapp writes:
>>>> So should we set up a sandbox svn repository and those who would  
>>>> like
>>>> to help out
>>>>
>>>> - take shots at migrating bioperl (any current cvs snapshot will  
>>>> do)
>>>> to svn
>>>
>>> Free Beer, huh?  Do you deliver?
>>>
>>> Can you package up a tarball of the cvs repository (bzip or gzip  
>>> would
>>> save some time) itself?
>>>
>>> thanks!
>>>
>>> g.
>>
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sat Jun 16 04:20:57 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 16 Jun 2007 09:20:57 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <46739D69.4090204@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:
> Hilmar Lapp writes:
>  > So should we set up a sandbox svn repository and those who would like  
>  > to help out
>  > 
>  > - take shots at migrating bioperl (any current cvs snapshot will do)  
>  > to svn
> 
> Free Beer, huh?  Do you deliver?
> 
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
> 
> thanks!
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Sounds like George might know what he's doing! I have a question about
setting up svn access. I believe access can be done in several ways,
over webdav, over ssh and probably others too. Do you have any knowledge
about the benefits of one over the other? I suppose I'm thinking of what
to implement to allow anonymous read access for users and authenticated
access for developers.

Nath

p.s. if you need any monkeys to do some work I'm happy to help out as
much as possible.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGc51pczuW2jkwy2gRAmi9AJ0XojVdh4ckXoc3bwVSmeNw95cR7QCfV+G9
Lb9NUEe4dkCakQ+Gc7Py98A=
=BG9m
-----END PGP SIGNATURE-----

From rvos at interchange.ubc.ca  Sat Jun 16 06:37:11 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 03:37:11 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <15232024.1181990231860.JavaMail.myubc2@handel.my.ubc.ca>

I can volunteer some time to help out with this.

Rutger

-----Original Message-----

> Date: Fri Jun 15 15:10:25 PDT 2007
> From: "Hilmar Lapp" <hlapp at gmx.net>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: spiros at lokku.com
>
> So should we set up a sandbox svn repository and those who would like  
> to help out
> 
> - take shots at migrating bioperl (any current cvs snapshot will do)  
> to svn
> 
> - you document what you find yourself having to do in trying to make  
> it work
> 
> - you report back when you think you have a working repository
> 
> - we all get a defined amount of time to test to our hearts' content,  
> say 2 weeks
> 
> - you fix issues that were encountered
> 
> - report back when done, followed by retesting for, say 1 week
> 
> - iterate previous 2 steps until no issues and no objections to  
> migration
> 
> - two more weeks of warning period to all developers to commit all  
> outstanding changes, or reapply them to a future svn checkout
> 
> - pull the trigger by locking down cvs, applying the migration as  
> worked out before, and announcing that BioPerl is now on svn
> 
> - get free beer at next BOSC (I'll pay if no one else does)
> 
> This may not be precisely the plan that needs to be executed, but  
> it's probably somewhere along those lines.
> 
> If there are volunteers who would like to spearhead this, then power  
> to you - I think everyone is in favor and the advantages of svn don't  
> need to be debated. The only reason it hasn't happened yet is because  
> no one has stepped forward who would have the energy.
> 
> I'm sure ChrisD will gladly create the svn sandbox if we have  
> volunteers lined up to get going.
> 
> 	-hilmar
> 
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
> 
> > On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
> >> Hi,
> >>
> >> I would very much prefer it if bioperl moved to svn. I'm  
> >> considering merging Bio::Phylo (to the extent that that's possible/ 
> >> practical) with bioperl and move it to an OBF repository, but I'd  
> >> rather not go back to CVS.
> >>
> >> Rutger
> >>
> >
> > I second that, SVN seems like the reasonable choice. I would be more
> > than happy to help out as well.
> >
> > Spiros
> >
> >>
> >> -----Original Message-----
> >>
> >>> Date: Fri Jun 15 07:56:23 PDT 2007
> >>> From: "Chris Fields" <cjfields at uiuc.edu>
> >>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> >>> To: "Sendu Bala" <bix at sendu.me.uk>
> >>>
> >>>
> >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> >>>
> >>>>>>> ...
> >>>>>> Can we do any sort of massive conversion at some logical  
> >>>>>> timepoint.
> >>>>>> Probably after a branch release or something?  Because it  
> >>>>>> basically
> >>>>>> means we're going to have differences on nearly every line  
> >>>>>> which is
> >>>>>> going to make diff-ing difficult when debugging old/new versions.
> >>>>>> Maybe it is not a problem because we aren't introducing and new
> >>>>>> bugs!
> >>>>
> >>>> Sorry, can you clarify the problem you envisage? And why would
> >>>> making a branch release help?
> >>>
> >>> Maybe the worry is that mass conversion in such a large codebase
> >>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows  
> >>> w/o
> >>> trying?
> >>>
> >>>>> I agree; if we intend on doing this it should be all at once,
> >>>>> maybe  on a branch dedicated to ensure that code changes don't
> >>>>> tank tests  (they shouldn't but one never knows).  We would then
> >>>>> need a script up- and-running that tidies everything up prior to
> >>>>> commits (though what  happens if perltidy tanks?...).
> >>>>> Sendu, up for it?
> >>>>
> >>>> If its going to be difficult and a hassle, for such an unnecessary
> >>>> thing I'm not sure its worth it. There are more pressing things to
> >>>> be done for Bioperl.
> >>>>
> >>>> If I can just run perltidy on the entire package and commit, I'd do
> >>>> it. If that's not appropriate, I won't.
> >>>
> >>> The choices aren't necessarily all or nothing.  What about  
> >>> voluntary,
> >>> recommended use of a perltidy config file included with the
> >>> distribution, with additional 'caveats'?  See my response to Sean.
> >>>
> >>>>>>> About svn
> >>>> [snip]
> >>>>> Stepped into that one, didn't I!  I'll look into how much effort
> >>>>> is  involved and try getting something going in the next month or
> >>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
> >>>>> well but it  might be worth looking into.
> >>>>
> >>>> I'd put this in the unnecessary-but-nice category as well. If it
> >>>> will be as easy as my ->new change, go ahead. If not, there are
> >>>> more pressing matters (POD fixing, test script updating and
> >>>> finishing...).
> >>>
> >>> A few other open-bio projects have actively discussed a CVS->SVN
> >>> migration (BioRuby and I think BioPython, though the latter could be
> >>> wrong).  As I said, "it might be worth looking into" to weigh the
> >>> pros/cons, get others opinions from others who have made the
> >>> transition, etc.  We could, as Jason suggested, even set up a tester
> >>> SVN w/o making it the default codebase (lock it off to a few  
> >>> testers,
> >>> have CVS commits automatically/manually carry over to SVN, etc).
> >>>
> >>> I agree with you that it's not feasible to switch over prior to a
> >>> release and that there are more pressing issues, but it doesn't hurt
> >>> having an open discussion about it.
> >>>
> >>> chris
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Sat Jun 16 07:21:47 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Sat, 16 Jun 2007 07:21:47 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
Message-ID: <4673C7CB.1030709@mail.nih.gov>

Chris Fields wrote:
> I'll help out to the extent I can w/o having the SVN know-how.  We  
> need (as Jason points out) someone who can detail the benefits and  
> maybe keep an updated journal on the wiki.
>
> I believe at least one or two of the other Bio* contemplated moving  
> over to SVN, which may be worth checking out.
>   
The bioconductor project is on SVN.  The project includes over 200 
packages (the equivalent of perl modules) with something around 150-200 
ACTIVE developers.  They also have a build system for several OSes that 
operates on a cron-like system with builds of several versions 
approximately daily.  Their system is running at something like revision 
30,000, so they have significant experience.  If anyone would like 
technical support, I can certainly ask the folks maintaining their site 
if they can give some input.  Let me know if anyone would like a contact 
person.

As for access, the typical access is over http (or https).  Access 
controls can be set up on the server side while allowing anonymous 
access for checkout.  There are many excellent SVN for every OS, so that 
should not be a problem. 

Sean

From cjfields at uiuc.edu  Sat Jun 16 10:02:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 09:02:35 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4673C7CB.1030709@mail.nih.gov>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
Message-ID: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>


On Jun 16, 2007, at 6:21 AM, Sean Davis wrote:

> Chris Fields wrote:
>> I'll help out to the extent I can w/o having the SVN know-how.  We
>> need (as Jason points out) someone who can detail the benefits and
>> maybe keep an updated journal on the wiki.
>>
>> I believe at least one or two of the other Bio* contemplated moving
>> over to SVN, which may be worth checking out.
>>
> The bioconductor project is on SVN.  The project includes over 200
> packages (the equivalent of perl modules) with something around  
> 150-200
> ACTIVE developers.  They also have a build system for several OSes  
> that
> operates on a cron-like system with builds of several versions
> approximately daily.  Their system is running at something like  
> revision
> 30,000, so they have significant experience.  If anyone would like
> technical support, I can certainly ask the folks maintaining their  
> site
> if they can give some input.  Let me know if anyone would like a  
> contact
> person.
>
> As for access, the typical access is over http (or https).  Access
> controls can be set up on the server side while allowing anonymous
> access for checkout.  There are many excellent SVN for every OS, so  
> that
> should not be a problem.
>
> Sean

It looks like George Hartzell may be taking a crack at it, with  
Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
could have something testable relatively soon.  After that we'll need  
to work out a few other issues, basically what's on Hilmar's list.

chris


From hlapp at gmx.net  Sat Jun 16 10:40:08 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 16 Jun 2007 10:40:08 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>
Message-ID: <51E89347-4AF7-482E-98DB-BE1AA0138A91@gmx.net>

Just as an aside, even if we can't keep anonymous cvs working, I  
would think that using apache URL rewriting and a small CGI script  
that returns an appropriate page redirect we can without too much  
trouble keep the hyperlinks functional that people may have bookmarked

	-hilmar

On Jun 15, 2007, at 6:23 PM, Jason Stajich wrote:

> Sounds like a plan, I'll be curious to see if we can still get keep  
> anonymous CVS working as I'd like to not have to pull the plug on  
> that.  There are some threads out on the web about how to do this  
> with a commit rule on SVN.
>
> Also, can someone who is close enough to all the SVN benefits  
> please elaborate how it is going to help _this_ project?
> Perhaps you would be willing to put a few words up -- like on (a to  
> be created):
> http://bioperl.org/wiki/BioPerl:Version_control_changeover
>
> This way if anonymous CVS is broken and/or developers who haven't  
> been paying attention come back to commit code ask why things  
> changed we don't have to compose long emails... =)
>
> -jason
> On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote:
>
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>>
>> - you document what you find yourself having to do in trying to make
>> it work
>>
>> - you report back when you think you have a working repository
>>
>> - we all get a defined amount of time to test to our hearts' content,
>> say 2 weeks
>>
>> - you fix issues that were encountered
>>
>> - report back when done, followed by retesting for, say 1 week
>>
>> - iterate previous 2 steps until no issues and no objections to
>> migration
>>
>> - two more weeks of warning period to all developers to commit all
>> outstanding changes, or reapply them to a future svn checkout
>>
>> - pull the trigger by locking down cvs, applying the migration as
>> worked out before, and announcing that BioPerl is now on svn
>>
>> - get free beer at next BOSC (I'll pay if no one else does)
>>
>> This may not be precisely the plan that needs to be executed, but
>> it's probably somewhere along those lines.
>>
>> If there are volunteers who would like to spearhead this, then power
>> to you - I think everyone is in favor and the advantages of svn don't
>> need to be debated. The only reason it hasn't happened yet is because
>> no one has stepped forward who would have the energy.
>
>>
>> I'm sure ChrisD will gladly create the svn sandbox if we have
>> volunteers lined up to get going.
>>
>> 	-hilmar
>>
>> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>>
>>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>>> Hi,
>>>>
>>>> I would very much prefer it if bioperl moved to svn. I'm
>>>> considering merging Bio::Phylo (to the extent that that's possible/
>>>> practical) with bioperl and move it to an OBF repository, but I'd
>>>> rather not go back to CVS.
>>>>
>>>> Rutger
>>>>
>>>
>>> I second that, SVN seems like the reasonable choice. I would be more
>>> than happy to help out as well.
>>>
>>> Spiros
>>>
>>>>
>>>> -----Original Message-----
>>>>
>>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>>
>>>>>
>>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>>
>>>>>>>>> ...
>>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>>> timepoint.
>>>>>>>> Probably after a branch release or something?  Because it
>>>>>>>> basically
>>>>>>>> means we're going to have differences on nearly every line
>>>>>>>> which is
>>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>>> versions.
>>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>>> bugs!
>>>>>>
>>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>>> making a branch release help?
>>>>>
>>>>> Maybe the worry is that mass conversion in such a large codebase
>>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>>> w/o
>>>>> trying?
>>>>>
>>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>>> Sendu, up for it?
>>>>>>
>>>>>> If its going to be difficult and a hassle, for such an  
>>>>>> unnecessary
>>>>>> thing I'm not sure its worth it. There are more pressing  
>>>>>> things to
>>>>>> be done for Bioperl.
>>>>>>
>>>>>> If I can just run perltidy on the entire package and commit,  
>>>>>> I'd do
>>>>>> it. If that's not appropriate, I won't.
>>>>>
>>>>> The choices aren't necessarily all or nothing.  What about
>>>>> voluntary,
>>>>> recommended use of a perltidy config file included with the
>>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>>
>>>>>>>>> About svn
>>>>>> [snip]
>>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>>> is  involved and try getting something going in the next  
>>>>>>> month or
>>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>>> well but it  might be worth looking into.
>>>>>>
>>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>>> more pressing matters (POD fixing, test script updating and
>>>>>> finishing...).
>>>>>
>>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>>> migration (BioRuby and I think BioPython, though the latter  
>>>>> could be
>>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>>> pros/cons, get others opinions >from others who have made the
>>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>>> tester
>>>>> SVN w/o making it the default codebase (lock it off to a few
>>>>> testers,
>>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>>
>>>>> I agree with you that it's not feasible to switch over prior to a
>>>>> release and that there are more pressing issues, but it doesn't  
>>>>> hurt
>>>>> having an open discussion about it.
>>>>>
>>>>> chris
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Jun 16 10:55:09 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 16 Jun 2007 10:55:09 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4673C7CB.1030709@mail.nih.gov>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
Message-ID: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>


On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:

> As for access, the typical access is over http (or https).

We're using svn+ssh here (NESCent) so the password is the same as the  
one you set for your account on the server, and you can use public/ 
private key negotiation for authentication.

I think the ability to not provide a password for every single  
interaction is a requirement. If that requires using svn+ssh or can  
be made to work through https too I don't know. On sf.net I have to  
use https for svn and it doesn't ask me for the password each time.  
Not sure how this works though, maybe some local caching?

We should not be using http, or whatever other protocol that sends  
unencrypted passwords.

>   Access controls can be set up on the server side while allowing  
> anonymous access for checkout.  There are many excellent SVN for  
> every OS, so that should not be a problem.

On Mac OSX the most convenient way I have found is through fink. It  
does ask to install 30 other dependencies, which had me balk at  
first, but me doing it by hand is even worse than fink doing it, so I  
finally gave in and it's really a breeze. I've not had a single issue.

  From a sysadmin perspective, what might be worth keeping in mind is  
that svn is going to store everything in a database (BerkeleyDB I  
think). I.e., there is no such thing anymore as restoring individual  
source code files from backup if one gets accidentally corrupted on  
the server. It seems you have to restore the entire database, i.e.,  
the entire repository. I vaguely recall though that how svn manages  
the repository is actually configurable and that other storage than  
DB is possible too. Don't ask me for the pros and cons of one vs the  
other.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From rvos at interchange.ubc.ca  Sat Jun 16 13:09:18 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 10:09:18 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>

CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).

For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).

Rutger


-----Original Message-----

> Date: Sat Jun 16 07:55:09 PDT 2007
> From: "Hilmar Lapp" <hlapp at gmx.net>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: "Sean Davis" <sdavis2 at mail.nih.gov>
>
> 
> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> 
> > As for access, the typical access is over http (or https).
> 
> We're using svn+ssh here (NESCent) so the password is the same as the  
> one you set for your account on the server, and you can use public/ 
> private key negotiation for authentication.
> 
> I think the ability to not provide a password for every single  
> interaction is a requirement. If that requires using svn+ssh or can  
> be made to work through https too I don't know. On sf.net I have to  
> use https for svn and it doesn't ask me for the password each time.  
> Not sure how this works though, maybe some local caching?
> 
> We should not be using http, or whatever other protocol that sends  
> unencrypted passwords.
> 
> >   Access controls can be set up on the server side while allowing  
> > anonymous access for checkout.  There are many excellent SVN for  
> > every OS, so that should not be a problem.
> 
> On Mac OSX the most convenient way I have found is through fink. It  
> does ask to install 30 other dependencies, which had me balk at  
> first, but me doing it by hand is even worse than fink doing it, so I  
> finally gave in and it's really a breeze. I've not had a single issue.
> 
>   From a sysadmin perspective, what might be worth keeping in mind is  
> that svn is going to store everything in a database (BerkeleyDB I  
> think). I.e., there is no such thing anymore as restoring individual  
> source code files from backup if one gets accidentally corrupted on  
> the server. It seems you have to restore the entire database, i.e.,  
> the entire repository. I vaguely recall though that how svn manages  
> the repository is actually configurable and that other storage than  
> DB is possible too. Don't ask me for the pros and cons of one vs the  
> other.
> 
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From rvos at interchange.ubc.ca  Sat Jun 16 13:15:45 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 10:15:45 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>

A brief word on the topic of perltidy: no. I like what it does, and I sort of follow one of its settings (-syn -sob -b), but if you run it on a whole source tree it'll screw up the diffs, and I'm still worried about it breaking things (though really it shouldn't, it creates a *.bak if something doesn't compile anymore).

Rutger


-----Original Message-----

> Date: Sat Jun 16 10:09:18 PDT 2007
> From: "rvos" <rvos at interchange.ubc.ca>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: "Hilmar Lapp" <hlapp at gmx.net>, "Sean Davis" <sdavis2 at mail.nih.gov>
>
> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).
> 
> For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).
> 
> Rutger
> 
> 
> -----Original Message-----
> 
> > Date: Sat Jun 16 07:55:09 PDT 2007
> > From: "Hilmar Lapp" <hlapp at gmx.net>
> > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> > To: "Sean Davis" <sdavis2 at mail.nih.gov>
> >
> > 
> > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> > 
> > > As for access, the typical access is over http (or https).
> > 
> > We're using svn+ssh here (NESCent) so the password is the same as the  
> > one you set for your account on the server, and you can use public/ 
> > private key negotiation for authentication.
> > 
> > I think the ability to not provide a password for every single  
> > interaction is a requirement. If that requires using svn+ssh or can  
> > be made to work through https too I don't know. On sf.net I have to  
> > use https for svn and it doesn't ask me for the password each time.  
> > Not sure how this works though, maybe some local caching?
> > 
> > We should not be using http, or whatever other protocol that sends  
> > unencrypted passwords.
> > 
> > >   Access controls can be set up on the server side while allowing  
> > > anonymous access for checkout.  There are many excellent SVN for  
> > > every OS, so that should not be a problem.
> > 
> > On Mac OSX the most convenient way I have found is through fink. It  
> > does ask to install 30 other dependencies, which had me balk at  
> > first, but me doing it by hand is even worse than fink doing it, so I  
> > finally gave in and it's really a breeze. I've not had a single issue.
> > 
> >   From a sysadmin perspective, what might be worth keeping in mind is  
> > that svn is going to store everything in a database (BerkeleyDB I  
> > think). I.e., there is no such thing anymore as restoring individual  
> > source code files from backup if one gets accidentally corrupted on  
> > the server. It seems you have to restore the entire database, i.e.,  
> > the entire repository. I vaguely recall though that how svn manages  
> > the repository is actually configurable and that other storage than  
> > DB is possible too. Don't ask me for the pros and cons of one vs the  
> > other.
> > 
> > 	-hilmar
> > -- 
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From george.heller at yahoo.com  Sat Jun 16 13:29:26 2007
From: george.heller at yahoo.com (George Heller)
Date: Sat, 16 Jun 2007 10:29:26 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
Message-ID: <959624.48556.qm@web56502.mail.re3.yahoo.com>

Hi all,
   
  I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. 
   
  Any ideas on the way I can go about doing this?
   
  George

       
---------------------------------
Shape Yahoo! in your own image.  Join our Network Research Panel today!

From bix at sendu.me.uk  Sat Jun 16 14:21:38 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 16 Jun 2007 19:21:38 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <959624.48556.qm@web56502.mail.re3.yahoo.com>
References: <959624.48556.qm@web56502.mail.re3.yahoo.com>
Message-ID: <46742A32.90305@sendu.me.uk>

George Heller wrote:
> Hi all,
> 
> I am looking at extracting the taxonomy hierarchy for some taxon ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
> 
> Any ideas on the way I can go about doing this?

Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
some kind of looping structure. Most easily a recursing sub.

If you happen to code up something neat and efficient, why not share it 
with us and we could add it to the Taxonomy module(s).

From cjfields at uiuc.edu  Sat Jun 16 15:23:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 14:23:43 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
Message-ID: <A59B3FA2-6732-4DB2-9C9C-223DFF41D1E9@uiuc.edu>


On Jun 16, 2007, at 9:55 AM, Hilmar Lapp wrote:

>
> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>
>> As for access, the typical access is over http (or https).
>
> We're using svn+ssh here (NESCent) so the password is the same as the
> one you set for your account on the server, and you can use public/
> private key negotiation for authentication.
>
> I think the ability to not provide a password for every single
> interaction is a requirement. If that requires using svn+ssh or can
> be made to work through https too I don't know. On sf.net I have to
> use https for svn and it doesn't ask me for the password each time.
> Not sure how this works though, maybe some local caching?
>
> We should not be using http, or whatever other protocol that sends
> unencrypted passwords.

Agreed; it should be through ssh.

>>   Access controls can be set up on the server side while allowing
>> anonymous access for checkout.  There are many excellent SVN for
>> every OS, so that should not be a problem.
>
> On Mac OSX the most convenient way I have found is through fink. It
> does ask to install 30 other dependencies, which had me balk at
> first, but me doing it by hand is even worse than fink doing it, so I
> finally gave in and it's really a breeze. I've not had a single issue.
>
>   From a sysadmin perspective, what might be worth keeping in mind is
> that svn is going to store everything in a database (BerkeleyDB I
> think). I.e., there is no such thing anymore as restoring individual
> source code files from backup if one gets accidentally corrupted on
> the server. It seems you have to restore the entire database, i.e.,
> the entire repository. I vaguely recall though that how svn manages
> the repository is actually configurable and that other storage than
> DB is possible too. Don't ask me for the pros and cons of one vs the
> other.

MacPorts/DarwinPorts also has subversion, various language bindings,  
cvs2svn, and various perl modules.  There are also a few SVN GUIs  
lingering around (including live folders within Komodo).

chris


From cjfields at uiuc.edu  Sat Jun 16 15:18:06 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 14:18:06 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>
References: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <1A314D08-8F3C-4A4B-B58D-64AC7952F149@uiuc.edu>

I think it's viable as an option if the code really needs it.  After  
100+ commits some of the code has schizy coding styles, so cleaning  
it up helps.  In those cases having a perltidy config file present  
wouldn't hurt.  However I agree that it shouldn't be applied across  
every module and should be done judiciously (the commit message, for  
instance, should actually state the code was tidied).

chris

PS - Nice to see the ball is rolling on SVN!

On Jun 16, 2007, at 12:15 PM, rvos wrote:

> A brief word on the topic of perltidy: no. I like what it does, and  
> I sort of follow one of its settings (-syn -sob -b), but if you run  
> it on a whole source tree it'll screw up the diffs, and I'm still  
> worried about it breaking things (though really it shouldn't, it  
> creates a *.bak if something doesn't compile anymore).
>
> Rutger
>
>
>
> -----Original Message-----
>
>> Date: Sat Jun 16 10:09:18 PDT 2007
>> From: "rvos" <rvos at interchange.ubc.ca>
>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
>> To: "Hilmar Lapp" <hlapp at gmx.net>, "Sean Davis"  
>> <sdavis2 at mail.nih.gov>
>>
>> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales  
>> talk has been expended over it already, for my own purpose I like  
>> the integration with eclipse (through subclipse plugin) and  
>> komodo, in addition to the atomic commits (so I can ctrl+c if I  
>> goof up (again)).
>>
>> For standalone use on osx I didn't use the fink one, but I forgot  
>> where I did get it from. It was very easy to set up, though. On  
>> windows there is a really nice standalone one (tortoisesvn) that  
>> integrates with the explorer so you can see on the file icons what  
>> the state of a file is. I know that there's a cvs2svn utility that  
>> converts your revision history (seems a requirement).
>>
>> Rutger
>>
>>
>> -----Original Message-----
>>
>>> Date: Sat Jun 16 07:55:09 PDT 2007
>>> From: "Hilmar Lapp" <hlapp at gmx.net>
>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
>>> To: "Sean Davis" <sdavis2 at mail.nih.gov>
>>>
>>>
>>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>>
>>>> As for access, the typical access is over http (or https).
>>>
>>> We're using svn+ssh here (NESCent) so the password is the same as  
>>> the
>>> one you set for your account on the server, and you can use public/
>>> private key negotiation for authentication.
>>>
>>> I think the ability to not provide a password for every single
>>> interaction is a requirement. If that requires using svn+ssh or can
>>> be made to work through https too I don't know. On sf.net I have to
>>> use https for svn and it doesn't ask me for the password each time.
>>> Not sure how this works though, maybe some local caching?
>>>
>>> We should not be using http, or whatever other protocol that sends
>>> unencrypted passwords.
>>>
>>>>   Access controls can be set up on the server side while allowing
>>>> anonymous access for checkout.  There are many excellent SVN for
>>>> every OS, so that should not be a problem.
>>>
>>> On Mac OSX the most convenient way I have found is through fink. It
>>> does ask to install 30 other dependencies, which had me balk at
>>> first, but me doing it by hand is even worse than fink doing it,  
>>> so I
>>> finally gave in and it's really a breeze. I've not had a single  
>>> issue.
>>>
>>>   From a sysadmin perspective, what might be worth keeping in  
>>> mind is
>>> that svn is going to store everything in a database (BerkeleyDB I
>>> think). I.e., there is no such thing anymore as restoring individual
>>> source code files from backup if one gets accidentally corrupted on
>>> the server. It seems you have to restore the entire database, i.e.,
>>> the entire repository. I vaguely recall though that how svn manages
>>> the repository is actually configurable and that other storage than
>>> DB is possible too. Don't ask me for the pros and cons of one vs the
>>> other.
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hartzell at alerce.com  Sat Jun 16 13:47:01 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 16 Jun 2007 10:47:01 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
	<F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
Message-ID: <18036.8725.29073.619527@almost.alerce.com>

Chris Fields writes:
 > Ah, got it.  Sorry.
 > 
 > George, planning on taking this up?

I'm going to take a *peek*.  I just finished (unless someone finds
another issue) moving someone's cvs repository over to svn, so I have
some tools cobbled together and some knowledge in the cache.

I don't have too much idle time at the moment though, so if it gets
gooey I'll just summarize what I learn.  Either way it seems worth a
peek.

I will need the repository itself though.  I'll post a note to
support at open-bio.org.

g.

From jason at bioperl.org  Sat Jun 16 19:54:18 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 16 Jun 2007 16:54:18 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18036.8725.29073.619527@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
	<F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
	<18036.8725.29073.619527@almost.alerce.com>
Message-ID: <6F57475B-715F-49D1-B6D2-F3FD3ACCB728@bioperl.org>

Thanks George.
I'll respond to your support ticket as well but I put up tarballs of  
the repository as of today.

I had thought at one point ChrisD might have setup rsync-able access  
to the whole repostitory through code.open-bio.org but for now I have  
put up tarballs of most of the CVS dirs from bioperl
http://bioperl.org/uploads/

Just to say I already went through all the steps of running cvs2svn  
myself and had problems gathering back out the branches and all the  
tags when I tried it.  If you want to start with a smaller repository  
like bioperl-network or bioperl-db as the initial cvs2svn conversion  
script took quite a long time to run on bioperl-live.

Regarding ssh/https:
We have already gone through some of this for blipkit and biojava  
projects.  I think we'll still keep separate anonymous read-only  
(code.open-bio.org) and writeable repositories (dev.open-bio.org) as  
I think we are resisting any webapps on the developement server as we  
want that to as locked down as possible.  For the newly created svn  
repositories that I've been creating/using I just use svn+ssh and  
that worked okay.


-jason

On Jun 16, 2007, at 10:47 AM, George Hartzell wrote:

> Chris Fields writes:
>> Ah, got it.  Sorry.
>>
>> George, planning on taking this up?
>
> I'm going to take a *peek*.  I just finished (unless someone finds
> another issue) moving someone's cvs repository over to svn, so I have
> some tools cobbled together and some knowledge in the cache.
>
> I don't have too much idle time at the moment though, so if it gets
> gooey I'll just summarize what I learn.  Either way it seems worth a
> peek.
>
> I will need the repository itself though.  I'll post a note to
> support at open-bio.org.
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hartzell at alerce.com  Sat Jun 16 19:56:09 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 16 Jun 2007 16:56:09 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <46739D69.4090204@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<46739D69.4090204@sheffield.ac.uk>
Message-ID: <18036.30873.609341.181853@almost.alerce.com>

Nathan S. Haigh writes:
 > [...]
 > Sounds like George might know what he's doing! 

Hey, I've been looking for a Marketing Director.  Want a job?

 > I have a question about
 > setting up svn access. I believe access can be done in several ways,
 > over webdav, over ssh and probably others too. Do you have any knowledge
 > about the benefits of one over the other? I suppose I'm thinking of what
 > to implement to allow anonymous read access for users and authenticated
 > access for developers.

There are two and a half ways to talk to the repository:

  - You can put it behind a web server (e.g. apache) and get at it
    using http/https.  Authentication and authorization happen using
    the normal web server tricks, so as long as you don't do anything
    silly (e.g. don't use basic auth, stick with mod_auth_digest),
    even http connections won't send passwords in the clear.  You can
    define users in .htpassword files or use any of the fancier setup
    (e.g. sql databases, etc...).

  - You can talk to it via subversion's simple server, svnserve.
    There are two ways you usually talk to svnserve (neither of which
    send passwords in the clear):

      * directly, using a URL like
          svn:/svn.example.com/repo/proj/trunk
        when you do this the client either talks directly to a copy of
        svnserve running as a daemon, or possibly to something like
        inetd that'll start an svnserve as necessary.

        In this case, you define authen. and author. info in an
        svnserve.conf file.

      * indirectly, using a URL like
          svn+ssh://svn.example.com/repo/proj/trunk/
        in which case you make an ssh connection to the server machine
        (and authenticate via ssh mechanisms, anything other than a
        key-pair will drive you nuts with repeated password requests)
        and then an svnserve process is started up for you in "tunnel
        mode".  Access control is coarse grained an via OS level  access
        permisions. 

        Generally in this case you need to give out shell accounts to
        everyone involved, or (tsk, tsk) have them use a common
        account.  There's a cute trick in the svn book that shows how
        to use a shared ssh account but still have all of the changes
        in the repo keep track of the real user.  I've never tried
        it.... 

   - If you're on the same machine as the repo, you can do this
     simple:
        file:///path/to/repo/proj/trunk

The biggest deciding factor is how you want to manage your users and
whether you're already messing around with a web server.  I've
generally worked in small group and everyone's had ssh access, but
I've set it up the other ways too.

You can even access via multiple paths.  The only trick is that the
repository needs to be writable by whoever's committing, and if
they're running svnserve themselves (file: or svn+ssh:) and things
aren't set up right (all the dirs in the repo need to be group
writable and have the magic bit set so that any new stuff created is
also writable, users umasks and group membership need to be aligned)
then things go fubar.  Google's your friend here, and each of the
OS's/distro's has a standard hack for making this work, usually
involving a wrapper app that takes care of things.

Feel free to ask any particular questions.

Phew,

g.

From jason at bioperl.org  Sat Jun 16 20:17:58 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 16 Jun 2007 17:17:58 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <200706151653.04135.sheris@eps.berkeley.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
	<1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
	<200706151653.04135.sheris@eps.berkeley.edu>
Message-ID: <6A369DE9-943A-4DF1-9DF0-F68E361C8C20@bioperl.org>

There error is clearly saying there must be a symbol or letter in  
your sequence that violates the regexp.
I had modified the code in CVS to actually provide a more informative  
mismatch error in the error message, but this probably not in the  
release you are using.

Anyways, add this to see what is causing the problem:

print join(",",($nstarthash{$_}[1] =~ /([^ 
$Bio::PrimarySeq::MATCHPATTERN]+)/g)), "\n";

-jason
On Jun 15, 2007, at 4:53 PM, Sheri Simmons wrote:

> Thanks for the suggestion, but that still gives the same error as  
> before.
>
> On Friday 15 June 2007 4:11 pm, Kevin Brown wrote:
>>> I'm getting an error as follows when I try to reverse
>>> complement a sequence string stored in a hash of arrays. The
>>> storage code is:
>>>
>>> 		$nstarthash{$key} = [$sortchecks[0], join("",
>>> @nseq),
>>> join("",@{$seqhash{$key}})];
>>>
>>> the sequence of interest is the element at index 1.
>>>
>>> Later, I try to retrieve this string for a subset of keys so
>>> I can reverse complement it based on input from another hash
>>> (%complement):
>>>
>>> 			my %revcomphash = map { my $read = $_;
>>> 			grep $complement{$read} eq 'C', %complement;
>>> 			{$_, (Bio::Seq->new(-seq
>>> =>$nstarthash{$_}[1]))->revcom->seq()};}
>>> 			 keys(%nstarthash);
>>>
>>>
>>> I get the following warning (long sequence edited for clarity):
>>>
>>> -- -------------------- WARNING ---------------------
>>> MSG: seq doesn't validate, mismatch is 1
>>> ---------------------------------------------------
>>>
>>> ------------- EXCEPTION  -------------
>>> MSG: Attempting to set the sequence to
>>> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
>>> which does not look healthy
>>> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
>>> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
>>> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK
>>> toplevel ../quality_wrapper.pl:103
>>>
>>> I cannot find any non-allowed characters in the sequence, and
>>> the de-referencing appears to work correctly. Can anyone help me?
>>> I'm using the latest Bioperl installation (1.5.2) with
>>> ActivePerl5.8 on a Mepis 6.5 system.
>>
>> Try telling the Bio::Seq object what alphabet to use when creating  
>> it.
>> I tend to create them like:
>>
>> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')
>
> -- 
> Sheri Simmons
> Department of Earth and Planetary Sciences
> University of California, Berkeley
> Berkeley, CA 94720-4767
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From n.haigh at sheffield.ac.uk  Sun Jun 17 07:45:11 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 17 Jun 2007 12:45:11 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <46751EC7.8020609@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

rvos wrote:
> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).
> 
> For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).
> 
> Rutger
> 
> 

Just to clarify, subversion is available as command line for windows:
http://subversion.tigris.org/project_packages.html

TortoiseSVN is another svn client with a GUI that integrates into the
shell. I tried setting this up a while back to use ssh (via PUTTY), but
I wasn't successful. This may have been due to me just starting out with
svn or that it was harder to setup in an earlier version of TortoiseSVN.

Does anyone have experience of setting up svn on Windows to use ssh? If
the changeover takes place, I'm happy to write some howto's for setting
up svn clients for Windows.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGdR7HczuW2jkwy2gRAmgOAJ96wLzVYbjqEPborZTsw6gwU6UitgCfV02v
8xHJvn/Eqf9LePR3Ei0ZaIw=
=t5pN
-----END PGP SIGNATURE-----

From george.heller at yahoo.com  Sun Jun 17 14:41:55 2007
From: george.heller at yahoo.com (George Heller)
Date: Sun, 17 Jun 2007 11:41:55 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46742A32.90305@sendu.me.uk>
Message-ID: <148654.15952.qm@web56511.mail.re3.yahoo.com>

Hi all,
   
  Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. 
   
  Thanks.
  George

Sendu Bala <bix at sendu.me.uk> wrote:
  George Heller wrote:
> Hi all,
> 
> I am looking at extracting the taxonomy hierarchy for some taxon ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
> 
> Any ideas on the way I can go about doing this?

Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
some kind of looping structure. Most easily a recursing sub.

If you happen to code up something neat and efficient, why not share it 
with us and we could add it to the Taxonomy module(s).


---------------------------------
Shape Yahoo! in your own image.  Join our Network Research Panel today!

From jason at bioperl.org  Sun Jun 17 16:48:05 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sun, 17 Jun 2007 13:48:05 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <148654.15952.qm@web56511.mail.re3.yahoo.com>
References: <148654.15952.qm@web56511.mail.re3.yahoo.com>
Message-ID: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org>

I assume you already figured out how to setup a local taxonomydb?

You just want the extant species/leaves of the tree

my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;


-jason
On Jun 17, 2007, at 11:41 AM, George Heller wrote:

> Hi all,
>
>   Can anyone point me to some example that uses the  
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at  
> this, and I am not quite sure how to implement it.
>
>   Thanks.
>   George
>
> Sendu Bala <bix at sendu.me.uk> wrote:
>   George Heller wrote:
>> Hi all,
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children and so
>> on.
>>
>> Any ideas on the way I can go about doing this?
>
> Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
>
> If you happen to code up something neat and efficient, why not  
> share it
> with us and we could add it to the Taxonomy module(s).
>
>
>
> ---------------------------------
> Shape Yahoo! in your own image.  Join our Network Research Panel  
> today!
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From aaron.j.mackey at gsk.com  Sun Jun 17 22:35:42 2007
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Sun, 17 Jun 2007 22:35:42 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46742A32.90305@sendu.me.uk>
Message-ID: <OF9A874C93.CFF12016-ON852572FE.000E328D-852572FE.000E463E@gsk.com>

To do so efficiently, you might want to check out:

  http://www.oreillynet.com/pub/a/network/2002/11/27/bioconf.html

-Aaron

bioperl-l-bounces at lists.open-bio.org wrote on 06/16/2007 02:21:38 PM:

> George Heller wrote:
> > Hi all,
> > 
> > I am looking at extracting the taxonomy hierarchy for some taxon ids.
> > What I plan to do is, for a given taxon id, say 33090, I want to
> > extract all taxon ids that are children of this species. I do not
> > just want the immediate children, but the children's children and so
> > on.
> > 
> > Any ideas on the way I can go about doing this?
> 
> Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
> 
> If you happen to code up something neat and efficient, why not share it 
> with us and we could add it to the Taxonomy module(s).
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From aaron.j.mackey at gsk.com  Sun Jun 17 22:34:12 2007
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Sun, 17 Jun 2007 22:34:12 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
Message-ID: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>

> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> 
> > As for access, the typical access is over http (or https).
> 
> We're using svn+ssh here (NESCent)

Let me just note that https is preferable to ssh for those poor slobs 
stuck behind a corporate firewall (svn happily prompts me for my proxy 
server's user/pass, then my https authentication realm's user/pass - all 
then get cached in some .svn/ file that I don't have to worry about again 
until my proxy server password changes once a month ...)

-Aaron


From george.heller at yahoo.com  Mon Jun 18 00:21:45 2007
From: george.heller at yahoo.com (George Heller)
Date: Sun, 17 Jun 2007 21:21:45 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org>
Message-ID: <487845.37410.qm@web56510.mail.re3.yahoo.com>

Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. 
   
  I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. 
   
  Thanks.
  George
   
  Jason Stajich <jason at bioperl.org> wrote:
    I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;  

  
  -jason
    On Jun 17, 2007, at 11:41 AM, George Heller wrote:

    Hi all,
  

    Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. 
  

    Thanks.
    George
  

  Sendu Bala <bix at sendu.me.uk> wrote:
    George Heller wrote:
    Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not share it 
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image.  Join our Network Research Panel today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Need a vacation? Get great deals to amazing places on Yahoo! Travel. 

From bix at sendu.me.uk  Mon Jun 18 06:44:00 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 11:44:00 +0100
Subject: [Bioperl-l] Network tests overhaul
Message-ID: <467661F0.2060703@sendu.me.uk>

When the test suite runs currently, most (the intent is all) tests skip 
if the test would require network (internet) access. This is to avoid 
tests failing not due to bugs in Bioperl code, but due to temporarily 
inaccessible servers. This is also to make running the test suite faster.

To do a complete test you currently have to set BIOPERLDEBUG to true, 
which activates the network test but also increases verbosity. This 
actually causes a problem, since when running the entire test suite the 
additional debug information is more a hindrance than a help, since the 
reams of printed information can hide significant warnings that may also 
get printed. Its also ugly.

The solution is to divorce activation of network tests from the request 
for verbosity. The obvious implementation is to have another environment 
variable, perhaps BIOPERLNETWORK. However, there is an opportunity to do 
something more appropriate. The running of networking tests should be a 
choice given to every end-user installing Bioperl. Debugging 
information, on the other hand, is only of interest to the developer 
working on a specific module under test, so can be left as a 'hidden' 
env var.


I have just committed one possible implementation along these lines.

You say:
perl Build.PL
as normal, and if you seem to have internet access it asks you if you'd 
like to run network tests. The default answer is no. If you answer yes, 
network tests will be enabled.

You can alternatively say:
perl Build.PL --network
and if you seem to have internet access, network tests will be enabled.

Then you run the tests:
./Build test
Any tests written to support the new system will then skip network tests 
if they haven't been enabled.

The only test I've written to support the new system is t/RemoteBlast.t:
./Build test --test_files t/RemoteBlast.t --verbose


Adding support to test scripts consists of the following changes:

+ use Module::Build;
+ my $build = Module::Build->current(get_options => { network => {} });
+ my $do_network_tests = $build->notes('network');

! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests
---
! if (!$do_network_tests) { # skip network tests


I propose adding this support to all test scripts that carry out network 
tests. Does anyone have objections? Does anyone have alternate 
implementations that may be superior?

I specifically suggest we don't use an env var in addition to the above, 
because the multiple ways of doing things could lead to confusion. Which 
takes priority? Did a user really have the networking tests turned on 
when he reported his test results?


The one thing I need help with is identifying which tests attempt to 
access the internet. I think we caught most of them for the 1.5.2 
release, but I think there are more lurking around. Can anyone offer a 
way to systematically find at least the test scripts which access the 
internet, if not the specific tests within?

Cheers,
Sendu.

From bix at sendu.me.uk  Mon Jun 18 06:46:17 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 11:46:17 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <46766279.7050202@sendu.me.uk>

Sendu Bala wrote:
> Adding support to test scripts consists of the following changes:
> 
> + use Module::Build;
> + my $build = Module::Build->current(get_options => { network => {} });

That should read:
+ my $build = Module::Build->current();

> + my $do_network_tests = $build->notes('network');

From cjfields at uiuc.edu  Mon Jun 18 07:45:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 06:45:10 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <46766279.7050202@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk>
Message-ID: <C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>

The idea sounds good, though if we plan on doing this we need to  
update the Test HOWTO as well.

Some modules require only a few (<50% of the total) network tests; I  
think SeqFeature.t may be one, though I'm not sure.  Does this handle  
those cases?

chris

On Jun 18, 2007, at 5:46 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Adding support to test scripts consists of the following changes:
>>
>> + use Module::Build;
>> + my $build = Module::Build->current(get_options => { network =>  
>> {} });
>
> That should read:
> + my $build = Module::Build->current();
>
>> + my $do_network_tests = $build->notes('network');
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Jun 18 07:49:18 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 12:49:18 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk>
	<C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>
Message-ID: <4676713E.1000508@sendu.me.uk>

Chris Fields wrote:
> The idea sounds good, though if we plan on doing this we need to update 
> the Test HOWTO as well.
> 
> Some modules require only a few (<50% of the total) network tests; I 
> think SeqFeature.t may be one, though I'm not sure.  Does this handle 
> those cases?

Yes, the system just gives the test script a boolean describing if 
network tests should be run. The script can then do whatever it wants 
with the boolean. Skip all tests, skip no tests, skip just some tests... 
its a drop-in replacement for the current 'debug' boolean used based on 
BIOPERLDEBUG.


From hlapp at gmx.net  Mon Jun 18 08:38:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:38:25 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <487845.37410.qm@web56510.mail.re3.yahoo.com>
References: <487845.37410.qm@web56510.mail.re3.yahoo.com>
Message-ID: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net>

I'm a bit confused - it sounds like you have set up a local BioSQL  
database and loaded the NCBI taxonomy into the database. You can now  
use simple SQL to retrieve all descendants of a node in the tree  
given its NCBI taxonID such as

	SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
	WHERE
	    n.ncbi_taxon_id = :taxonID
	AND tn.left_value > n. left_value
	AND tn.right_value < n.right_value
	AND tn.taxon_id = tnm.taxon_id
	AND tn.name_class = 'scientific_name'

BioPerl doesn't have a Taxonomy::biosql module yet (though this would  
seem like a worthwhile thing to add), so you can't use the  
Bio::DB::Taxonomy interface to do this against a BioSQL instance.

However, BioPerl does have support for the flat-file download of the  
NCBI taxonomy database and indexes it, so you can simply use  
Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download  
to achieve what you wanted to do in a less than 5 lines of perl.

Although the recursive implementation of Taxonomy::get_all_Descendants 
() won't be lightning fast, it may still be perfectly fine for your  
application - are you sure it is not?

	-hilmar

On Jun 18, 2007, at 12:21 AM, George Heller wrote:

> Thanks. And how can I assign the $node here in the below code, such  
> that I can reference it to a particular taxon id record? I want to  
> retrieve all the descendents from the taxonomy hierarchy, given a  
> particular taxon id.
>
>   I have a local db setup, in which I have uploaded data using the  
> load_ncbi_taxonomy.pl script.
>
>   Thanks.
>   George
>
>   Jason Stajich <jason at bioperl.org> wrote:
>     I assume you already figured out how to setup a local taxonomydb?
>
>
>   You just want the extant species/leaves of the tree
>
>
> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>
>
>
>   -jason
>     On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>     Hi all,
>
>
>     Can anyone point me to some example that uses the  
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at  
> this, and I am not quite sure how to implement it.
>
>
>     Thanks.
>     George
>
>
>   Sendu Bala <bix at sendu.me.uk> wrote:
>     George Heller wrote:
>     Hi all,
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon  
> ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children and so
>   on.
>
>
>   Any ideas on the way I can go about doing this?
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and  
> each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>   If you happen to code up something neat and efficient, why not  
> share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image.  Join our Network Research Panel  
> today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Jun 18 08:44:22 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:44:22 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
Message-ID: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>

Just curious - how do you cvs commit then to an external repository?  
Is that open in the firewall?

It is true though that corporations typically will not permit any  
encrypted outgoing traffic through their firewall except https.  
sf.net only supports https for svn, AFAIK.

	-hilmar

On Jun 17, 2007, at 10:34 PM, aaron.j.mackey at gsk.com wrote:

>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>
>>> As for access, the typical access is over http (or https).
>>
>> We're using svn+ssh here (NESCent)
>
> Let me just note that https is preferable to ssh for those poor slobs
> stuck behind a corporate firewall (svn happily prompts me for my proxy
> server's user/pass, then my https authentication realm's user/pass  
> - all
> then get cached in some .svn/ file that I don't have to worry about  
> again
> until my proxy server password changes once a month ...)
>
> -Aaron
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Jun 18 08:47:56 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:47:56 -0400
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <B9BDBD4A-962D-4E83-8151-5D6EA8B69D3B@gmx.net>

Sounds like a great idea to me. -hilmar

On Jun 18, 2007, at 6:44 AM, Sendu Bala wrote:

> When the test suite runs currently, most (the intent is all) tests  
> skip
> if the test would require network (internet) access. This is to avoid
> tests failing not due to bugs in Bioperl code, but due to temporarily
> inaccessible servers. This is also to make running the test suite  
> faster.
>
> To do a complete test you currently have to set BIOPERLDEBUG to true,
> which activates the network test but also increases verbosity. This
> actually causes a problem, since when running the entire test suite  
> the
> additional debug information is more a hindrance than a help, since  
> the
> reams of printed information can hide significant warnings that may  
> also
> get printed. Its also ugly.
>
> The solution is to divorce activation of network tests from the  
> request
> for verbosity. The obvious implementation is to have another  
> environment
> variable, perhaps BIOPERLNETWORK. However, there is an opportunity  
> to do
> something more appropriate. The running of networking tests should  
> be a
> choice given to every end-user installing Bioperl. Debugging
> information, on the other hand, is only of interest to the developer
> working on a specific module under test, so can be left as a 'hidden'
> env var.
>
>
> I have just committed one possible implementation along these lines.
>
> You say:
> perl Build.PL
> as normal, and if you seem to have internet access it asks you if  
> you'd
> like to run network tests. The default answer is no. If you answer  
> yes,
> network tests will be enabled.
>
> You can alternatively say:
> perl Build.PL --network
> and if you seem to have internet access, network tests will be  
> enabled.
>
> Then you run the tests:
> ./Build test
> Any tests written to support the new system will then skip network  
> tests
> if they haven't been enabled.
>
> The only test I've written to support the new system is t/ 
> RemoteBlast.t:
> ./Build test --test_files t/RemoteBlast.t --verbose
>
>
> Adding support to test scripts consists of the following changes:
>
> + use Module::Build;
> + my $build = Module::Build->current(get_options => { network =>  
> {} });
> + my $do_network_tests = $build->notes('network');
>
> ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests
> ---
> ! if (!$do_network_tests) { # skip network tests
>
>
> I propose adding this support to all test scripts that carry out  
> network
> tests. Does anyone have objections? Does anyone have alternate
> implementations that may be superior?
>
> I specifically suggest we don't use an env var in addition to the  
> above,
> because the multiple ways of doing things could lead to confusion.  
> Which
> takes priority? Did a user really have the networking tests turned on
> when he reported his test results?
>
>
> The one thing I need help with is identifying which tests attempt to
> access the internet. I think we caught most of them for the 1.5.2
> release, but I think there are more lurking around. Can anyone offer a
> way to systematically find at least the test scripts which access the
> internet, if not the specific tests within?
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 08:55:53 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 07:55:53 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
Message-ID: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>

On Jun 18, 2007, at 7:44 AM, Hilmar Lapp wrote:

> Just curious - how do you cvs commit then to an external repository?
> Is that open in the firewall?
>
> It is true though that corporations typically will not permit any
> encrypted outgoing traffic through their firewall except https.
> sf.net only supports https for svn, AFAIK.
>
> 	-hilmar

If so it may be better to allow https, though I don't know how Chris  
D. and others feel about it.

Did we make a decision as to the fate of cvs if we get svn up-and- 
running?  Keep it around (assuming svn commits would be carried over  
to cvs and vice versa)?  Or see what happens over time?

chris

From sdavis2 at mail.nih.gov  Mon Jun 18 09:05:50 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 18 Jun 2007 09:05:50 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
Message-ID: <4676832E.5080704@mail.nih.gov>

aaron.j.mackey at gsk.com wrote:
>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>
>>> As for access, the typical access is over http (or https).
>> We're using svn+ssh here (NESCent)
> 
> Let me just note that https is preferable to ssh for those poor slobs 
> stuck behind a corporate firewall (svn happily prompts me for my proxy 
> server's user/pass, then my https authentication realm's user/pass - all 
> then get cached in some .svn/ file that I don't have to worry about again 
> until my proxy server password changes once a month ...)

That would be my suggestion as well (although I added it only
parenthetically).

Sean

From hlapp at gmx.net  Mon Jun 18 09:13:27 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 09:13:27 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
Message-ID: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>


On Jun 18, 2007, at 8:55 AM, Chris Fields wrote:

> Did we make a decision as to the fate of cvs if we get svn up-and- 
> running?  Keep it around (assuming svn commits would be carried  
> over to cvs and vice versa)?  Or see what happens over time?

Let's not plan for having cvs and svn writable repositories in  
parallel - that would create an administrative nightmare. Once the  
tests complete, there'll be a clean cut-over.

What Jason suggested is to try and continue a read-only (anonymous)  
cvs repository, updated from the svn repository that the developers  
use, aside from an anonymous svn repository mirroring the writable  
one. This would primarily be for maintaining working URLs for those  
folks who http-linked into the anonymous cvs repository. What I added  
earlier is that even if that fails to be feasible, you can achieve  
the goal using some small CGI script and apache redirect to map CVS- 
style links to the anonymous svn repository.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 09:31:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 08:31:35 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>
Message-ID: <0E64DBD0-BBE9-411A-A146-70236EF558BB@uiuc.edu>


On Jun 18, 2007, at 8:13 AM, Hilmar Lapp wrote:

>
> On Jun 18, 2007, at 8:55 AM, Chris Fields wrote:
>
>> Did we make a decision as to the fate of cvs if we get svn up-and- 
>> running?  Keep it around (assuming svn commits would be carried  
>> over to cvs and vice versa)?  Or see what happens over time?
>
> Let's not plan for having cvs and svn writable repositories in  
> parallel - that would create an administrative nightmare. Once the  
> tests complete, there'll be a clean cut-over.

My thoughts as well.  Much simpler.

> What Jason suggested is to try and continue a read-only (anonymous)  
> cvs repository, updated from the svn repository that the developers  
> use, aside from an anonymous svn repository mirroring the writable  
> one. This would primarily be for maintaining working URLs for those  
> folks who http-linked into the anonymous cvs repository. What I  
> added earlier is that even if that fails to be feasible, you can  
> achieve the goal using some small CGI script and apache redirect to  
> map CVS-style links to the anonymous svn repository.
>
> 	-hilmar

I like the idea of a read-only cvs or a 'faux' cvs, though the former  
would initially be easier as we already have it available.  We could  
just lock it down at some switchover point to read-only (something I  
think Jason also suggested).

chris

From bix at sendu.me.uk  Mon Jun 18 09:13:33 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 14:13:33 +0100
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
Message-ID: <467684FD.3080300@sendu.me.uk>

Chris Fields wrote:
> 
> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>> If its going to be difficult and a hassle, for such an unnecessary 
>> thing I'm not sure its worth it. There are more pressing things to be 
>> done for Bioperl.
>>
>> If I can just run perltidy on the entire package and commit, I'd do 
>> it. If that's not appropriate, I won't.
> 
> The choices aren't necessarily all or nothing.  What about voluntary, 
> recommended use of a perltidy config file included with the 
> distribution, with additional 'caveats'?

I'm happy with that idea. Why not come up with something and make it 
available for us to try out?


Cheers,
Sendu.

From bix at sendu.me.uk  Mon Jun 18 09:26:36 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 14:26:36 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
Message-ID: <4676880C.9030009@sendu.me.uk>

Chris Fields wrote:
> If so it may be better to allow https, though I don't know how Chris  
> D. and others feel about it.

If it makes no difference to me as an end-user, I won't mind. But I 
won't want to enter my password even once, at the beginning of a 
session. If that's not possible with https, then ssh should be an option 
as well.


Unrelated, but it randomly just occurred to me: what happens to all the 
id lines at the top of modules? Eg:

$Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $

That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
I wish we would, since they caused me no end of hassles during the 1.5.2 
release, doing updates across branches.)


> Did we make a decision as to the fate of cvs if we get svn up-and- 
> running?  Keep it around (assuming svn commits would be carried over  
> to cvs and vice versa)?  Or see what happens over time?

Well, I don't think hard decisions are possible until we know how its 
going to work in practice. I tried setting up my own svn repository 
once, but didn't keep it and can't remember much about it.

So, I suppose we'll play it by ear and decide things later. Is someone 
out there actively doing something leading toward a demonstration of how 
it will be?

From cjfields at uiuc.edu  Mon Jun 18 09:58:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 08:58:34 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467684FD.3080300@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
Message-ID: <DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>


On Jun 18, 2007, at 8:13 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary
>>> thing I'm not sure its worth it. There are more pressing things  
>>> to be
>>> done for Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd do
>>> it. If that's not appropriate, I won't.
>>
>> The choices aren't necessarily all or nothing.  What about voluntary,
>> recommended use of a perltidy config file included with the
>> distribution, with additional 'caveats'?
>
> I'm happy with that idea. Why not come up with something and make it
> available for us to try out?
>
>
> Cheers,
> Sendu.

Will do.  Maybe something that conforms to PBP; there's a PBP  
perltidy config on perlmonks, along with some emacs/vim related bits:

http://www.perlmonks.org/?node_id=516501

chris

From sdavis2 at mail.nih.gov  Mon Jun 18 10:03:35 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 18 Jun 2007 10:03:35 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4676880C.9030009@sendu.me.uk>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
Message-ID: <467690B7.7090105@mail.nih.gov>

Sendu Bala wrote:
> Chris Fields wrote:
>> If so it may be better to allow https, though I don't know how Chris  
>> D. and others feel about it.
> 
> If it makes no difference to me as an end-user, I won't mind. But I 
> won't want to enter my password even once, at the beginning of a 
> session. If that's not possible with https, then ssh should be an option 
> as well.
> 
> 
> Unrelated, but it randomly just occurred to me: what happens to all the 
> id lines at the top of modules? Eg:
> 
> $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $
> 
> That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
> I wish we would, since they caused me no end of hassles during the 1.5.2 
> release, doing updates across branches.)

See here:

http://svnbook.red-bean.com/en/1.0/ch07s02.html

Check out the section at the bottom having to do with svn:keywords.

Sean

From akarger at CGR.Harvard.edu  Mon Jun 18 10:10:57 2007
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 18 Jun 2007 10:10:57 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <46751EC7.8020609@sheffield.ac.uk>
References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
	<46751EC7.8020609@sheffield.ac.uk>
Message-ID: <B9182BFF5B004245BABC12956EA6322E04AFA6BC@huls5.nucleus.harvard.edu>

 
> Just to clarify, subversion is available as command line for windows:
> http://subversion.tigris.org/project_packages.html
> 
> TortoiseSVN is another svn client with a GUI that integrates into the
> shell. I tried setting this up a while back to use ssh (via 
> PUTTY), but
> I wasn't successful. This may have been due to me just 
> starting out with
> svn or that it was harder to setup in an earlier version of 
> TortoiseSVN.
> 
> Does anyone have experience of setting up svn on Windows to 
> use ssh? If
> the changeover takes place, I'm happy to write some howto's 
> for setting
> up svn clients for Windows.

Here are some notes I wrote recently. I'm using this with command-line
svn, not TortoiseSVN. I would hope that it would work with Tortoise,
too, but I can't guarantee.

1. Run PuTTYgen (installed with PuTTY, probably in Start
menu->Programs->PuTTY) and follow directions to create a private key
file like C:\someplace\private_key.ppk and a public key. At this point,
you'll pick an ssh password, which is separate from your login password.

2. Get an account with the appropriate .ssh/authorized_keys file on the
host machine. (This is not Windows-specific. By the way, if you change
the lines of the authorized_keys file to start with, e.g., 
	command="svnserve -t -r /main/repos/dir",no-pty ssh-rsa AAAAB...
comment
then (a) you're more secure because users can't open a real shell on the
computer, and (b) users don't need to type the repository directory in
their svn co commands.)

3. Set your environment variables (My Computer->Properties. Advanced
Tab, click on Environment Variables. In the top half ("User variables
for ..."), click "New" and put in the variable name and value.

3a. Set the SVN_EDITOR environment variable to your favorite editor,
such as vim or emacs, or a full path to some other editor. If it's not
set, then either VISUAL or EDITOR must be set.

3b. Set the SVN_SSH environment variable to run PuTTY's "plink" program,
which is the Windows equivalent of command-line ssh. If you installed
PuTTY in the default location, set it to "C:/Program
Files/PuTTY/plink.exe". Note 1: use FORWARD slashes. Note 2: Include the
quotation marks in the environment variable.

4. When you want to start using svn, you'll need to run Pageant (Start
menu->Programs->PuTTY), select "Add Key", browse to your private key
file, and enter the ssh password you chose in step 1 (not your login
password). Pageant will stay running until you quit it or logout, so you
can have multiple svn checkins etc., and you only need to type in your
password once.

5. Now just run command-line svn commands the same way you would on UNIX
(modulo Windows' brain-dead shell).

-Amir Karger


From cjfields at uiuc.edu  Mon Jun 18 10:24:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 09:24:00 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4676880C.9030009@sendu.me.uk>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
Message-ID: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>

On Jun 18, 2007, at 8:26 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> If so it may be better to allow https, though I don't know how  
>> Chris  D. and others feel about it.
>
> If it makes no difference to me as an end-user, I won't mind. But I  
> won't want to enter my password even once, at the beginning of a  
> session. If that's not possible with https, then ssh should be an  
> option as well.

Aaron pointed out in a related post that https access is the  
preferred option behind a corporate firewall (svn prompts for proxy  
user/pass, then caches it).  Not sure how Jason/Hilmar/Chris D. feel  
about https or supporting both https+ssh.

...

>> Did we make a decision as to the fate of cvs if we get svn up-and-  
>> running?  Keep it around (assuming svn commits would be carried  
>> over  to cvs and vice versa)?  Or see what happens over time?
>
> Well, I don't think hard decisions are possible until we know how  
> its going to work in practice. I tried setting up my own svn  
> repository once, but didn't keep it and can't remember much about it.

Agree; we'll need to work out specifics once we know how things work  
out using cvs2svn.  I think the idea is to test using a smaller  
distribution (maybe network or db) and move up from there.

> So, I suppose we'll play it by ear and decide things later. Is  
> someone out there actively doing something leading toward a  
> demonstration of how it will be?

George Hartzell is going to test it out, I believe, and will post  
something when he can.

chris

From dmessina at wustl.edu  Mon Jun 18 10:54:31 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 18 Jun 2007 09:54:31 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
	<DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
Message-ID: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>

[Chris F]
> Will do.  Maybe something that conforms to PBP; there's a PBP
> perltidy config on perlmonks, along with some emacs/vim related bits:
>
> http://www.perlmonks.org/?node_id=516501


FYI, perltidy now has a built-in -pbp flag:

[from perltidy-20070508]
> -pbp, --perl-best-practices
> -pbp is an abbreviation for the parameters in the book Perl Best  
> Practices by Damian Conway:
>
>     -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1  
> -nsfs -nolq
>     -wbb="% + - * / x != == >= <= =~ !~ < > | & =
>           **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x="
> Note that the -st and -se flags make perltidy act as a filter on  
> one file only. These can be overridden with -nst and -nse if  
> necessary.
>
[full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ 
bin/perltidy]


Dave


From dmessina at wustl.edu  Mon Jun 18 11:04:10 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 18 Jun 2007 10:04:10 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>

Awesome, Sendu! Really glad you implemented this.


> Can anyone offer a
> way to systematically find at least the test scripts which access the
> internet, if not the specific tests within?

I think tests would be accessing the net indirectly through a BioPerl  
module (which may also be using indirect access), so it'd be hard to  
come up with a universal glob for that.

However:

	% grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l
	108

	% ls -1 bioperl-live/t | wc -l
	248

Less than half of the test files use BIOPERLDEBUG, so that narrows  
down the possibilities...

Dave


From bix at sendu.me.uk  Mon Jun 18 11:09:19 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 16:09:19 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
Message-ID: <4676A01F.30205@sendu.me.uk>

David Messina wrote:
>> Can anyone offer a
>> way to systematically find at least the test scripts which access the
>> internet, if not the specific tests within?
> 
> I think tests would be accessing the net indirectly through a BioPerl 
> module (which may also be using indirect access), so it'd be hard to 
> come up with a universal glob for that.
> 
> However:
> 
>     % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l
>     108
> 
>     % ls -1 bioperl-live/t | wc -l
>     248
> 
> Less than half of the test files use BIOPERLDEBUG, so that narrows down 
> the possibilities...

Not necessarily. The problem is that there may be test scripts that have 
never even tried to skip network tests, and therefore don't use 
BIOPERLDEBUG. (Or that chose their own way to decide when to skip.)

I was thinking along the lines of, does anyone know how to monitor 
accesses to the network card (or equivalent), getting information on 
which program (test script) requested the access?

From cjfields at uiuc.edu  Mon Jun 18 11:41:28 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 10:41:28 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
	<DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
	<67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>
Message-ID: <B3EDFCDD-0F3D-47C8-B3A8-A428F24B265A@uiuc.edu>


On Jun 18, 2007, at 9:54 AM, David Messina wrote:

> [Chris F]
>> Will do.  Maybe something that conforms to PBP; there's a PBP
>> perltidy config on perlmonks, along with some emacs/vim related bits:
>>
>> http://www.perlmonks.org/?node_id=516501
>
>
> FYI, perltidy now has a built-in -pbp flag:
>
> [from perltidy-20070508]
>> -pbp, --perl-best-practices
>> -pbp is an abbreviation for the parameters in the book Perl Best
>> Practices by Damian Conway:
>>
>>     -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1
>> -nsfs -nolq
>>     -wbb="% + - * / x != == >= <= =~ !~ < > | & =
>>           **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x="
>> Note that the -st and -se flags make perltidy act as a filter on
>> one file only. These can be overridden with -nst and -nse if
>> necessary.
>>
> [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/
> bin/perltidy]
>
>
> Dave

<slaps head>  Makes sense that would eventually be incorporated.

If so there's no need to include a config (unless we want to sway  
away from PBP-style).  We can just recommend everyone use that setting.

chris

From cjfields at uiuc.edu  Mon Jun 18 12:06:26 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 11:06:26 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676A01F.30205@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
Message-ID: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>


On Jun 18, 2007, at 10:09 AM, Sendu Bala wrote:

> David Messina wrote:
>>> ...
>> Less than half of the test files use BIOPERLDEBUG, so that narrows  
>> down
>> the possibilities...
>
> Not necessarily. The problem is that there may be test scripts that  
> have
> never even tried to skip network tests, and therefore don't use
> BIOPERLDEBUG. (Or that chose their own way to decide when to skip.)
>
> I was thinking along the lines of, does anyone know how to monitor
> accesses to the network card (or equivalent), getting information on
> which program (test script) requested the access?

EUtilities.t uses network tests predominately.  I'll switch over when  
I commit everything from the overhaul.

Couldn't you enable BIOPERLDEBUG, disable network access, then  
iterate through tests checking for those which fail or skip?  I think  
Test::Harness has a way to do this, using execute_tests().

chris


From bix at sendu.me.uk  Mon Jun 18 12:34:38 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 17:34:38 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
Message-ID: <4676B41E.3050706@sendu.me.uk>

Chris Fields wrote:
> Couldn't you enable BIOPERLDEBUG, disable network access, then iterate 
> through tests checking for those which fail or skip?

Yes, good idea, though my dev machine is also my email/webserver so I'd 
rather come up with an alternate solution than one involving 'disable 
network access'.

Still, that's what I'll probably end up doing. Cheers!


Oh, Chris, Spiros, how goes the Test::More conversion? I might want to 
wait for you to finish, or join in? If you're not going to have time to 
do any more in the next few weeks, can you please update 
http://www.bioperl.org/wiki/TestMoreProgress removing your name (or in 
the opposite case, add your name in)? Its not quite clear to me which 
tests are assigned to whom. Can someone clarify what the markings mean?

Cheers,
Sendu.

From cjfields at uiuc.edu  Mon Jun 18 12:43:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 11:43:31 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676B41E.3050706@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
	<4676B41E.3050706@sendu.me.uk>
Message-ID: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>


On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> Couldn't you enable BIOPERLDEBUG, disable network access, then  
>> iterate through tests checking for those which fail or skip?
>
> Yes, good idea, though my dev machine is also my email/webserver so  
> I'd rather come up with an alternate solution than one involving  
> 'disable network access'.
>
> Still, that's what I'll probably end up doing. Cheers!
>
>
> Oh, Chris, Spiros, how goes the Test::More conversion? I might want  
> to wait for you to finish, or join in? If you're not going to have  
> time to do any more in the next few weeks, can you please update  
> http://www.bioperl.org/wiki/TestMoreProgress removing your name (or  
> in the opposite case, add your name in)? Its not quite clear to me  
> which tests are assigned to whom. Can someone clarify what the  
> markings mean?
>
> Cheers,
> Sendu.

Not sure how far along spiros is; I handed it over after I finished  
up to the 'Q' tests.  In general the ones marked out have been  
converted over, ones with names next to them have been claimed.  If  
you need help I'll prob. start back up again to finish them off; we  
just need to divy them up.

chris

From george.heller at yahoo.com  Mon Jun 18 13:07:59 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 10:07:59 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net>
Message-ID: <218165.62089.qm@web56505.mail.re3.yahoo.com>

What exactly is the "node n" in the query below. When I issue this query, it says, 
   
  relation "node" does not exist.
   
  I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line,
   
  shift->throw_not_implemented();
   
  Thanks.
  George.

Hilmar Lapp <hlapp at gmx.net> wrote:
  I'm a bit confused - it sounds like you have set up a local BioSQL 
database and loaded the NCBI taxonomy into the database. You can now 
use simple SQL to retrieve all descendants of a node in the tree 
given its NCBI taxonID such as

SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
WHERE
n.ncbi_taxon_id = :taxonID
AND tn.left_value > n. left_value
AND tn.right_value < n.right_value
AND tn.taxon_id = tnm.taxon_id
AND tn.name_class = 'scientific_name'

BioPerl doesn't have a Taxonomy::biosql module yet (though this would 
seem like a worthwhile thing to add), so you can't use the 
Bio::DB::Taxonomy interface to do this against a BioSQL instance.

However, BioPerl does have support for the flat-file download of the 
NCBI taxonomy database and indexes it, so you can simply use 
Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download 
to achieve what you wanted to do in a less than 5 lines of perl.

Although the recursive implementation of Taxonomy::get_all_Descendants 
() won't be lightning fast, it may still be perfectly fine for your 
application - are you sure it is not?

-hilmar

On Jun 18, 2007, at 12:21 AM, George Heller wrote:

> Thanks. And how can I assign the $node here in the below code, such 
> that I can reference it to a particular taxon id record? I want to 
> retrieve all the descendents from the taxonomy hierarchy, given a 
> particular taxon id.
>
> I have a local db setup, in which I have uploaded data using the 
> load_ncbi_taxonomy.pl script.
>
> Thanks.
> George
>
> Jason Stajich wrote:
> I assume you already figured out how to setup a local taxonomydb?
>
>
> You just want the extant species/leaves of the tree
>
>
> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>
>
>
> -jason
> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
> Hi all,
>
>
> Can anyone point me to some example that uses the 
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at 
> this, and I am not quite sure how to implement it.
>
>
> Thanks.
> George
>
>
> Sendu Bala wrote:
> George Heller wrote:
> Hi all,
>
>
> I am looking at extracting the taxonomy hierarchy for some taxon 
> ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
>
>
> Any ideas on the way I can go about doing this?
>
>
> Well, you'll use Bio::DB::Taxonomy presumably, and 
> each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
>
>
> If you happen to code up something neat and efficient, why not 
> share it
> with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
> ---------------------------------
> Shape Yahoo! in your own image. Join our Network Research Panel 
> today!
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================


---------------------------------
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 

From jason at bioperl.org  Mon Jun 18 13:53:28 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 10:53:28 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com>
References: <218165.62089.qm@web56505.mail.re3.yahoo.com>
Message-ID: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org>

It is implemented in the implementing class - DB::Taxonomy is just  
the base class. For example see the flatfile implementation  
Bio::DB::Taxonomy::flatfile

See the scripts/taxa/local_taxonomydb_query.PLS for example using it:
nodes and names are from NCBI taxonomy database.

Here is an un-debugged copy+paste for your question that *should* work.

use Bio::DB::Taxonomy
my $idx_dir = '/tmp';

my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                -nodesfile => $nodesfile,
                                -namesfile => $namesfile,
                                -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;


-jason

On Jun 18, 2007, at 10:07 AM, George Heller wrote:

> What exactly is the "node n" in the query below. When I issue this  
> query, it says,
>
>   relation "node" does not exist.
>
>   I tried to use the get_all_Descendents method but it looks like  
> in order to do a recursive call it calls the method  
> each_Descendent. This method is not implemented in  
> Bio::DB::Taxonomy. It just has a single line,
>
>   shift->throw_not_implemented();
>
>   Thanks.
>   George.
>
> Hilmar Lapp <hlapp at gmx.net> wrote:
>   I'm a bit confused - it sounds like you have set up a local BioSQL
> database and loaded the NCBI taxonomy into the database. You can now
> use simple SQL to retrieve all descendants of a node in the tree
> given its NCBI taxonID such as
>
> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
> WHERE
> n.ncbi_taxon_id = :taxonID
> AND tn.left_value > n. left_value
> AND tn.right_value < n.right_value
> AND tn.taxon_id = tnm.taxon_id
> AND tn.name_class = 'scientific_name'
>
> BioPerl doesn't have a Taxonomy::biosql module yet (though this would
> seem like a worthwhile thing to add), so you can't use the
> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
> However, BioPerl does have support for the flat-file download of the
> NCBI taxonomy database and indexes it, so you can simply use
> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download
> to achieve what you wanted to do in a less than 5 lines of perl.
>
> Although the recursive implementation of Taxonomy::get_all_Descendants
> () won't be lightning fast, it may still be perfectly fine for your
> application - are you sure it is not?
>
> -hilmar
>
> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>> Thanks. And how can I assign the $node here in the below code, such
>> that I can reference it to a particular taxon id record? I want to
>> retrieve all the descendents from the taxonomy hierarchy, given a
>> particular taxon id.
>>
>> I have a local db setup, in which I have uploaded data using the
>> load_ncbi_taxonomy.pl script.
>>
>> Thanks.
>> George
>>
>> Jason Stajich wrote:
>> I assume you already figured out how to setup a local taxonomydb?
>>
>>
>> You just want the extant species/leaves of the tree
>>
>>
>> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>>
>>
>>
>> -jason
>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>> Hi all,
>>
>>
>> Can anyone point me to some example that uses the
>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>> this, and I am not quite sure how to implement it.
>>
>>
>> Thanks.
>> George
>>
>>
>> Sendu Bala wrote:
>> George Heller wrote:
>> Hi all,
>>
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon
>> ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children and so
>> on.
>>
>>
>> Any ideas on the way I can go about doing this?
>>
>>
>> Well, you'll use Bio::DB::Taxonomy presumably, and
>> each_Descendent in
>> some kind of looping structure. Most easily a recursing sub.
>>
>>
>> If you happen to code up something neat and efficient, why not
>> share it
>> with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Shape Yahoo! in your own image. Join our Network Research Panel
>> today!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
>
>
>
> ---------------------------------
> Take the Internet to Go: Yahoo!Go puts the Internet in your pocket:  
> mail, news, photos & more.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hlapp at gmx.net  Mon Jun 18 18:10:00 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:10:00 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
	<278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>
Message-ID: <989DBD68-896E-4FB9-9413-4A1060E88ABD@gmx.net>

https is working fine for me for sf.net repositories, and I only have  
to enter the password upon first commit (since checkout doesn't even  
need a password).

	-hilmar

On Jun 18, 2007, at 10:24 AM, Chris Fields wrote:

> Not sure how Jason/Hilmar/Chris D. feel about https or supporting  
> both https+ssh

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From george.heller at yahoo.com  Mon Jun 18 18:18:21 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 15:18:21 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org>
Message-ID: <904670.24974.qm@web56513.mail.re3.yahoo.com>

I tried running the below mentioned script and I seem to be getting the following error:
   
  Weak references are not implemented in the version of perl at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76.
Compilation failed in require at my.pl line 7.
BEGIN failed--compilation aborted at my.pl line 7.

  My script looks something like,
   
  #!/usr/bin/perl
  use strict;
#use warnings;
use DBI;
  use Bio::Tree::Node;
use Bio::DB::Taxonomy;
use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
  
my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                               -nodesfile => $nodesfile,
                               -namesfile => $namesfile,
                               -directory => $idx_dir);
 my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  
      foreach $field (@extant_children) {
         print "$field";
         print "|";
         print "\n";
      }

  And I am running the script using the command,
   
  perl myscript.pl -v --names names.dmp --nodes nodes.dmp
   
  and I have the nodes.dmp and names.dmp files in the current directory.
   
  Thanks,
  George
  

Jason Stajich <jason at bioperl.org> wrote:
  It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile  

  See the scripts/taxa/local_taxonomydb_query.PLS for example using it:
  nodes and names are from NCBI taxonomy database.
  

  Here is an un-debugged copy+paste for your question that *should* work.
  

  use Bio::DB::Taxonomy
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
    my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                 -nodesfile => $nodesfile,
                                 -namesfile => $namesfile,
                                 -directory => $idx_dir);
     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  -jason

    On Jun 18, 2007, at 10:07 AM, George Heller wrote:

    What exactly is the "node n" in the query below. When I issue this query, it says, 
  

    relation "node" does not exist.
  

    I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line,
  

    shift->throw_not_implemented();
  

    Thanks.
    George.
  

  Hilmar Lapp <hlapp at gmx.net> wrote:
    I'm a bit confused - it sounds like you have set up a local BioSQL 
  database and loaded the NCBI taxonomy into the database. You can now 
  use simple SQL to retrieve all descendants of a node in the tree 
  given its NCBI taxonID such as
  

  SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
  WHERE
  n.ncbi_taxon_id = :taxonID
  AND tn.left_value > n. left_value
  AND tn.right_value < n.right_value
  AND tn.taxon_id = tnm.taxon_id
  AND tn.name_class = 'scientific_name'
  

  BioPerl doesn't have a Taxonomy::biosql module yet (though this would 
  seem like a worthwhile thing to add), so you can't use the 
  Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

  However, BioPerl does have support for the flat-file download of the 
  NCBI taxonomy database and indexes it, so you can simply use 
  Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download 
  to achieve what you wanted to do in a less than 5 lines of perl.
  

  Although the recursive implementation of Taxonomy::get_all_Descendants 
  () won't be lightning fast, it may still be perfectly fine for your 
  application - are you sure it is not?
  

  -hilmar
  

  On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

    Thanks. And how can I assign the $node here in the below code, such 
  that I can reference it to a particular taxon id record? I want to 
  retrieve all the descendents from the taxonomy hierarchy, given a 
  particular taxon id.
  

  I have a local db setup, in which I have uploaded data using the 
  load_ncbi_taxonomy.pl script.
  

  Thanks.
  George
  

  Jason Stajich wrote:
  I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
  

  -jason
  On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

  Hi all,
  

  Can anyone point me to some example that uses the 
  get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at 
  this, and I am not quite sure how to implement it.
  

  Thanks.
  George
  

  Sendu Bala wrote:
  George Heller wrote:
  Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon 
  ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and 
  each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not 
  share it
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image. Join our Network Research Panel 
  today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Need a vacation? Get great deals to amazing places on Yahoo! Travel.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  -- 
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Bored stiff? Loosen up...
Download and play hundreds of games for free on Yahoo! Games.

From hlapp at gmx.net  Mon Jun 18 18:27:19 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:27:19 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com>
References: <218165.62089.qm@web56505.mail.re3.yahoo.com>
Message-ID: <DEB0D23B-4FEC-418A-8AAB-FF4CBF4DAF65@gmx.net>


On Jun 18, 2007, at 1:07 PM, George Heller wrote:

> What exactly is the "node n" in the query below. When I issue this  
> query, it says,

Sorry, replace with "taxon". Jason answered the rest.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 18:33:40 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 17:33:40 -0500
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <904670.24974.qm@web56513.mail.re3.yahoo.com>
References: <904670.24974.qm@web56513.mail.re3.yahoo.com>
Message-ID: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>

As the error implies your local version of perl doesn't seem support  
weak references, which means it doesn't have Scalar::Utils (which was  
added to core after perl 5.6.1, I think).  Try installing  
Scalar::Utils to see what happens.

chris

On Jun 18, 2007, at 5:18 PM, George Heller wrote:

> I tried running the below mentioned script and I seem to be getting  
> the following error:
>
>   Weak references are not implemented in the version of perl at / 
> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ 
> Bio/Tree/Node.pm line 76.
> Compilation failed in require at my.pl line 7.
> BEGIN failed--compilation aborted at my.pl line 7.
>
>   My script looks something like,
>
>   #!/usr/bin/perl
>   use strict;
> #use warnings;
> use DBI;
>   use Bio::Tree::Node;
> use Bio::DB::Taxonomy;
> use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
>
> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                -nodesfile => $nodesfile,
>                                -namesfile => $namesfile,
>                                -directory => $idx_dir);
>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>       foreach $field (@extant_children) {
>          print "$field";
>          print "|";
>          print "\n";
>       }
>
>   And I am running the script using the command,
>
>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>   and I have the nodes.dmp and names.dmp files in the current  
> directory.
>
>   Thanks,
>   George
>
>
> Jason Stajich <jason at bioperl.org> wrote:
>   It is implemented in the implementing class - DB::Taxonomy is  
> just the base class. For example see the flatfile implementation  
> Bio::DB::Taxonomy::flatfile
>
>   See the scripts/taxa/local_taxonomydb_query.PLS for example using  
> it:
>   nodes and names are from NCBI taxonomy database.
>
>
>   Here is an un-debugged copy+paste for your question that *should*  
> work.
>
>
>   use Bio::DB::Taxonomy
>   my $idx_dir = '/tmp';
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                  -nodesfile => $nodesfile,
>                                  -namesfile => $namesfile,
>                                  -directory => $idx_dir);
>      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>
>
>
>   -jason
>
>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>     What exactly is the "node n" in the query below. When I issue  
> this query, it says,
>
>
>     relation "node" does not exist.
>
>
>     I tried to use the get_all_Descendents method but it looks like  
> in order to do a recursive call it calls the method  
> each_Descendent. This method is not implemented in  
> Bio::DB::Taxonomy. It just has a single line,
>
>
>     shift->throw_not_implemented();
>
>
>     Thanks.
>     George.
>
>
>   Hilmar Lapp <hlapp at gmx.net> wrote:
>     I'm a bit confused - it sounds like you have set up a local BioSQL
>   database and loaded the NCBI taxonomy into the database. You can now
>   use simple SQL to retrieve all descendants of a node in the tree
>   given its NCBI taxonID such as
>
>
>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>   WHERE
>   n.ncbi_taxon_id = :taxonID
>   AND tn.left_value > n. left_value
>   AND tn.right_value < n.right_value
>   AND tn.taxon_id = tnm.taxon_id
>   AND tn.name_class = 'scientific_name'
>
>
>   BioPerl doesn't have a Taxonomy::biosql module yet (though this  
> would
>   seem like a worthwhile thing to add), so you can't use the
>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>   However, BioPerl does have support for the flat-file download of the
>   NCBI taxonomy database and indexes it, so you can simply use
>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile  
> download
>   to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>   Although the recursive implementation of  
> Taxonomy::get_all_Descendants
>   () won't be lightning fast, it may still be perfectly fine for your
>   application - are you sure it is not?
>
>
>   -hilmar
>
>
>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>     Thanks. And how can I assign the $node here in the below code,  
> such
>   that I can reference it to a particular taxon id record? I want to
>   retrieve all the descendents from the taxonomy hierarchy, given a
>   particular taxon id.
>
>
>   I have a local db setup, in which I have uploaded data using the
>   load_ncbi_taxonomy.pl script.
>
>
>   Thanks.
>   George
>
>
>   Jason Stajich wrote:
>   I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>   You just want the extant species/leaves of the tree
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descedents;
>
>
>
>
>
>
>   -jason
>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>   Hi all,
>
>
>
>
>   Can anyone point me to some example that uses the
>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>   this, and I am not quite sure how to implement it.
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>   Sendu Bala wrote:
>   George Heller wrote:
>   Hi all,
>
>
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon
>   ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children and so
>   on.
>
>
>
>
>   Any ideas on the way I can go about doing this?
>
>
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and
>   each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>   If you happen to code up something neat and efficient, why not
>   share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image. Join our Network Research Panel
>   today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Need a vacation? Get great deals to amazing places on Yahoo! Travel.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Take the Internet to Go: Yahoo!Go puts the Internet in your  
> pocket: mail, news, photos & more.
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Bored stiff? Loosen up...
> Download and play hundreds of games for free on Yahoo! Games.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Mon Jun 18 18:50:38 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:50:38 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>
References: <904670.24974.qm@web56513.mail.re3.yahoo.com>
	<707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>
Message-ID: <F433CCB4-781D-480E-8EF5-CD68E70B27B8@gmx.net>

The perl version appears to be 5.8.5 though, so something strange  
appears to be going on too.

George, can you please post the output of

	$ /usr/bin/perl -V

-hilmar

On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:

> As the error implies your local version of perl doesn't seem support
> weak references, which means it doesn't have Scalar::Utils (which was
> added to core after perl 5.6.1, I think).  Try installing
> Scalar::Utils to see what happens.
>
> chris
>
> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>> I tried running the below mentioned script and I seem to be getting
>> the following error:
>>
>>   Weak references are not implemented in the version of perl at /
>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>> Bio/Tree/Node.pm line 76.
>> Compilation failed in require at my.pl line 7.
>> BEGIN failed--compilation aborted at my.pl line 7.
>>
>>   My script looks something like,
>>
>>   #!/usr/bin/perl
>>   use strict;
>> #use warnings;
>> use DBI;
>>   use Bio::Tree::Node;
>> use Bio::DB::Taxonomy;
>> use Bio::DB::Taxonomy::flatfile;
>>   my $idx_dir = '/tmp';
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>>                                -nodesfile => $nodesfile,
>>                                -namesfile => $namesfile,
>>                                -directory => $idx_dir);
>>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>  my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>       foreach $field (@extant_children) {
>>          print "$field";
>>          print "|";
>>          print "\n";
>>       }
>>
>>   And I am running the script using the command,
>>
>>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>
>>   and I have the nodes.dmp and names.dmp files in the current
>> directory.
>>
>>   Thanks,
>>   George
>>
>>
>> Jason Stajich <jason at bioperl.org> wrote:
>>   It is implemented in the implementing class - DB::Taxonomy is
>> just the base class. For example see the flatfile implementation
>> Bio::DB::Taxonomy::flatfile
>>
>>   See the scripts/taxa/local_taxonomydb_query.PLS for example using
>> it:
>>   nodes and names are from NCBI taxonomy database.
>>
>>
>>   Here is an un-debugged copy+paste for your question that *should*
>> work.
>>
>>
>>   use Bio::DB::Taxonomy
>>   my $idx_dir = '/tmp';
>>
>>
>>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>>     my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>>                                  -nodesfile => $nodesfile,
>>                                  -namesfile => $namesfile,
>>                                  -directory => $idx_dir);
>>      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>  my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>
>>
>>
>>   -jason
>>
>>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>
>>     What exactly is the "node n" in the query below. When I issue
>> this query, it says,
>>
>>
>>     relation "node" does not exist.
>>
>>
>>     I tried to use the get_all_Descendents method but it looks like
>> in order to do a recursive call it calls the method
>> each_Descendent. This method is not implemented in
>> Bio::DB::Taxonomy. It just has a single line,
>>
>>
>>     shift->throw_not_implemented();
>>
>>
>>     Thanks.
>>     George.
>>
>>
>>   Hilmar Lapp <hlapp at gmx.net> wrote:
>>     I'm a bit confused - it sounds like you have set up a local  
>> BioSQL
>>   database and loaded the NCBI taxonomy into the database. You can  
>> now
>>   use simple SQL to retrieve all descendants of a node in the tree
>>   given its NCBI taxonID such as
>>
>>
>>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>>   WHERE
>>   n.ncbi_taxon_id = :taxonID
>>   AND tn.left_value > n. left_value
>>   AND tn.right_value < n.right_value
>>   AND tn.taxon_id = tnm.taxon_id
>>   AND tn.name_class = 'scientific_name'
>>
>>
>>   BioPerl doesn't have a Taxonomy::biosql module yet (though this
>> would
>>   seem like a worthwhile thing to add), so you can't use the
>>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>
>>
>>   However, BioPerl does have support for the flat-file download of  
>> the
>>   NCBI taxonomy database and indexes it, so you can simply use
>>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>> download
>>   to achieve what you wanted to do in a less than 5 lines of perl.
>>
>>
>>   Although the recursive implementation of
>> Taxonomy::get_all_Descendants
>>   () won't be lightning fast, it may still be perfectly fine for your
>>   application - are you sure it is not?
>>
>>
>>   -hilmar
>>
>>
>>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>
>>
>>     Thanks. And how can I assign the $node here in the below code,
>> such
>>   that I can reference it to a particular taxon id record? I want to
>>   retrieve all the descendents from the taxonomy hierarchy, given a
>>   particular taxon id.
>>
>>
>>   I have a local db setup, in which I have uploaded data using the
>>   load_ncbi_taxonomy.pl script.
>>
>>
>>   Thanks.
>>   George
>>
>>
>>   Jason Stajich wrote:
>>   I assume you already figured out how to setup a local taxonomydb?
>>
>>
>>
>>
>>   You just want the extant species/leaves of the tree
>>
>>
>>
>>
>>   my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descedents;
>>
>>
>>
>>
>>
>>
>>   -jason
>>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>>
>>   Hi all,
>>
>>
>>
>>
>>   Can anyone point me to some example that uses the
>>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>>   this, and I am not quite sure how to implement it.
>>
>>
>>
>>
>>   Thanks.
>>   George
>>
>>
>>
>>
>>   Sendu Bala wrote:
>>   George Heller wrote:
>>   Hi all,
>>
>>
>>
>>
>>   I am looking at extracting the taxonomy hierarchy for some taxon
>>   ids.
>>   What I plan to do is, for a given taxon id, say 33090, I want to
>>   extract all taxon ids that are children of this species. I do not
>>   just want the immediate children, but the children's children  
>> and so
>>   on.
>>
>>
>>
>>
>>   Any ideas on the way I can go about doing this?
>>
>>
>>
>>
>>   Well, you'll use Bio::DB::Taxonomy presumably, and
>>   each_Descendent in
>>   some kind of looping structure. Most easily a recursing sub.
>>
>>
>>
>>
>>   If you happen to code up something neat and efficient, why not
>>   share it
>>   with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Shape Yahoo! in your own image. Join our Network Research Panel
>>   today!
>>   _______________________________________________
>>   Bioperl-l mailing list
>>   Bioperl-l at lists.open-bio.org
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>>
>>   --
>>   Jason Stajich
>>   jason at bioperl.org
>>   http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Need a vacation? Get great deals to amazing places on Yahoo!  
>> Travel.
>>   _______________________________________________
>>   Bioperl-l mailing list
>>   Bioperl-l at lists.open-bio.org
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>   --
>>   ===========================================================
>>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>>   ===========================================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Take the Internet to Go: Yahoo!Go puts the Internet in your
>> pocket: mail, news, photos & more.
>>
>>
>>     --
>>   Jason Stajich
>>   jason at bioperl.org
>>   http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Bored stiff? Loosen up...
>> Download and play hundreds of games for free on Yahoo! Games.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From george.heller at yahoo.com  Mon Jun 18 19:05:42 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 16:05:42 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F433CCB4-781D-480E-8EF5-CD68E70B27B8@gmx.net>
Message-ID: <706979.34648.qm@web56509.mail.re3.yahoo.com>

This is the output of /usr/bin/perl -V

Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
  Platform:
    osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
    uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.3.4'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  
Characteristics of this binary (from libperl):
  Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
  Built under linux
  Compiled at Jul 24 2006 18:28:10
  @INC:
    /usr/lib/perl5/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/5.8.5
    /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.5
    /usr/lib/perl5/site_perl/5.8.4
    /usr/lib/perl5/site_perl/5.8.3
    /usr/lib/perl5/site_perl/5.8.2
    /usr/lib/perl5/site_perl/5.8.1
    /usr/lib/perl5/site_perl/5.8.0
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.5
    /usr/lib/perl5/vendor_perl/5.8.4
    /usr/lib/perl5/vendor_perl/5.8.3
    /usr/lib/perl5/vendor_perl/5.8.2
    /usr/lib/perl5/vendor_perl/5.8.1
    /usr/lib/perl5/vendor_perl/5.8.0
    /usr/lib/perl5/vendor_perl
   
  Thanks.
  George
    .

Hilmar Lapp <hlapp at gmx.net> wrote:
  The perl version appears to be 5.8.5 though, so something strange 
appears to be going on too.

George, can you please post the output of

$ /usr/bin/perl -V

-hilmar

On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:

> As the error implies your local version of perl doesn't seem support
> weak references, which means it doesn't have Scalar::Utils (which was
> added to core after perl 5.6.1, I think). Try installing
> Scalar::Utils to see what happens.
>
> chris
>
> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>> I tried running the below mentioned script and I seem to be getting
>> the following error:
>>
>> Weak references are not implemented in the version of perl at /
>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>> Bio/Tree/Node.pm line 76.
>> Compilation failed in require at my.pl line 7.
>> BEGIN failed--compilation aborted at my.pl line 7.
>>
>> My script looks something like,
>>
>> #!/usr/bin/perl
>> use strict;
>> #use warnings;
>> use DBI;
>> use Bio::Tree::Node;
>> use Bio::DB::Taxonomy;
>> use Bio::DB::Taxonomy::flatfile;
>> my $idx_dir = '/tmp';
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>> -nodesfile => $nodesfile,
>> -namesfile => $namesfile,
>> -directory => $idx_dir);
>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>> foreach $field (@extant_children) {
>> print "$field";
>> print "|";
>> print "\n";
>> }
>>
>> And I am running the script using the command,
>>
>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>
>> and I have the nodes.dmp and names.dmp files in the current
>> directory.
>>
>> Thanks,
>> George
>>
>>
>> Jason Stajich wrote:
>> It is implemented in the implementing class - DB::Taxonomy is
>> just the base class. For example see the flatfile implementation
>> Bio::DB::Taxonomy::flatfile
>>
>> See the scripts/taxa/local_taxonomydb_query.PLS for example using
>> it:
>> nodes and names are from NCBI taxonomy database.
>>
>>
>> Here is an un-debugged copy+paste for your question that *should*
>> work.
>>
>>
>> use Bio::DB::Taxonomy
>> my $idx_dir = '/tmp';
>>
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>> -nodesfile => $nodesfile,
>> -namesfile => $namesfile,
>> -directory => $idx_dir);
>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>
>>
>>
>> -jason
>>
>> On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>
>> What exactly is the "node n" in the query below. When I issue
>> this query, it says,
>>
>>
>> relation "node" does not exist.
>>
>>
>> I tried to use the get_all_Descendents method but it looks like
>> in order to do a recursive call it calls the method
>> each_Descendent. This method is not implemented in
>> Bio::DB::Taxonomy. It just has a single line,
>>
>>
>> shift->throw_not_implemented();
>>
>>
>> Thanks.
>> George.
>>
>>
>> Hilmar Lapp wrote:
>> I'm a bit confused - it sounds like you have set up a local 
>> BioSQL
>> database and loaded the NCBI taxonomy into the database. You can 
>> now
>> use simple SQL to retrieve all descendants of a node in the tree
>> given its NCBI taxonID such as
>>
>>
>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>> WHERE
>> n.ncbi_taxon_id = :taxonID
>> AND tn.left_value > n. left_value
>> AND tn.right_value < n.right_value
>> AND tn.taxon_id = tnm.taxon_id
>> AND tn.name_class = 'scientific_name'
>>
>>
>> BioPerl doesn't have a Taxonomy::biosql module yet (though this
>> would
>> seem like a worthwhile thing to add), so you can't use the
>> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>
>>
>> However, BioPerl does have support for the flat-file download of 
>> the
>> NCBI taxonomy database and indexes it, so you can simply use
>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>> download
>> to achieve what you wanted to do in a less than 5 lines of perl.
>>
>>
>> Although the recursive implementation of
>> Taxonomy::get_all_Descendants
>> () won't be lightning fast, it may still be perfectly fine for your
>> application - are you sure it is not?
>>
>>
>> -hilmar
>>
>>
>> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>
>>
>> Thanks. And how can I assign the $node here in the below code,
>> such
>> that I can reference it to a particular taxon id record? I want to
>> retrieve all the descendents from the taxonomy hierarchy, given a
>> particular taxon id.
>>
>>
>> I have a local db setup, in which I have uploaded data using the
>> load_ncbi_taxonomy.pl script.
>>
>>
>> Thanks.
>> George
>>
>>
>> Jason Stajich wrote:
>> I assume you already figured out how to setup a local taxonomydb?
>>
>>
>>
>>
>> You just want the extant species/leaves of the tree
>>
>>
>>
>>
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descedents;
>>
>>
>>
>>
>>
>>
>> -jason
>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>>
>> Hi all,
>>
>>
>>
>>
>> Can anyone point me to some example that uses the
>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>> this, and I am not quite sure how to implement it.
>>
>>
>>
>>
>> Thanks.
>> George
>>
>>
>>
>>
>> Sendu Bala wrote:
>> George Heller wrote:
>> Hi all,
>>
>>
>>
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon
>> ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children 
>> and so
>> on.
>>
>>
>>
>>
>> Any ideas on the way I can go about doing this?
>>
>>
>>
>>
>> Well, you'll use Bio::DB::Taxonomy presumably, and
>> each_Descendent in
>> some kind of looping structure. Most easily a recursing sub.
>>
>>
>>
>>
>> If you happen to code up something neat and efficient, why not
>> share it
>> with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Shape Yahoo! in your own image. Join our Network Research Panel
>> today!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Need a vacation? Get great deals to amazing places on Yahoo! 
>> Travel.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> --
>> ===========================================================
>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Take the Internet to Go: Yahoo!Go puts the Internet in your
>> pocket: mail, news, photos & more.
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Bored stiff? Loosen up...
>> Download and play hundreds of games for free on Yahoo! Games.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================


---------------------------------
Expecting? Get great news right away with email Auto-Check.
Try the Yahoo! Mail Beta.

From jason at bioperl.org  Mon Jun 18 19:22:08 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 16:22:08 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <706979.34648.qm@web56509.mail.re3.yahoo.com>
References: <706979.34648.qm@web56509.mail.re3.yahoo.com>
Message-ID: <C93DF7A1-20AC-4474-BBC6-0C2598406EEB@bioperl.org>

Try installing the latest Scalar::Util

On Jun 18, 2007, at 4:05 PM, George Heller wrote:

> This is the output of /usr/bin/perl -V
>
> Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
>   Platform:
>     osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386- 
> linux-thread-multi
>     uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>     config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>     hint=recommended, useposix=true, d_sigaction=define
>     usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>     useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
>     use64bitint=undef use64bitall=undef uselongdouble=undef
>     usemymalloc=n, bincompat5005=undef
>   Compiler:
>     cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- 
> strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>     optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>     cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- 
> aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>     ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)',  
> gccosandvers=''
>     intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>     d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>     ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>     alignbytes=4, prototype=define
>   Linker and Libraries:
>     ld='gcc', ldflags =' -L/usr/local/lib'
>     libpth=/usr/local/lib /lib /usr/lib
>     libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>     perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>     libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>     gnulibc_version='2.3.4'
>   Dynamic Linking:
>     dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,- 
> E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>     cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
> Characteristics of this binary (from libperl):
>   Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>   Built under linux
>   Compiled at Jul 24 2006 18:28:10
>   @INC:
>     /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/5.8.5
>     /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.5
>     /usr/lib/perl5/site_perl/5.8.4
>     /usr/lib/perl5/site_perl/5.8.3
>     /usr/lib/perl5/site_perl/5.8.2
>     /usr/lib/perl5/site_perl/5.8.1
>     /usr/lib/perl5/site_perl/5.8.0
>     /usr/lib/perl5/site_perl
>     /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.5
>     /usr/lib/perl5/vendor_perl/5.8.4
>     /usr/lib/perl5/vendor_perl/5.8.3
>     /usr/lib/perl5/vendor_perl/5.8.2
>     /usr/lib/perl5/vendor_perl/5.8.1
>     /usr/lib/perl5/vendor_perl/5.8.0
>     /usr/lib/perl5/vendor_perl
>
>   Thanks.
>   George
>     .
>
> Hilmar Lapp <hlapp at gmx.net> wrote:
>   The perl version appears to be 5.8.5 though, so something strange
> appears to be going on too.
>
> George, can you please post the output of
>
> $ /usr/bin/perl -V
>
> -hilmar
>
> On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>> As the error implies your local version of perl doesn't seem support
>> weak references, which means it doesn't have Scalar::Utils (which was
>> added to core after perl 5.6.1, I think). Try installing
>> Scalar::Utils to see what happens.
>>
>> chris
>>
>> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>>
>>> I tried running the below mentioned script and I seem to be getting
>>> the following error:
>>>
>>> Weak references are not implemented in the version of perl at /
>>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>>> Bio/Tree/Node.pm line 76.
>>> Compilation failed in require at my.pl line 7.
>>> BEGIN failed--compilation aborted at my.pl line 7.
>>>
>>> My script looks something like,
>>>
>>> #!/usr/bin/perl
>>> use strict;
>>> #use warnings;
>>> use DBI;
>>> use Bio::Tree::Node;
>>> use Bio::DB::Taxonomy;
>>> use Bio::DB::Taxonomy::flatfile;
>>> my $idx_dir = '/tmp';
>>>
>>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>>> -nodesfile => $nodesfile,
>>> -namesfile => $namesfile,
>>> -directory => $idx_dir);
>>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descendents;
>>>
>>> foreach $field (@extant_children) {
>>> print "$field";
>>> print "|";
>>> print "\n";
>>> }
>>>
>>> And I am running the script using the command,
>>>
>>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>>
>>> and I have the nodes.dmp and names.dmp files in the current
>>> directory.
>>>
>>> Thanks,
>>> George
>>>
>>>
>>> Jason Stajich wrote:
>>> It is implemented in the implementing class - DB::Taxonomy is
>>> just the base class. For example see the flatfile implementation
>>> Bio::DB::Taxonomy::flatfile
>>>
>>> See the scripts/taxa/local_taxonomydb_query.PLS for example using
>>> it:
>>> nodes and names are from NCBI taxonomy database.
>>>
>>>
>>> Here is an un-debugged copy+paste for your question that *should*
>>> work.
>>>
>>>
>>> use Bio::DB::Taxonomy
>>> my $idx_dir = '/tmp';
>>>
>>>
>>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>>> -nodesfile => $nodesfile,
>>> -namesfile => $namesfile,
>>> -directory => $idx_dir);
>>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descendents;
>>>
>>>
>>>
>>>
>>> -jason
>>>
>>> On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>>
>>> What exactly is the "node n" in the query below. When I issue
>>> this query, it says,
>>>
>>>
>>> relation "node" does not exist.
>>>
>>>
>>> I tried to use the get_all_Descendents method but it looks like
>>> in order to do a recursive call it calls the method
>>> each_Descendent. This method is not implemented in
>>> Bio::DB::Taxonomy. It just has a single line,
>>>
>>>
>>> shift->throw_not_implemented();
>>>
>>>
>>> Thanks.
>>> George.
>>>
>>>
>>> Hilmar Lapp wrote:
>>> I'm a bit confused - it sounds like you have set up a local
>>> BioSQL
>>> database and loaded the NCBI taxonomy into the database. You can
>>> now
>>> use simple SQL to retrieve all descendants of a node in the tree
>>> given its NCBI taxonID such as
>>>
>>>
>>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>>> WHERE
>>> n.ncbi_taxon_id = :taxonID
>>> AND tn.left_value > n. left_value
>>> AND tn.right_value < n.right_value
>>> AND tn.taxon_id = tnm.taxon_id
>>> AND tn.name_class = 'scientific_name'
>>>
>>>
>>> BioPerl doesn't have a Taxonomy::biosql module yet (though this
>>> would
>>> seem like a worthwhile thing to add), so you can't use the
>>> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>>
>>>
>>> However, BioPerl does have support for the flat-file download of
>>> the
>>> NCBI taxonomy database and indexes it, so you can simply use
>>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>>> download
>>> to achieve what you wanted to do in a less than 5 lines of perl.
>>>
>>>
>>> Although the recursive implementation of
>>> Taxonomy::get_all_Descendants
>>> () won't be lightning fast, it may still be perfectly fine for your
>>> application - are you sure it is not?
>>>
>>>
>>> -hilmar
>>>
>>>
>>> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>>
>>>
>>> Thanks. And how can I assign the $node here in the below code,
>>> such
>>> that I can reference it to a particular taxon id record? I want to
>>> retrieve all the descendents from the taxonomy hierarchy, given a
>>> particular taxon id.
>>>
>>>
>>> I have a local db setup, in which I have uploaded data using the
>>> load_ncbi_taxonomy.pl script.
>>>
>>>
>>> Thanks.
>>> George
>>>
>>>
>>> Jason Stajich wrote:
>>> I assume you already figured out how to setup a local taxonomydb?
>>>
>>>
>>>
>>>
>>> You just want the extant species/leaves of the tree
>>>
>>>
>>>
>>>
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descedents;
>>>
>>>
>>>
>>>
>>>
>>>
>>> -jason
>>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>>
>>>
>>> Hi all,
>>>
>>>
>>>
>>>
>>> Can anyone point me to some example that uses the
>>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>>> this, and I am not quite sure how to implement it.
>>>
>>>
>>>
>>>
>>> Thanks.
>>> George
>>>
>>>
>>>
>>>
>>> Sendu Bala wrote:
>>> George Heller wrote:
>>> Hi all,
>>>
>>>
>>>
>>>
>>> I am looking at extracting the taxonomy hierarchy for some taxon
>>> ids.
>>> What I plan to do is, for a given taxon id, say 33090, I want to
>>> extract all taxon ids that are children of this species. I do not
>>> just want the immediate children, but the children's children
>>> and so
>>> on.
>>>
>>>
>>>
>>>
>>> Any ideas on the way I can go about doing this?
>>>
>>>
>>>
>>>
>>> Well, you'll use Bio::DB::Taxonomy presumably, and
>>> each_Descendent in
>>> some kind of looping structure. Most easily a recursing sub.
>>>
>>>
>>>
>>>
>>> If you happen to code up something neat and efficient, why not
>>> share it
>>> with us and we could add it to the Taxonomy module(s).
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Shape Yahoo! in your own image. Join our Network Research Panel
>>> today!
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>>
>>> --
>>> Jason Stajich
>>> jason at bioperl.org
>>> http://jason.open-bio.org/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Need a vacation? Get great deals to amazing places on Yahoo!
>>> Travel.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> --
>>> ===========================================================
>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Take the Internet to Go: Yahoo!Go puts the Internet in your
>>> pocket: mail, news, photos & more.
>>>
>>>
>>> --
>>> Jason Stajich
>>> jason at bioperl.org
>>> http://jason.open-bio.org/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Bored stiff? Loosen up...
>>> Download and play hundreds of games for free on Yahoo! Games.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
>
>
>
> ---------------------------------
> Expecting? Get great news right away with email Auto-Check.
> Try the Yahoo! Mail Beta.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From george.heller at yahoo.com  Mon Jun 18 20:04:00 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 17:04:00 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <C93DF7A1-20AC-4474-BBC6-0C2598406EEB@bioperl.org>
Message-ID: <424035.72876.qm@web56507.mail.re3.yahoo.com>

Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
   
  Sorry to be bothering, really appreaciate your patience.
   
  Thanks.
  George

Jason Stajich <jason at bioperl.org> wrote:
  Try installing the latest Scalar::Util  
    On Jun 18, 2007, at 4:05 PM, George Heller wrote:

    This is the output of /usr/bin/perl -V
  

  Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
    Platform:
      osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
      uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
      config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
      hint=recommended, useposix=true, d_sigaction=define
      usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
      useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
      use64bitint=undef use64bitall=undef uselongdouble=undef
      usemymalloc=n, bincompat5005=undef
    Compiler:
      cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
      optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
      cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
      ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
      intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
      d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
      ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
      alignbytes=4, prototype=define
    Linker and Libraries:
      ld='gcc', ldflags =' -L/usr/local/lib'
      libpth=/usr/local/lib /lib /usr/lib
      libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
      perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
      libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
      gnulibc_version='2.3.4'
    Dynamic Linking:
      dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
      cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

  Characteristics of this binary (from libperl):
    Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
    Built under linux
    Compiled at Jul 24 2006 18:28:10
    @INC:
      /usr/lib/perl5/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/5.8.5
      /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.5
      /usr/lib/perl5/site_perl/5.8.4
      /usr/lib/perl5/site_perl/5.8.3
      /usr/lib/perl5/site_perl/5.8.2
      /usr/lib/perl5/site_perl/5.8.1
      /usr/lib/perl5/site_perl/5.8.0
      /usr/lib/perl5/site_perl
      /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.5
      /usr/lib/perl5/vendor_perl/5.8.4
      /usr/lib/perl5/vendor_perl/5.8.3
      /usr/lib/perl5/vendor_perl/5.8.2
      /usr/lib/perl5/vendor_perl/5.8.1
      /usr/lib/perl5/vendor_perl/5.8.0
      /usr/lib/perl5/vendor_perl
  

    Thanks.
    George
      .
  

  Hilmar Lapp <hlapp at gmx.net> wrote:
    The perl version appears to be 5.8.5 though, so something strange 
  appears to be going on too.
  

  George, can you please post the output of
  

  $ /usr/bin/perl -V
  

  -hilmar
  

  On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

    As the error implies your local version of perl doesn't seem support
  weak references, which means it doesn't have Scalar::Utils (which was
  added to core after perl 5.6.1, I think). Try installing
  Scalar::Utils to see what happens.
  

  chris
  

  On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

    I tried running the below mentioned script and I seem to be getting
  the following error:
  

  Weak references are not implemented in the version of perl at /
  usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
  BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
  Bio/Tree/Node.pm line 76.
  Compilation failed in require at my.pl line 7.
  BEGIN failed--compilation aborted at my.pl line 7.
  

  My script looks something like,
  

  #!/usr/bin/perl
  use strict;
  #use warnings;
  use DBI;
  use Bio::Tree::Node;
  use Bio::DB::Taxonomy;
  use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
  my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
  -nodesfile => $nodesfile,
  -namesfile => $namesfile,
  -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descendents;
  

  foreach $field (@extant_children) {
  print "$field";
  print "|";
  print "\n";
  }
  

  And I am running the script using the command,
  

  perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

  and I have the nodes.dmp and names.dmp files in the current
  directory.
  

  Thanks,
  George
  

  Jason Stajich wrote:
  It is implemented in the implementing class - DB::Taxonomy is
  just the base class. For example see the flatfile implementation
  Bio::DB::Taxonomy::flatfile
  

  See the scripts/taxa/local_taxonomydb_query.PLS for example using
  it:
  nodes and names are from NCBI taxonomy database.
  

  Here is an un-debugged copy+paste for your question that *should*
  work.
  

  use Bio::DB::Taxonomy
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
  my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
  -nodesfile => $nodesfile,
  -namesfile => $namesfile,
  -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descendents;
  

  -jason
  

  On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

  What exactly is the "node n" in the query below. When I issue
  this query, it says,
  

  relation "node" does not exist.
  

  I tried to use the get_all_Descendents method but it looks like
  in order to do a recursive call it calls the method
  each_Descendent. This method is not implemented in
  Bio::DB::Taxonomy. It just has a single line,
  

  shift->throw_not_implemented();
  

  Thanks.
  George.
  

  Hilmar Lapp wrote:
  I'm a bit confused - it sounds like you have set up a local 
  BioSQL
  database and loaded the NCBI taxonomy into the database. You can 
  now
  use simple SQL to retrieve all descendants of a node in the tree
  given its NCBI taxonID such as
  

  SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
  WHERE
  n.ncbi_taxon_id = :taxonID
  AND tn.left_value > n. left_value
  AND tn.right_value < n.right_value
  AND tn.taxon_id = tnm.taxon_id
  AND tn.name_class = 'scientific_name'
  

  BioPerl doesn't have a Taxonomy::biosql module yet (though this
  would
  seem like a worthwhile thing to add), so you can't use the
  Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

  However, BioPerl does have support for the flat-file download of 
  the
  NCBI taxonomy database and indexes it, so you can simply use
  Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
  download
  to achieve what you wanted to do in a less than 5 lines of perl.
  

  Although the recursive implementation of
  Taxonomy::get_all_Descendants
  () won't be lightning fast, it may still be perfectly fine for your
  application - are you sure it is not?
  

  -hilmar
  

  On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

  Thanks. And how can I assign the $node here in the below code,
  such
  that I can reference it to a particular taxon id record? I want to
  retrieve all the descendents from the taxonomy hierarchy, given a
  particular taxon id.
  

  I have a local db setup, in which I have uploaded data using the
  load_ncbi_taxonomy.pl script.
  

  Thanks.
  George
  

  Jason Stajich wrote:
  I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descedents;
  

  -jason
  On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

  Hi all,
  

  Can anyone point me to some example that uses the
  get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
  this, and I am not quite sure how to implement it.
  

  Thanks.
  George
  

  Sendu Bala wrote:
  George Heller wrote:
  Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon
  ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children 
  and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and
  each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not
  share it
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image. Join our Network Research Panel
  today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Need a vacation? Get great deals to amazing places on Yahoo! 
  Travel.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Take the Internet to Go: Yahoo!Go puts the Internet in your
  pocket: mail, news, photos & more.
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Bored stiff? Loosen up...
  Download and play hundreds of games for free on Yahoo! Games.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  Christopher Fields
  Postdoctoral Researcher
  Lab of Dr. Robert Switzer
  Dept of Biochemistry
  University of Illinois Urbana-Champaign
  

  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  -- 
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Expecting? Get great news right away with email Auto-Check.
  Try the Yahoo! Mail Beta.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Building a website is a piece of cake. 
Yahoo! Small Business gives you all the tools to get online.

From jason at bioperl.org  Mon Jun 18 20:17:34 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 17:17:34 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <424035.72876.qm@web56507.mail.re3.yahoo.com>
References: <424035.72876.qm@web56507.mail.re3.yahoo.com>
Message-ID: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org>

All the children are in this array.

You get to decide what you want to do with them. In the following  
example I print the id, rank, and scientific name out to the screen.
Because this is a taxonomy db query you are getting back  
Bio::Taxonomy::Taxon objects so read the documentation for this  
module to see what you can do with the object.
I would also suggest spending a little time with the Getting started  
and HOWTO:Trees documentation on the website to get familiar with the  
objects and nomenclature.


my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;

for my $child ( @extant_children ) {
   print "id is ", $child->id, "\n"; # NCBI taxa id
   print "rank is ", $child->rank, "\n"; # e.g. species
   print "scientific name is ", $child->scientific_name, "\n"; #  
scientific name
}

On Jun 18, 2007, at 5:04 PM, George Heller wrote:

> Ok, I installed the latest of Scalar::Util and the script seems to  
> be working. But I am confused where exactly I need to look for the  
> descendent taxon ids once the script is run. I did look into the / 
> tmp/ directory, but I couldnt understand much.
>
>   Sorry to be bothering, really appreaciate your patience.
>
>   Thanks.
>   George
>
> Jason Stajich <jason at bioperl.org> wrote:
>   Try installing the latest Scalar::Util
>     On Jun 18, 2007, at 4:05 PM, George Heller wrote:
>
>     This is the output of /usr/bin/perl -V
>
>
>   Summary of my perl5 (revision 5 version 8 subversion 5)  
> configuration:
>     Platform:
>       osname=linux, osvers=2.6.9-22.18.bz155725.elsmp,  
> archname=i386-linux-thread-multi
>       uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>       config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>       hint=recommended, useposix=true, d_sigaction=define
>       usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>       useperlio=define d_sfio=undef uselargefiles=define  
> usesocks=undef
>       use64bitint=undef use64bitall=undef uselongdouble=undef
>       usemymalloc=n, bincompat5005=undef
>     Compiler:
>       cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - 
> fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>       optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>       cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- 
> aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>       ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)',  
> gccosandvers=''
>       intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>       d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>       ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>       alignbytes=4, prototype=define
>     Linker and Libraries:
>       ld='gcc', ldflags =' -L/usr/local/lib'
>       libpth=/usr/local/lib /lib /usr/lib
>       libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>       perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>       libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>       gnulibc_version='2.3.4'
>     Dynamic Linking:
>       dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- 
> Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>       cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
>
>   Characteristics of this binary (from libperl):
>     Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>     Built under linux
>     Compiled at Jul 24 2006 18:28:10
>     @INC:
>       /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/5.8.5
>       /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.5
>       /usr/lib/perl5/site_perl/5.8.4
>       /usr/lib/perl5/site_perl/5.8.3
>       /usr/lib/perl5/site_perl/5.8.2
>       /usr/lib/perl5/site_perl/5.8.1
>       /usr/lib/perl5/site_perl/5.8.0
>       /usr/lib/perl5/site_perl
>       /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.5
>       /usr/lib/perl5/vendor_perl/5.8.4
>       /usr/lib/perl5/vendor_perl/5.8.3
>       /usr/lib/perl5/vendor_perl/5.8.2
>       /usr/lib/perl5/vendor_perl/5.8.1
>       /usr/lib/perl5/vendor_perl/5.8.0
>       /usr/lib/perl5/vendor_perl
>
>
>     Thanks.
>     George
>       .
>
>
>   Hilmar Lapp <hlapp at gmx.net> wrote:
>     The perl version appears to be 5.8.5 though, so something strange
>   appears to be going on too.
>
>
>   George, can you please post the output of
>
>
>   $ /usr/bin/perl -V
>
>
>   -hilmar
>
>
>   On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>
>     As the error implies your local version of perl doesn't seem  
> support
>   weak references, which means it doesn't have Scalar::Utils (which  
> was
>   added to core after perl 5.6.1, I think). Try installing
>   Scalar::Utils to see what happens.
>
>
>   chris
>
>
>   On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>
>     I tried running the below mentioned script and I seem to be  
> getting
>   the following error:
>
>
>   Weak references are not implemented in the version of perl at /
>   usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>   BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>   Bio/Tree/Node.pm line 76.
>   Compilation failed in require at my.pl line 7.
>   BEGIN failed--compilation aborted at my.pl line 7.
>
>
>   My script looks something like,
>
>
>   #!/usr/bin/perl
>   use strict;
>   #use warnings;
>   use DBI;
>   use Bio::Tree::Node;
>   use Bio::DB::Taxonomy;
>   use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>   my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>   -nodesfile => $nodesfile,
>   -namesfile => $namesfile,
>   -directory => $idx_dir);
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descendents;
>
>
>   foreach $field (@extant_children) {
>   print "$field";
>   print "|";
>   print "\n";
>   }
>
>
>   And I am running the script using the command,
>
>
>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>
>   and I have the nodes.dmp and names.dmp files in the current
>   directory.
>
>
>   Thanks,
>   George
>
>
>
>
>   Jason Stajich wrote:
>   It is implemented in the implementing class - DB::Taxonomy is
>   just the base class. For example see the flatfile implementation
>   Bio::DB::Taxonomy::flatfile
>
>
>   See the scripts/taxa/local_taxonomydb_query.PLS for example using
>   it:
>   nodes and names are from NCBI taxonomy database.
>
>
>
>
>   Here is an un-debugged copy+paste for your question that *should*
>   work.
>
>
>
>
>   use Bio::DB::Taxonomy
>   my $idx_dir = '/tmp';
>
>
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>   my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>   -nodesfile => $nodesfile,
>   -namesfile => $namesfile,
>   -directory => $idx_dir);
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descendents;
>
>
>
>
>
>
>
>
>   -jason
>
>
>   On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>
>   What exactly is the "node n" in the query below. When I issue
>   this query, it says,
>
>
>
>
>   relation "node" does not exist.
>
>
>
>
>   I tried to use the get_all_Descendents method but it looks like
>   in order to do a recursive call it calls the method
>   each_Descendent. This method is not implemented in
>   Bio::DB::Taxonomy. It just has a single line,
>
>
>
>
>   shift->throw_not_implemented();
>
>
>
>
>   Thanks.
>   George.
>
>
>
>
>   Hilmar Lapp wrote:
>   I'm a bit confused - it sounds like you have set up a local
>   BioSQL
>   database and loaded the NCBI taxonomy into the database. You can
>   now
>   use simple SQL to retrieve all descendants of a node in the tree
>   given its NCBI taxonID such as
>
>
>
>
>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>   WHERE
>   n.ncbi_taxon_id = :taxonID
>   AND tn.left_value > n. left_value
>   AND tn.right_value < n.right_value
>   AND tn.taxon_id = tnm.taxon_id
>   AND tn.name_class = 'scientific_name'
>
>
>
>
>   BioPerl doesn't have a Taxonomy::biosql module yet (though this
>   would
>   seem like a worthwhile thing to add), so you can't use the
>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>
>
>   However, BioPerl does have support for the flat-file download of
>   the
>   NCBI taxonomy database and indexes it, so you can simply use
>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>   download
>   to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>
>
>   Although the recursive implementation of
>   Taxonomy::get_all_Descendants
>   () won't be lightning fast, it may still be perfectly fine for your
>   application - are you sure it is not?
>
>
>
>
>   -hilmar
>
>
>
>
>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>
>
>   Thanks. And how can I assign the $node here in the below code,
>   such
>   that I can reference it to a particular taxon id record? I want to
>   retrieve all the descendents from the taxonomy hierarchy, given a
>   particular taxon id.
>
>
>
>
>   I have a local db setup, in which I have uploaded data using the
>   load_ncbi_taxonomy.pl script.
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>   Jason Stajich wrote:
>   I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>
>
>
>
>   You just want the extant species/leaves of the tree
>
>
>
>
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descedents;
>
>
>
>
>
>
>
>
>
>
>
>
>   -jason
>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>
>
>   Hi all,
>
>
>
>
>
>
>
>
>   Can anyone point me to some example that uses the
>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>   this, and I am not quite sure how to implement it.
>
>
>
>
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>
>
>
>
>   Sendu Bala wrote:
>   George Heller wrote:
>   Hi all,
>
>
>
>
>
>
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon
>   ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children
>   and so
>   on.
>
>
>
>
>
>
>
>
>   Any ideas on the way I can go about doing this?
>
>
>
>
>
>
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and
>   each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>
>
>
>
>   If you happen to code up something neat and efficient, why not
>   share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image. Join our Network Research Panel
>   today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Need a vacation? Get great deals to amazing places on Yahoo!
>   Travel.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Take the Internet to Go: Yahoo!Go puts the Internet in your
>   pocket: mail, news, photos & more.
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Bored stiff? Loosen up...
>   Download and play hundreds of games for free on Yahoo! Games.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   Christopher Fields
>   Postdoctoral Researcher
>   Lab of Dr. Robert Switzer
>   Dept of Biochemistry
>   University of Illinois Urbana-Champaign
>
>
>
>
>
>
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Expecting? Get great news right away with email Auto-Check.
>   Try the Yahoo! Mail Beta.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Building a website is a piece of cake.
> Yahoo! Small Business gives you all the tools to get online.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From george.heller at yahoo.com  Mon Jun 18 20:29:31 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 17:29:31 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org>
Message-ID: <369098.81077.qm@web56507.mail.re3.yahoo.com>

But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like,
   
  #!/usr/bin/perl
  use strict;
#use warnings;
use DBI;
  use Bio::Tree::Node;
use Bio::DB::Taxonomy;
use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
my $nodefile;
my $namesfile;

  my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                               -nodesfile => $nodefile,
                               -namesfile => $namesfile,
                               -directory => $idx_dir);
 my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  
for my $child ( @extant_children ) {
  print "id is ", $child->id, "\n"; # NCBI taxa id
  print "rank is ", $child->rank, "\n"; # e.g. species
  print "scientific name is ", $child->scientific_name, "\n"; #
scientific name
}

Thanks.
  George
  
Jason Stajich <jason at bioperl.org> wrote:
    All the children are in this array.  
  

  You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen.  
  Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object.
    I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature.
  

  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  for my $child ( @extant_children ) {
      print "id is ", $child->id, "\n"; # NCBI taxa id
    print "rank is ", $child->rank, "\n"; # e.g. species
    print "scientific name is ", $child->scientific_name, "\n"; # scientific name
  }


    On Jun 18, 2007, at 5:04 PM, George Heller wrote:

    Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
  

    Sorry to be bothering, really appreaciate your patience.
  

    Thanks.
    George
  

  Jason Stajich <jason at bioperl.org> wrote:
    Try installing the latest Scalar::Util  
      On Jun 18, 2007, at 4:05 PM, George Heller wrote:
  

      This is the output of /usr/bin/perl -V
  

    Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
      Platform:
        osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
        uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
        config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
        hint=recommended, useposix=true, d_sigaction=define
        usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
        useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
        use64bitint=undef use64bitall=undef uselongdouble=undef
        usemymalloc=n, bincompat5005=undef
      Compiler:
        cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
        optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
        cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
        ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
        intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
        d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
        ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
        alignbytes=4, prototype=define
      Linker and Libraries:
        ld='gcc', ldflags =' -L/usr/local/lib'
        libpth=/usr/local/lib /lib /usr/lib
        libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
        perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
        libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
        gnulibc_version='2.3.4'
      Dynamic Linking:
        dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
        cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

    Characteristics of this binary (from libperl):
      Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
      Built under linux
      Compiled at Jul 24 2006 18:28:10
      @INC:
        /usr/lib/perl5/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/5.8.5
        /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.5
        /usr/lib/perl5/site_perl/5.8.4
        /usr/lib/perl5/site_perl/5.8.3
        /usr/lib/perl5/site_perl/5.8.2
        /usr/lib/perl5/site_perl/5.8.1
        /usr/lib/perl5/site_perl/5.8.0
        /usr/lib/perl5/site_perl
        /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.5
        /usr/lib/perl5/vendor_perl/5.8.4
        /usr/lib/perl5/vendor_perl/5.8.3
        /usr/lib/perl5/vendor_perl/5.8.2
        /usr/lib/perl5/vendor_perl/5.8.1
        /usr/lib/perl5/vendor_perl/5.8.0
        /usr/lib/perl5/vendor_perl
  

      Thanks.
      George
        .
  

    Hilmar Lapp <hlapp at gmx.net> wrote:
      The perl version appears to be 5.8.5 though, so something strange 
    appears to be going on too.
  

    George, can you please post the output of
  

    $ /usr/bin/perl -V
  

    -hilmar
  

    On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

      As the error implies your local version of perl doesn't seem support
    weak references, which means it doesn't have Scalar::Utils (which was
    added to core after perl 5.6.1, I think). Try installing
    Scalar::Utils to see what happens.
  

    chris
  

    On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

      I tried running the below mentioned script and I seem to be getting
    the following error:
  

    Weak references are not implemented in the version of perl at /
    usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
    BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
    Bio/Tree/Node.pm line 76.
    Compilation failed in require at my.pl line 7.
    BEGIN failed--compilation aborted at my.pl line 7.
  

    My script looks something like,
  

    #!/usr/bin/perl
    use strict;
    #use warnings;
    use DBI;
    use Bio::Tree::Node;
    use Bio::DB::Taxonomy;
    use Bio::DB::Taxonomy::flatfile;
    my $idx_dir = '/tmp';
  

    my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
    my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
    -nodesfile => $nodesfile,
    -namesfile => $namesfile,
    -directory => $idx_dir);
    my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descendents;
  

    foreach $field (@extant_children) {
    print "$field";
    print "|";
    print "\n";
    }
  

    And I am running the script using the command,
  

    perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

    and I have the nodes.dmp and names.dmp files in the current
    directory.
  

    Thanks,
    George
  

    Jason Stajich wrote:
    It is implemented in the implementing class - DB::Taxonomy is
    just the base class. For example see the flatfile implementation
    Bio::DB::Taxonomy::flatfile
  

    See the scripts/taxa/local_taxonomydb_query.PLS for example using
    it:
    nodes and names are from NCBI taxonomy database.
  

    Here is an un-debugged copy+paste for your question that *should*
    work.
  

    use Bio::DB::Taxonomy
    my $idx_dir = '/tmp';
  

    my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
    my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
    -nodesfile => $nodesfile,
    -namesfile => $namesfile,
    -directory => $idx_dir);
    my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descendents;
  

    -jason
  

    On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

    What exactly is the "node n" in the query below. When I issue
    this query, it says,
  

    relation "node" does not exist.
  

    I tried to use the get_all_Descendents method but it looks like
    in order to do a recursive call it calls the method
    each_Descendent. This method is not implemented in
    Bio::DB::Taxonomy. It just has a single line,
  

    shift->throw_not_implemented();
  

    Thanks.
    George.
  

    Hilmar Lapp wrote:
    I'm a bit confused - it sounds like you have set up a local 
    BioSQL
    database and loaded the NCBI taxonomy into the database. You can 
    now
    use simple SQL to retrieve all descendants of a node in the tree
    given its NCBI taxonID such as
  

    SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
    WHERE
    n.ncbi_taxon_id = :taxonID
    AND tn.left_value > n. left_value
    AND tn.right_value < n.right_value
    AND tn.taxon_id = tnm.taxon_id
    AND tn.name_class = 'scientific_name'
  

    BioPerl doesn't have a Taxonomy::biosql module yet (though this
    would
    seem like a worthwhile thing to add), so you can't use the
    Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

    However, BioPerl does have support for the flat-file download of 
    the
    NCBI taxonomy database and indexes it, so you can simply use
    Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
    download
    to achieve what you wanted to do in a less than 5 lines of perl.
  

    Although the recursive implementation of
    Taxonomy::get_all_Descendants
    () won't be lightning fast, it may still be perfectly fine for your
    application - are you sure it is not?
  

    -hilmar
  

    On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

    Thanks. And how can I assign the $node here in the below code,
    such
    that I can reference it to a particular taxon id record? I want to
    retrieve all the descendents from the taxonomy hierarchy, given a
    particular taxon id.
  

    I have a local db setup, in which I have uploaded data using the
    load_ncbi_taxonomy.pl script.
  

    Thanks.
    George
  

    Jason Stajich wrote:
    I assume you already figured out how to setup a local taxonomydb?
  

    You just want the extant species/leaves of the tree
  

    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descedents;
  

    -jason
    On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

    Hi all,
  

    Can anyone point me to some example that uses the
    get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
    this, and I am not quite sure how to implement it.
  

    Thanks.
    George
  

    Sendu Bala wrote:
    George Heller wrote:
    Hi all,
  

    I am looking at extracting the taxonomy hierarchy for some taxon
    ids.
    What I plan to do is, for a given taxon id, say 33090, I want to
    extract all taxon ids that are children of this species. I do not
    just want the immediate children, but the children's children 
    and so
    on.
  

    Any ideas on the way I can go about doing this?
  

    Well, you'll use Bio::DB::Taxonomy presumably, and
    each_Descendent in
    some kind of looping structure. Most easily a recursing sub.
  

    If you happen to code up something neat and efficient, why not
    share it
    with us and we could add it to the Taxonomy module(s).
  

    ---------------------------------
    Shape Yahoo! in your own image. Join our Network Research Panel
    today!
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

    ---------------------------------
    Need a vacation? Get great deals to amazing places on Yahoo! 
    Travel.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    --
    ===========================================================
    : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
    ===========================================================
  

    ---------------------------------
    Take the Internet to Go: Yahoo!Go puts the Internet in your
    pocket: mail, news, photos & more.
  

    --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

    ---------------------------------
    Bored stiff? Loosen up...
    Download and play hundreds of games for free on Yahoo! Games.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    Christopher Fields
    Postdoctoral Researcher
    Lab of Dr. Robert Switzer
    Dept of Biochemistry
    University of Illinois Urbana-Champaign
  

    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    -- 
    ===========================================================
    : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
    ===========================================================
  

    ---------------------------------
    Expecting? Get great news right away with email Auto-Check.
    Try the Yahoo! Mail Beta.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

  ---------------------------------
  Building a website is a piece of cake. 
  Yahoo! Small Business gives you all the tools to get online.


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us.

From jason at bioperl.org  Mon Jun 18 21:05:43 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 18:05:43 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <369098.81077.qm@web56507.mail.re3.yahoo.com>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
Message-ID: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>

The files are indexes because you are indexing a flatfile - this  
speeds up the lookup so the second time you run the script it doesn't  
have to index.
You don't need to look at the files, they won't make sense to a human!

The reason it isn't printing anything is someone didn't really write  
the implementation quite right. This code was overhauled by Sendu  
before the last release I guess something didn't quite get connected.

I checked in code that has the Bio::Taxon delegating now to a DB  
handle for the each_Descendent call.
You can either patch your code  or just use the code listed here:
  http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

On Jun 18, 2007, at 5:29 PM, George Heller wrote:

> But the problem is that I don't really get any output on the  
> screen. In the /tmp directory I get 4 files namely parents, nodes,  
> id2names and names2id, but I dont know what to make of them. This  
> is what my script looks like,
>
>   #!/usr/bin/perl
>   use strict;
> #use warnings;
> use DBI;
>   use Bio::Tree::Node;
> use Bio::DB::Taxonomy;
> use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
> my $nodefile;
> my $namesfile;
>
>   my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                -nodesfile => $nodefile,
>                                -namesfile => $namesfile,
>                                -directory => $idx_dir);
>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
> for my $child ( @extant_children ) {
>   print "id is ", $child->id, "\n"; # NCBI taxa id
>   print "rank is ", $child->rank, "\n"; # e.g. species
>   print "scientific name is ", $child->scientific_name, "\n"; #
> scientific name
> }
>
> Thanks.
>   George
>
> Jason Stajich <jason at bioperl.org> wrote:
>     All the children are in this array.
>
>
>   You get to decide what you want to do with them. In the following  
> example I print the id, rank, and scientific name out to the screen.
>   Because this is a taxonomy db query you are getting back  
> Bio::Taxonomy::Taxon objects so read the documentation for this  
> module to see what you can do with the object.
>     I would also suggest spending a little time with the Getting  
> started and HOWTO:Trees documentation on the website to get  
> familiar with the objects and nomenclature.
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>
>   for my $child ( @extant_children ) {
>       print "id is ", $child->id, "\n"; # NCBI taxa id
>     print "rank is ", $child->rank, "\n"; # e.g. species
>     print "scientific name is ", $child->scientific_name, "\n"; #  
> scientific name
>   }
>
>
>     On Jun 18, 2007, at 5:04 PM, George Heller wrote:
>
>     Ok, I installed the latest of Scalar::Util and the script seems  
> to be working. But I am confused where exactly I need to look for  
> the descendent taxon ids once the script is run. I did look into  
> the /tmp/ directory, but I couldnt understand much.
>
>
>     Sorry to be bothering, really appreaciate your patience.
>
>
>     Thanks.
>     George
>
>
>   Jason Stajich <jason at bioperl.org> wrote:
>     Try installing the latest Scalar::Util
>       On Jun 18, 2007, at 4:05 PM, George Heller wrote:
>
>
>       This is the output of /usr/bin/perl -V
>
>
>
>
>     Summary of my perl5 (revision 5 version 8 subversion 5)  
> configuration:
>       Platform:
>         osname=linux, osvers=2.6.9-22.18.bz155725.elsmp,  
> archname=i386-linux-thread-multi
>         uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>         config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>         hint=recommended, useposix=true, d_sigaction=define
>         usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>         useperlio=define d_sfio=undef uselargefiles=define  
> usesocks=undef
>         use64bitint=undef use64bitall=undef uselongdouble=undef
>         usemymalloc=n, bincompat5005=undef
>       Compiler:
>         cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - 
> fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>         optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>         cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- 
> strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>         ccversion='', gccversion='3.4.6 20060404 (Red Hat  
> 3.4.6-2)', gccosandvers=''
>         intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>         d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>         ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>         alignbytes=4, prototype=define
>       Linker and Libraries:
>         ld='gcc', ldflags =' -L/usr/local/lib'
>         libpth=/usr/local/lib /lib /usr/lib
>         libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>         perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>         libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>         gnulibc_version='2.3.4'
>       Dynamic Linking:
>         dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- 
> Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>         cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
>
>
>
>     Characteristics of this binary (from libperl):
>       Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>       Built under linux
>       Compiled at Jul 24 2006 18:28:10
>       @INC:
>         /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/5.8.5
>         /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.5
>         /usr/lib/perl5/site_perl/5.8.4
>         /usr/lib/perl5/site_perl/5.8.3
>         /usr/lib/perl5/site_perl/5.8.2
>         /usr/lib/perl5/site_perl/5.8.1
>         /usr/lib/perl5/site_perl/5.8.0
>         /usr/lib/perl5/site_perl
>         /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.5
>         /usr/lib/perl5/vendor_perl/5.8.4
>         /usr/lib/perl5/vendor_perl/5.8.3
>         /usr/lib/perl5/vendor_perl/5.8.2
>         /usr/lib/perl5/vendor_perl/5.8.1
>         /usr/lib/perl5/vendor_perl/5.8.0
>         /usr/lib/perl5/vendor_perl
>
>
>
>
>       Thanks.
>       George
>         .
>
>
>
>
>     Hilmar Lapp <hlapp at gmx.net> wrote:
>       The perl version appears to be 5.8.5 though, so something  
> strange
>     appears to be going on too.
>
>
>
>
>     George, can you please post the output of
>
>
>
>
>     $ /usr/bin/perl -V
>
>
>
>
>     -hilmar
>
>
>
>
>     On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>
>
>
>       As the error implies your local version of perl doesn't seem  
> support
>     weak references, which means it doesn't have Scalar::Utils  
> (which was
>     added to core after perl 5.6.1, I think). Try installing
>     Scalar::Utils to see what happens.
>
>
>
>
>     chris
>
>
>
>
>     On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>
>
>
>       I tried running the below mentioned script and I seem to be  
> getting
>     the following error:
>
>
>
>
>     Weak references are not implemented in the version of perl at /
>     usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>     BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/ 
> 5.8.5/
>     Bio/Tree/Node.pm line 76.
>     Compilation failed in require at my.pl line 7.
>     BEGIN failed--compilation aborted at my.pl line 7.
>
>
>
>
>     My script looks something like,
>
>
>
>
>     #!/usr/bin/perl
>     use strict;
>     #use warnings;
>     use DBI;
>     use Bio::Tree::Node;
>     use Bio::DB::Taxonomy;
>     use Bio::DB::Taxonomy::flatfile;
>     my $idx_dir = '/tmp';
>
>
>
>
>     my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>     -nodesfile => $nodesfile,
>     -namesfile => $namesfile,
>     -directory => $idx_dir);
>     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descendents;
>
>
>
>
>     foreach $field (@extant_children) {
>     print "$field";
>     print "|";
>     print "\n";
>     }
>
>
>
>
>     And I am running the script using the command,
>
>
>
>
>     perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>
>
>
>     and I have the nodes.dmp and names.dmp files in the current
>     directory.
>
>
>
>
>     Thanks,
>     George
>
>
>
>
>
>
>
>
>     Jason Stajich wrote:
>     It is implemented in the implementing class - DB::Taxonomy is
>     just the base class. For example see the flatfile implementation
>     Bio::DB::Taxonomy::flatfile
>
>
>
>
>     See the scripts/taxa/local_taxonomydb_query.PLS for example using
>     it:
>     nodes and names are from NCBI taxonomy database.
>
>
>
>
>
>
>
>
>     Here is an un-debugged copy+paste for your question that *should*
>     work.
>
>
>
>
>
>
>
>
>     use Bio::DB::Taxonomy
>     my $idx_dir = '/tmp';
>
>
>
>
>
>
>
>
>     my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>     -nodesfile => $nodesfile,
>     -namesfile => $namesfile,
>     -directory => $idx_dir);
>     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descendents;
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     -jason
>
>
>
>
>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>
>
>
>     What exactly is the "node n" in the query below. When I issue
>     this query, it says,
>
>
>
>
>
>
>
>
>     relation "node" does not exist.
>
>
>
>
>
>
>
>
>     I tried to use the get_all_Descendents method but it looks like
>     in order to do a recursive call it calls the method
>     each_Descendent. This method is not implemented in
>     Bio::DB::Taxonomy. It just has a single line,
>
>
>
>
>
>
>
>
>     shift->throw_not_implemented();
>
>
>
>
>
>
>
>
>     Thanks.
>     George.
>
>
>
>
>
>
>
>
>     Hilmar Lapp wrote:
>     I'm a bit confused - it sounds like you have set up a local
>     BioSQL
>     database and loaded the NCBI taxonomy into the database. You can
>     now
>     use simple SQL to retrieve all descendants of a node in the tree
>     given its NCBI taxonID such as
>
>
>
>
>
>
>
>
>     SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>     WHERE
>     n.ncbi_taxon_id = :taxonID
>     AND tn.left_value > n. left_value
>     AND tn.right_value < n.right_value
>     AND tn.taxon_id = tnm.taxon_id
>     AND tn.name_class = 'scientific_name'
>
>
>
>
>
>
>
>
>     BioPerl doesn't have a Taxonomy::biosql module yet (though this
>     would
>     seem like a worthwhile thing to add), so you can't use the
>     Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>
>
>
>
>
>
>     However, BioPerl does have support for the flat-file download of
>     the
>     NCBI taxonomy database and indexes it, so you can simply use
>     Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>     download
>     to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>
>
>
>
>
>
>     Although the recursive implementation of
>     Taxonomy::get_all_Descendants
>     () won't be lightning fast, it may still be perfectly fine for  
> your
>     application - are you sure it is not?
>
>
>
>
>
>
>
>
>     -hilmar
>
>
>
>
>
>
>
>
>     On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>
>
>
>
>
>
>     Thanks. And how can I assign the $node here in the below code,
>     such
>     that I can reference it to a particular taxon id record? I want to
>     retrieve all the descendents from the taxonomy hierarchy, given a
>     particular taxon id.
>
>
>
>
>
>
>
>
>     I have a local db setup, in which I have uploaded data using the
>     load_ncbi_taxonomy.pl script.
>
>
>
>
>
>
>
>
>     Thanks.
>     George
>
>
>
>
>
>
>
>
>     Jason Stajich wrote:
>     I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     You just want the extant species/leaves of the tree
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descedents;
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     -jason
>     On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>
>
>
>
>
>
>     Hi all,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Can anyone point me to some example that uses the
>     get_all_Descendents method from Bio::DB::Taxonomy? I am a  
> newbie at
>     this, and I am not quite sure how to implement it.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Thanks.
>     George
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Sendu Bala wrote:
>     George Heller wrote:
>     Hi all,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     I am looking at extracting the taxonomy hierarchy for some taxon
>     ids.
>     What I plan to do is, for a given taxon id, say 33090, I want to
>     extract all taxon ids that are children of this species. I do not
>     just want the immediate children, but the children's children
>     and so
>     on.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Any ideas on the way I can go about doing this?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Well, you'll use Bio::DB::Taxonomy presumably, and
>     each_Descendent in
>     some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     If you happen to code up something neat and efficient, why not
>     share it
>     with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Shape Yahoo! in your own image. Join our Network Research Panel
>     today!
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Need a vacation? Get great deals to amazing places on Yahoo!
>     Travel.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>     --
>     ===========================================================
>     : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>     ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Take the Internet to Go: Yahoo!Go puts the Internet in your
>     pocket: mail, news, photos & more.
>
>
>
>
>
>
>
>
>     --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Bored stiff? Loosen up...
>     Download and play hundreds of games for free on Yahoo! Games.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>     Christopher Fields
>     Postdoctoral Researcher
>     Lab of Dr. Robert Switzer
>     Dept of Biochemistry
>     University of Illinois Urbana-Champaign
>
>
>
>
>
>
>
>
>
>
>
>
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>     --
>     ===========================================================
>     : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>     ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Expecting? Get great news right away with email Auto-Check.
>     Try the Yahoo! Mail Beta.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>       --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Building a website is a piece of cake.
>   Yahoo! Small Business gives you all the tools to get online.
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s  
> user panel and lay it on us.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From torsten.seemann at infotech.monash.edu.au  Mon Jun 18 21:21:04 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 19 Jun 2007 11:21:04 +1000
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676A01F.30205@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
Message-ID: <a79f6a4b0706181821p12a2e138xade9c30895e45068@mail.gmail.com>

Sendu,

> >> Can anyone offer a
> >> way to systematically find at least the test scripts which access the
> >> internet, if not the specific tests within?

Perhaps you could use 'strace' to list network system calls for each
test script, and grep out AF_INET connections?

% strace -e trace=network command_to_test 2>&1 | grep AF_INET

I'm not an strace expert but it might do what you need.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010

From george.heller at yahoo.com  Mon Jun 18 21:16:10 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 18:16:10 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
Message-ID: <815364.33231.qm@web56512.mail.re3.yahoo.com>

Works perfectly. Thanks so much Jason, Hilmar, Chris. You've been a great help!
   
  Thanks.
  George

Jason Stajich <jason at bioperl.org> wrote:
  The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index.  You don't need to look at the files, they won't make sense to a human!
  

  The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. 
  

  I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call.
  You can either patch your code  or just use the code listed here:
     http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

  
    On Jun 18, 2007, at 5:29 PM, George Heller wrote:

    But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like,
  

    #!/usr/bin/perl
    use strict;
  #use warnings;
  use DBI;
    use Bio::Tree::Node;
  use Bio::DB::Taxonomy;
  use Bio::DB::Taxonomy::flatfile;
    my $idx_dir = '/tmp';
  my $nodefile;
  my $namesfile;
  

    my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
  my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                 -nodesfile => $nodefile,
                                 -namesfile => $namesfile,
                                 -directory => $idx_dir);
   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
   my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  for my $child ( @extant_children ) {
    print "id is ", $child->id, "\n"; # NCBI taxa id
    print "rank is ", $child->rank, "\n"; # e.g. species
    print "scientific name is ", $child->scientific_name, "\n"; #
  scientific name
  }
  

  Thanks.
    George
  

  Jason Stajich <jason at bioperl.org> wrote:
      All the children are in this array.  
  

    You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen.  
    Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object.
      I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature.
  

    my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

    for my $child ( @extant_children ) {
        print "id is ", $child->id, "\n"; # NCBI taxa id
      print "rank is ", $child->rank, "\n"; # e.g. species
      print "scientific name is ", $child->scientific_name, "\n"; # scientific name
    }
  

      On Jun 18, 2007, at 5:04 PM, George Heller wrote:
  

      Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
  

      Sorry to be bothering, really appreaciate your patience.
  

      Thanks.
      George
  

    Jason Stajich <jason at bioperl.org> wrote:
      Try installing the latest Scalar::Util  
        On Jun 18, 2007, at 4:05 PM, George Heller wrote:
  

        This is the output of /usr/bin/perl -V
  

      Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
        Platform:
          osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
          uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
          config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
          hint=recommended, useposix=true, d_sigaction=define
          usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
          useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
          use64bitint=undef use64bitall=undef uselongdouble=undef
          usemymalloc=n, bincompat5005=undef
        Compiler:
          cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
          optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
          cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
          ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
          intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
          d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
          ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
          alignbytes=4, prototype=define
        Linker and Libraries:
          ld='gcc', ldflags =' -L/usr/local/lib'
          libpth=/usr/local/lib /lib /usr/lib
          libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
          perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
          libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
          gnulibc_version='2.3.4'
        Dynamic Linking:
          dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
          cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

      Characteristics of this binary (from libperl):
        Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
        Built under linux
        Compiled at Jul 24 2006 18:28:10
        @INC:
          /usr/lib/perl5/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/5.8.5
          /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.5
          /usr/lib/perl5/site_perl/5.8.4
          /usr/lib/perl5/site_perl/5.8.3
          /usr/lib/perl5/site_perl/5.8.2
          /usr/lib/perl5/site_perl/5.8.1
          /usr/lib/perl5/site_perl/5.8.0
          /usr/lib/perl5/site_perl
          /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.5
          /usr/lib/perl5/vendor_perl/5.8.4
          /usr/lib/perl5/vendor_perl/5.8.3
          /usr/lib/perl5/vendor_perl/5.8.2
          /usr/lib/perl5/vendor_perl/5.8.1
          /usr/lib/perl5/vendor_perl/5.8.0
          /usr/lib/perl5/vendor_perl
  

        Thanks.
        George
          .
  

      Hilmar Lapp <hlapp at gmx.net> wrote:
        The perl version appears to be 5.8.5 though, so something strange 
      appears to be going on too.
  

      George, can you please post the output of
  

      $ /usr/bin/perl -V
  

      -hilmar
  

      On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

        As the error implies your local version of perl doesn't seem support
      weak references, which means it doesn't have Scalar::Utils (which was
      added to core after perl 5.6.1, I think). Try installing
      Scalar::Utils to see what happens.
  

      chris
  

      On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

        I tried running the below mentioned script and I seem to be getting
      the following error:
  

      Weak references are not implemented in the version of perl at /
      usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
      BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
      Bio/Tree/Node.pm line 76.
      Compilation failed in require at my.pl line 7.
      BEGIN failed--compilation aborted at my.pl line 7.
  

      My script looks something like,
  

      #!/usr/bin/perl
      use strict;
      #use warnings;
      use DBI;
      use Bio::Tree::Node;
      use Bio::DB::Taxonomy;
      use Bio::DB::Taxonomy::flatfile;
      my $idx_dir = '/tmp';
  

      my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
      my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
      -nodesfile => $nodesfile,
      -namesfile => $namesfile,
      -directory => $idx_dir);
      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descendents;
  

      foreach $field (@extant_children) {
      print "$field";
      print "|";
      print "\n";
      }
  

      And I am running the script using the command,
  

      perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

      and I have the nodes.dmp and names.dmp files in the current
      directory.
  

      Thanks,
      George
  

      Jason Stajich wrote:
      It is implemented in the implementing class - DB::Taxonomy is
      just the base class. For example see the flatfile implementation
      Bio::DB::Taxonomy::flatfile
  

      See the scripts/taxa/local_taxonomydb_query.PLS for example using
      it:
      nodes and names are from NCBI taxonomy database.
  

      Here is an un-debugged copy+paste for your question that *should*
      work.
  

      use Bio::DB::Taxonomy
      my $idx_dir = '/tmp';
  

      my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
      my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
      -nodesfile => $nodesfile,
      -namesfile => $namesfile,
      -directory => $idx_dir);
      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descendents;
  

      -jason
  

      On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

      What exactly is the "node n" in the query below. When I issue
      this query, it says,
  

      relation "node" does not exist.
  

      I tried to use the get_all_Descendents method but it looks like
      in order to do a recursive call it calls the method
      each_Descendent. This method is not implemented in
      Bio::DB::Taxonomy. It just has a single line,
  

      shift->throw_not_implemented();
  

      Thanks.
      George.
  

      Hilmar Lapp wrote:
      I'm a bit confused - it sounds like you have set up a local 
      BioSQL
      database and loaded the NCBI taxonomy into the database. You can 
      now
      use simple SQL to retrieve all descendants of a node in the tree
      given its NCBI taxonID such as
  

      SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
      WHERE
      n.ncbi_taxon_id = :taxonID
      AND tn.left_value > n. left_value
      AND tn.right_value < n.right_value
      AND tn.taxon_id = tnm.taxon_id
      AND tn.name_class = 'scientific_name'
  

      BioPerl doesn't have a Taxonomy::biosql module yet (though this
      would
      seem like a worthwhile thing to add), so you can't use the
      Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

      However, BioPerl does have support for the flat-file download of 
      the
      NCBI taxonomy database and indexes it, so you can simply use
      Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
      download
      to achieve what you wanted to do in a less than 5 lines of perl.
  

      Although the recursive implementation of
      Taxonomy::get_all_Descendants
      () won't be lightning fast, it may still be perfectly fine for your
      application - are you sure it is not?
  

      -hilmar
  

      On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

      Thanks. And how can I assign the $node here in the below code,
      such
      that I can reference it to a particular taxon id record? I want to
      retrieve all the descendents from the taxonomy hierarchy, given a
      particular taxon id.
  

      I have a local db setup, in which I have uploaded data using the
      load_ncbi_taxonomy.pl script.
  

      Thanks.
      George
  

      Jason Stajich wrote:
      I assume you already figured out how to setup a local taxonomydb?
  

      You just want the extant species/leaves of the tree
  

      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descedents;
  

      -jason
      On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

      Hi all,
  

      Can anyone point me to some example that uses the
      get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
      this, and I am not quite sure how to implement it.
  

      Thanks.
      George
  

      Sendu Bala wrote:
      George Heller wrote:
      Hi all,
  

      I am looking at extracting the taxonomy hierarchy for some taxon
      ids.
      What I plan to do is, for a given taxon id, say 33090, I want to
      extract all taxon ids that are children of this species. I do not
      just want the immediate children, but the children's children 
      and so
      on.
  

      Any ideas on the way I can go about doing this?
  

      Well, you'll use Bio::DB::Taxonomy presumably, and
      each_Descendent in
      some kind of looping structure. Most easily a recursing sub.
  

      If you happen to code up something neat and efficient, why not
      share it
      with us and we could add it to the Taxonomy module(s).
  

      ---------------------------------
      Shape Yahoo! in your own image. Join our Network Research Panel
      today!
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

      ---------------------------------
      Need a vacation? Get great deals to amazing places on Yahoo! 
      Travel.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
      ===========================================================
      : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
      ===========================================================
  

      ---------------------------------
      Take the Internet to Go: Yahoo!Go puts the Internet in your
      pocket: mail, news, photos & more.
  

      --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

      ---------------------------------
      Bored stiff? Loosen up...
      Download and play hundreds of games for free on Yahoo! Games.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      Christopher Fields
      Postdoctoral Researcher
      Lab of Dr. Robert Switzer
      Dept of Biochemistry
      University of Illinois Urbana-Champaign
  

      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      -- 
      ===========================================================
      : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
      ===========================================================
  

      ---------------------------------
      Expecting? Get great news right away with email Auto-Check.
      Try the Yahoo! Mail Beta.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

        --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

    ---------------------------------
    Building a website is a piece of cake. 
    Yahoo! Small Business gives you all the tools to get online.
  

      --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

  ---------------------------------
  Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us.


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Now that's room service! Choose from over 150,000 hotels 
in 45,000 destinations on Yahoo! Travel to find your fit.

From torsten.seemann at infotech.monash.edu.au  Mon Jun 18 21:26:41 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 19 Jun 2007 11:26:41 +1000
Subject: [Bioperl-l] gff2xml
In-Reply-To: <a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>
References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
	<a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>
Message-ID: <a79f6a4b0706181826x4ccc4ee5n8ddafa703ad162a3@mail.gmail.com>

(Sean, please reply to the bioperl-l list rather than to me personally
so everyone can read it. i'm reposting it here)

> > I posted this on the gbrowse list earlier. I'm looking to convert gff
> > data files into xml. Does anyone know of a module written to do this
> > already?
>
> What DTD do you want the XML to conform to?
> eg. ChadoXML, TinySeq XML, TIGR XML ... ?

Hi Torsten,
I'm collaborating with other groups and want web-service compatible
functionality for various tools. Normally the analysis tools I'm using
generate gff output. I'm going to have to wrap this output in XML with
XSL stylesheet for end-users to view. Haven't done it before and don't
know what DTD to use. The bp_seqconvert.pl doesn't accept gff format.
I would imagine the DTD would be quite short as the gff files are very
standard, I just don't have any experience with these DTD
requirements.
--Sean O'Keeffe <limericksean at gmail.com>

From sac at bioperl.org  Tue Jun 19 02:42:27 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Mon, 18 Jun 2007 23:42:27 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy)
Message-ID: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>

On 6/16/07, Jason Stajich <jason at bioperl.org> wrote:
> [...]
> Just to say I already went through all the steps of running cvs2svn
> myself and had problems gathering back out the branches and all the
> tags when I tried it.  If you want to start with a smaller repository
> like bioperl-network or bioperl-db as the initial cvs2svn conversion
> script took quite a long time to run on bioperl-live.

Might this been a good opportunity to investigate partitioning
bioperl-live into sub-repositories? There has been talk in the past of
defining a set of "core" modules separate from other functionally
related groups of modules that would be viewed as optional extensions.
The goal being to help manage growth and simplify releases. There are
currently 892 modules under Bio/.

In addition to simplifying the migration to SVN, it would also have
other benefits. Say some new functionality or a slew of fixes were
added to Bio::Graphics. We could turn around a new Bio::Graphics
release quickly without having to work on getting various other parts
up to snuff that aren't related to graphics (Biblio, DB, PopGen,
Search etc.). Maintenance and releases of the various extensions would
be more parallelizable, orchestrated by separate ring leaders.

Over time, as a set of functionality matures, it would see fewer
updates and there would be less of a need for users to
download/install/test it. This could make bioperl easier to customize,
extend, and grok in general.

Long term, it should ease development and release cycles, but it will
involve a bit of near term bullet-biting. We'd need to get clear on
how to partition things, including modules, tests, docs, installation
logic, etc. and we'd probably need new integration tests to verify
that the subsets continue working together.

What do folks think? Would this SVN-based, re-partitioned bioperl-live
constitute a 2.0 release? Any volunteers to help assemble a roadmap
and milestones? Should I go on dreaming?

Cheers,
Steve

From bix at sendu.me.uk  Tue Jun 19 03:01:05 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 08:01:05 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
Message-ID: <46777F31.7030402@sendu.me.uk>

Jason Stajich wrote:
> The reason it isn't printing anything is someone didn't really write  
> the implementation quite right. This code was overhauled by Sendu  
> before the last release I guess something didn't quite get connected.
> 
> I checked in code that has the Bio::Taxon delegating now to a DB  
> handle for the each_Descendent call.
> You can either patch your code  or just use the code listed here:
>   http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

I've reverted that change.

For some reason the docs for Bio::Taxon::each_Descendent aren't showing 
up on the website, but they state:

---
Note that this method never asks the database for the descendents; it 
will only return objects you have manually set with add_Descendent(), or 
where this was done for you by making a Bio::Tree::Tree with this object 
as an argument to new().

To get the database descendents use 
$taxon->db_handle->each_Descendent($taxon).
---


I also have a note in the Synopsis for the module:

---
# Though be careful with each_Descendent - unless you add_Descendent()
# yourself, you won't get an answer because unlike for ancestor(),
# Bio::Taxon does not ask the database for the answer. You can ask the
# database yourself using the same method:
($human) = $homo->db_handle->each_Descendent($homo);
---


This is quite deliberate and is to prevent Bad Things from happening. 
(Can't exactly remember the reasoning now, but I know it was good.)

From bix at sendu.me.uk  Tue Jun 19 03:41:57 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 08:41:57 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
Message-ID: <467788C5.6070406@sendu.me.uk>

Steve Chervitz wrote:
> Might this been a good opportunity to investigate partitioning
> bioperl-live into sub-repositories? There has been talk in the past of
> defining a set of "core" modules separate from other functionally
> related groups of modules that would be viewed as optional extensions.
> The goal being to help manage growth and simplify releases. There are
> currently 892 modules under Bio/.
> 
> In addition to simplifying the migration to SVN, it would also have
> other benefits. Say some new functionality or a slew of fixes were
> added to Bio::Graphics. We could turn around a new Bio::Graphics
> release quickly without having to work on getting various other parts
> up to snuff that aren't related to graphics (Biblio, DB, PopGen,
> Search etc.). Maintenance and releases of the various extensions would
> be more parallelizable, orchestrated by separate ring leaders.
> 
> Over time, as a set of functionality matures, it would see fewer
> updates and there would be less of a need for users to
> download/install/test it. This could make bioperl easier to customize,
> extend, and grok in general.
> 
> Long term, it should ease development and release cycles

I actually take the opposite view. Breaking things up makes testing and 
releases more difficult.

If one person acts as pumpkin for all the sub-parts, his work-load 
increases almost linearly with the number of sub-parts. If each sub-part 
gets its own pumpkin, where do all these pumpkins come from? It seems to 
me that frequently authors will write modules but inevitably their 
circumstance changes and they can no longer devote the time to look 
after them. Having a single pumpkin and 'forcing' him to make sure 
everything works (regardless of his personal interest in the module) 
seems more reliable than hoping there will be a person interested enough 
in each sub-part to handle its release.

Since all sub-parts will at the least interact with the 'true' core set 
of Bioperl modules, they need to be tested and potentially re-released 
every time the true core is updated. And since some sub-parts will 
interact with other sub-parts, there will need to be coordinated 
joint-testing and release of multiple sub-parts.

What happens when users report problems? We ask them what version 
they're running. Right now '1.5.2' means a specific thing, and its 
trivial for someone to confirm the same problem by installing 1.5.2. 
What happens when users have to list out all the versions of all the 
sub-parts they have? Who is going to consistently recreate a users 
hodge-podge of versions in order to confirm a bug? Won't the advice 
instead be: "update all versions to the latest and get back to us"?

So, as I see it, all sub-parts would best be tested and released with a 
single new version number every time one sub-part is updated 
(significantly). In which case, why have sub-parts at all? Keeping 
things the way they are now means ease of release for the pumpkin and 
ease of installation for end-users (only one install command to issue to 
CPAN). Having 'true' sub-parts (each with its own pumpkin), in my 
fatalistic view, is just going to lead to some useful sub-parts being 
abandoned and never updated, even where updates may be desirable.

Each and every Bio:: module could have been released separately by its 
respective author. As I see it, one of the main values of 'Bioperl' is 
that its one (reasonably) consistent collection of modules that lowers 
the barrier of entry for new Bioinformaticians, giving them extremely 
easy access to a whole host of functionality with a single install.

From hlapp at gmx.net  Tue Jun 19 08:47:02 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 08:47:02 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46777F31.7030402@sendu.me.uk>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
Message-ID: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>

So the real mistake was to write

  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;

instead of

  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents 
($node);

I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the  
database?

If this is correct, can we highlight this in the documentation? It's  
a small difference that everyone failed to spot.

If it is not correct, then maybe we need to revisit the rationale for  
why a Bio::DB::Taxonomy::get_all_Descendents may not query the  
underlying database.

Also, in my reading of Bio::Taxonomy::Taxon it won't use the database  
either for ancestor(). Which would be consistent with its other methods.

I.e., the bottom line is don't use Node or Taxon objects for  
hierarchy queries that you expect to use an underlying database, use  
the Bio::DB::Taxonomy object instead. It makes sense, but is it true?

	-hilmar

On Jun 19, 2007, at 3:01 AM, Sendu Bala wrote:

> Jason Stajich wrote:
>> The reason it isn't printing anything is someone didn't really write
>> the implementation quite right. This code was overhauled by Sendu
>> before the last release I guess something didn't quite get connected.
>>
>> I checked in code that has the Bio::Taxon delegating now to a DB
>> handle for the each_Descendent call.
>> You can either patch your code  or just use the code listed here:
>>   http://bioperl.org/wiki/Module:Bio::DB::Taxonomy
>
> I've reverted that change.
>
> For some reason the docs for Bio::Taxon::each_Descendent aren't  
> showing
> up on the website, but they state:
>
> ---
> Note that this method never asks the database for the descendents; it
> will only return objects you have manually set with add_Descendent 
> (), or
> where this was done for you by making a Bio::Tree::Tree with this  
> object
> as an argument to new().
>
> To get the database descendents use
> $taxon->db_handle->each_Descendent($taxon).
> ---
>
>
> I also have a note in the Synopsis for the module:
>
> ---
> # Though be careful with each_Descendent - unless you add_Descendent()
> # yourself, you won't get an answer because unlike for ancestor(),
> # Bio::Taxon does not ask the database for the answer. You can ask the
> # database yourself using the same method:
> ($human) = $homo->db_handle->each_Descendent($homo);
> ---
>
>
> This is quite deliberate and is to prevent Bad Things from happening.
> (Can't exactly remember the reasoning now, but I know it was good.)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From rvos at interchange.ubc.ca  Tue Jun 19 09:05:25 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Tue, 19 Jun 2007 06:05:25 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <15433211.1182258325544.JavaMail.myubc2@brahms.my.ubc.ca>


> Unrelated, but it randomly just occurred to me: what happens to all the 
> id lines at the top of modules? Eg:
> 
> $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $
> 
> That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
> I wish we would, since they caused me no end of hassles during the 1.5.2 
> release, doing updates across branches.)

If you run something like 'svn propset svn:keywords Id' on the file/folder/recursively, svn picks up on the $Id tag. The structure of the resulting string would be a little different, because svn revision numbers are simply auto-increasing integers (afaik) - so any regular expressions that cleverly want to include the revision number in $VERSION would need to be updated.


From bix at sendu.me.uk  Tue Jun 19 10:25:26 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 15:25:26 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
Message-ID: <4677E756.6050200@sendu.me.uk>

Hilmar Lapp wrote:
> So the real mistake was to write
> 
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
> 
> instead of
> 
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents 
> ($node);
> 
> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the  
> database?

Yes, the database object methods use the database. I don't even think it 
makes sense to question that. What else would it do?


> If this is correct, can we highlight this in the documentation? It's  
> a small difference that everyone failed to spot.

The documentation for what? I've already clearly pointed out the gotcha 
in Bio::Taxon.


> Also, in my reading of Bio::Taxonomy::Taxon it won't use the database  
> either for ancestor(). Which would be consistent with its other methods.

Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're dealing 
with, and it /does/ use the db to get the ancestor, unless the ancestor 
is manually set (see below for explanation).


> I.e., the bottom line is don't use Node or Taxon objects for  
> hierarchy queries that you expect to use an underlying database, use  
> the Bio::DB::Taxonomy object instead. It makes sense, but is it true?

Almost. It happens to be true but ideally wouldn't be the case. The 
confusion and problems arise, I guess, because we have two ways to 
access/create hierarchies and both of them are built from the same 
building block (Bio::Taxon objects).

On the one hand we have Bio::DB::Taxonomy and the other we have 
Bio::Tree::Tree.

Tree objects are easy: you have a Taxon object created in memory for 
each and every node in the tree. Each Taxon knows its ancestor and 
descendants by storing references to the relevant Taxon objects in the 
tree. You 'navigate' through the tree by grabbing a Taxon inside it and 
asking the Taxon itself for its ancestor or descendant.

This leaves us with the Taxon object having the methods ancestor() and 
each_Descendent(), which we'll expect to work in other circumstances.

Bio::DB::Taxonomy returns single Taxon objects from the database on 
request. Now we still expect our ancestor() and each_Descendent() 
methods to work, but if things were set up like Bio::Tree::Tree we'd end 
up pulling the entire database into memory because we'd have to create 
all the Taxon objects that are ancestors and descendants, recursively, 
every time we request a single Taxon (which is wasteful in the case of 
Bio::DB::Taxonomy::flatfile and slow/not allowed in the case of 
Bio::DB::Taxonomy::entrez).

The solution? We simply don't create the immediate ancestor or 
descendant Taxon objects of the requested Taxon, and instead implement 
the Taxon methods to ask the database to create them on demand, if they 
don't already exist. Well, that idea is fine (and necessary) for the 
ancestor method, but we run into problems with each_Descendent().

The problem arises when we create Bio::Tree::Tree objects from a Taxon 
we got from the database. Being able to do that is why Bio::Taxon is 
shared between them, as it is a very desirable thing to do: you can 
instantly create a lineage tree for a Taxon of interest and then use all 
the Bio::Tree::Tree methods on it. Unfortunately one of those methods is 
get_nodes() which is implemented using each_Descendent() and 
get_all_Descendents(). If each_Descendent() asked the database for the 
real answer, we'd end up pulling the entire database into the tree.

So my implementation was to not ask the database and just warn people in 
the docs. Ideally it /would/ use the database, because that's what a 
user would expect. Can anyone see an alternate way around the problem?

From hlapp at gmx.net  Tue Jun 19 12:14:38 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 12:14:38 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <4677E756.6050200@sendu.me.uk>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
	<4677E756.6050200@sendu.me.uk>
Message-ID: <C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>

Sorry I was accidentally looking at an older branch.

Reading through the Taxon module I get more confused though than  
would leave me at ease.

Here's what I understand of your description of the problem:

- We would like nodes returned from Bio::DB::Taxonomy to use the  
database for all hierarchical queries.

- We would like nodes used in a Bio::Tree::Tree not to use the  
database for any hierarchical query.

What I understand that we have is

- Taxon node objects that have a db_handle set will use the database  
for ancestor(), unless it has been set manually (?), but not for  
each_Descendent().

- Taxon node objects that don't have a db_handle set won't use a  
database but will function normally otherwise.

- This is needed to prevent Bio::Tree::Tree methods from pulling the  
entire tree into memory.

If this is correct (I'm not sure it is), it sounds like we want to  
temporarily divorce taxonomy nodes from their database capabilities  
while they are being queried in a tree context?

I'm still trying to understand - if I create a Bio::Tree::Tree from a  
single node, will the tree automatically contain all nodes along the  
lineage of ancestors up to the root? So, even if extracting this  
lineage involved querying a database it would be acceptable, but not  
for querying descendents?

It sounds to me like what is needed is that nodes that get added to a  
tree need to be stripped of their database capabilities. This could  
be achieved by creating a wrapper class that delegates all non- 
hierarchical methods to the wrapped Taxon object, and overriding all  
hierarchical queries to not use a database. I'm not sure I fully  
understand yet though, but the inconsistent behavior will be sure to  
throw people off track.

	-hilmar

On Jun 19, 2007, at 10:25 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> So the real mistake was to write
>>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>   my @extant_children = grep { $_->is_Leaf } $node- 
>> >get_all_Descendents;
>> instead of
>>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>   my @extant_children = grep { $_->is_Leaf } $db- 
>> >get_all_Descendents ($node);
>> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask  
>> the  database?
>
> Yes, the database object methods use the database. I don't even  
> think it makes sense to question that. What else would it do?
>
>
>> If this is correct, can we highlight this in the documentation?  
>> It's  a small difference that everyone failed to spot.
>
> The documentation for what? I've already clearly pointed out the  
> gotcha in Bio::Taxon.
>
>
>> Also, in my reading of Bio::Taxonomy::Taxon it won't use the  
>> database  either for ancestor(). Which would be consistent with  
>> its other methods.
>
> Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're  
> dealing with, and it /does/ use the db to get the ancestor, unless  
> the ancestor is manually set (see below for explanation).
>
>
>> I.e., the bottom line is don't use Node or Taxon objects for   
>> hierarchy queries that you expect to use an underlying database,  
>> use  the Bio::DB::Taxonomy object instead. It makes sense, but is  
>> it true?
>
> Almost. It happens to be true but ideally wouldn't be the case. The  
> confusion and problems arise, I guess, because we have two ways to  
> access/create hierarchies and both of them are built from the same  
> building block (Bio::Taxon objects).
>
> On the one hand we have Bio::DB::Taxonomy and the other we have  
> Bio::Tree::Tree.
>
> Tree objects are easy: you have a Taxon object created in memory  
> for each and every node in the tree. Each Taxon knows its ancestor  
> and descendants by storing references to the relevant Taxon objects  
> in the tree. You 'navigate' through the tree by grabbing a Taxon  
> inside it and asking the Taxon itself for its ancestor or descendant.
>
> This leaves us with the Taxon object having the methods ancestor()  
> and each_Descendent(), which we'll expect to work in other  
> circumstances.
>
> Bio::DB::Taxonomy returns single Taxon objects from the database on  
> request. Now we still expect our ancestor() and each_Descendent()  
> methods to work, but if things were set up like Bio::Tree::Tree  
> we'd end up pulling the entire database into memory because we'd  
> have to create all the Taxon objects that are ancestors and  
> descendants, recursively, every time we request a single Taxon  
> (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and  
> slow/not allowed in the case of Bio::DB::Taxonomy::entrez).
>
> The solution? We simply don't create the immediate ancestor or  
> descendant Taxon objects of the requested Taxon, and instead  
> implement the Taxon methods to ask the database to create them on  
> demand, if they don't already exist. Well, that idea is fine (and  
> necessary) for the ancestor method, but we run into problems with  
> each_Descendent().
>
> The problem arises when we create Bio::Tree::Tree objects from a  
> Taxon we got from the database. Being able to do that is why  
> Bio::Taxon is shared between them, as it is a very desirable thing  
> to do: you can instantly create a lineage tree for a Taxon of  
> interest and then use all the Bio::Tree::Tree methods on it.  
> Unfortunately one of those methods is get_nodes() which is  
> implemented using each_Descendent() and get_all_Descendents(). If  
> each_Descendent() asked the database for the real answer, we'd end  
> up pulling the entire database into the tree.
>
> So my implementation was to not ask the database and just warn  
> people in the docs. Ideally it /would/ use the database, because  
> that's what a user would expect. Can anyone see an alternate way  
> around the problem?

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Tue Jun 19 14:41:52 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 19 Jun 2007 14:41:52 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] is this a bp_genbank2gff3.pl bug?
In-Reply-To: <18039.61086.829726.809888@gargle.gargle.HOWL>
References: <18039.61086.829726.809888@gargle.gargle.HOWL>
Message-ID: <1182278512.2592.42.camel@localhost.localdomain>

Hi Alessandra,

I cc'ed your message to the bioperl and sequence ontology mailing lists,
since your question is relevant to both.

Converting genbank files to GFF3 is excruciatingly difficult; I
generally find that I can use the genbank2gff3 script to get me most of
the way there, but then I need to do some manual fixing to make it
'right'.

I am using bioperl-live, since there have been several fixes to the
script since bioperl 1.5.2 was released, including the most recent fixes
from me today (when I started working on this); I would suggest you use
bioperl-live as well.  I ran the script on chrY.

Most (perhaps all) of the errors fit into a few categories:

  - CDS doesn't have a phase, where the GFF3 spec requires CDSes to have
a phase.  Since it can be a little bit of a hassle to calculate, I
understand why it was left out, but I'll submit a bug report to have
those calculated.  If you are planning on loading the GFF file into
Chado, you can use the --noCDS option to get exons instead of CDSes,
which makes the problem go away (the validator has a bug here though--it
reports the polypeptide derives_from mRNA as invalid, but it is correct;
I'm reporting that directly to the author).  Here's the bioperl bug
report:

  http://bugzilla.open-bio.org/show_bug.cgi?id=2322

  - "invalid type pair" is caused by the genbank file using feature
types in a way that conflicts with the Sequence Ontology.  For example,
it has STS features that are part_of a gene, pseudogenic_region as
part_of pseudogene.  I don't know if there would be an easy way to catch
this in the conversion script.  You may need to fix these by hand.  If
the problems occur for features that you don't care about, you can use
the --filter option to leave them out of the resulting GFF file (for
example, adding '--filter STS' would leave all STS features out of the
file).  Also, if you don't plan on loading these into Chado (which does
require SO-compliance) but instead plan on using a Bio::DB::SeqFeature
database, these errors may not be a problem.

  - "invalid type" is caused by feature types that are not in SOFA
(Sequence Ontology for Feature Annotation), though the terms probably
are in SO.  I thought at one point we discussed allowing any SO type to
appear in the GFF3 type column, but that is not what the spec says now.
I don't see this type of error as causing a problem for either
Bio::DB::SeqFeature or Chado.  Chado allows features to be typed with
anything that is in SO and does not restrict to SOFA.

Scott


On Tue, 2007-06-19 at 16:56 +0200, Alessandra Bilardi wrote:
> Hi all,
> 
> I used bp_genbank2gff3.pl with CVS bioperl and it created gff3 about
> human genbank file. I used validate_gff3 on line with human.gff and 
> it has id non-unique so the database gbrowse inserting has errors.
> 
> I attach the error file about hs_ref_chrY.gbk and hs_ref_chr1.gbk that 
> I download at at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens
> Elements having id non-unique are:
> - CDS or pseudo*exon without mRNA and parent 
> - STS with egual start and end
> - tRNA with egual name
> 
> If this is a bp_genbank2gff3.pl bug, can you rectify bp_genbank2gff3.pl?
> If I'm mistaken, can you help me?
> 
> Thanks very much for the help in advance,
> 
> Alessandra.
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070619/3d818b27/attachment.bin 

From sac at bioperl.org  Tue Jun 19 14:54:39 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Tue, 19 Jun 2007 11:54:39 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <467788C5.6070406@sendu.me.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
Message-ID: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>

Valid points, Sendu. I wonder if there might be a best-of-both-worlds
approach here. I would not be advocating for a major slice and dice,
but just identifying a few large, reasonably well established and
encapsulated blocks of functionality that could be managed more
independently and segregating them away from the rest. For example:
DB, Graphics, Search+SearchIO, Tools.

Once per year, we could have a "whole caboodle" release where the core
and all sub parts are tested and released as a group, as we currently
do. Then, updates to the sub parts can occur as-needed but without
necessarily involving updates to other sub parts or the core.

The onus would be on the pumpkin for the sub part release to make sure
it continues to work with the last whole caboodle release. This would
minimize the number of release clashes, since sub part updates would
only be sanctioned relative to the last caboodle release, and it would
ensure that the whole set continues to interoperate.

Perhaps it would be worth experimenting with such an approach so we
can judge it based on actual experience. We could identify one
functional sub part and segregate it out, do a release cycle or two,
along with a sub part release, and decide if this makes things easier
or harder, for devs as well as users. We could always bring it back
into the fold if it doesn't work out.

My fear is that as bioperl continues to grow, the monolithic approach
will become increasingly onerous for a single release pumpkin to
manage, and harder to find someone who feels up to the task. It could
also discourage new developers from diving into the codebase if it
looks too deep. And they are our lifeblood.

A more functionally segregated bioperl codebase could lower the
activation energy needed to recruit release pumpkins and new devs,
leading to more release iterations, fewer bugs, more features, and
more sustainable growth.

When I first discovered Bioperl in 1996, it had three modules. At
~900, I  probably wouldn't have joined ranks as a developer (well, I
probably would, but it would have taken a while to digest it and
become a contributor).

Steve

On 6/19/07, Sendu Bala <bix at sendu.me.uk> wrote:
> Steve Chervitz wrote:
> > Might this been a good opportunity to investigate partitioning
> > bioperl-live into sub-repositories? There has been talk in the past of
> > defining a set of "core" modules separate from other functionally
> > related groups of modules that would be viewed as optional extensions.
> > The goal being to help manage growth and simplify releases. There are
> > currently 892 modules under Bio/.
> >
> > In addition to simplifying the migration to SVN, it would also have
> > other benefits. Say some new functionality or a slew of fixes were
> > added to Bio::Graphics. We could turn around a new Bio::Graphics
> > release quickly without having to work on getting various other parts
> > up to snuff that aren't related to graphics (Biblio, DB, PopGen,
> > Search etc.). Maintenance and releases of the various extensions would
> > be more parallelizable, orchestrated by separate ring leaders.
> >
> > Over time, as a set of functionality matures, it would see fewer
> > updates and there would be less of a need for users to
> > download/install/test it. This could make bioperl easier to customize,
> > extend, and grok in general.
> >
> > Long term, it should ease development and release cycles
>
> I actually take the opposite view. Breaking things up makes testing and
> releases more difficult.
>
> If one person acts as pumpkin for all the sub-parts, his work-load
> increases almost linearly with the number of sub-parts. If each sub-part
> gets its own pumpkin, where do all these pumpkins come from? It seems to
> me that frequently authors will write modules but inevitably their
> circumstance changes and they can no longer devote the time to look
> after them. Having a single pumpkin and 'forcing' him to make sure
> everything works (regardless of his personal interest in the module)
> seems more reliable than hoping there will be a person interested enough
> in each sub-part to handle its release.
>
> Since all sub-parts will at the least interact with the 'true' core set
> of Bioperl modules, they need to be tested and potentially re-released
> every time the true core is updated. And since some sub-parts will
> interact with other sub-parts, there will need to be coordinated
> joint-testing and release of multiple sub-parts.
>
> What happens when users report problems? We ask them what version
> they're running. Right now '1.5.2' means a specific thing, and its
> trivial for someone to confirm the same problem by installing 1.5.2.
> What happens when users have to list out all the versions of all the
> sub-parts they have? Who is going to consistently recreate a users
> hodge-podge of versions in order to confirm a bug? Won't the advice
> instead be: "update all versions to the latest and get back to us"?
>
> So, as I see it, all sub-parts would best be tested and released with a
> single new version number every time one sub-part is updated
> (significantly). In which case, why have sub-parts at all? Keeping
> things the way they are now means ease of release for the pumpkin and
> ease of installation for end-users (only one install command to issue to
> CPAN). Having 'true' sub-parts (each with its own pumpkin), in my
> fatalistic view, is just going to lead to some useful sub-parts being
> abandoned and never updated, even where updates may be desirable.
>
> Each and every Bio:: module could have been released separately by its
> respective author. As I see it, one of the main values of 'Bioperl' is
> that its one (reasonably) consistent collection of modules that lowers
> the barrier of entry for new Bioinformaticians, giving them extremely
> easy access to a whole host of functionality with a single install.
>

From bix at sendu.me.uk  Tue Jun 19 15:13:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 20:13:39 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
Message-ID: <46782AE3.2090703@sendu.me.uk>

Steve Chervitz wrote:
> Valid points, Sendu. I wonder if there might be a best-of-both-worlds
> approach here.
[snip]

You haven't convinced me, but I'd go along with the majority decision if 
best-of-both-worlds was picked.


> DB, Graphics, Search+SearchIO, Tools.

I will, however, say that DB interleaves into too many core modules. It 
should stay in core. Tools? Its hardly touched anyway, so I don't see 
the value of taking it out, what with Bio::Tools::Run already being its 
own package. Most Bioperl users probably get Bioperl just to do 
something Blast related, so all Blast stuff really ought to stay in core.

Graphics is an obvious choice and I agree. Updated frequently, and has 
its own release needs. It also has some of the trickier dependencies, so 
would make installing core simpler.

I can imagine plucking Search+SearchIO out, and its something that needs 
regular updating. Another good candidate.


> Perhaps it would be worth experimenting with such an approach so we
> can judge it based on actual experience. We could identify one
> functional sub part and segregate it out, do a release cycle or two,
> along with a sub part release, and decide if this makes things easier
> or harder, for devs as well as users.

Well, we already have the run package. Its a split-off subpart that gets 
updated. The only 'experiment' left to do is finding it its own pumpkin.

From bix at sendu.me.uk  Tue Jun 19 15:48:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 20:48:50 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
	<4677E756.6050200@sendu.me.uk>
	<C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>
Message-ID: <46783322.30309@sendu.me.uk>

Hilmar Lapp wrote:
> Here's what I understand of your description of the problem:
> 
> - We would like nodes returned from Bio::DB::Taxonomy to use the  
> database for all hierarchical queries.
> 
> - We would like nodes used in a Bio::Tree::Tree not to use the  
> database for any hierarchical query.

Correct.


> What I understand that we have is
> 
> - Taxon node objects that have a db_handle set will use the database  
> for ancestor(), unless it has been set manually (?), but not for  
> each_Descendent().
> 
> - Taxon node objects that don't have a db_handle set won't use a  
> database but will function normally otherwise.
> 
> - This is needed to prevent Bio::Tree::Tree methods from pulling the  
> entire tree into memory.

Correct.


> If this is correct (I'm not sure it is), it sounds like we want to  
> temporarily divorce taxonomy nodes from their database capabilities  
> while they are being queried in a tree context?

Yes.


> I'm still trying to understand - if I create a Bio::Tree::Tree from a  
> single node, will the tree automatically contain all nodes along the  
> lineage of ancestors up to the root? So, even if extracting this  
> lineage involved querying a database it would be acceptable, but not  
> for querying descendents?

Yes. Asking the database for all the ancestors up to root only pulls a 
couple of nodes into the tree and is exactly what the user would want to 
happen. But if nodes are allowed to get their descendants from the 
database, when we get the root node from the database, we'd get all the 
root's descendants, and then for each of those we'd get all /their/ 
descendants... that's when the whole db gets sucked in.


> It sounds to me like what is needed is that nodes that get added to a  
> tree need to be stripped of their database capabilities. This could  
> be achieved by creating a wrapper class that delegates all non- 
> hierarchical methods to the wrapped Taxon object, and overriding all  
> hierarchical queries to not use a database. I'm not sure I fully  
> understand yet though, but the inconsistent behavior will be sure to  
> throw people off track.

When we're making a tree from a db Taxon we need db access to find all 
the ancestors; we just don't want to get any descendants outside our 
initiating Taxon's direct lineage.


my @names = ('Eukaryota', 'Mammalia', 'Primates', 'Homo', 'Homo sapiens');
my @ranks = qw(superkingdom class order genus species);
my $db = Bio::DB::Taxonomy->new(-source => 'list', -names => \@names,
                                                    -ranks => \@ranks);

@names = ('Eukaryota', 'Mammalia', 'Rodentia', 'Mus', 'Mus musculus');
$db->add_lineage(-names => \@names, -ranks => \@ranks);


my $homo = $db->get_taxon(-name => 'Homo');
isa_ok($homo, 'Bio::Taxon'); # PASS

is $homo->ancestor->scientific_name, 'Primates' # PASS
my @descs = $homo->each_Descendent;
is @descs, 1 # FAIL, we wanted it to contain the 'Homo sapiens' node


my $lineage = Bio::Tree::Tree->new(-node => $homo);
is $lineage->get_root_node->scientific_name, 'Eukaryota'; # PASS
my @nodes = $lineage->get_nodes;
ok @nodes, 4; # PASS: we didn't pull in Rodentia which would be 8

(on that last test I can't remember if the answer might actually be 5 
because our lineage does contain 'Homo sapiens')


If anyone can figure out how to get all those to pass, please let me know.

From cjfields at uiuc.edu  Tue Jun 19 17:15:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 19 Jun 2007 16:15:00 -0500
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
Message-ID: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>


On Jun 19, 2007, at 1:54 PM, Steve Chervitz wrote:

> Valid points, Sendu. I wonder if there might be a best-of-both-worlds
> approach here. I would not be advocating for a major slice and dice,
> but just identifying a few large, reasonably well established and
> encapsulated blocks of functionality that could be managed more
> independently and segregating them away from the rest. For example:
> DB, Graphics, Search+SearchIO, Tools.

There should also be a consensus between the core devs on this; I  
don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing  
their opinions as it will directly impact projects which rely on core  
functionality (GBrowse/GMOD, bioperl-db, etc).  I also agree with  
George that this should be postponed until after svn issues are taken  
care of.

Stating that, I think this is a good idea in general, though we'll  
need to be careful which ones we segregate out as non-core.  I agree  
with your choices; I would add in Bio::Restriction, Bio::Assembly,  
Bio::Structure, and a few more.  As long as the distribution required  
installation of 'core' prior to test runs it shouldn't be too much of  
a problem.

In order for this to work we would need to delineate what defines  
'core' (how broad the definition should be), then identify those  
modules that don't fit and decide what to do with them.  Would we  
want to split the others into separate packages or lump together as a  
bioperl-auxiliary (horrid name, but you get my point)?  Too many  
could be a logistical nightmare, as Sendu has pointed out.

> Once per year, we could have a "whole caboodle" release where the core
> and all sub parts are tested and released as a group, as we currently
> do. Then, updates to the sub parts can occur as-needed but without
> necessarily involving updates to other sub parts or the core.

Sounds fine by me.  Actually, my thought was we could reimplement  
Bundle::BioPerl on CPAN (which Module::Build effectively obsoleted)  
to install all the necessary subpackages in order to emulate an old- 
style 'core' installation, or act as an 'install everything BioPerl- 
related' Bundle.  Regular updates of the subpackages to CPAN should  
just require updating the Bundle (which would update only the  
relevant parts, at least I believe it would).

> The onus would be on the pumpkin for the sub part release to make sure
> it continues to work with the last whole caboodle release. This would
> minimize the number of release clashes, since sub part updates would
> only be sanctioned relative to the last caboodle release, and it would
> ensure that the whole set continues to interoperate.
>
> Perhaps it would be worth experimenting with such an approach so we
> can judge it based on actual experience. We could identify one
> functional sub part and segregate it out, do a release cycle or two,
> along with a sub part release, and decide if this makes things easier
> or harder, for devs as well as users. We could always bring it back
> into the fold if it doesn't work out.
>
> My fear is that as bioperl continues to grow, the monolithic approach
> will become increasingly onerous for a single release pumpkin to
> manage, and harder to find someone who feels up to the task. It could
> also discourage new developers from diving into the codebase if it
> looks too deep. And they are our lifeblood.

Agreed!

> A more functionally segregated bioperl codebase could lower the
> activation energy needed to recruit release pumpkins and new devs,
> leading to more release iterations, fewer bugs, more features, and
> more sustainable growth.

'Activation energy.'  Hmm.  Spoken like a true biologist.

> When I first discovered Bioperl in 1996, it had three modules. At
> ~900, I  probably wouldn't have joined ranks as a developer (well, I
> probably would, but it would have taken a while to digest it and
> become a contributor).
>
> Steve

I pretty much agree, though this will require quite a bit more  
discussion.

chris


From hlapp at gmx.net  Tue Jun 19 17:57:54 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 17:57:54 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
Message-ID: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>


On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:

> There should also be a consensus between the core devs on this; I
> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
> their opinions

The problem I have increasingly had with BioPerl (aside from the fact  
that it's written in Perl ;) is the plethora of dependencies I need  
to install, not the number of modules.

But every time I've been told that that's what Perl is all about, and  
I should shut up and install the bundle. Idiosyncratically I don't  
like bundles that clutter up my hard disk with stuff I'll never use,  
and in this sense if BioPerl is divided into 10 packages I will have  
to think about each one whether I need it, and do a separate CVS  
checkout - and regular update - of each one (though granted, I  
believe there are ways the multiple checkout and update thing can be  
taken care of).

In reality, this may be a rapidly disappearing trait though of those  
who have grown up in a time when they proudly spent all their savings  
to buy that new computer because it had a 20MB hard disk, compared to  
the two 360k floppy drives the previous one had.

So don't ask me, just don't make it too hard for the dinosaurs.

> as it will directly impact projects which rely on core
> functionality (GBrowse/GMOD, bioperl-db, etc).

Well, I hope there are ways to limit that?

> I also agree with George that this should be postponed until after  
> svn issues are taken care of.

I agree entirely. Please don't throw this in the same bin or tie one  
to the other. The migration is neither easier nor faster nor better  
testable with a partitioned BioPerl.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Jun 19 21:48:20 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 19 Jun 2007 20:48:20 -0500
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
Message-ID: <D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>


On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote:

> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:
>
>> There should also be a consensus between the core devs on this; I
>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
>> their opinions
>
> The problem I have increasingly had with BioPerl (aside from the fact
> that it's written in Perl ;) is the plethora of dependencies I need
> to install, not the number of modules.
>
> But every time I've been told that that's what Perl is all about, and
> I should shut up and install the bundle. Idiosyncratically I don't
> like bundles that clutter up my hard disk with stuff I'll never use,
> and in this sense if BioPerl is divided into 10 packages I will have
> to think about each one whether I need it, and do a separate CVS
> checkout - and regular update - of each one (though granted, I
> believe there are ways the multiple checkout and update thing can be
> taken care of).

I agree; the fewer dependencies the better.  We could divide it up  
into a small, focused core package with only a few dependencies, and  
1-3 more containing the focused bits which require the most  
maintenance (Graphics, SearchIO/Tools, etc).  I worry about having  
too many more.

> In reality, this may be a rapidly disappearing trait though of those
> who have grown up in a time when they proudly spent all their savings
> to buy that new computer because it had a 20MB hard disk, compared to
> the two 360k floppy drives the previous one had.
>
> So don't ask me, just don't make it too hard for the dinosaurs.

There would need to be some way of getting an old-style full-blown  
core installation regardless of how many subdistros we would divy  
core up into.  My thought for CPAN was having Bundle::BioPerl take  
over this but I'm not sure if it's still being used.  Maybe there are  
other ways for svn/cvs.

>> as it will directly impact projects which rely on core
>> functionality (GBrowse/GMOD, bioperl-db, etc).
>
> Well, I hope there are ways to limit that?

I believe so, yes, particularly for bioperl-db.  I would think  
splitting off Bio::Graphics or Bio::DB* will have some effect on  
GBrowse/GFF.

>> I also agree with George that this should be postponed until after
>> svn issues are taken care of.
>
> I agree entirely. Please don't throw this in the same bin or tie one
> to the other. The migration is neither easier nor faster nor better
> testable with a partitioned BioPerl.
>
> 	-hilmar

We def. have to complete transition to subversion first, then think  
about this some more.

chris

From n.haigh at sheffield.ac.uk  Wed Jun 20 02:31:24 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 07:31:24 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
Message-ID: <4678C9BC.10206@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote:
> 
>> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:
>>
>>> There should also be a consensus between the core devs on this; I
>>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
>>> their opinions
>> The problem I have increasingly had with BioPerl (aside from the fact
>> that it's written in Perl ;) is the plethora of dependencies I need
>> to install, not the number of modules.
>>
>> But every time I've been told that that's what Perl is all about, and
>> I should shut up and install the bundle. Idiosyncratically I don't
>> like bundles that clutter up my hard disk with stuff I'll never use,
>> and in this sense if BioPerl is divided into 10 packages I will have
>> to think about each one whether I need it, and do a separate CVS
>> checkout - and regular update - of each one (though granted, I
>> believe there are ways the multiple checkout and update thing can be
>> taken care of).
> 
> I agree; the fewer dependencies the better.  We could divide it up  
> into a small, focused core package with only a few dependencies, and  
> 1-3 more containing the focused bits which require the most  
> maintenance (Graphics, SearchIO/Tools, etc).  I worry about having  
> too many more.
> 
>> In reality, this may be a rapidly disappearing trait though of those
>> who have grown up in a time when they proudly spent all their savings
>> to buy that new computer because it had a 20MB hard disk, compared to
>> the two 360k floppy drives the previous one had.
>>
>> So don't ask me, just don't make it too hard for the dinosaurs.
> 
> There would need to be some way of getting an old-style full-blown  
> core installation regardless of how many subdistros we would divy  
> core up into.  My thought for CPAN was having Bundle::BioPerl take  
> over this but I'm not sure if it's still being used.  Maybe there are  
> other ways for svn/cvs.

Personally, I think this use of Bundle::Bioperl is more in line with
what CPAN Bundles were meant to do - "a bundle is a collection of
modules that comprise a cohesive unit". Under that definition you could
probably put the whole of Bioperl but I won't go there! When a package
is updated and a new release is made, this should be
installable/updatable via cpan as well as updating the bundle with the
correct version. This was you can get all of Bioperl via the bundle, or
just install the sub-packages on their own.

If the switch over to svn takes place, will all the Bioperl-* projects
move over at the same time? If so, will they go into their own svn
repository or into the same one? Since with svn you can checkout any
subtree of the repository I'm not clear on the pro's and cons of either
of these options.

Am I right in thinking that there is a way for cvs to define a "project"
such that when you checkout that "project" it actually checks out
multiple projects behind the scene? I'm sure I've seen this somewhere,
possibly when the project is dependent on some 3rd party code that is
also in cvs. If this is possible, I'm sure it will also be possible with
svn. This could then allow something like the following to happen after
the split up of Bioperl. The following projects could be defined:
bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
called "bioperl" would actually checkout the real projects call
bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
that this ought to be possible, doesn't it?


> 
>>> as it will directly impact projects which rely on core
>>> functionality (GBrowse/GMOD, bioperl-db, etc).
>> Well, I hope there are ways to limit that?
> 
> I believe so, yes, particularly for bioperl-db.  I would think  
> splitting off Bio::Graphics or Bio::DB* will have some effect on  
> GBrowse/GFF.
> 
>>> I also agree with George that this should be postponed until after
>>> svn issues are taken care of.
>> I agree entirely. Please don't throw this in the sam. e bin or tie one
>> to the other. The migration is neither easier nor faster nor better
>> testable with a partitioned BioPerl.
>>
>> 	-hilmar
> 
> We def. have to complete transition to subversion first, then think  
> about this some more.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeMm7czuW2jkwy2gRAi+CAJ9cNZ70GojV7eviRjdWTFLk/MKYoACg2Ls4
op9sQTZyeK6G6taFhTAPMYc=
=7NRw
-----END PGP SIGNATURE-----

From hlapp at gmx.net  Wed Jun 20 07:46:16 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 07:46:16 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <4678C9BC.10206@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
Message-ID: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:

> If the switch over to svn takes place, will all the Bioperl-* projects
> move over at the same time?

They are under the same CVSROOT right now. Locking down some sub- 
repositories but not others may be odd or impossible.

> If so, will they go into their own svn repository or into the same  
> one?

Good question, I'm not sure about the pros and cons one way or the  
other either. The fewer repositories the less sysadmin work in fine- 
graining permissions.

	-hilmar

- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGeRONuV6N2JxL7qsRAoYTAJ9GVuC0j4szCcWTg7yWGoxN3YFucQCgogJ8
Ims4d150lsX0vXtDwGI1lKg=
=K4++
-----END PGP SIGNATURE-----

From n.haigh at sheffield.ac.uk  Wed Jun 20 07:57:22 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 12:57:22 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
Message-ID: <46791622.6080409@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hilmar Lapp wrote:
> 
> On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:
> 
>> If the switch over to svn takes place, will all the Bioperl-* projects
>> move over at the same time?
> 
> They are under the same CVSROOT right now. Locking down some
> sub-repositories but not others may be odd or impossible.
> 
>> If so, will they go into their own svn repository or into the same one?
> 
> Good question, I'm not sure about the pros and cons one way or the other
> either. The fewer repositories the less sysadmin work in fine-graining
> permissions.
> 
>     -hilmar
> 


I don't think there is any major reason why the following single repos
wouldn't do the trick:

/--
  |-bioperl-live
  |     |--- trunk
  |     |--- branches
  |     |--- tags
  |
  |-bioperl-run
        |--- trunk
        |--- branches
        |--- tags

Any reason why this couldn't be used?

I know some people don't like the idea of the revision number
incrementing for the whole repository if it contains several "projects".
However, revision numbers are really only a way for svn to keep track of
things and a very large revision number shouldn't really "upset" anyone.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeRYiczuW2jkwy2gRApS5AJsHl73MWZP8aMfOqlLgTYuzpMWmQgCg3VqA
1Vj8BSUnanpdjYYLE6eGanU=
=bOqK
-----END PGP SIGNATURE-----

From hlapp at gmx.net  Wed Jun 20 08:08:33 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 08:08:33 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <46791622.6080409@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
	<46791622.6080409@sheffield.ac.uk>
Message-ID: <DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote:

> I don't think there is any major reason why the following single repos
> wouldn't do the trick:
>
> /--
>   |-bioperl-live
>   |     |--- trunk
>   |     |--- branches
>   |     |--- tags
>   |
>   |-bioperl-run
>         |--- trunk
>         |--- branches
>         |--- tags
>
> Any reason why this couldn't be used?

That would work fine except that there are several more sub-projects  
(bioperl-db, bioperl-graphics, bioperl-microarray, and a few more).

That should still be fine. I think what needs to be recognized is the  
limitations it puts on permission granularity. If it's all the same  
repository (as is now) then having commit rights to one (subproject)  
will mean commit rights to all. From my perspective that's fine, it  
has worked great so far.

	-hilmar

- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGeRjFuV6N2JxL7qsRAj3dAJ42r1C8By29DNTUP9Ts0Lf5dOcS9QCgjSE1
hckjT7LBtHcmwGI8B+BKQIM=
=gYfA
-----END PGP SIGNATURE-----

From hartzell at alerce.com  Tue Jun 19 15:53:39 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 19 Jun 2007 12:53:39 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
Message-ID: <18040.13379.217277.992742@almost.alerce.com>

Steve Chervitz writes:
 > On 6/16/07, Jason Stajich <jason at bioperl.org> wrote:
 > > [...]
 > > Just to say I already went through all the steps of running cvs2svn
 > > myself and had problems gathering back out the branches and all the
 > > tags when I tried it.  If you want to start with a smaller repository
 > > like bioperl-network or bioperl-db as the initial cvs2svn conversion
 > > script took quite a long time to run on bioperl-live.
 > 
 > Might this been a good opportunity to investigate partitioning
 > bioperl-live into sub-repositories? [...]

I'd say that the time to do this kind of rearrangement would be
*after* the svn repo's set up.  That way you'll be able to track stuff
back through to the beginning of time.

g.


From sdavis2 at mail.nih.gov  Wed Jun 20 08:44:08 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 20 Jun 2007 08:44:08 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN
	and	...Re:	Perltidy)
In-Reply-To: <DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>	<4678C9BC.10206@sheffield.ac.uk>	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>	<46791622.6080409@sheffield.ac.uk>
	<DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>
Message-ID: <46792118.4030205@mail.nih.gov>

Hilmar Lapp wrote:
> 
> On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote:
> 
>> I don't think there is any major reason why the following single repos
>> wouldn't do the trick:
> 
>> /--
>>   |-bioperl-live
>>   |     |--- trunk
>>   |     |--- branches
>>   |     |--- tags
>>   |
>>   |-bioperl-run
>>         |--- trunk
>>         |--- branches
>>         |--- tags
> 
>> Any reason why this couldn't be used?
> 
> That would work fine except that there are several more sub-projects  
> (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more).
> 
> That should still be fine. I think what needs to be recognized is the  
> limitations it puts on permission granularity. If it's all the same  
> repository (as is now) then having commit rights to one (subproject)  
> will mean commit rights to all. From my perspective that's fine, it  
> has worked great so far.

Actually, I think there are ways of creating per-directory access
control.  See here:

http://svnbook.red-bean.com/en/1.2/svn-book.html#svn.serverconfig.svnserve.auth.general

With Apache-based https access, such access control is relatively
straightforward, it appears.  With the standalone svn server over ssh,
one needs to use "commit hook scripts" to limit access.  But I think it
is possible (admitting that I have not tried to do this...).

Sean

From hartzell at alerce.com  Wed Jun 20 09:23:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 20 Jun 2007 06:23:32 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <4678C9BC.10206@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
Message-ID: <18041.10836.728079.835572@almost.alerce.com>

Nathan S. Haigh writes:
 > [...]
 > If the switch over to svn takes place, will all the Bioperl-* projects
 > move over at the same time? If so, will they go into their own svn
 > repository or into the same one? Since with svn you can checkout any
 > subtree of the repository I'm not clear on the pro's and cons of either
 > of these options.

I'm planning to drop the projects from the top of the CVSROOT into a
single svn repository:

    bioperl-ext bioperl-pipeline biodata bioperl-gui
    bioperl-run bioperl-cookbook bioperl-live biosql-schema
    bioperl-corba-client bioperl-microarray html bioperl-corba-server
    bioperl-network task-manager bioperl-das-client bioperl-papers
    xml-html bioperl-db bioperl-pedigree

although that's open to feedback from the core members.

As a progress report, I've built a demo repos with -run, -ext, and
-live in it and asked a couple of folks to to take a peek at it.  When
I get a bit further along I'll figure out how to get something for the
public to test.

 > Am I right in thinking that there is a way for cvs to define a "project"
 > such that when you checkout that "project" it actually checks out
 > multiple projects behind the scene? I'm sure I've seen this somewhere,
 > possibly when the project is dependent on some 3rd party code that is
 > also in cvs. If this is possible, I'm sure it will also be possible with
 > svn. This could then allow something like the following to happen after
 > the split up of Bioperl. The following projects could be defined:
 > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
 > called "bioperl" would actually checkout the real projects call
 > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
 > that this ought to be possible, doesn't it?
 > [...]

I don't think that there's any functionality like that in svn.

g.

From hartzell at alerce.com  Wed Jun 20 09:26:04 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 20 Jun 2007 06:26:04 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <46791622.6080409@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
	<46791622.6080409@sheffield.ac.uk>
Message-ID: <18041.10988.375946.833182@almost.alerce.com>

Nathan S. Haigh writes:
 > -----BEGIN PGP SIGNED MESSAGE-----
 > Hash: SHA1
 > 
 > Hilmar Lapp wrote:
 > > 
 > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:
 > > 
 > >> If the switch over to svn takes place, will all the Bioperl-* projects
 > >> move over at the same time?
 > > 
 > > They are under the same CVSROOT right now. Locking down some
 > > sub-repositories but not others may be odd or impossible.
 > > 
 > >> If so, will they go into their own svn repository or into the same one?
 > > 
 > > Good question, I'm not sure about the pros and cons one way or the other
 > > either. The fewer repositories the less sysadmin work in fine-graining
 > > permissions.
 > > 
 > >     -hilmar
 > > 
 > 
 > 
 > I don't think there is any major reason why the following single repos
 > wouldn't do the trick:
 > 
 > /--
 >   |-bioperl-live
 >   |     |--- trunk
 >   |     |--- branches
 >   |     |--- tags
 >   |
 >   |-bioperl-run
 >         |--- trunk
 >         |--- branches
 >         |--- tags
 > 
 > Any reason why this couldn't be used?
 > [...]

That's exactly the way that I'm setting it up.

g.

From n.haigh at sheffield.ac.uk  Wed Jun 20 09:33:33 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 14:33:33 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <18041.10836.728079.835572@almost.alerce.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>	<4678C9BC.10206@sheffield.ac.uk>
	<18041.10836.728079.835572@almost.alerce.com>
Message-ID: <46792CAD.5060700@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:
> Nathan S. Haigh writes:
>  > [...]
>  > If the switch over to svn takes place, will all the Bioperl-* projects
>  > move over at the same time? If so, will they go into their own svn
>  > repository or into the same one? Since with svn you can checkout any
>  > subtree of the repository I'm not clear on the pro's and cons of either
>  > of these options.
> 
> I'm planning to drop the projects from the top of the CVSROOT into a
> single svn repository:
> 
>     bioperl-ext bioperl-pipeline biodata bioperl-gui
>     bioperl-run bioperl-cookbook bioperl-live biosql-schema
>     bioperl-corba-client bioperl-microarray html bioperl-corba-server
>     bioperl-network task-manager bioperl-das-client bioperl-papers
>     xml-html bioperl-db bioperl-pedigree
> 
> although that's open to feedback from the core members.
> 
> As a progress report, I've built a demo repos with -run, -ext, and
> -live in it and asked a couple of folks to to take a peek at it.  When
> I get a bit further along I'll figure out how to get something for the
> public to test.

Could I take a peek??

> 
>  > Am I right in thinking that there is a way for cvs to define a "project"
>  > such that when you checkout that "project" it actually checks out
>  > multiple projects behind the scene? I'm sure I've seen this somewhere,
>  > possibly when the project is dependent on some 3rd party code that is
>  > also in cvs. If this is possible, I'm sure it will also be possible with
>  > svn. This could then allow something like the following to happen after
>  > the split up of Bioperl. The following projects could be defined:
>  > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
>  > called "bioperl" would actually checkout the real projects call
>  > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
>  > that this ought to be possible, doesn't it?
>  > [...]
> 
> I don't think that there's any functionality like that in svn.


I did come across this which might help:
http://subversion.tigris.org/servlets/ReadMsg?listName=users&msgNo=43561

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeSytczuW2jkwy2gRAnlUAJ4pjhPlYlqOm+M882Ni116MJVzPCwCbB3Su
sWDAmqFhGgtlyeawaIGSV14=
=zeAY
-----END PGP SIGNATURE-----

From bix at sendu.me.uk  Wed Jun 20 11:38:20 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 20 Jun 2007 16:38:20 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
Message-ID: <467949EC.9040100@sendu.me.uk>

In considering updating all the test scripts to take advantage of the 
new network option, and/or reimplementing them in Test::More, I thought 
now would be a good time to standardize all the test scripts and reduce 
the possibility of having to alter them all in the future if something 
changes.

For example we could decide on an alternate way of choosing to run 
network tests, or a new way of deciding to output debug information. 
There are also some inconsistencies in the messages produced by tests 
skipping all, and even an unfortunate mistake that has been copy/pasted 
through a lot of test scripts.

My solution is t/lib/BioperlTest.pm (documented with perldoc)

We go from this:

----
use strict;
our $DEBUG;

BEGIN {
   $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
	
   eval { require Test::More; };
   if( $@ ) {
     use lib 't/lib';
   }
   use Test::More; # the mistake!
	
   use Module::Build;
   my $build = Module::Build->current();
   my $do_network_tests = $build->notes('network');

   eval {
     require IO::String;
     require LWP;
     require LWP::UserAgent;
   };
   if ($@) {
     plan skip_all => 'IO::String or LWP or LWP::UserAgentnot installed.
This means Bio::Tools::Run::RemoteBlast is not usable. Skipping tests';
   }
   elsif (!$do_network_tests) {
     plan skip_all => 'Network tests have not been requested, skipping
all';
   }
   else {
     plan tests => 21;
   }

   #...
}

my $obj = Bio::Object->new(-verbose => $DEBUG);
#...
----

To this:

----
use strict;

BEGIN {
   use lib 't/lib';
   use BioperlTest;

   test_begin(-requires_modules => [qw(IO::String LWP LWP::UserAgent)],
              -requires_networking => 1,
              -tests => 21);

   #...
}

my $obj = Bio::Object->new(-verbose => test_debug());
#...
----


Can anyone identify problems with this approach? Is the interface 
presented by BioperlTest flexible enough that any changes would only be 
additions for new functionality (and therefore all test scripts wouldn't 
need to be altered)? Is BioperlTest missing anything you'd like?

Are there any objections to me updating all tests in this manner? For an 
example, see t/RemoteBlast.t


Cheers,
Sendu.

From spiros at lokku.com  Wed Jun 20 11:49:48 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Wed, 20 Jun 2007 16:49:48 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
	<4676B41E.3050706@sendu.me.uk>
	<4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>
Message-ID: <bba689ec0706200849p3d32ffb8wee14bbeb2027e905@mail.gmail.com>

Yep, they are not all done. Some still need to be ported over, doing
some here and there at home. However, the recent email Sendu sent, the
one about abstracting the setup of testing is actually something i was
thinking myself so it might be a better way to tackle the problem. For
once it would save us from duplicating the same 30 lines of code
across all tests.

As far as network tests are involved, ive always been an avid hater of
them. I believe they only bring more troubles than what they
contribute due to the diversity of setups people have. My way of
tackling them was always to group all the tests that required live
access into one file and then forcibly just run that - iff needed and
not by default. Like i said, thats just my opinion, ive been bitten by
them one time too many.

Spiros

On 6/18/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote:
>
> > Chris Fields wrote:
> >> Couldn't you enable BIOPERLDEBUG, disable network access, then
> >> iterate through tests checking for those which fail or skip?
> >
> > Yes, good idea, though my dev machine is also my email/webserver so
> > I'd rather come up with an alternate solution than one involving
> > 'disable network access'.
> >
> > Still, that's what I'll probably end up doing. Cheers!
> >
> >
> > Oh, Chris, Spiros, how goes the Test::More conversion? I might want
> > to wait for you to finish, or join in? If you're not going to have
> > time to do any more in the next few weeks, can you please update
> > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or
> > in the opposite case, add your name in)? Its not quite clear to me
> > which tests are assigned to whom. Can someone clarify what the
> > markings mean?
> >
> > Cheers,
> > Sendu.
>
> Not sure how far along spiros is; I handed it over after I finished
> up to the 'Q' tests.  In general the ones marked out have been
> converted over, ones with names next to them have been claimed.  If
> you need help I'll prob. start back up again to finish them off; we
> just need to divy them up.
>
> chris
>

From hlapp at gmx.net  Wed Jun 20 12:27:47 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 12:27:47 -0400
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467949EC.9040100@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
Message-ID: <A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>

Very cool! Sounds like a no-brainer to me to adopt this in all the  
tests. -hilmar

On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:

> In considering updating all the test scripts to take advantage of the
> new network option, and/or reimplementing them in Test::More, I  
> thought
> now would be a good time to standardize all the test scripts and  
> reduce
> the possibility of having to alter them all in the future if something
> changes.
>
> For example we could decide on an alternate way of choosing to run
> network tests, or a new way of deciding to output debug information.
> There are also some inconsistencies in the messages produced by tests
> skipping all, and even an unfortunate mistake that has been copy/ 
> pasted
> through a lot of test scripts.
>
> My solution is t/lib/BioperlTest.pm (documented with perldoc)
>
> We go from this:
>
> ----
> use strict;
> our $DEBUG;
>
> BEGIN {
>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
> 	
>    eval { require Test::More; };
>    if( $@ ) {
>      use lib 't/lib';
>    }
>    use Test::More; # the mistake!
> 	
>    use Module::Build;
>    my $build = Module::Build->current();
>    my $do_network_tests = $build->notes('network');
>
>    eval {
>      require IO::String;
>      require LWP;
>      require LWP::UserAgent;
>    };
>    if ($@) {
>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot  
> installed.
> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping  
> tests';
>    }
>    elsif (!$do_network_tests) {
>      plan skip_all => 'Network tests have not been requested, skipping
> all';
>    }
>    else {
>      plan tests => 21;
>    }
>
>    #...
> }
>
> my $obj = Bio::Object->new(-verbose => $DEBUG);
> #...
> ----
>
> To this:
>
> ----
> use strict;
>
> BEGIN {
>    use lib 't/lib';
>    use BioperlTest;
>
>    test_begin(-requires_modules => [qw(IO::String LWP  
> LWP::UserAgent)],
>               -requires_networking => 1,
>               -tests => 21);
>
>    #...
> }
>
> my $obj = Bio::Object->new(-verbose => test_debug());
> #...
> ----
>
>
> Can anyone identify problems with this approach? Is the interface
> presented by BioperlTest flexible enough that any changes would  
> only be
> additions for new functionality (and therefore all test scripts  
> wouldn't
> need to be altered)? Is BioperlTest missing anything you'd like?
>
> Are there any objections to me updating all tests in this manner?  
> For an
> example, see t/RemoteBlast.t
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 20 12:44:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 20 Jun 2007 11:44:01 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
Message-ID: <BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>

Agreed!  You've already created an example case so there's something  
to go off of.

I plan on changing some EUtilities tests soon so I'll try  
implementing this, basing off your RemoteBlast.t implementation.   
Seems clear enough on the surface; if I run into problems I'll post.

chris

On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote:

> Very cool! Sounds like a no-brainer to me to adopt this in all the
> tests. -hilmar
>
> On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:
>
>> In considering updating all the test scripts to take advantage of the
>> new network option, and/or reimplementing them in Test::More, I
>> thought
>> now would be a good time to standardize all the test scripts and
>> reduce
>> the possibility of having to alter them all in the future if  
>> something
>> changes.
>>
>> For example we could decide on an alternate way of choosing to run
>> network tests, or a new way of deciding to output debug information.
>> There are also some inconsistencies in the messages produced by tests
>> skipping all, and even an unfortunate mistake that has been copy/
>> pasted
>> through a lot of test scripts.
>>
>> My solution is t/lib/BioperlTest.pm (documented with perldoc)
>>
>> We go from this:
>>
>> ----
>> use strict;
>> our $DEBUG;
>>
>> BEGIN {
>>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
>> 	
>>    eval { require Test::More; };
>>    if( $@ ) {
>>      use lib 't/lib';
>>    }
>>    use Test::More; # the mistake!
>> 	
>>    use Module::Build;
>>    my $build = Module::Build->current();
>>    my $do_network_tests = $build->notes('network');
>>
>>    eval {
>>      require IO::String;
>>      require LWP;
>>      require LWP::UserAgent;
>>    };
>>    if ($@) {
>>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot
>> installed.
>> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping
>> tests';
>>    }
>>    elsif (!$do_network_tests) {
>>      plan skip_all => 'Network tests have not been requested,  
>> skipping
>> all';
>>    }
>>    else {
>>      plan tests => 21;
>>    }
>>
>>    #...
>> }
>>
>> my $obj = Bio::Object->new(-verbose => $DEBUG);
>> #...
>> ----
>>
>> To this:
>>
>> ----
>> use strict;
>>
>> BEGIN {
>>    use lib 't/lib';
>>    use BioperlTest;
>>
>>    test_begin(-requires_modules => [qw(IO::String LWP
>> LWP::UserAgent)],
>>               -requires_networking => 1,
>>               -tests => 21);
>>
>>    #...
>> }
>>
>> my $obj = Bio::Object->new(-verbose => test_debug());
>> #...
>> ----
>>
>>
>> Can anyone identify problems with this approach? Is the interface
>> presented by BioperlTest flexible enough that any changes would
>> only be
>> additions for new functionality (and therefore all test scripts
>> wouldn't
>> need to be altered)? Is BioperlTest missing anything you'd like?
>>
>> Are there any objections to me updating all tests in this manner?
>> For an
>> example, see t/RemoteBlast.t
>>
>>
>> Cheers,
>> Sendu.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From wollenbergk at mail.nih.gov  Wed Jun 20 14:11:04 2007
From: wollenbergk at mail.nih.gov (Wollenberg, Kurt (NIH/NIAID))
Date: Wed, 20 Jun 2007 14:11:04 -0400
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
Message-ID: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>

Greetings:

I am working on a script to take a list of sequence IDs, extract the
sequences from GenPept, and then run a BLAST search for each of the
retrieved sequences. I am having a problem with the sequence retrieval,
where some sequences are found and others are not and it's not obvious to me
why this is. 

For example, using a text file containing the two following IDs as input:
SKG3_YEAST
NEM1_YEAST

My script 

while( <IN> ) {
  chomp;
  my $seqid = $_;
  my $seq_obj = get_sequence( 'genpept', $seqid );
}

will create a sequence object for the first ID, (print "Accession of
",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession
number) but for the second I am told

-------------------- WARNING ---------------------
MSG: id (NEM1_YEAST) does not exist
---------------------------------------------------

When I pull up these records using the Entrez cross-databse search in my web
browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using
these search terms). In both records these IDs reside in the same field
("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence finds one
but not the other. Any advice would be greatly appreciated.

Cheers,
Kurt Wollenberg, Ph.D.
Phylogenetics and Sequence Analysis Consultant
Biocomputing Research Consulting Section
Bioinformatics and Scientific IT Program (BSIP)
NIH/NIAID/OTIS
Contractor, Lockheed Martin
http://bioinformatics.niaid.nih.gov

Disclaimer:
The information in this e-mail and any of its attachments is confidential
and may contain sensitive information. It should not be used by anyone who
is not the original intended recipient. If you have received this e-mail in
error please inform the sender and delete it from your mailbox or any other
storage devices. National Institute of Allergy and Infectious Diseases shall
not accept liability for any statements made that are sender's own and not
expressly made on behalf of the NIAID by one of its representatives.


From bosborne11 at verizon.net  Wed Jun 20 14:59:39 2007
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 20 Jun 2007 14:59:39 -0400
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
In-Reply-To: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
Message-ID: <C29EF15B.EAF7%bosborne11@verizon.net>

Kurt,

I can't answer your question but I wouldn't use Bio::Perl myself, I'd use
Bio::DB::GenPept:

501 ~>perl -e 'use Bio::DB::GenPept; $db = Bio::DB::GenPept->new; $seq =
$db->get_Seq_by_acc('NEM1_YEAST'); print $seq->seq;'
MNALKYFSNHLITTKKQKKINVEVTKNQDLLGPSKEVSNKYTSHSENDCVSEVDQQYDHSSSHLKESDQNQERKNS
VPKKPKALRSILIEKIASILWALLLFLPYYLIIKPLMSLWFVFTFPLSVIERRVKHTDKRNRGSNASENELPVSSS
NINDSSEKTNPKNCNLNTIPEAVEDDLNASDEIILQRDNVKGSLLRAQSVKSRPRSYSKSELSLSNHSSSNTVFGT
KRMGRFLFPKKLIPKSVLNTQKKKKLVIDLDETLIHSASRSTTHSNSSQGHLVEVKFGLSGIRTLYFIHKRPYCDL
FLTKVSKWYDLIIFTASMKEYADPVIDWLESSFPSSFSKRYYRSDCVLRDGVGYIKDLSIVKDSEENGKGSSSSLD
DVIIIDNSPVSYAMNVDNAIQVEGWISDPTDTDLLNLLPFLEAMRYSTDVRNILALKHGEKAFNIN502 ~>

It's true that Bio::Perl is easy-to-use but it's also _very_ limited.

Brian O.


On 6/20/07 2:11 PM, "Wollenberg, Kurt (NIH/NIAID)"
<wollenbergk at mail.nih.gov> wrote:

> Greetings:
> 
> I am working on a script to take a list of sequence IDs, extract the
> sequences from GenPept, and then run a BLAST search for each of the
> retrieved sequences. I am having a problem with the sequence retrieval,
> where some sequences are found and others are not and it's not obvious to me
> why this is. 
> 
> For example, using a text file containing the two following IDs as input:
> SKG3_YEAST
> NEM1_YEAST
> 
> My script 
> 
> while( <IN> ) {
>   chomp;
>   my $seqid = $_;
>   my $seq_obj = get_sequence( 'genpept', $seqid );
> }
> 
> will create a sequence object for the first ID, (print "Accession of
> ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession
> number) but for the second I am told
> 
> -------------------- WARNING ---------------------
> MSG: id (NEM1_YEAST) does not exist
> ---------------------------------------------------
> 
> When I pull up these records using the Entrez cross-databse search in my web
> browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using
> these search terms). In both records these IDs reside in the same field
> ("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence finds one
> but not the other. Any advice would be greatly appreciated.
> 
> Cheers,
> Kurt Wollenberg, Ph.D.
> Phylogenetics and Sequence Analysis Consultant
> Biocomputing Research Consulting Section
> Bioinformatics and Scientific IT Program (BSIP)
> NIH/NIAID/OTIS
> Contractor, Lockheed Martin
> http://bioinformatics.niaid.nih.gov
> 
> Disclaimer:
> The information in this e-mail and any of its attachments is confidential
> and may contain sensitive information. It should not be used by anyone who
> is not the original intended recipient. If you have received this e-mail in
> error please inform the sender and delete it from your mailbox or any other
> storage devices. National Institute of Allergy and Infectious Diseases shall
> not accept liability for any statements made that are sender's own and not
> expressly made on behalf of the NIAID by one of its representatives.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Wed Jun 20 16:11:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 20 Jun 2007 15:11:34 -0500
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
In-Reply-To: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
References: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
Message-ID: <F9F5A58E-4767-49C4-80F2-DEE3CA474C01@uiuc.edu>

I'm assuming you are using the Bio::Perl exported sub get_sequence 
().  I am able to reproduce the issue using bioperl-live; it's an odd  
issue as direct use of Bio::DB::GenPept works fine:

use Bio::DB::GenPept;

my $factory = Bio::DB::GenPept->new();

my @accs = qw(SKG3_YEAST NEM1_YEAST);

my $io = $factory->get_Stream_by_acc(\@accs);

while (my $seq = $io->next_seq) {
     print "Accession:",$seq->accession,"\n";
}

chris


On Jun 20, 2007, at 1:11 PM, Wollenberg, Kurt (NIH/NIAID) wrote:

> Greetings:
>
> I am working on a script to take a list of sequence IDs, extract the
> sequences from GenPept, and then run a BLAST search for each of the
> retrieved sequences. I am having a problem with the sequence  
> retrieval,
> where some sequences are found and others are not and it's not  
> obvious to me
> why this is.
>
> For example, using a text file containing the two following IDs as  
> input:
> SKG3_YEAST
> NEM1_YEAST
>
> My script
>
> while( <IN> ) {
>   chomp;
>   my $seqid = $_;
>   my $seq_obj = get_sequence( 'genpept', $seqid );
> }
>
> will create a sequence object for the first ID, (print "Accession of
> ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct  
> accession
> number) but for the second I am told
>
> -------------------- WARNING ---------------------
> MSG: id (NEM1_YEAST) does not exist
> ---------------------------------------------------
>
> When I pull up these records using the Entrez cross-databse search  
> in my web
> browser I find genpept records for both SKG3_YEAST and NEM1_YEAST  
> (using
> these search terms). In both records these IDs reside in the same  
> field
> ("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence  
> finds one
> but not the other. Any advice would be greatly appreciated.
>
> Cheers,
> Kurt Wollenberg, Ph.D.
> Phylogenetics and Sequence Analysis Consultant
> Biocomputing Research Consulting Section
> Bioinformatics and Scientific IT Program (BSIP)
> NIH/NIAID/OTIS
> Contractor, Lockheed Martin
> http://bioinformatics.niaid.nih.gov
>
> Disclaimer:
> The information in this e-mail and any of its attachments is  
> confidential
> and may contain sensitive information. It should not be used by  
> anyone who
> is not the original intended recipient. If you have received this e- 
> mail in
> error please inform the sender and delete it from your mailbox or  
> any other
> storage devices. National Institute of Allergy and Infectious  
> Diseases shall
> not accept liability for any statements made that are sender's own  
> and not
> expressly made on behalf of the NIAID by one of its representatives.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From sac at bioperl.org  Thu Jun 21 02:32:47 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Wed, 20 Jun 2007 23:32:47 -0700
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
	<BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>
Message-ID: <8f200b4c0706202332w25a09547k1de20f24466877d9@mail.gmail.com>

Looks like a nice refactor. After it's in place, don't forget to
update the wiki:
http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests

Steve

On 6/20/07, Chris Fields <cjfields at uiuc.edu> wrote:
> Agreed!  You've already created an example case so there's something
> to go off of.
>
> I plan on changing some EUtilities tests soon so I'll try
> implementing this, basing off your RemoteBlast.t implementation.
> Seems clear enough on the surface; if I run into problems I'll post.
>
> chris
>
> On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote:
>
> > Very cool! Sounds like a no-brainer to me to adopt this in all the
> > tests. -hilmar
> >
> > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:
> >
> >> In considering updating all the test scripts to take advantage of the
> >> new network option, and/or reimplementing them in Test::More, I
> >> thought
> >> now would be a good time to standardize all the test scripts and
> >> reduce
> >> the possibility of having to alter them all in the future if
> >> something
> >> changes.
> >>
> >> For example we could decide on an alternate way of choosing to run
> >> network tests, or a new way of deciding to output debug information.
> >> There are also some inconsistencies in the messages produced by tests
> >> skipping all, and even an unfortunate mistake that has been copy/
> >> pasted
> >> through a lot of test scripts.
> >>
> >> My solution is t/lib/BioperlTest.pm (documented with perldoc)
> >>
> >> We go from this:
> >>
> >> ----
> >> use strict;
> >> our $DEBUG;
> >>
> >> BEGIN {
> >>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
> >>
> >>    eval { require Test::More; };
> >>    if( $@ ) {
> >>      use lib 't/lib';
> >>    }
> >>    use Test::More; # the mistake!
> >>
> >>    use Module::Build;
> >>    my $build = Module::Build->current();
> >>    my $do_network_tests = $build->notes('network');
> >>
> >>    eval {
> >>      require IO::String;
> >>      require LWP;
> >>      require LWP::UserAgent;
> >>    };
> >>    if ($@) {
> >>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot
> >> installed.
> >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping
> >> tests';
> >>    }
> >>    elsif (!$do_network_tests) {
> >>      plan skip_all => 'Network tests have not been requested,
> >> skipping
> >> all';
> >>    }
> >>    else {
> >>      plan tests => 21;
> >>    }
> >>
> >>    #...
> >> }
> >>
> >> my $obj = Bio::Object->new(-verbose => $DEBUG);
> >> #...
> >> ----
> >>
> >> To this:
> >>
> >> ----
> >> use strict;
> >>
> >> BEGIN {
> >>    use lib 't/lib';
> >>    use BioperlTest;
> >>
> >>    test_begin(-requires_modules => [qw(IO::String LWP
> >> LWP::UserAgent)],
> >>               -requires_networking => 1,
> >>               -tests => 21);
> >>
> >>    #...
> >> }
> >>
> >> my $obj = Bio::Object->new(-verbose => test_debug());
> >> #...
> >> ----
> >>
> >>
> >> Can anyone identify problems with this approach? Is the interface
> >> presented by BioperlTest flexible enough that any changes would
> >> only be
> >> additions for new functionality (and therefore all test scripts
> >> wouldn't
> >> need to be altered)? Is BioperlTest missing anything you'd like?
> >>
> >> Are there any objections to me updating all tests in this manner?
> >> For an
> >> example, see t/RemoteBlast.t
> >>
> >>
> >> Cheers,
> >> Sendu.
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From staffa at niehs.nih.gov  Thu Jun 21 14:36:12 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Thu, 21 Jun 2007 14:36:12 -0400
Subject: [Bioperl-l] BIO::DB::FASTA  ID
Message-ID: <C2A03D5E.4DE9%staffa@niehs.nih.gov>

This program below returns only  1527 IDs from a fasta file that I have
constructed, which has
mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa
1820
.
It actually does not return the first 3 ids,
nor the 5th, nor 7..36, 38,39,41..44......
The header lines are of variable length and the sequence lines are 80
characters except at the ends when they might be shorter.
Is there some caveat that I am ignoring in my format that breaks
bio::db::fasta?


#!/usr/bin/perl
#
#
#
use strict;
use Bio::DB::Fasta;
use Bio::Tools::SeqWords;
use Bio::Seq;
use Bio::SeqIO;
$|=1;
#
#
my $Dpse_UTR_file_for_T_orthologs =
"/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa";
my $db = Bio::DB::Fasta->new
('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa',
  -reindex,  -makeid => \&make_my_id);
my @ids = $db->ids;
my $number_in = @ids;
print "number of Dpse IDs = $number_in\n";
foreach my $id (@ids){
print "$id\n";
}
sub make_my_id {
#       parse header line:
#       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT
    my $line = shift;
#    print "line = $line\n";
    $line =~ />(\w+) /;
    my $ID = $1;
#    print "ID = $ID\n";
    return $ID;
      }

-------------- next part --------------
A non-text attachment was scrubbed...
Name: T_orthologs_Dpse_genes.fa
Type: application/octet-stream
Size: 5033676 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070621/07c354d0/attachment-0001.obj 

From jason at bioperl.org  Thu Jun 21 17:19:14 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 21 Jun 2007 14:19:14 -0700
Subject: [Bioperl-l] BIO::DB::FASTA  ID
In-Reply-To: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
References: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
Message-ID: <F3A92546-08EE-4AD5-BFCE-BF006D153AD7@bioperl.org>

Hey Nick -
I think
a) your IDs are not unique
b) you need to declare the function make_my_id BEFORE your call  
Bio::DB::Fasta->new if you want your function to be used.

$ grep "^>" T_orthologs_Dpse_genes.fa | awk '{print $1}' | sort |  
uniq | wc -l
1527


-jason
On Jun 21, 2007, at 11:36 AM, Staffa, Nick (NIH/NIEHS) wrote:

> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> $|=1;
> #
> #
> my $Dpse_UTR_file_for_T_orthologs =
> "/home/staffa/clients/Kari/D_pse_genome/testit/ 
> T_orthologs_Dpse_genes.fa";
> my $db = Bio::DB::Fasta->new
> ('/home/staffa/clients/Kari/D_pse_genome/testit/ 
> T_orthologs_Dpse_genes.fa',
>   -reindex,  -makeid => \&make_my_id);
> my @ids = $db->ids;
> my $number_in = @ids;
> print "number of Dpse IDs = $number_in\n";
> foreach my $id (@ids){
> print "$id\n";
> }
> sub make_my_id {
> #       parse header line:
> #       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0  
> TTATTTATT
>     my $line = shift;
> #    print "line = $line\n";
>     $line =~ />(\w+) /;
>     my $ID = $1;
> #    print "ID = $ID\n";
>     return $ID;
>       }

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From mkiwala at watson.wustl.edu  Thu Jun 21 17:23:46 2007
From: mkiwala at watson.wustl.edu (Michael Kiwala)
Date: Thu, 21 Jun 2007 16:23:46 -0500
Subject: [Bioperl-l] BIO::DB::FASTA  ID
In-Reply-To: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
References: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
Message-ID: <467AEC62.2040508@watson.wustl.edu>

You only have 1527 unique id's in the file.

~$ grep '^>' Desktop/T_orthologs_Dpse_genes.fa|cut -d\  -f1|sort -u|wc -l
1527


Change your make_id function to make sure the id's are unique.


Staffa, Nick (NIH/NIEHS) wrote:
> This program below returns only  1527 IDs from a fasta file that I have
> constructed, which has
> mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa
> 1820
> .
> It actually does not return the first 3 ids,
> nor the 5th, nor 7..36, 38,39,41..44......
> The header lines are of variable length and the sequence lines are 80
> characters except at the ends when they might be shorter.
> Is there some caveat that I am ignoring in my format that breaks
> bio::db::fasta?
>
>
> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> $|=1;
> #
> #
> my $Dpse_UTR_file_for_T_orthologs =
> "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa";
> my $db = Bio::DB::Fasta->new
> ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa',
>   -reindex,  -makeid => \&make_my_id);
> my @ids = $db->ids;
> my $number_in = @ids;
> print "number of Dpse IDs = $number_in\n";
> foreach my $id (@ids){
> print "$id\n";
> }
> sub make_my_id {
> #       parse header line:
> #       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT
>     my $line = shift;
> #    print "line = $line\n";
>     $line =~ />(\w+) /;
>     my $ID = $1;
> #    print "ID = $ID\n";
>     return $ID;
>       }
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From bix at sendu.me.uk  Mon Jun 25 09:06:27 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 25 Jun 2007 14:06:27 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467949EC.9040100@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
Message-ID: <467FBDD3.8050009@sendu.me.uk>

Sendu Bala wrote:
> In considering updating all the test scripts to [... use] t/lib/BioperlTest.pm

I'm now in the process of converting all test scripts. In addition to 
those things mentioned previously, BioperlTest now also provides the 
methods test_input_file() and test_output_file().


This:
----
use Bio::Root::IO;
my $output_file = Bio::Root::IO->catfile(qw(t data temp.file));
$obj->new(-file => ">$output_file");

END {
   unlink($output_file);
}

...

$obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file)));
----


Becomes this:
----
my $output_file = test_output_file();
$obj->new(-file => ">$output_file");

...

$obj->new(-file => test_input_file('input.file'));
----


I should think the benefits are obvious, especially for the output 
files, which thanks to inconsistency of using END blocks correctly or at 
all, leaves some output data behind on occasion.

test_input_file() is helpful for the shorthand, but also gets rid of 
many tests' usage of Bio::Root::IO (relying on something you're 
installing and testing in another test script to work in the current 
test script, without testing it in your own test script seems like a 
no-no to me).

From cjfields at uiuc.edu  Mon Jun 25 09:39:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 08:39:21 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467FBDD3.8050009@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
Message-ID: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>

On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> In considering updating all the test scripts to [... use] t/lib/ 
>> BioperlTest.pm
>
> I'm now in the process of converting all test scripts. In addition to
> those things mentioned previously, BioperlTest now also provides the
> methods test_input_file() and test_output_file().
>
>
> This:
> ----
> use Bio::Root::IO;
> my $output_file = Bio::Root::IO->catfile(qw(t data temp.file));
> $obj->new(-file => ">$output_file");
>
> END {
>    unlink($output_file);
> }
>
> ...
>
> $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file)));
> ----
>
>
> Becomes this:
> ----
> my $output_file = test_output_file();
> $obj->new(-file => ">$output_file");
>
> ...
>
> $obj->new(-file => test_input_file('input.file'));
> ----
>
>
> I should think the benefits are obvious, especially for the output
> files, which thanks to inconsistency of using END blocks correctly  
> or at
> all, leaves some output data behind on occasion.

Sounds fine by me, though it's a lot of work.  BTW, did we ever  
decide whether to finish up with Test::More conversion?  I haven't  
heard back yet; let me know what you want to do.

> test_input_file() is helpful for the shorthand, but also gets rid of
> many tests' usage of Bio::Root::IO (relying on something you're
> installing and testing in another test script to work in the current
> test script, without testing it in your own test script seems like a
> no-no to me).

Well, in a way isn't that itself a test of the class (whether it  
breaks or not)?  ; >

Do test_input_file() and test_input_file() handle directory  
structures in an OS-safe way like catfile()?  For instance, I plan on  
adding test data to a new directory similar to Bio::Graphics (t/data/ 
eutil) to prevent cluttering of the t/data directory.  I could use  
'$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base  
directory is 't/data' but that may not be cross-platform compatible  
with win32 file systems, which may still expect something like 't\data 
\eutil\input.xml'.

chris

From bix at sendu.me.uk  Mon Jun 25 09:45:23 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 25 Jun 2007 14:45:23 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
Message-ID: <467FC6F3.6080705@sendu.me.uk>

Chris Fields wrote:
> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:
>> I should think the benefits are obvious, especially for the output
>> files, which thanks to inconsistency of using END blocks correctly or at
>> all, leaves some output data behind on occasion.
> 
> Sounds fine by me, though it's a lot of work.  BTW, did we ever decide 
> whether to finish up with Test::More conversion?  I haven't heard back 
> yet; let me know what you want to do.

I'm doing the remaining Test::More conversions at the same time.


> Do test_input_file() and test_input_file() handle directory structures 
> in an OS-safe way like catfile()?  For instance, I plan on adding test 
> data to a new directory similar to Bio::Graphics (t/data/eutil) to 
> prevent cluttering of the t/data directory.  I could use 
> '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base 
> directory is 't/data' but that may not be cross-platform compatible with 
> win32 file systems, which may still expect something like 
> 't\data\eutil\input.xml'.

Its platform-independent, currently implemented using File::Spec. So 
you'll say:

$obj->new(-file => test_input_file('eutil', 'input.xml'));

Its all documented in the POD of BioperlTest.


From cjfields at uiuc.edu  Mon Jun 25 09:49:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 08:49:51 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467FC6F3.6080705@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
	<467FC6F3.6080705@sendu.me.uk>
Message-ID: <679B8E76-C090-4A29-B843-99B5853FE2FB@uiuc.edu>


On Jun 25, 2007, at 8:45 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:
>>> I should think the benefits are obvious, especially for the output
>>> files, which thanks to inconsistency of using END blocks  
>>> correctly or at
>>> all, leaves some output data behind on occasion.
>> Sounds fine by me, though it's a lot of work.  BTW, did we ever  
>> decide whether to finish up with Test::More conversion?  I haven't  
>> heard back yet; let me know what you want to do.
>
> I'm doing the remaining Test::More conversions at the same time.

Okay.  Just didn't want to do any redundant work if it's already  
being/been done.

>> Do test_input_file() and test_input_file() handle directory  
>> structures in an OS-safe way like catfile()?  For instance, I plan  
>> on adding test data to a new directory similar to Bio::Graphics (t/ 
>> data/eutil) to prevent cluttering of the t/data directory.  I  
>> could use '$obj->new(-file => test_input_file('/eutil/ 
>> input.xml'))' if the base directory is 't/data' but that may not  
>> be cross-platform compatible with win32 file systems, which may  
>> still expect something like 't\data\eutil\input.xml'.
>
> Its platform-independent, currently implemented using File::Spec.  
> So you'll say:
>
> $obj->new(-file => test_input_file('eutil', 'input.xml'));
>
> Its all documented in the POD of BioperlTest.

yay!

chris

From mmokrejs at ribosome.natur.cuni.cz  Mon Jun 25 12:06:24 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Mon, 25 Jun 2007 18:06:24 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <467254DD.3010505@mrc-lmb.cam.ac.uk>
References: <466938F6.7050903@ribosome.natur.cuni.cz>	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>	<467178AE.5040905@ribosome.natur.cuni.cz>	<46717990.6040509@ribosome.natur.cuni.cz>
	<467254DD.3010505@mrc-lmb.cam.ac.uk>
Message-ID: <467FE800.4010300@ribosome.natur.cuni.cz>


Dave Howorth wrote:
> Martin MOKREJ? wrote:
>>>> Also, there is a *huge* amount of documentation and examples on
>>>> the BioPerl website.
>>>>
>>>> http://www.bioperl.org/wiki/HOWTOs
>>> You mean 
>>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>>>  ? ;-)
>> $ perl embl2picture.pl ~/99.gb | display - Error returned while
>> evaluating value of 'description' option for glyph
>> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature
>> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl
>> line 141, <GEN0> line 125.
> 
> Hmm an error at line 141 of a 69 line script? Methinks you're not
> actually running the script that's presented on the wiki page you
> quoted. I cut-and-pasted the script and your file and it worked for me
> (at least, it produced an image, along with a bunch of OOPS lines)

Maybe you used the first version of the script?  There are two or more
scripts, I used the very last one.

M.


From cjfields at uiuc.edu  Mon Jun 25 12:48:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 11:48:30 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <467FE7B0.3010904@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
	<46723F91.60501@ribosome.natur.cuni.cz>
	<A2212781-75F3-4BB7-967F-1668B682E84E@uiuc.edu>
	<467FE7B0.3010904@ribosome.natur.cuni.cz>
Message-ID: <B9DB370F-FB17-4DEF-9664-37489D84FC05@uiuc.edu>

Martin,

Keep bioperl-related discussion on the bioperl mail list.  The large  
majority of this isn't biopython-related, but maybe some devs there  
can add to this?

On Jun 25, 2007, at 11:05 AM, Martin MOKREJ? wrote:

...

> Would you please tell me exactly what is wrong with the spacing?

Here's a section of the seq record attached to your previous email:

DEFINITION .
ACCESSION .
VERSION .
SOURCE .
   ORGANISM .

Normally there is a fixed column width for any data present in a  
field, so it would look more like this:

DEFINITION  PYR4 (DIHYDROOROTASE, PYRIMIDIN 4, dihydroorotase);  
dihydroorotase
             [Arabidopsis thaliana].
ACCESSION   NP_194024
VERSION     NP_194024.1  GI:15235865
DBSOURCE    REFSEQ: accession NM_118422.3
KEYWORDS    .
SOURCE      Arabidopsis thaliana (thale cress)
   ORGANISM  Arabidopsis thaliana
             Eukaryota; Viridiplantae; Streptophyta; Embryophyta;  
Tracheophyta;
             Spermatophyta; Magnoliophyta; eudicotyledons; core  
eudicotyledons;
             rosids; eurosids II; Brassicales; Brassicaceae;  
Arabidopsis.

Here's the relevant bit in the latest release notes:

"The second part of each sequence entry record contains the information
appropriate to its keyword, in positions 13 to 80 for keywords and
positions 11 to 80 for the sequence."

The bioperl devs try to make our parsers as flexible as possible but  
others may not, so it's something in ApE that should probably be  
fixed.  And as mentioned to you several times in the past on the mail  
list and on bugzilla, don't expect sequence records which sway from  
the standard (in this case, the release notes) to parse correctly in  
all cases.  We can try supporting some that sway from that standard  
but only up to a point.  If it causes additional bugs, headaches, or  
degrades performance it won't be supported.

> ...
> Well, I just copy&pasted the script from the bioperl webpages, I think
> from a tutorial or FAQ, don't remember anymore.

Well, can't help you if you can't point out where the code originated  
from.  We would like to know so it can be corrected.

> ...
> Well, my search for such tools available on Unix to be used in a  
> script,
> non-interactively, completely failed. My last hope except getting  
> improved
> ApE is to use the GenomeDiagram under biopython, but so far my .gb  
> files
> cannot be parsed yet. :(
> Martin

As mentioned previously you will likely have to code for it yourself  
(perl or python) or help debug the relevant biopython code to get it  
working.  We can't/won't do this for you unless/until it's something  
we feel warrants implementation.  Judging by the bug list, we also  
haven't the time nor inclination to code for it.  Sorry but we have  
other priorities besides doing your work for you.

chris


From jesper at krogh.cc  Tue Jun 26 03:05:32 2007
From: jesper at krogh.cc (Jesper Krogh)
Date: Tue, 26 Jun 2007 09:05:32 +0200 (CEST)
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
Message-ID: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>

Hi List.

Trying to parse the embl database, the embl-parser fails on: AB019196
http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196


------------- EXCEPTION: Bio::Root::Exception -------------
MSG: AB019196 seems to have an invalid species classification.
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
STACK: Bio::SeqIO::embl::_read_EMBL_Species
/usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
STACK: Bio::SeqIO::embl::next_seq
/usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
STACK: -e:1
-----------------------------------------------------------


It seems to be dissatisfied with this:
OS   Acetobacter aceti
OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.

Thanks.
-- 
Jesper Krogh


From cjfields at uiuc.edu  Tue Jun 26 09:13:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 08:13:50 -0500
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
Message-ID: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>

I can verify this using bioperl-live.  Can you file this as a bug?

http://bugzilla.open-bio.org/

chris

On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:

> Hi List.
>
> Trying to parse the embl database, the embl-parser fails on: AB019196
> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: AB019196 seems to have an invalid species classification.
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
> STACK: Bio::SeqIO::embl::_read_EMBL_Species
> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
> STACK: Bio::SeqIO::embl::next_seq
> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
> STACK: -e:1
> -----------------------------------------------------------
>
>
> It seems to be dissatisfied with this:
> OS   Acetobacter aceti
> OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>
> Thanks.
> -- 
> Jesper Krogh
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From suji_ramin at yahoo.com  Tue Jun 26 00:58:36 2007
From: suji_ramin at yahoo.com (SujiBala)
Date: Mon, 25 Jun 2007 21:58:36 -0700 (PDT)
Subject: [Bioperl-l] Error in constructing Phylogenetic tree using
	BioPerl
Message-ID: <571051.26423.qm@web51107.mail.re2.yahoo.com>

Hi Hello
  This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. 
   
  Error messasge
    Must supply  a valid Bio::Align::AlignI for the _align parameter  in the distance 
  My program
  use Bio::AlignIO;
use Bio::Align::DNAStatistics;
use Bio::Tree::DistanceFactory;
# for a dna alignment  can also use ProteinStatistics
@aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
$stats = Bio::Align::DNAStatistics->new;
$mat = $stats->distance( -align  => @aln,-method => 'Kimura');
$dfactory = Bio::Tree::DistanceFactory->new(-method => 'NJ');
$tree = $dfactory->make_tree($mat);
   
  I am using clustalw formatted fasta file with more than one sequence 
   

SujiBala


---------------------------------
Luggage? GPS? Comic books? 
Check out fitting  gifts for grads at Yahoo! Search.

From bartels.stefan at mh-hannover.de  Tue Jun 26 05:26:03 2007
From: bartels.stefan at mh-hannover.de (don esteban)
Date: Tue, 26 Jun 2007 02:26:03 -0700 (PDT)
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
	<BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
Message-ID: <11302459.post@talk.nabble.com>


Try using the Proxyconfiguration in your script:

$ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080";


L Xu wrote:
> 
> I do have the internet connection bu not use the proxy server.
> I tested the network connection with ping command (below). The ncbi
> website 
> does not response. Is there any special network setting needed for 
> connecting the ncbi website?
> Thank you so much.
> 
> C:\>ping www.yahoo.com
> 
> Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:
> 
> Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=360ms TTL=45
> 
> Ping statistics for 69.147.114.210:
>     Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
> Approximate round trip times in milli-seconds:
>     Minimum = 312ms, Maximum = 363ms, Average = 338ms
> 
> C:\>ping www.ncbi.nlm.nih.gov
> 
> Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:
> 
> Request timed out.
> Request timed out.
> Request timed out.
> Request timed out.
> 
> Ping statistics for 130.14.29.110:
>     Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
> 
> 
> 
> = = = Original message = = =
> 
> Judging by the output it looks like you have no network access or? can't 
> connect to the server (what remoteblast needs).? Make sure you? don't need 
> proxy settings.
> 
> To preempt the next question, no, I'm not going to explain what a? proxy 
> is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
> tool...
> 
> chris
> 
> On Jun 13, 2007, at 7:16 AM, L Xu wrote:
> 
> 
>    ...
> -------------------- WARNING ---------------------
> MSG: <HTML>
> <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> <BODY>
> <H1>An Error Occurred</H1>
> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> </BODY>
> </HTML>
> 
> ---------------------------------------------------
> ...
> 
> ___________________________________________________________
> Sent by ePrompter, the premier email notification software.
> Free download at http://www.ePrompter.com.
> 
> _________________________________________________________________
> Get a preview of Live Earth, the hottest event this summer - only on MSN 
> http://liveearth.msn.com?source=msntaglineliveearthhm
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From rahall2 at ualr.edu  Tue Jun 26 09:51:08 2007
From: rahall2 at ualr.edu (Roger Hall)
Date: Tue, 26 Jun 2007 08:51:08 -0500
Subject: [Bioperl-l] Tuesday: ill
Message-ID: <000001c7b7f9$0d029040$4601a8c0@LIBERAL2>

Well I guess I won't be in today after all.
 
Michael, Stephen, and Ames: please call me from the grad office at 10 on
my cell phone (744-8514). 
 
Phil: please go ahead and meet with Tim, and let me know what questions
remain afterwards.
 
Thanks!
 
Roger Hall
Technical Director
MidSouth Bioinformatics Center
University of Arkansas at Little Rock
(501) 569-8074
 

From cjfields at uiuc.edu  Tue Jun 26 10:02:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 09:02:29 -0500
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <4681185D.5030402@cam.ac.uk>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
	<246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
	<4681185D.5030402@cam.ac.uk>
Message-ID: <EC86EE5C-02DF-4E4F-AF25-6E53925CBC1F@uiuc.edu>

Ill try getting to that ASAP (as well as a few bugs).  The problem is  
we have to patch this in 2-3 places (SeqIO::swiss, SeqIO::embl) due  
to repeated code issues, something I'm trying to rectify with a new  
set of parsers.  Just haven't had the time to work on them lately  
unfortunately.

chris

On Jun 26, 2007, at 8:45 AM, Roy Chaudhuri wrote:

> Sorry, replied to this but forgot to cc the list.
>
> It looks like a related problem to bug 2288 that I filed about  
> Bio::SeqIO::swiss - the period after subgen. is what causes the  
> problems since it is interpreted as a seperator between nodes. I  
> put a patch in for Bio::SeqIO::swiss that works for me, but I guess  
> it might have side effects.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
>
> Chris Fields wrote:
>> I can verify this using bioperl-live.  Can you file this as a bug?
>> http://bugzilla.open-bio.org/
>> chris
>> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:
>>> Hi List.
>>>
>>> Trying to parse the embl database, the embl-parser fails on:  
>>> AB019196
>>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>>>
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: AB019196 seems to have an invalid species classification.
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/ 
>>> Root.pm:359
>>> STACK: Bio::SeqIO::embl::_read_EMBL_Species
>>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
>>> STACK: Bio::SeqIO::embl::next_seq
>>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
>>> STACK: -e:1
>>> -----------------------------------------------------------
>>>
>>>
>>> It seems to be dissatisfied with this:
>>> OS   Acetobacter aceti
>>> OC   Bacteria; Proteobacteria; Alphaproteobacteria;  
>>> Rhodospirillales;
>>> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>>>
>>> Thanks.
>>> -- 
>>> Jesper Krogh
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From rrc22 at cam.ac.uk  Tue Jun 26 09:45:01 2007
From: rrc22 at cam.ac.uk (Roy Chaudhuri)
Date: Tue, 26 Jun 2007 14:45:01 +0100
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
	<246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
Message-ID: <4681185D.5030402@cam.ac.uk>

Sorry, replied to this but forgot to cc the list.

It looks like a related problem to bug 2288 that I filed about 
Bio::SeqIO::swiss - the period after subgen. is what causes the problems 
since it is interpreted as a seperator between nodes. I put a patch in 
for Bio::SeqIO::swiss that works for me, but I guess it might have side 
effects.

Roy.
--
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.

Chris Fields wrote:
> I can verify this using bioperl-live.  Can you file this as a bug?
> 
> http://bugzilla.open-bio.org/
> 
> chris
> 
> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:
> 
>> Hi List.
>>
>> Trying to parse the embl database, the embl-parser fails on: AB019196
>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>>
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: AB019196 seems to have an invalid species classification.
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
>> STACK: Bio::SeqIO::embl::_read_EMBL_Species
>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
>> STACK: Bio::SeqIO::embl::next_seq
>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
>> STACK: -e:1
>> -----------------------------------------------------------
>>
>>
>> It seems to be dissatisfied with this:
>> OS   Acetobacter aceti
>> OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
>> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>>
>> Thanks.
>> -- 
>> Jesper Krogh
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Tue Jun 26 10:13:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 26 Jun 2007 15:13:48 +0100
Subject: [Bioperl-l] Error in constructing Phylogenetic tree
	using	BioPerl
In-Reply-To: <571051.26423.qm@web51107.mail.re2.yahoo.com>
References: <571051.26423.qm@web51107.mail.re2.yahoo.com>
Message-ID: <46811F1C.3020307@sendu.me.uk>

SujiBala wrote:
> Hi Hello
>   This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. 
>    
>   Error messasge
>     Must supply  a valid Bio::Align::AlignI for the _align parameter  in the distance 
>   My program
>   use Bio::AlignIO;
> use Bio::Align::DNAStatistics;
> use Bio::Tree::DistanceFactory;
> # for a dna alignment  can also use ProteinStatistics
> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
> $stats = Bio::Align::DNAStatistics->new;
> $mat = $stats->distance( -align  => @aln,-method => 'Kimura');

Without looking at the docs for these modules, it is immediately obvious 
that Bio::AlignIO->new() is going to return an instance of Bio::AlignIO 
and not an array of alignments. It is also obvious that the -align => 
parameter for the distance() method can't take an array of anything (but 
probably an array ref?).

Check the documentation and make sure you know what objects you're 
generating and passing around.

From schlesi at ebi.ac.uk  Tue Jun 26 10:59:13 2007
From: schlesi at ebi.ac.uk (Felix Schlesinger)
Date: Tue, 26 Jun 2007 15:59:13 +0100
Subject: [Bioperl-l] PAML parser
Message-ID: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>

Hello,

I am trying to use the PAML result parser (BioPerl
Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15.
However on all outputs I have tested no result object is returned
(next_result is undef). This includes the HIV and Lysin datasets
included with PAML.
My code is:

my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir =>
"/.");
my $result = $codemlp->next_result;
foreach my $model ( $result->get_NSSite_results ) {
...

and the error is: Can't call method "get_NSSite_results" on an
undefined value ...

I can include the mlc file is needed. Is this supposed to work? Or do
I have to run paml from bioperl to parse the results?

Thanks
  Felix

From Xianjun.Dong at bccs.uib.no  Tue Jun 26 10:35:17 2007
From: Xianjun.Dong at bccs.uib.no (Xianjun Dong)
Date: Tue, 26 Jun 2007 16:35:17 +0200
Subject: [Bioperl-l] bug for PAML::Baseml
Message-ID: <46812425.8000509@ii.uib.no>

An HTML attachment was scrubbed...
URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070626/cb3d8193/attachment-0001.html 

From Xianjun.Dong at bccs.uib.no  Tue Jun 26 11:40:47 2007
From: Xianjun.Dong at bccs.uib.no (Xianjun Dong)
Date: Tue, 26 Jun 2007 17:40:47 +0200
Subject: [Bioperl-l] bug for PAML::Baseml
In-Reply-To: <46812425.8000509@ii.uib.no>
References: <46812425.8000509@ii.uib.no>
Message-ID: <4681337F.1000902@ii.uib.no>

An HTML attachment was scrubbed...
URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070626/604ce866/attachment.html 

From hartzell at alerce.com  Tue Jun 26 14:12:04 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 26 Jun 2007 14:12:04 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
Message-ID: <18049.22260.967524.353173@almost.alerce.com>


There don't seem to be any .cvsignore files in the repository, or in
CVSROOT/cvsignore.

Am I missing something, or don't we use them?

g.


From cjfields at uiuc.edu  Tue Jun 26 15:54:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 14:54:25 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <74515C87-5553-4AF0-9B83-26F3E71E15C8@uiuc.edu>

Not sure.  You may want to email support at open-bio.org; my guess is  
Chris D or Jason would have an answer.

chris

On Jun 26, 2007, at 1:12 PM, George Hartzell wrote:

>
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
>
> Am I missing something, or don't we use them?
>
> g.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Tue Jun 26 15:55:21 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 26 Jun 2007 16:55:21 -0300
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>

Maybe we've been using the default?

On Jun 26, 2007, at 3:12 PM, George Hartzell wrote:

>
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
>
> Am I missing something, or don't we use them?
>
> g.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Tue Jun 26 16:21:30 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 26 Jun 2007 16:21:30 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
Message-ID: <18049.30026.61328.134490@almost.alerce.com>

Chris Fields writes:
 > [...]
 > It looks like George Hartzell may be taking a crack at it, with  
 > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
 > could have something testable relatively soon.  After that we'll need  
 > to work out a few other issues, basically what's on Hilmar's list.

There's a repository on file:///home/hartzell/bioperl with all of the
components projects in place.

If you have a dev.open-bio.org account and you're in the bioperl
group, you're good to get at it via:

  file:///home/hartzell/bioperl

or 

  svn+ssh://dev.open-bio.org/home/hartzell/bioperl

There are a couple of things to think about:

  - how are we going to provide access.  I *think* that I heard a
    decision to use http:// and https://.  Who gets to set that up?

  - what do we want to do about keywords.  The cvs2svn tool guesses
    and automatically sets the svn:keywords property to Author Date
    Revision and Id on many of the files in the tree.  If it looks
    like it got it right, we can stick with it.  Or, we can disable
    that conversion and I've cribbed a little script that'll grep out
    files using Id and set the svn:keywords property accordingly.

  - what do we want to do about svn:ignore?  I haven't seen any
    .cvsignore files.

Beyond that, how does the repo look?

How are we going to cut over?

Are we going to try to push svn commits to the read-mostly CVS repo,
or just keep it around for history's sake (I lean towards the latter).

g.

From jason at bioperl.org  Tue Jun 26 19:22:20 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:22:20 -0300
Subject: [Bioperl-l] PAML parser
In-Reply-To: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>
References: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>
Message-ID: <D536496C-D716-42DF-B614-DD43C1B13A67@bioperl.org>

Can you make sure you have the latest and greatest version of these  
modules from the CVS repository?  We had to fix things to parse 3.15  
-- I can't tell if this is the problem or something else.
You can also add -verbose => 1when you initialize the object and it  
may spit out more warnings about whether it is having problems.


-jason

On Jun 26, 2007, at 11:59 AM, Felix Schlesinger wrote:

> Hello,
>
> I am trying to use the PAML result parser (BioPerl
> Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15.
> However on all outputs I have tested no result object is returned
> (next_result is undef). This includes the HIV and Lysin datasets
> included with PAML.
> My code is:
>
> my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir =>
> "/.");
> my $result = $codemlp->next_result;
> foreach my $model ( $result->get_NSSite_results ) {
> ...
>
> and the error is: Can't call method "get_NSSite_results" on an
> undefined value ...
>
> I can include the mlc file is needed. Is this supposed to work? Or do
> I have to run paml from bioperl to parse the results?
>
> Thanks
>   Felix
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Tue Jun 26 19:27:05 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:27:05 -0300
Subject: [Bioperl-l] Error in constructing Phylogenetic tree
	using	BioPerl
In-Reply-To: <46811F1C.3020307@sendu.me.uk>
References: <571051.26423.qm@web51107.mail.re2.yahoo.com>
	<46811F1C.3020307@sendu.me.uk>
Message-ID: <A99815DC-0FC2-4019-B0C4-CA8EA713FEB0@bioperl.org>


On Jun 26, 2007, at 11:13 AM, Sendu Bala wrote:

> SujiBala wrote:
>> Hi Hello
>>   This is sujatha from singapore. I am trying to construct phylo  
>> tree using DNAStatistics and Kirma method. But I am getting the  
>> following error message. It would be nice if you could help me  
>> resolve this problem asap.
>>
>>   Error messasge
>>     Must supply  a valid Bio::Align::AlignI for the _align  
>> parameter  in the distance
>>   My program
>>   use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>> use Bio::Tree::DistanceFactory;
>> # for a dna alignment  can also use ProteinStatistics
>> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
>> $stats = Bio::Align::DNAStatistics->new;
>> $mat = $stats->distance( -align  => @aln,-method => 'Kimura');
>

yep you want to call next_aln on the Bio::AlignIO object.
I fixed the example code in the HOWTO so it should work properly now;
http://bioperl.org/wiki/HOWTO:Trees#Constructing_Trees

> Without looking at the docs for these modules, it is immediately  
> obvious
> that Bio::AlignIO->new() is going to return an instance of  
> Bio::AlignIO
> and not an array of alignments. It is also obvious that the -align =>
> parameter for the distance() method can't take an array of anything  
> (but
> probably an array ref?).
>
> Check the documentation and make sure you know what objects you're
> generating and passing around.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Tue Jun 26 19:29:11 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:29:11 -0300
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
	<E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>
Message-ID: <5A8FD8A3-9593-4925-AA74-D4B03CDC1C34@bioperl.org>

We don't have one. I have one on my local machine that defined  
basically *~ and .#* so I never had a problem.

Feel free to propose one if you think it is important, I never really  
though it was important.

On Jun 26, 2007, at 4:55 PM, Hilmar Lapp wrote:

> Maybe we've been using the default?
>
> On Jun 26, 2007, at 3:12 PM, George Hartzell wrote:
>
>>
>> There don't seem to be any .cvsignore files in the repository, or in
>> CVSROOT/cvsignore.
>>
>> Am I missing something, or don't we use them?
>>
>> g.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From j_martin at lbl.gov  Tue Jun 26 21:01:29 2007
From: j_martin at lbl.gov (Joel Martin)
Date: Tue, 26 Jun 2007 18:01:29 -0700
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <11302459.post@talk.nabble.com>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
	<BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
	<11302459.post@talk.nabble.com>
Message-ID: <20070627010129.GA8628@eniac.jgi-psf.org>

Hello, 
  The tutorial code snippet is an endless loop, I think it's supposed
to remove the rid.  As the only print statement you added is after the
endless loop, you aren't seeing anything happen.   

Use the code from this instead,

perldoc Bio::Tools::Run::RemoteBlast

  The bptutorial.pl does have a note that it's not useful and to read the pod
for Bio::Tools::Run::RemoteBlast, it's in the next sentences after the code
snippet you used.  

  Though, as it's a tutorial example it might be nice to remove the while
loop .. or at least add the sleep(5) part.
http://www.bioperl.org/wiki/Bptutorial.pl#Running_BLAST_.28using_RemoteBlast.pm.29

  Aside from that, you may have network issues but www.ncbi.nlm.nih.gov
doesn't respond to ping as far as I can tell. 

Joel


On Tue, Jun 26, 2007 at 02:26:03AM -0700, don esteban wrote:
> 
> Try using the Proxyconfiguration in your script:
> 
> $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080";
> 
> 
> 
> 
> L Xu wrote:
> > 
> > I do have the internet connection bu not use the proxy server.
> > I tested the network connection with ping command (below). The ncbi
> > website 
> > does not response. Is there any special network setting needed for 
> > connecting the ncbi website?
> > Thank you so much.
> > 
> > C:\>ping www.yahoo.com
> > 
> > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:
> > 
> > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45
> > 
> > Ping statistics for 69.147.114.210:
> >     Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
> > Approximate round trip times in milli-seconds:
> >     Minimum = 312ms, Maximum = 363ms, Average = 338ms
> > 
> > C:\>ping www.ncbi.nlm.nih.gov
> > 
> > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:
> > 
> > Request timed out.
> > Request timed out.
> > Request timed out.
> > Request timed out.
> > 
> > Ping statistics for 130.14.29.110:
> >     Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
> > 
> > 
> > 
> > = = = Original message = = =
> > 
> > Judging by the output it looks like you have no network access or? can't 
> > connect to the server (what remoteblast needs).? Make sure you? don't need 
> > proxy settings.
> > 
> > To preempt the next question, no, I'm not going to explain what a? proxy 
> > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
> > tool...
> > 
> > chris
> > 
> > On Jun 13, 2007, at 7:16 AM, L Xu wrote:
> > 
> > 
> >    ...
> > -------------------- WARNING ---------------------
> > MSG: <HTML>
> > <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> > <BODY>
> > <H1>An Error Occurred</H1>
> > 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> > </BODY>
> > </HTML>
> > 
> > ---------------------------------------------------
> > ...
> > 
> > ___________________________________________________________
> > Sent by ePrompter, the premier email notification software.
> > Free download at http://www.ePrompter.com.
> > 
> > _________________________________________________________________
> > Get a preview of Live Earth, the hottest event this summer - only on MSN 
> > http://liveearth.msn.com?source=msntaglineliveearthhm
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > 
> > 
> 
> -- 
> View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From melvinp at pacific.net.sg  Wed Jun 27 01:25:08 2007
From: melvinp at pacific.net.sg (Melvin P)
Date: Wed, 27 Jun 2007 13:25:08 +0800
Subject: [Bioperl-l] finding statistics on AA
Message-ID: <4681F4B4.8010609@pacific.net.sg>

Hi, I am new to BioPerl. I am trying to find out if there is any class 
that I can use for occupancy number/occurrence counts, psuedo count, 
observed frequency etc given a few sequences of amino acid. For example, 
what is the observed frequency of residue i at position p. My objective 
is to analyze the information content. Thanks.


From bix at sendu.me.uk  Wed Jun 27 06:23:58 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 11:23:58 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <467FBDD3.8050009@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
Message-ID: <46823ABE.2080300@sendu.me.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> In considering updating all the test scripts to [... use] 
>> t/lib/BioperlTest.pm
> 
> I'm now in the process of converting all test scripts.

And I've now completed that job (for bioperl-live at least), except for 
t/EUtilities.t since I know Chris is working on it.


In addition to converting to Test::More where necessary, I've also made 
all psuedo-TODO blocks real ones. Previously I had advised to use SKIP 
blocks instead since TODO blocks need a Test::Harness upgrade. However I 
think in the next release we ought to make such upgrading compulsory 
(which should be automatic when combined with compulsory usage of 
Module::Build and Test::More in turn: users simply have to update CPAN).


The conversion to BioperlTest directly led to the discovery and fixing 
of 6 minor bugs, so was certainly not without merit.


No user or developer needs to have BIOPERLDEBUG permanently set to true 
anymore. To run all tests you just have to answer yes to the BioDBGFF 
and networking questions of 'perl Build.PL'. With './Build test' you 
then get clean, easy-to-read output where it is obvious to see that we 
currently have these issues:

t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in 
another thread.

t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, 
t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and 
t/Annotation.t all have TODO tests. If you know about those modules, now 
would be a great time to implement those TODOs!

Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are 
deprecated' warnings.


To debug a particular test you could say:
BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t


I've updated the HOWTO for writing test scripts:
http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests

From cjfields at uiuc.edu  Wed Jun 27 07:55:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 06:55:47 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <46823ABE.2080300@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk>
Message-ID: <DC0F57B9-D733-4C89-9B7A-65E1ADFCFDD2@uiuc.edu>


On Jun 27, 2007, at 5:23 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Sendu Bala wrote:
>>> In considering updating all the test scripts to [... use]
>>> t/lib/BioperlTest.pm
>>
>> I'm now in the process of converting all test scripts.
>
> And I've now completed that job (for bioperl-live at least), except  
> for
> t/EUtilities.t since I know Chris is working on it.

The network tests will be much shorter; the bulk will be transferred  
to a new suite for the backend Bio::Tools:EUtilities parser (which  
will test static files in t/data/eutils, so no dynamic changes).

> In addition to converting to Test::More where necessary, I've also  
> made
> all psuedo-TODO blocks real ones. Previously I had advised to use SKIP
> blocks instead since TODO blocks need a Test::Harness upgrade.  
> However I
> think in the next release we ought to make such upgrading compulsory
> (which should be automatic when combined with compulsory usage of
> Module::Build and Test::More in turn: users simply have to update  
> CPAN).

Sounds good to me, but there may be some grumblings out there.

Having specific TODOs are nice b/c we can test them w/o fails.  Handy.

> The conversion to BioperlTest directly led to the discovery and fixing
> of 6 minor bugs, so was certainly not without merit.
>
>
> No user or developer needs to have BIOPERLDEBUG permanently set to  
> true
> anymore. To run all tests you just have to answer yes to the BioDBGFF
> and networking questions of 'perl Build.PL'. With './Build test' you
> then get clean, easy-to-read output where it is obvious to see that we
> currently have these issues:
>
> t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in
> another thread.
>
> t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t,
> t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and
> t/Annotation.t all have TODO tests. If you know about those  
> modules, now
> would be a great time to implement those TODOs!

The RNA_SearchIO.t is from ERPIN output; there's no easy way to  
generate it beyond having the user supply the info (or having the  
program author change the output).

Will have to look at the others to see what's involved; maybe  
something for the priority list?

> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
> deprecated' warnings.

I ran into this with XML::Simple data structures recently; there was  
an easy way around it via XML::Simple using forcearray().  It has to  
do with attempting to assign data to/from a hash in a specific way  
involving array references (though I can't remember exactly how; I  
slept since then).

> To debug a particular test you could say:
> BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t
>
>
> I've updated the HOWTO for writing test scripts:
> http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests

Good work!

chris

From schlesi at ebi.ac.uk  Wed Jun 27 07:57:27 2007
From: schlesi at ebi.ac.uk (Felix Schlesinger)
Date: Wed, 27 Jun 2007 12:57:27 +0100
Subject: [Bioperl-l] Selecting columns from alignment
Message-ID: <7317d50c0706270457i1c3d92a8hb124fa663f51b837@mail.gmail.com>

Hi,

is there an elegant way to select columns from an alignment object
fulfilling a certain property (for example less than x gaps)?
Everything I can see from Align::AlignI seems to involve looking at
the individual sequences, creating lots of slices and appending them.
If there a better way in bioperl or failing that, does anyone know a
software package with similar functionality (t-coffee has lots of
filters for alignments, but nothing to select columns besides by
position it seems). Ideally this would also return a mapping from old
to new positions in one of the sequences of course.

Thanks
  Felix

From cjfields at uiuc.edu  Wed Jun 27 10:36:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 09:36:41 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>


On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:

> ...
> If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
>
>   file:///home/hartzell/bioperl
>
> or
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

I managed to get it working using file://.  Haven't tried svn+ssh yet  
but I've had persistent problems getting ssh to work properly on my  
macbook; not sure why yet but I haven't had time to play around with it.

> There are a couple of things to think about:
>
>   - how are we going to provide access.  I *think* that I heard a
>     decision to use http:// and https://.  Who gets to set that up?

That hasn't been decided yet and will be up to a consensus of the  
core devs, but I think the odds are in favor of allowing https:// but  
against allowing http://.

As for setup that could be anyone with admin privs, though it may be  
best left up to Chris D, Jason, or Mauricio.

>   - what do we want to do about keywords.  The cvs2svn tool guesses
>     and automatically sets the svn:keywords property to Author Date
>     Revision and Id on many of the files in the tree.  If it looks
>     like it got it right, we can stick with it.  Or, we can disable
>     that conversion and I've cribbed a little script that'll grep out
>     files using Id and set the svn:keywords property accordingly.

Probably again a consensus issue, but you can choose one route.  My  
inclination is the former if it's easier.

>   - what do we want to do about svn:ignore?  I haven't seen any
>     .cvsignore files.

Not sure.  I've never used one personally, but (as Jason suggests) if  
you have ideas for one you can propose them, or we can suggest devs  
set up svn::ignore locally.

> Beyond that, how does the repo look?

Seems fine, though a simple 'svn file:///home/hartzell/bioperl'  
checkout gets everything (all distros, branches, etc).  We need to  
make sure everyone uses 'svn co file:///home/hartzell/bioperl/bioperl- 
live/trunk /live' or similar if they just want the latest core/db/etc.

We'll also need to start a svn wiki page to show how to get relevant  
distros (similar in style probably to the cvs page, with dev  
information, how to set up ssh keys, https stuff, etc).

> How are we going to cut over?
>
> Are we going to try to push svn commits to the read-mostly CVS repo,
> or just keep it around for history's sake (I lean towards the latter).

I think a clean cut-over.  Everyone would be warned to hold commits  
for a day (lest they be lost), then probably do something in this order:

- switch cvs to read-only except for svn commits
- run a clean cvs2svn
- set up svn as read/write
- set up test commits to cvs via svn
- disable cvs commit messages to bioperl-guts, enable svn commit  
messages in it's place.
- push svn commits over to read-only cvs

cvs >>must<< be read-only after that point (no cvs->svn commits),  
with write access only available through svn.  If at some future  
point there is no reason to keep it around or that it is more trouble  
than it's worth, we can make a decision then on cvs's fate.

> g.

chris

From rvos at interchange.ubc.ca  Wed Jun 27 10:23:25 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Wed, 27 Jun 2007 07:23:25 -0700 (PDT)
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
Message-ID: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>

 
> Are we going to try to push svn commits to the read-mostly CVS repo,
> or just keep it around for history's sake (I lean towards the latter).

I'm a little confused - surely once the svn is up and running we'll want *no more* cvs commits? Parallel repositories that each accumulate stuff will be a nightmare. I'm probably just not getting your point.

Rutger


From cjfields at uiuc.edu  Wed Jun 27 11:18:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 10:18:03 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>


On Jun 27, 2007, at 9:23 AM, rvos wrote:

>
>> Are we going to try to push svn commits to the read-mostly CVS repo,
>> or just keep it around for history's sake (I lean towards the  
>> latter).
>
> I'm a little confused - surely once the svn is up and running we'll  
> want *no more* cvs commits? Parallel repositories that each  
> accumulate stuff will be a nightmare. I'm probably just not getting  
> your point.
>
> Rutger

Most projects make a clean break with cvs (no more commits) for the  
reasons you point out.  Not sure how the other core devs feel about  
that but I could go for that; it would def. prevent headaches.  We  
could keep cvs for the time being as read-only, with no svn->cvs  
syncing.

There are few projects which have (as a phase-out plan) old read-only  
cvs repositories available, with an automatic svn->cvs commit  
following every new svn commit.  Not sure how that works, esp. for  
branching/merging and so on which I could see potentially getting hairy.

chris

From cjfields at uiuc.edu  Wed Jun 27 12:05:49 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 11:05:49 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <5EA56270-3427-4995-B3C1-2789229AACF1@uiuc.edu>


On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:

> ...If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
>
>   file:///home/hartzell/bioperl
>
> or
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

Did manage to get svn+ssh working (with some password harassment);  
core tests passed enough that I think everything's okay.  If ssh keys  
are set up correctly (mine aren't) it should work fine.

chris

From dmessina at wustl.edu  Wed Jun 27 12:27:32 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 11:27:32 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>

> [Chris]
>
> I managed to get it working using file://.  Haven't tried svn+ssh yet
> but I've had persistent problems getting ssh to work properly on my
> macbook; not sure why yet but I haven't had time to play around  
> with it.

I just did a checkout and a test commit, both via svn+ssh -- works  
great for me.


>> [George]
>>
>>   - what do we want to do about keywords.  The cvs2svn tool guesses
>>     and automatically sets the svn:keywords property to Author Date
>>     Revision and Id on many of the files in the tree.  If it looks
>>     like it got it right, we can stick with it.  Or, we can disable
>>     that conversion and I've cribbed a little script that'll grep out
>>     files using Id and set the svn:keywords property accordingly.


I would think we would want "Author Date Id Rev URL" set on  
everything, no?. So either cvs2svn or your tool (whichever you think  
is better), followed by

	svn propset svn:keywords "Author Date Id Rev URL" *

from the root of a working copy would take care of all of the  
existing files in the repository, I think.

George knows more about this than I do, but I think you can set up a  
global config file with

	enable-auto-props = yes
	* = svn:keywords="Author Date Id Rev URL"

to ensure it gets set on any future additions to the repository.


>>   - what do we want to do about svn:ignore?  I haven't seen any
>>     .cvsignore files.
>
> Not sure.  I've never used one personally, but (as Jason suggests) if
> you have ideas for one you can propose them, or we can suggest devs
> set up svn::ignore locally.

I use the default global-ignores

	global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* .DS_Store

(again, in my system-wide config file), but I'm not tied to that. I  
do think we should have one, though; individuals can easily override  
any settings in the system-wide config with their own ~/.subversion/ 
config.


>> Beyond that, how does the repo look?

Looks great, George! Thanks for doing this.


Dave

From hartzell at alerce.com  Wed Jun 27 13:00:53 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 13:00:53 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <18050.38853.526224.791878@almost.alerce.com>

rvos writes:
 >  
 > > Are we going to try to push svn commits to the read-mostly CVS repo,
 > > or just keep it around for history's sake (I lean towards the latter).
 > 
 > I'm a little confused - surely once the svn is up and running we'll
 > want *no more* cvs commits? Parallel repositories that each
 > accumulate stuff will be a nightmare. I'm probably just not getting
 > your point. 

There had been some point of keeping a CVS repository around as a
read-only mirror of the svn repo, presumably for people who's habits
or setup won't let them use svn.

In theory, each commit to the svn repo can be automagically pushed
down into CVS w/out user intervention, google will tell you how but
I've never run anything that way.

g.

From dmessina at wustl.edu  Wed Jun 27 13:27:01 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 12:27:01 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <99969FC2-479E-408C-AADB-7664EBE937CF@wustl.edu>

> [Chris]
> We'll also need to start a svn wiki page to show how to get relevant
> distros (similar in style probably to the cvs page, with dev
> information, how to set up ssh keys, https stuff, etc).

I cloned the CVS page and have started adapting it for Subversion:

	http://www.bioperl.org/wiki/Using_Subversion

I'll do some more on it later today, but if anyone wants to fiddle  
with it in the interim, please do.


Dave


From n.haigh at sheffield.ac.uk  Wed Jun 27 14:44:16 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 19:44:16 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <46823ABE.2080300@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk>
Message-ID: <4682B000.2050707@sheffield.ac.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> Sendu Bala wrote:
>>> In considering updating all the test scripts to [... use] 
>>> t/lib/BioperlTest.pm
>> I'm now in the process of converting all test scripts.
> 
> And I've now completed that job (for bioperl-live at least), except for 
> t/EUtilities.t since I know Chris is working on it.
> 
> 
> In addition to converting to Test::More where necessary, I've also made 
> all psuedo-TODO blocks real ones. Previously I had advised to use SKIP 
> blocks instead since TODO blocks need a Test::Harness upgrade. However I 
> think in the next release we ought to make such upgrading compulsory 
> (which should be automatic when combined with compulsory usage of 
> Module::Build and Test::More in turn: users simply have to update CPAN).
> 
> 
> The conversion to BioperlTest directly led to the discovery and fixing 
> of 6 minor bugs, so was certainly not without merit.
> 
> 
> No user or developer needs to have BIOPERLDEBUG permanently set to true 
> anymore. To run all tests you just have to answer yes to the BioDBGFF 
> and networking questions of 'perl Build.PL'. With './Build test' you 
> then get clean, easy-to-read output where it is obvious to see that we 
> currently have these issues:
> 
> t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in 
> another thread.
> 
> t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, 
> t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and 
> t/Annotation.t all have TODO tests. If you know about those modules, now 
> would be a great time to implement those TODOs!
> 
> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are 
> deprecated' warnings.

Ah, that reminds me!

I recently tried to do an install of the cvs head (a week or two ago) on
a clean installation of Debian 4.0 (etch). During the installation, of
dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
Bioperl. I seem to remember this circular dependency cropping up before
- am I correct - and can you remind me how this was "fixed"?

Cheers
Nath

From bix at sendu.me.uk  Wed Jun 27 14:52:01 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 19:52:01 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B000.2050707@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
Message-ID: <4682B1D1.3080206@sendu.me.uk>

Nathan S. Haigh wrote:
> I recently tried to do an install of the cvs head (a week or two ago) on
> a clean installation of Debian 4.0 (etch). During the installation, of
> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
> Bioperl. I seem to remember this circular dependency cropping up before
> - am I correct - and can you remind me how this was "fixed"?

Yes, it always happens. It was 'fixed' by being completely ignored by 
me. Installation is guaranteed to fail, but if you really want it, 
trying to install again after you already have Bioperl installed will 
result in success.

Clearly something nicer could be done. Suggestions on a postcard...

From cjfields at uiuc.edu  Wed Jun 27 15:01:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 14:01:01 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B000.2050707@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
Message-ID: <A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>


On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote:

> Sendu Bala wrote:
>> ...
>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
>> deprecated' warnings.
>
> Ah, that reminds me!
>
> I recently tried to do an install of the cvs head (a week or two  
> ago) on
> a clean installation of Debian 4.0 (etch). During the installation, of
> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
> Bioperl. I seem to remember this circular dependency cropping up  
> before
> - am I correct - and can you remind me how this was "fixed"?
>
> Cheers
> Nath

Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part  
of Bioperl (and he could be come a dev).  That would solve it.

chris

From n.haigh at sheffield.ac.uk  Wed Jun 27 15:16:40 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 20:16:40 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
Message-ID: <4682B798.1010409@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> 
> On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote:
> 
>> Sendu Bala wrote:
>>> ...
>>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
>>> deprecated' warnings.
>>
>> Ah, that reminds me!
>>
>> I recently tried to do an install of the cvs head (a week or two ago) on
>> a clean installation of Debian 4.0 (etch). During the installation, of
>> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
>> Bioperl. I seem to remember this circular dependency cropping up before
>> - am I correct - and can you remind me how this was "fixed"?
>>
>> Cheers
>> Nath
> 
> Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of
> Bioperl (and he could be come a dev).  That would solve it.
> 
> chris

Just to put the feelers out to see what people think.

It seems (to me at least) that Bioperl modules could/should? be released
as individual modules and that "bioperl" would really constitute a
"bundle" of all these modules - in terms of CPAN anyway. Am I correct in
this thinking? The Bio::ASN1::EntrezGene could simply require a
particular module rather than the whole of bioperl - might get out of
the circular dependency theoretically!?

I'm not suggesting moving in this direction, but just wondered what
others thought about this concept?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgreYczuW2jkwy2gRAi5IAJ9/Alq1fktEmAF16DlKcBVcy7d+jQCeIj+X
tOFQUQ7cGJLUITEDw1+QLxc=
=Yc+g
-----END PGP SIGNATURE-----

From cjfields at uiuc.edu  Wed Jun 27 15:31:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 14:31:44 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B798.1010409@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
	<4682B798.1010409@sheffield.ac.uk>
Message-ID: <33C76559-4771-4FDC-9EEA-1645BC3C576C@uiuc.edu>


On Jun 27, 2007, at 2:16 PM, Nathan S. Haigh wrote:

> ...
>
> Just to put the feelers out to see what people think.
>
> It seems (to me at least) that Bioperl modules could/should? be  
> released
> as individual modules and that "bioperl" would really constitute a
> "bundle" of all these modules - in terms of CPAN anyway. Am I  
> correct in
> this thinking? The Bio::ASN1::EntrezGene could simply require a
> particular module rather than the whole of bioperl - might get out of
> the circular dependency theoretically!?
>
> I'm not suggesting moving in this direction, but just wondered what
> others thought about this concept?
>
> Nath

Well, Steve suggested splitting some of core into distinct groups,  
which I tend to agree with in some respects (speed up releases for  
those modules, such as SearchIO, DB, Graphics).  The problem we have  
yet to solve is what we consider 'core'.  Is it Bio::Seq and  
related?  Should it include Bio::DB*?  Should it just be Bio::*  
modules with no or very few external dependencies?  And so on...,   
probably not a decision we want to make immediately (until after svn  
migration, tests finished, maybe a release or two, a beer)...

The Bioperl module dependency that Bio::ASN1::EntrezGene has is  
Bio::Index::AbstractSeq.  You could try a test build of  
Bio::ASN1::EntrezGene to see what happens.

chris


From hlapp at gmx.net  Wed Jun 27 15:49:15 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 16:49:15 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
Message-ID: <E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>


On Jun 27, 2007, at 1:27 PM, David Messina wrote:

> I would think we would want "Author Date Id Rev URL" set on
> everything, no?. So either cvs2svn or your tool (whichever you think
> is better), followed by
>
> 	svn propset svn:keywords "Author Date Id Rev URL" *

Shouldn't this be done recursively?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Jun 27 15:50:27 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 16:50:27 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
Message-ID: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>


On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:

> Most projects make a clean break with cvs (no more commits) for the
> reasons you point out.  Not sure how the other core devs feel about
> that but I could go for that; it would def. prevent headaches.

There shouldn't be any cvs write support after the cut-over I think.  
I don't see the benefit that would justify the huge headache potential.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 27 16:01:40 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:01:40 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
Message-ID: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>


On Jun 27, 2007, at 2:50 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:
>
>> Most projects make a clean break with cvs (no more commits) for the
>> reasons you point out.  Not sure how the other core devs feel about
>> that but I could go for that; it would def. prevent headaches.
>
> There shouldn't be any cvs write support after the cut-over I  
> think. I don't see the benefit that would justify the huge headache  
> potential.
>
> 	-hilmar

Agreed, so maybe we should set that in stone.  That means no svn->cvs  
syncing post-migration as well, I assume.

Now how about a quick straw poll, what kind of access?  svn+ssh is  
already available, but some (Aaron among them) have indicated they  
would like https as well (not sure how involved it would be to set up).

chris

From hlapp at gmx.net  Wed Jun 27 16:08:40 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:08:40 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
Message-ID: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>


On Jun 27, 2007, at 5:01 PM, Chris Fields wrote:

> That means no svn->cvs syncing post-migration as well, I assume.

That's a bit of a different story. People out there have URL links  
into our anonymous CVS repository. If it's not too troublesome (and  
tend to I think it's not) I'd like to maintain those in working  
order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi  
script that maps between the URL flavors (i.e., that maps a CVS-style  
URL to the equivalent SVN link).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Wed Jun 27 16:15:10 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 16:15:10 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
Message-ID: <18050.50510.84363.355034@almost.alerce.com>

David Messina writes:
 > > [Chris]
 > >
 > > I managed to get it working using file://.  Haven't tried svn+ssh yet
 > > but I've had persistent problems getting ssh to work properly on my
 > > macbook; not sure why yet but I haven't had time to play around  
 > > with it.
 > 
 > I just did a checkout and a test commit, both via svn+ssh -- works  
 > great for me.

Is there anyone working outside of bioperl-{run,live,ext}?

g.


From bix at sendu.me.uk  Wed Jun 27 16:22:13 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 21:22:13 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B798.1010409@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
	<4682B798.1010409@sheffield.ac.uk>
Message-ID: <4682C6F5.4020406@sendu.me.uk>

Nathan S. Haigh wrote:
> It seems (to me at least) that Bioperl modules could/should? be released
> as individual modules and that "bioperl" would really constitute a
> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
> this thinking? The Bio::ASN1::EntrezGene could simply require a
> particular module rather than the whole of bioperl - might get out of
> the circular dependency theoretically!?

No, it wouldn't. The 'problem' only arises because the user is 
/choosing/ to install both Bioperl and Bio::ASN1::EntrezGene at the same 
time. So even if Bioperl was released as separate modules there would 
still be that 'bundle' and users would still choose to do the same 
thing: install all the Bioperl modules as well as all its /optional/ 
recommended modules. And there lies the problem: Bio::ASN1::EntrezGene 
requires  Bioperl modules, and one Bioperl module requires 
Bio::ASN1::EntrezGene, so the circularity isn't solved.


(FYI:
Bio::ASN1::EntrezGene requires Bio::Index::AbstractSeq
Bio::Index::AbstractSeq requires a couple of Bioperl modules, including 
Bio::Root::Root

Bio::SeqIO::entrezgene requires Bio::ASN1::EntrezGene and a bunch of 
Bioperl modules, including Bio::Root::Root.
)


You only avoid circularity by choosing not to install everything in one 
go. Which is something you can do right now with no problems.

From n.haigh at sheffield.ac.uk  Wed Jun 27 16:24:18 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 21:24:18 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
Message-ID: <4682C772.5070502@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hilmar Lapp wrote:
> On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:
> 
>> Most projects make a clean break with cvs (no more commits) for the
>> reasons you point out.  Not sure how the other core devs feel about
>> that but I could go for that; it would def. prevent headaches.
> 
> There shouldn't be any cvs write support after the cut-over I think.  
> I don't see the benefit that would justify the huge headache potential.
> 
> 	-hilmar

I agree. A clean switch from cvs read/write to svn read/write plus cvs
read only sounds the least problematic!

However, how will links to cvs be dealt with? Links on Bioperl could be
switched over to point to svn, but what about possible links from
external sources? Maybe a more generic approach of redirection could
work? Or a simple warning page stating the fact that we have moved from
cvs to svn and provide a common link to follow?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgsdyczuW2jkwy2gRAtuyAKDIpN0TNX0U7sTuE3i+fj6WFZ1K0QCfcX7Y
81KurFwJlRtYFxSmLZP56Sk=
=pp7b
-----END PGP SIGNATURE-----

From hlapp at gmx.net  Wed Jun 27 16:30:19 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:30:19 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>


On Jun 26, 2007, at 5:21 PM, George Hartzell wrote:

>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>

Cool - this works for me.

One thing I notice is that in cvs log you see which version is in  
which branch which is useful to answer user queries that might be a  
version problem. svn log doesn't seem to want to show that. Does  
anyone have ideas for how to do this in svn?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Jun 27 16:32:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:32:18 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4682C772.5070502@sheffield.ac.uk>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<4682C772.5070502@sheffield.ac.uk>
Message-ID: <D080DC49-A2A4-44E4-9027-A63C1772CD85@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 27, 2007, at 5:24 PM, Nathan S. Haigh wrote:

> However, how will links to cvs be dealt with?

Well I said before that probably one can write a couple of lines of  
Perl to write a cgi script that returns the appropriate redirect URL  
with a redirect status code.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGgslWuV6N2JxL7qsRAvsTAKDjR18NzWzlj74mCF+diNpe2dLV2ACgn/4Y
f6sJ/ngeKEGpKHgyAHM1DAA=
=8n0E
-----END PGP SIGNATURE-----

From cjfields at uiuc.edu  Wed Jun 27 16:50:11 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:50:11 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
Message-ID: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>


On Jun 27, 2007, at 3:30 PM, Hilmar Lapp wrote:

>
> On Jun 26, 2007, at 5:21 PM, George Hartzell wrote:
>
>>
>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>>
>
> Cool - this works for me.
>
> One thing I notice is that in cvs log you see which version is in  
> which branch which is useful to answer user queries that might be a  
> version problem. svn log doesn't seem to want to show that. Does  
> anyone have ideas for how to do this in svn?
>
> 	-hilmar

We prob. should move it to a new directory ASAP which george can  
write to when he needs to update.  cvs is in /home/repository/ 
bioperl, so maybe something similar, like /home/svn/repository/bioperl?

chris


From cjfields at uiuc.edu  Wed Jun 27 16:51:37 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:51:37 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>
Message-ID: <4D8CAAD9-4774-47FB-84E0-7FBA50EC377B@uiuc.edu>


On Jun 27, 2007, at 3:08 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 5:01 PM, Chris Fields wrote:
>
>> That means no svn->cvs syncing post-migration as well, I assume.
>
> That's a bit of a different story. People out there have URL links  
> into our anonymous CVS repository. If it's not too troublesome (and  
> tend to I think it's not) I'd like to maintain those in working  
> order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi  
> script that maps between the URL flavors (i.e., that maps a CVS- 
> style URL to the equivalent SVN link).
>
> 	-hilmar

I'll try getting a wiki page up as a checklist for this, including  
what direction we're heading in, ideas (your list and CGI redirect  
ideas, svn::ignore issues, etc).  Dave has already started on the  
'getting bioperl using svn' wiki page.

If we intend to sync cvs with svn we need to find the right tools or  
at least check for other projects which have done something similar.   
I haven't googled on that yet but I'll attempt to tonight.

chris


From cjfields at uiuc.edu  Wed Jun 27 16:53:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:53:08 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <C2A83EA3.EC27%bosborne11@verizon.net>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
Message-ID: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>

bioperl-run also.  I think the run CVS repo has some binary files, so  
if there are any problems with cvs2svn it'll be there.

chris

On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote:

> George,
>
> bioperl-db and bioperl-network should be included, I think.
>
> Brian O
>
>
> On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:
>
>> David Messina writes:
>>>> [Chris]
>>>>
>>>> I managed to get it working using file://.  Haven't tried svn 
>>>> +ssh yet
>>>> but I've had persistent problems getting ssh to work properly on my
>>>> macbook; not sure why yet but I haven't had time to play around
>>>> with it.
>>>
>>> I just did a checkout and a test commit, both via svn+ssh -- works
>>> great for me.
>>
>> Is there anyone working outside of bioperl-{run,live,ext}?
>>
>> g.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Wed Jun 27 17:05:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 22:05:50 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682C6F5.4020406@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk>
Message-ID: <4682D12E.3000803@sendu.me.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> It seems (to me at least) that Bioperl modules could/should? be released
>> as individual modules and that "bioperl" would really constitute a
>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>> particular module rather than the whole of bioperl - might get out of
>> the circular dependency theoretically!?
> 
> No, it wouldn't.
[snip]
> You only avoid circularity by choosing not to install everything in one 
> go.

Errr... I take that back. Since CPAN bundles install things in a certain 
order, you just have to make sure that everything Bio::ASN1::EntrezGene 
needs is installed first, then Bio::ASN1::EntrezGene, then 
Bio::SeqIO::entrezgene.

But the main problem with this approach is that maintenance, 
global-style code improvements and releases become a nightmare. I could, 
perhaps, imagine a scenario where the repository stayed as-is (one 
monolithic collection), but the dist action of Build.PL could be altered 
to generate a release package per module instead of one big release 
package of all modules, as is currently the case.

Is there much value in doing that? Does anyone want me to look into the 
feasibility of such a thing?

From bosborne11 at verizon.net  Wed Jun 27 16:19:47 2007
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 27 Jun 2007 16:19:47 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <18050.50510.84363.355034@almost.alerce.com>
Message-ID: <C2A83EA3.EC27%bosborne11@verizon.net>

George,

bioperl-db and bioperl-network should be included, I think.

Brian O


On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:

> David Messina writes:
>>> [Chris]
>>> 
>>> I managed to get it working using file://.  Haven't tried svn+ssh yet
>>> but I've had persistent problems getting ssh to work properly on my
>>> macbook; not sure why yet but I haven't had time to play around
>>> with it.
>> 
>> I just did a checkout and a test commit, both via svn+ssh -- works
>> great for me.
> 
> Is there anyone working outside of bioperl-{run,live,ext}?
> 
> g.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Wed Jun 27 17:25:53 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 22:25:53 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682D12E.3000803@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
Message-ID: <4682D5E1.2030507@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> It seems (to me at least) that Bioperl modules could/should? be released
>>> as individual modules and that "bioperl" would really constitute a
>>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
>>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>>> particular module rather than the whole of bioperl - might get out of
>>> the circular dependency theoretically!?
>>
>> No, it wouldn't.
> [snip]
>> You only avoid circularity by choosing not to install everything in
>> one go.
> 
> Errr... I take that back. Since CPAN bundles install things in a certain
> order, you just have to make sure that everything Bio::ASN1::EntrezGene
> needs is installed first, then Bio::ASN1::EntrezGene, then
> Bio::SeqIO::entrezgene.
> 
> But the main problem with this approach is that maintenance,
> global-style code improvements and releases become a nightmare. I could,
> perhaps, imagine a scenario where the repository stayed as-is (one
> monolithic collection), but the dist action of Build.PL could be altered
> to generate a release package per module instead of one big release
> package of all modules, as is currently the case.
> 
> Is there much value in doing that? Does anyone want me to look into the
> feasibility of such a thing?


I think the value would be in other external modules being able to use
bioperl modules with more ease (not sure how many modules have, or
currently depend on bioperl) as they would depend on a single module,
rather than the whole package. However, how would the dependencies of
each module be handled? I'm clearly thinking aloud, but....Maybe this
would tease apart "cliques" of modules that are interdependent? and
could in themselves be shipped as bundles e.g. Bio::Graphics and have a
"master" bioperl bundle that installa all the bioperl modules.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgtXhczuW2jkwy2gRAiftAKDZQGDpaq5saEyE3ZfPyFqli4j+8QCfXbIB
2EZjccEFEzfFlx4H47gzwLk=
=nobl
-----END PGP SIGNATURE-----

From hlapp at gmx.net  Wed Jun 27 17:35:28 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 18:35:28 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
Message-ID: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>

Is there a reason not to port every subproject over?

	-hilmar

On Jun 27, 2007, at 5:53 PM, Chris Fields wrote:

> bioperl-run also.  I think the run CVS repo has some binary files, so
> if there are any problems with cvs2svn it'll be there.
>
> chris
>
> On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote:
>
>> George,
>>
>> bioperl-db and bioperl-network should be included, I think.
>>
>> Brian O
>>
>>
>> On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:
>>
>>> David Messina writes:
>>>>> [Chris]
>>>>>
>>>>> I managed to get it working using file://.  Haven't tried svn
>>>>> +ssh yet
>>>>> but I've had persistent problems getting ssh to work properly  
>>>>> on my
>>>>> macbook; not sure why yet but I haven't had time to play around
>>>>> with it.
>>>>
>>>> I just did a checkout and a test commit, both via svn+ssh -- works
>>>> great for me.
>>>
>>> Is there anyone working outside of bioperl-{run,live,ext}?
>>>
>>> g.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 27 17:36:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:36:29 -0500
Subject: [Bioperl-l] Splits again, formerly  Test overhaul complete
In-Reply-To: <4682D12E.3000803@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
Message-ID: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>


On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> It seems (to me at least) that Bioperl modules could/should? be  
>>> released
>>> as individual modules and that "bioperl" would really constitute a
>>> "bundle" of all these modules - in terms of CPAN anyway. Am I  
>>> correct in
>>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>>> particular module rather than the whole of bioperl - might get  
>>> out of
>>> the circular dependency theoretically!?
>> No, it wouldn't.
> [snip]
>> You only avoid circularity by choosing not to install everything  
>> in one go.
>
> Errr... I take that back. Since CPAN bundles install things in a  
> certain order, you just have to make sure that everything  
> Bio::ASN1::EntrezGene needs is installed first, then  
> Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene.
>
> But the main problem with this approach is that maintenance, global- 
> style code improvements and releases become a nightmare. I could,  
> perhaps, imagine a scenario where the repository stayed as-is (one  
> monolithic collection), but the dist action of Build.PL could be  
> altered to generate a release package per module instead of one big  
> release package of all modules, as is currently the case.
>
> Is there much value in doing that? Does anyone want me to look into  
> the feasibility of such a thing?

Not for the time being, at least in my opinion.  Too much on our  
plate at this point with svn migration, test conversion, bugzilla  
running over (next point of attack!), etc.  Maybe something to think  
about after, though I like the idea of a few splits to core as Steve  
suggested (SearchIO, Graphics, some LWP-related DB modules).

My (albeit extreme) thought is to have a lean-and-mean set of 'core'  
modules with as few external dependencies as possible, which could  
work around the circular dependency issue in this case:

                dep.on                  dep.on
Bio::Auxiliary -----> ASN1::EntrezGene -----> core
(with EntrezGene)                            (basic SeqIO, Index, DB,  
etc)
       \---->------>--- dep.on ->----->----->----/

Bioperl auxiliary modules would list core as a required dependency  
along with anything else needed for that particular aux. section  
(i.e. XML parsers, LWP, GD, etc.).  The whole mess, if needed, would  
be installed using Bundle::BioPerl or similar, with no part released  
w/o testing on the whole 'base' to ensure proper interaction.

If a fix needed to be made in one set, make the fix, test against  
bioperl 'base' as a whole, and release when possible.  No need to  
wait for a full-fledged 1.5.3 release.

Maybe wishful thinking...

chris


From cjfields at uiuc.edu  Wed Jun 27 17:44:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:44:47 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
	<9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
Message-ID: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>

We should port them all, yes.

chris

On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote:

> Is there a reason not to port every subproject over?
>
> 	-hilmar


From cjfields at uiuc.edu  Wed Jun 27 17:53:02 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:53:02 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682D5E1.2030507@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<4682D5E1.2030507@sheffield.ac.uk>
Message-ID: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>


On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote:

>> ...
>> Is there much value in doing that? Does anyone want me to look  
>> into the
>> feasibility of such a thing?
>
>
> I think the value would be in other external modules being able to use
> bioperl modules with more ease (not sure how many modules have, or
> currently depend on bioperl) as they would depend on a single module,
> rather than the whole package. However, how would the dependencies of
> each module be handled? I'm clearly thinking aloud, but....Maybe this
> would tease apart "cliques" of modules that are interdependent? and
> could in themselves be shipped as bundles e.g. Bio::Graphics and  
> have a
> "master" bioperl bundle that installa all the bioperl modules.

See my response to Sendu, and Steve Chervitz's original post and  
related thread:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ 
focus=15315

which pretty much covers the same ground.  I think at most 4-5 split  
'cliques', including core, with the fewest possible dependencies in  
core.  If we do any of this, it prob. should wait until after an svn  
migration and bugzilla bug stomping unless there is a (well-argued)  
advantage to doing it now.

chris

From n.haigh at sheffield.ac.uk  Wed Jun 27 18:07:31 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 23:07:31 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<4682D5E1.2030507@sheffield.ac.uk>
	<1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>
Message-ID: <4682DFA3.9090100@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> 
> On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote:
> 
>>> ...
>>> Is there much value in doing that? Does anyone want me to look into the
>>> feasibility of such a thing?
>>
>>
>> I think the value would be in other external modules being able to use
>> bioperl modules with more ease (not sure how many modules have, or
>> currently depend on bioperl) as they would depend on a single module,
>> rather than the whole package. However, how would the dependencies of
>> each module be handled? I'm clearly thinking aloud, but....Maybe this
>> would tease apart "cliques" of modules that are interdependent? and
>> could in themselves be shipped as bundles e.g. Bio::Graphics and have a
>> "master" bioperl bundle that installa all the bioperl modules.
> 
> See my response to Sendu, and Steve Chervitz's original post and related
> thread:
> 
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/focus=15315
> 
> which pretty much covers the same ground.  I think at most 4-5 split
> 'cliques', including core, with the fewest possible dependencies in
> core.  If we do any of this, it prob. should wait until after an svn
> migration and bugzilla bug stomping unless there is a (well-argued)
> advantage to doing it now.
> 
> chris


That's fine by me - or should I say, the best way forward - I was really
just thinking aloud :)

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgt+jczuW2jkwy2gRAhPmAKDCgI1BOp/MOQVUQhQGqWaRRfPTaACfTPix
TSi/e8PtYTwpxn6x+ewrjBs=
=7Vp1
-----END PGP SIGNATURE-----

From bix at sendu.me.uk  Wed Jun 27 18:43:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 23:43:48 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
Message-ID: <4682E824.1050507@sendu.me.uk>

Chris Fields wrote:
> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:
>> But the main problem with this approach is that maintenance, global- 
>> style code improvements and releases become a nightmare. I could,  
>> perhaps, imagine a scenario where the repository stayed as-is (one  
>> monolithic collection), but the dist action of Build.PL could be  
>> altered to generate a release package per module instead of one big  
>> release package of all modules, as is currently the case.
>>
>> Is there much value in doing that? Does anyone want me to look into  
>> the feasibility of such a thing?
> 
> Not for the time being, at least in my opinion.  Too much on our  
> plate at this point with svn migration, test conversion, bugzilla  
> running over (next point of attack!), etc.  Maybe something to think  
> about after, though I like the idea of a few splits to core as Steve  
> suggested (SearchIO, Graphics, some LWP-related DB modules).
[snip]
> If a fix needed to be made in one set, make the fix, test against  
> bioperl 'base' as a whole, and release when possible.  No need to  
> wait for a full-fledged 1.5.3 release.

What advantage is there of these defined splits instead of individual 
modules? As I see it you lose some of the potential benefits of breaking 
Bioperl up completely, whilst also suffering the maintenance problems I 
outlined in my objection to Steve's post.

Being able to work on all Bioperl from a single cvs (ne svn) check out/ 
archive, whilst distributing it as individual modules on CPAN seems like 
the best of both worlds to me. What am I missing?

From hartzell at alerce.com  Wed Jun 27 20:41:01 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:41:01 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
	<9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>
Message-ID: <18051.925.23313.932916@almost.alerce.com>

Chris Fields writes:
 > [...]
 > We prob. should move it to a new directory ASAP which george can  
 > write to when he needs to update.  cvs is in /home/repository/ 
 > bioperl, so maybe something similar, like /home/svn/repository/bioperl?

I'd be parsimonious (lazy...) and go for /home/svn/bioperl.

g.

From hartzell at alerce.com  Wed Jun 27 20:46:29 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:46:29 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
Message-ID: <18051.1253.87485.235496@almost.alerce.com>

Chris Fields writes:
 > [...]
 > Now how about a quick straw poll, what kind of access?  svn+ssh is  
 > already available, but some (Aaron among them) have indicated they  
 > would like https as well (not sure how involved it would be to set up).

What we do here, in large part, depends on what our host machine makes
available to us.

Is there an apache instance that we can use?  Maybe a separate one?

May someone among us configure it, or do we need to ask for help?  (in
other words, does anyone have sudo?)

Is there some reason to not include http: (using Digest authentication
so that passwords aren't passed in the clear?)?  Maybe even go so far
as to ask why bother with https:, it's not like we need to transfer
any data encrypted....

g.

From dmessina at wustl.edu  Wed Jun 27 23:02:25 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 22:02:25 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
Message-ID: <D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>


On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 1:27 PM, David Messina wrote:
>
>> I would think we would want "Author Date Id Rev URL" set on
>> everything, no?. So either cvs2svn or your tool (whichever you think
>> is better), followed by
>>
>> 	svn propset svn:keywords "Author Date Id Rev URL" *
>
> Shouldn't this be done recursively?


Yep, good catch! Thanks, Hilmar.

Should be:

	svn propset --recursive svn:keywords "Author Date Id Rev URL" *


From jason at bioperl.org  Wed Jun 27 23:29:09 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 28 Jun 2007 00:29:09 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <18051.1253.87485.235496@almost.alerce.com>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
Message-ID: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>

I think Chris D and I will need to confer a bit on https+svn.  I  
don't know when we'll have a good chance to discuss everything.  At  
some point this discussion is may need to be taken off bioperl and  
just the interested parties as we're delving into hardware geek land.

The repository machine (dev) is a locked down machine meaning it only  
really runs ssh and not many servers include httpd.  We have  
anonymous CVS (client and through httpd browsing) running on a  
separate machine (code) that has the info rsynced over every 10 or 15  
minutes. The foundation websites and mailing lists run on a third  
machine (portal).


If we decide to support https we'll need to spend a little time  
deciding how well we can keep it locked down - it will only be https  
not http for example and we may want to see about limiting ssh access  
to everyone if we migrate all OBF projects over to SVN and only  
support https.

Again to re-iterate what I think we would do:
  - SVN read/write will live on 'dev', _WHEN_ we switch over no  
writes to the CVS repository. It will be available by ssh+svn and  
potentially by https+svn
  - SVN read-only will live on 'code', it will be accessible by http+svn
  - CVS read-only will live on 'code', this will only be a sync from  
the SVN to the CVS.  See http://svn2cvs.tigris.org/ for details


As I tried to ask for in the past, would someone also illustrate the  
importance of why _WE_ need to switch to SVN on a wiki page on  
Bioperl so that when someone complains/asks about this in the future  
the arguments are already laid out.  I am basically fine with it, but  
I don't honestly see a compelling reason beyond what has been  
mentioned wrt better integration in IDEs.
http://bioperl.org/wiki/Why_SVN

-jason
On Jun 27, 2007, at 9:46 PM, George Hartzell wrote:

> Chris Fields writes:
>> [...]
>> Now how about a quick straw poll, what kind of access?  svn+ssh is
>> already available, but some (Aaron among them) have indicated they
>> would like https as well (not sure how involved it would be to set  
>> up).
>
> What we do here, in large part, depends on what our host machine makes
> available to us.
>
> Is there an apache instance that we can use?  Maybe a separate one?
>
> May someone among us configure it, or do we need to ask for help?  (in
> other words, does anyone have sudo?)
>
> Is there some reason to not include http: (using Digest authentication
> so that passwords aren't passed in the clear?)?  Maybe even go so far
> as to ask why bother with https:, it's not like we need to transfer
> any data encrypted....
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Wed Jun 27 23:51:32 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 28 Jun 2007 00:51:32 -0300
Subject: [Bioperl-l] Splits again
In-Reply-To: <4682E824.1050507@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
Message-ID: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>

Hey guys - I'm wading in a bit late as I haven't had time to keep up  
with whole discussion.

So you are suggesting 800+ individual CPAN modules?  I don't think  
that is a good idea.  Why would you split up Bio::Seq::RichSeq and  
Bio::Seq into two separate packages for example? I think if you  
really want to move away from the monolithic install it has to be  
more logical by function - but I am not that optimistic that this is  
going to actually be easier for people.  Maybe I'm misunderstanding.

What are the arguments for separating things -- to make it so people  
aren't scared by the number of modules so they'll code?  It seems  
like some people just want it to be installed and run scripts - does  
having them install dozens of modules work.  Do we need to consider  
people how much this would suck if someone can't use CPAN or  
Module::Builder to automate dependancy tracking installation?  How  
does it work when modules are deprecated?

I'm not sure I have made up my mind on what I'd like to see, but at  
some point I think we need to get a clearer idea of what audience we  
are trying to serve best.  If want it to be easy to install maybe we  
should invest time into making OSX double-click installers, RPMs, and  
the Windows stuff easily installable.  If we want to serve the  
developers who aren't using SVN so we want to push out releases of  
modules ASAP?  I just am not clear on the motivation for some of the  
proposed changes.

Also - the main point I wanted to make - Can I suggest we spend a  
little time discussing what it will take to get a stable release for  
the current code as it stands (bioperl-live and bioperl-run)?  It  
seems like we really need to do this first so that we have a stable  
release that can be followed by CVS -> SVN migration, then consider  
major changes to the repository structure and release packaging, and  
potential deprecation and incorporation of other modules.


I assume there is no chance that we'd have a 1.6 candidate by BOSC  
next month?

Will it be productive to schedule a fair amount of time at BOSC  
discussing how to partition out the packages into separate sub- 
packages after we've done a successful release rather than trying to  
change things right now? I realize not everyone will be there but  
maybe it will be easier to interact on this then.

I think it will also be time to talk with Lincoln/Scott about how  
Gbrowse is structured and if that is working for them.  There is too  
much code in different places that I think we need to figure out how  
to structure it properly so those packages can be released.  It would  
probably mean moving Bio::Graphics, Bio::DB::GFF and  
Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages  
so they could be released more regularly on par with Gbrowse  
schedules.   Also I think someone needs to figure out Bio::Tools::GFF  
vs Bio::FeatureIO -- what do we want to do?  I don't think we really  
fully support GFF3 that well -- the X2GFF scripts probably need some  
more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL,  
etc... ) and or migration to the proper GFF writing.


-jason
On Jun 27, 2007, at 7:43 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:
>>> But the main problem with this approach is that maintenance, global-
>>> style code improvements and releases become a nightmare. I could,
>>> perhaps, imagine a scenario where the repository stayed as-is (one
>>> monolithic collection), but the dist action of Build.PL could be
>>> altered to generate a release package per module instead of one big
>>> release package of all modules, as is currently the case.
>>>
>>> Is there much value in doing that? Does anyone want me to look into
>>> the feasibility of such a thing?
>>
>> Not for the time being, at least in my opinion.  Too much on our
>> plate at this point with svn migration, test conversion, bugzilla
>> running over (next point of attack!), etc.  Maybe something to think
>> about after, though I like the idea of a few splits to core as Steve
>> suggested (SearchIO, Graphics, some LWP-related DB modules).
> [snip]
>> If a fix needed to be made in one set, make the fix, test against
>> bioperl 'base' as a whole, and release when possible.  No need to
>> wait for a full-fledged 1.5.3 release.
>
> What advantage is there of these defined splits instead of individual
> modules? As I see it you lose some of the potential benefits of  
> breaking
> Bioperl up completely, whilst also suffering the maintenance  
> problems I
> outlined in my objection to Steve's post.
>
> Being able to work on all Bioperl from a single cvs (ne svn) check  
> out/
> archive, whilst distributing it as individual modules on CPAN seems  
> like
> the best of both worlds to me. What am I missing?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From chris at bioteam.net  Thu Jun 28 00:08:25 2007
From: chris at bioteam.net (Chris Dagdigian)
Date: Thu, 28 Jun 2007 00:08:25 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <97A3257B-8E00-48D7-8B7D-51AD728CB8F7@bioteam.net>


My understanding of "https+svn" is that it is actually WebDAV-over- 
HTTP which means that not only would we need to light up a HTTPD  
server on the developer box we'd also have to get a stable mod_dav  
module installed (sometimes not trivial) and then we would have to  
figure out how to handle the authentication bits. Right now with SSH  
we use Unix group permissions to figure out who can write to what  
repository -- WebDAV makes this a lot more complicated.

Forcing encryption over https will prevent someone from sniffing a  
developer password which removes the main security issue. The next  
problem is going to be integrating the DAV module with Linux PAM so  
that existing usernames and passwords can be used, -OR- we have to  
set up and maintain an entirely separate set of username and password  
maps for each developer and each SVN project.

I'm not super concerned about this -- BioTeam runs svn internally and  
we expose our SVN for employees both via WebDAV and SVN+SSH - it's  
not that hard to set up.

My biggest concern really has to do with how much extra work this  
will mean for the OBF sysadmin team. If there is an easy way to get a  
stable Apache/DAV/SVN integration going with authentication coming  
from Linux PAM then this is no big deal. If we have to manually  
maintain separate authentication lists then it will be kind of a hassle.

Like Jason mentioned, the OBF currently segregates "stuff" onto three  
different servers with three levels of security:

- dev.open-bio.org -- Developers only, SSH access only (main  
sourcecode repository for OBF)
- portal.open-bio.org -- Websites, Wikis, Blogs, Mailing list servers  
and helpdesk.open-bio.org
- code.open-bio.org -- "Disposable" anonymous access server that we  
can easily burn/wipe/reinstall if it ever gets hacked

Everything else that Jason mentioned is fine and easy to set up (if  
not already running):

  - SVN+SSH for developers
  - Anonymous SVN and Anonymous RSYNC for community access on  
code.open-bio.org
  - svn2cvs for whomever wants it on code.open-bio.org
  - web based SVN code browser installed on http://code.open-bio.org


Regards,
Chris


On Jun 27, 2007, at 11:29 PM, Jason Stajich wrote:

> I think Chris D and I will need to confer a bit on https+svn.  I  
> don't know when we'll have a good chance to discuss everything.  At  
> some point this discussion is may need to be taken off bioperl and  
> just the interested parties as we're delving into hardware geek land.
>
> The repository machine (dev) is a locked down machine meaning it  
> only really runs ssh and not many servers include httpd.  We have  
> anonymous CVS (client and through httpd browsing) running on a  
> separate machine (code) that has the info rsynced over every 10 or  
> 15 minutes. The foundation websites and mailing lists run on a  
> third machine (portal).
>
>
> If we decide to support https we'll need to spend a little time  
> deciding how well we can keep it locked down - it will only be  
> https not http for example and we may want to see about limiting  
> ssh access to everyone if we migrate all OBF projects over to SVN  
> and only support https.
>
> Again to re-iterate what I think we would do:
>  - SVN read/write will live on 'dev', _WHEN_ we switch over no  
> writes to the CVS repository. It will be available by ssh+svn and  
> potentially by https+svn
>  - SVN read-only will live on 'code', it will be accessible by http 
> +svn
>  - CVS read-only will live on 'code', this will only be a sync from  
> the SVN to the CVS.  See http://svn2cvs.tigris.org/ for details
>
>
> As I tried to ask for in the past, would someone also illustrate  
> the importance of why _WE_ need to switch to SVN on a wiki page on  
> Bioperl so that when someone complains/asks about this in the  
> future the arguments are already laid out.  I am basically fine  
> with it, but I don't honestly see a compelling reason beyond what  
> has been mentioned wrt better integration in IDEs.
> http://bioperl.org/wiki/Why_SVN
>
> -jason
> On Jun 27, 2007, at 9:46 PM, George Hartzell wrote:
>
>> Chris Fields writes:
>>> [...]
>>> Now how about a quick straw poll, what kind of access?  svn+ssh is
>>> already available, but some (Aaron among them) have indicated they
>>> would like https as well (not sure how involved it would be to  
>>> set up).
>>
>> What we do here, in large part, depends on what our host machine  
>> makes
>> available to us.
>>
>> Is there an apache instance that we can use?  Maybe a separate one?
>>
>> May someone among us configure it, or do we need to ask for help?   
>> (in
>> other words, does anyone have sudo?)
>>
>> Is there some reason to not include http: (using Digest  
>> authentication
>> so that passwords aren't passed in the clear?)?  Maybe even go so far
>> as to ask why bother with https:, it's not like we need to transfer
>> any data encrypted....
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>


From cjfields at uiuc.edu  Thu Jun 28 00:18:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 23:18:03 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4682E824.1050507@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
Message-ID: <FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>


On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:

> Chris Fields wrote:
> ...
>> If a fix needed to be made in one set, make the fix, test against   
>> bioperl 'base' as a whole, and release when possible.  No need to   
>> wait for a full-fledged 1.5.3 release.
>
> What advantage is there of these defined splits instead of  
> individual modules? As I see it you lose some of the potential  
> benefits of breaking Bioperl up completely, whilst also suffering  
> the maintenance problems I outlined in my objection to Steve's post.
>
> Being able to work on all Bioperl from a single cvs (ne svn) check  
> out/ archive, whilst distributing it as individual modules on CPAN  
> seems like the best of both worlds to me. What am I missing?

Okay, forewarned, but here's my long-winded reasoning.  The short and  
sweet version: I (very) respectfully don't agree with you, at least  
re: the idea we should commit all modules to CPAN independently.  It  
doesn't make any sense to me, but maybe you can elaborate more?   
Maybe I'm misinterpreting what you mean?

Also, I agree with Steve C. that core is anything but a  
representation of a 'core' set of modules, and some sections could  
(should?) be split off into discrete, cohesive units.  We may be  
alone in that camp, though it doesn't seem so (it's popped up more  
than a few times, in one form or another).  If you want an in-depth  
explanation for both opinions, read on (below my sig), or feel free  
to bypass it.  I'll understand.

Finally, all of this should wait until later.  Much later, like after  
a decent release, after svn, etc kind of 'later'.  I think we can  
agree on that.

.
.
.
.
.

Still here?  Okay... each issue (skip as needed):

Individual CPAN modules:

CPAN is not our personal versioning system; it may be if a  
distribution consists of only a few modules, but not when it's one of  
the largest distros present.  If someone wants to update an  
individual bioperl module for a quick bug fix they are more than  
welcome to download it via cvs, svn, or even using a web browser, and  
replace the one they have.  In most cases, it works w/o problems.   
With Module::Build you have even made it easier if a full  
installation is necessary.

I'm trying to reason how one could break up the individual SeqIO/ 
SearchIO/otherIO modules into single module distributions.  They are  
intrinsically tied together (SeqIO::genbank won't work w/o SeqIO,  
which relies on the various interfaces, RootIO, and on down).  How  
would tests be run off CPAN when the modules are distributed  
independently?  Would they also be individually distributed?  What  
would you use to tie all the individual modules together?  How would  
you explain to the CPAN maintainers that you want to split bioperl  
into 990 individual modules, all updated independently, but intend on  
bundling them afterwards anyway?

I'm failing to see the advantages to this approach, but if you can  
find an example where this was done successfully on CPAN or elsewhere  
maybe I could see what you mean.

Splitting up core:

As I see it, here are the advantages of a defined split as Steve and  
I see it (off the top of my head).  Some of this probably reiterates  
my previous points, as well as Steve's, so apologies in advance.

- A lean, mean, focused set of bioperl base modules (core) w/o or  
with very few external deps, minimal installation issues, etc.  The  
very basic stuff to get up and running.

- BioPerl bundled modules (Nathan's 'cliques') with defined, focused  
functionality, code, and tests, which add a bit more 'sugar' to the  
base functionality of the core.  If you only care about parsing BLAST  
reports, get SearchIO, which requires core and optionally other  
modules (XML::SAX).  If you want additional DB functionality apart  
from the very basic ones in core, install DB (with it's additional  
requirements, including core, DBI, and so on).  Same with Graphics,  
Tools, Tree/Phylo, etc.  We just need to define and limit the number  
of splits.

- Easier to add additional bundled modules.  For instance, I could  
focus all of my RNA work into a discrete set of modules (say, bioperl- 
rna) which I maintain, I ensure works with the latest core code, I  
ensure also plays well with the other children =) , and I distribute  
via CPAN.  Same with EUtilities, which could go into a separated DB- 
related set or stay in core.

- If we want a full-fledged 'install everything', the CPAN Bundle  
system is available.  I think it's easier to use a Bundle for 4-5,  
even 10 groups of modules as opposed to over 900.

- A Bundle or a build file where discrete distributions are listed  
(Bio::SearchIO, etc) wouldn't need to be updated every time a new  
module is added to a distribution.  I suppose this could be  
automated, but why have the additional headache?

- A chance to cut out some cruft.  We all know that particular areas  
need work or a complete overhaul (Restriction, Structure, maybe a few  
others).  Smaller, concentrated sets of modules I believe would be  
easier to maintain, and those that don't get use will eventually fall  
out of favor and may be lost or replaced from the more maintained  
group of modules.  Survival of the fittest.

- We already have had practice; bioperl-db, bioperl-run, bioperl- 
network, and others.  Those that have been routinely maintained and  
enjoy wide use (db, run, network) have survived; others not so much  
(corba-related stuff, microarray, ext, etc., though the code is still  
available if someone else wants to take it up and revive it!).

Disadvantages of a defined split:

- The initial headache of identifying which groups go where,  
coordinating with those who rely on bioperl (GMOD, etc) on how this  
will be set up, so on...

- Separate groups of modules require testing together to ensure  
functionality is consistent and maintained (something I think you  
pointed out previously).

- I think an increased possibility of branching is possible.

- Extra headaches for devs, who have to keep track of the various  
critical distributions and make sure they work well together.

- Maybe others, but it's getting late here.  Add more as needed; I'm  
sure there are a number more.


chris

From cjfields at uiuc.edu  Thu Jun 28 01:17:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 00:17:01 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <671B8432-28DA-47DA-9E0C-66AF0E3D5973@uiuc.edu>

D'oh!  Just when I wanted to go to bed.  It's not fair, you're in  
California...

On Jun 27, 2007, at 10:51 PM, Jason Stajich wrote:

> Hey guys - I'm wading in a bit late as I haven't had time to keep up
> with whole discussion.
>
> So you are suggesting 800+ individual CPAN modules?  I don't think
> that is a good idea.  Why would you split up Bio::Seq::RichSeq and
> Bio::Seq into two separate packages for example? I think if you
> really want to move away from the monolithic install it has to be
> more logical by function - but I am not that optimistic that this is
> going to actually be easier for people.  Maybe I'm misunderstanding.

Okay, so maybe it wasn't just me.

> What are the arguments for separating things -- to make it so people
> aren't scared by the number of modules so they'll code?  It seems
> like some people just want it to be installed and run scripts - does
> having them install dozens of modules work.  Do we need to consider
> people how much this would suck if someone can't use CPAN or
> Module::Builder to automate dependancy tracking installation?  How
> does it work when modules are deprecated?

What I envision for core is maybe not just one distribution, but a  
cluster of distributions:

base - Bio::Seq; Bio::SeqIO; Bio::AlignIO, some Bio::DB, associated  
modules.  Bare bones, with as few dependencies as possible.
aux - Any Bio::SeqIO, Bio::AlignIO, Bio::DB etc. that requires  
additional modules.
search - Bio::Search and SearchIO
tools - Bio::Tools, Bio::Restriction, maybe DB modules, GFF-related  
stuff?
graphics - Bio::Graphics.  Maybe GMOD-related stuff here?

The last four would list bioperl-core as a dependency themselves  
along with any other modules necessary.  We could also have the core  
Build.PL ask the user if they want to install the other non-base  
distros, and maybe include bioperl-db, bioperl-network, and bioperl- 
run in the loop if requested.

All would be installed as a bundle similar to Bundle::BioPerl, but  
have regular CPAN point releases (1.x.x) independently from one  
another i.e. for bug fixes, with a yearly/biyearly timed full release  
(1.x) of the whole shebang.  Any point release for any 'core'  
distribution would have to be tested against the others prior to  
release.

This is basically following Steve's train of thought, though more  
elaborated:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ 
focus=15315

> I'm not sure I have made up my mind on what I'd like to see, but at
> some point I think we need to get a clearer idea of what audience we
> are trying to serve best.  If want it to be easy to install maybe we
> should invest time into making OSX double-click installers, RPMs, and
> the Windows stuff easily installable.  If we want to serve the
> developers who aren't using SVN so we want to push out releases of
> modules ASAP?  I just am not clear on the motivation for some of the
> proposed changes.

I think regular CPAN releases with updated PPMs hosted via portal  
work fine for the most part, but it would be nice to host RPMs.   
Others (Allen Day, for instance) have donated time to generate RPMs  
but they seem to lag behind a bit more.

The original idea for svn arose from an unrelated thread with Mark  
Johnson discussing something (Glimmer maybe?) and took off from  
there.  I was actually pretty surprised it took on a life of it's  
own.  As for the motivation to switch, I haven't specifically used it  
myself, but the large number of responses seem to indicate others  
have and seem happy with it.  Rutger Vos had also indicated he would  
move Bio::Phylo over to the repo if we used svn.  We def. should  
address the issues you bring up (why _WE_ need svn) more succinctly  
but that shouldn't be an issue.

> Also - the main point I wanted to make - Can I suggest we spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

Agreed.  We prob. need to schedule a good couple of days (or so) to  
squash bugs.

> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

Um, not likely as nothing has been addressed Feature/Annotation-wise  
(overloads are still there, methods have not been deprecated, etc).   
There was an underlying assumption these would have an effect on GMOD- 
related stuff (I remember reading a post from Scott Cain in the mail  
archive mentioning something along these lines after the 1.5 release  
hubbub).

Maybe a quick 1.5.3 for BOSC, with a 1.6 for fall?

> Will it be productive to schedule a fair amount of time at BOSC
> discussing how to partition out the packages into separate sub-
> packages after we've done a successful release rather than trying to
> change things right now? I realize not everyone will be there but
> maybe it will be easier to interact on this then.

How many are going to be there?  I can't go this year except on my  
own dime (which I don't have many of, student loans and all, sorry),  
though I'll likely be in a new lab by spring which is likely more  
amenable to funding.  If there is a hackathon in the late fall (post- 
sept) I'll make it a point to go regardless.

> I think it will also be time to talk with Lincoln/Scott about how
> Gbrowse is structured and if that is working for them.  There is too
> much code in different places that I think we need to figure out how
> to structure it properly so those packages can be released.  It would
> probably mean moving Bio::Graphics, Bio::DB::GFF and
> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
> so they could be released more regularly on par with Gbrowse
> schedules.   Also I think someone needs to figure out Bio::Tools::GFF
> vs Bio::FeatureIO -- what do we want to do?  I don't think we really
> fully support GFF3 that well -- the X2GFF scripts probably need some
> more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL,
> etc... ) and or migration to the proper GFF writing.
>
>
> -jason

Will Lincoln or Scott be at BOSC?

chris


From dmessina at wustl.edu  Thu Jun 28 01:21:58 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 00:21:58 -0500
Subject: [Bioperl-l] finding statistics on AA
In-Reply-To: <4681F4B4.8010609@pacific.net.sg>
References: <4681F4B4.8010609@pacific.net.sg>
Message-ID: <F57E70E8-BBDA-45CF-B2C7-E05AED04F4C6@wustl.edu>

Hi Melvin,

I don't think BioPerl has any information content-related code. I'm  
not terribly familiar with it myself, but the usual recommendation is  
to look at the EMBOSS package:

	http://en.wikipedia.org/wiki/EMBOSS

Dave


From bix at sendu.me.uk  Thu Jun 28 02:38:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 07:38:48 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <46835778.5070901@sendu.me.uk>

Jason Stajich wrote:
> So you are suggesting ou are suggesting 800+ individual CPAN modules?
> I don't think that is a good idea.  Why would you split up
> Bio::Seq::RichSeq and Bio::Seq into two separate packages for
> example? I think if you really want to move away from the monolithic
> install it has to be more logical by function - but I am not that
> optimistic that this is going to actually be easier for people.
> Maybe I'm misunderstanding.
> 
> What are the arguments for separating things -- to make it so people
>  aren't scared by the number of modules so they'll code?  It seems
> like some people just want it to be installed and run scripts - does
> having them install dozens of modules work.  Do we need to consider
> people how much this would suck if someone can't use CPAN or
> Module::Builder to automate dependancy tracking installation?  How
> does it work when modules are deprecated?

See my upcoming reply to Chris. Briefly, if the only change is to the
dist action of Build.PL, we can make a single archive of all modules
available to non-CPAN users, and individual modules available to CPAN
users. No problems.


> Also - the main point I wanted to make - Can I suggest we spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

I'd recommend that a 'stable' release shouldn't happen until we resolve
all the missing tests and bugzilla bugs (because I think the opportunity
should be taken to have it stable both in terms of interface /and/
bugs). Which is a lot of work.


> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

None.

From bix at sendu.me.uk  Thu Jun 28 03:25:03 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 08:25:03 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
Message-ID: <4683624F.6020402@sendu.me.uk>

Chris Fields wrote:
> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>> What advantage is there of these defined splits instead of  
>> individual modules? As I see it you lose some of the potential  
>> benefits of breaking Bioperl up completely, whilst also suffering  
>> the maintenance problems I outlined in my objection to Steve's post.
>>
>> Being able to work on all Bioperl from a single cvs (ne svn) check  
>> out/ archive, whilst distributing it as individual modules on CPAN  
>> seems like the best of both worlds to me. What am I missing?
> 
> Okay, forewarned, but here's my long-winded reasoning.  The short and  
> sweet version: I (very) respectfully don't agree with you, at least  
> re: the idea we should commit all modules to CPAN independently. It  
> doesn't make any sense to me, but maybe you can elaborate more?   
> Maybe I'm misinterpreting what you mean?

The short and sweet version: my proposal has all the benefits of yours, 
but none of the disadvantages. What's not to like?


> Finally, all of this should wait until later.  Much later, like after  
> a decent release, after svn, etc kind of 'later'.  I think we can  
> agree on that.

Hmm, not really. If it can be implemented by a change in just Build.PL 
and ModuleBuildBioperl, its really independent of everything else. 
That's the beauty of it: the only thing that changes is how things are 
uploaded to and downloaded from CPAN. The only person that normally 
deals with that issue is the pumpkin for a release, and he only cares 
about it at release time.

In fact, if we're going to do it at all it makes sense to try it out on 
a minor release like 1.5.3. We've already got experience of doing it 
split-style from 1.5.2. (And let me tell you: splits at the code-base 
level suck.)


> Individual CPAN modules:
> 
> CPAN is not our personal versioning system; it may be if a  
> distribution consists of only a few modules, but not when it's one of  
> the largest distros present.  If someone wants to update an  
> individual bioperl module for a quick bug fix they are more than  
> welcome to download it via cvs, svn, or even using a web browser, and  
> replace the one they have.

And where is the harm in letting them do it via CPAN as well? In fact, 
there are significant benefits:


> I'm trying to reason how one could break up the individual SeqIO/ 
> SearchIO/otherIO modules into single module distributions.  They are  
> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO,  
> which relies on the various interfaces, RootIO, and on down).  How  
> would tests be run off CPAN when the modules are distributed  
> independently?

Bio::SeqIO::genbank would have a dependency on the latest version of 
Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.

So when a user wants to get the latest version of Bio::SeqIO::genbank, 
they no longer have to worry about what other modules in its dependency 
hierarchy they should also install.

Instead they just request Bio::SeqIO::genbank which itself ensures you 
have the latest version of all its dependencies before installing itself 
and running its tests.

When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank 
users should have, he could just call './Build dist Bio::SeqIO::genbank' 
which would generate a new package for Bio::SeqIO::genbank suitable for 
uploading to CPAN. No more long release cycles and having to constantly 
tell people to 'use CVS' to get working Bioperl code.


> Would they also be individually distributed?  What  
> would you use to tie all the individual modules together?  How would  
> you explain to the CPAN maintainers that you want to split bioperl  
> into 990 individual modules, all updated independently, but intend on  
> bundling them afterwards anyway?

They would be tied together by a CPAN bundle. You don't have to 
'explain' anything to the CPAN maintainers because you're not doing 
anything wrong. In fact, you're using it the way you're supposed to.


> Splitting up core:
> 
> As I see it, here are the advantages of a defined split as Steve and  
> I see it (off the top of my head).  Some of this probably reiterates  
> my previous points, as well as Steve's, so apologies in advance.

Below I answer with how it would be with my single-module approach 
compared to the defined splits.


> - A lean, mean, focused set of bioperl base modules (core) w/o or  
> with very few external deps, minimal installation issues, etc.  The  
> very basic stuff to get up and running.

Even leaner, even more focused.


> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused  
> functionality, code, and tests, which add a bit more 'sugar' to the  
> base functionality of the core.  If you only care about parsing BLAST  
> reports, get SearchIO, which requires core and optionally other  
> modules (XML::SAX).  If you want additional DB functionality apart  
> from the very basic ones in core, install DB (with it's additional  
> requirements, including core, DBI, and so on).  Same with Graphics,  
> Tools, Tree/Phylo, etc.  We just need to define and limit the number  
> of splits.

The same can be achieved with CPAN bundles for each kind of functional 
grouping you can think of. And since its just a single text file that 
defines such a grouping, its easy to change or add new ones as you feel 
like it, as opposed to the rather more permanent and substantial effort 
of creating one of your splits on the code-base level.

Also, the world doesn't have to rely on /our/ ideas of what a useful 
functional split is. If someone just wants to parse Blast results, they 
can just use CPAN to install Bio::SearchIO::blast_pull instead of having 
to install all of SearchIO.


> - Easier to add additional bundled modules.  For instance, I could  
> focus all of my RNA work into a discrete set of modules (say, bioperl- 
> rna) which I maintain, I ensure works with the latest core code, I  
> ensure also plays well with the other children =) , and I distribute  
> via CPAN.  Same with EUtilities, which could go into a separated DB- 
> related set or stay in core.

And if you lose interest in them? They eventually die because they no 
longer have someone looking after them by default (the pumpkin and other 
devs). Alternatively you could just make a CPAN bundle. One text file! 
Easy! No duplication of modules in CPAN, no new hassle for you or the 
Bioperl 'core' pumpkin to ensure that the latest version of each work 
with each other and other splits.


> - If we want a full-fledged 'install everything', the CPAN Bundle  
> system is available.  I think it's easier to use a Bundle for 4-5,  
> even 10 groups of modules as opposed to over 900.

No, it isn't any easier. Its /equally/ easy to install a bundle of 900 
packages of 900 modules as it is to install 5 packages of 900 modules.

When not installing absolutely everything, but perhaps 'most' things, 
there's the additional benefit that it would be easier to skip a 
particular Bio::module because you didn't want to install its external 
dependencies and weren't that interested in it anyway.


> - A Bundle or a build file where discrete distributions are listed  
> (Bio::SearchIO, etc) wouldn't need to be updated every time a new  
> module is added to a distribution.  I suppose this could be  
> automated, but why have the additional headache?

Yes, it would be automated, and no, it wouldn't at all be any kind of 
additional headache. I'm proposing a fully-automated system that the 
pumpkin wouldn't even have to think about it. Much /less/ of a headache 
than dealing with splits. Orders of magnitude easier to deal with.


> - A chance to cut out some cruft.  We all know that particular areas  
> need work or a complete overhaul (Restriction, Structure, maybe a few  
> others).  Smaller, concentrated sets of modules I believe would be  
> easier to maintain, and those that don't get use will eventually fall  
> out of favor and may be lost or replaced from the more maintained  
> group of modules.  Survival of the fittest.

And the smallest, most concentrated set of modules is the individual module.


> - We already have had practice; bioperl-db, bioperl-run, bioperl- 
> network, and others.  Those that have been routinely maintained and  
> enjoy wide use (db, run, network) have survived; others not so much  
> (corba-related stuff, microarray, ext, etc., though the code is still  
> available if someone else wants to take it up and revive it!).

The reason some of these existing splits (micoarray, ext) have fallen by 
the way-side? /Because/ they're splits. If they had been part of 
bioperl-live all along, they'd have been kept in a working, compatible 
state and would have been released along with everything else in 1.5.2


> Disadvantages of a defined split:
> 
> - The initial headache of identifying which groups go where,  
> coordinating with those who rely on bioperl (GMOD, etc) on how this  
> will be set up, so on...

No need to worry about this with individual modules.


> - Separate groups of modules require testing together to ensure  
> functionality is consistent and maintained (something I think you  
> pointed out previously).

No need to worry.


> - I think an increased possibility of branching is possible.
> 
> - Extra headaches for devs, who have to keep track of the various  
> critical distributions and make sure they work well together.

No headaches.

From charles-listes+bioperl at plessy.org  Thu Jun 28 03:40:04 2007
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Thu, 28 Jun 2007 16:40:04 +0900
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
Message-ID: <20070628074004.GD6338@kunpuu.plessy.org>

Dear developpers,

I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
it would make sense to call it "bioperl-live" and distribute it in
parallel with the stable 1.4.0 version, if bioperl-live means "the
current developepr version".

If I am wrong, can somebody explain me what bioperl-live exactly refers
to ?

Have a nice day,

-- 
Charles Plessy
Debian-med packaging team
Wako, Saitama, Japan

From n.haigh at sheffield.ac.uk  Thu Jun 28 04:23:10 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:23:10 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <46836FEE.5030203@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Chris Fields wrote:
>> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>>> What advantage is there of these defined splits instead of 
>>> individual modules? As I see it you lose some of the potential 
>>> benefits of breaking Bioperl up completely, whilst also suffering 
>>> the maintenance problems I outlined in my objection to Steve's post.
>>>
>>> Being able to work on all Bioperl from a single cvs (ne svn) check 
>>> out/ archive, whilst distributing it as individual modules on CPAN 
>>> seems like the best of both worlds to me. What am I missing?
>>
>> Okay, forewarned, but here's my long-winded reasoning.  The short and 
>> sweet version: I (very) respectfully don't agree with you, at least 
>> re: the idea we should commit all modules to CPAN independently. It 
>> doesn't make any sense to me, but maybe you can elaborate more?  
>> Maybe I'm misinterpreting what you mean?
> 
> The short and sweet version: my proposal has all the benefits of yours,
> but none of the disadvantages. What's not to like?
> 
> 
>> Finally, all of this should wait until later.  Much later, like after 
>> a decent release, after svn, etc kind of 'later'.  I think we can 
>> agree on that.
> 
> Hmm, not really. If it can be implemented by a change in just Build.PL
> and ModuleBuildBioperl, its really independent of everything else.
> That's the beauty of it: the only thing that changes is how things are
> uploaded to and downloaded from CPAN. The only person that normally
> deals with that issue is the pumpkin for a release, and he only cares
> about it at release time.
> 
> In fact, if we're going to do it at all it makes sense to try it out on
> a minor release like 1.5.3. We've already got experience of doing it
> split-style from 1.5.2. (And let me tell you: splits at the code-base
> level suck.)
> 
> 
>> Individual CPAN modules:
>>
>> CPAN is not our personal versioning system; it may be if a 
>> distribution consists of only a few modules, but not when it's one of 
>> the largest distros present.  If someone wants to update an 
>> individual bioperl module for a quick bug fix they are more than 
>> welcome to download it via cvs, svn, or even using a web browser, and 
>> replace the one they have.
> 
> And where is the harm in letting them do it via CPAN as well? In fact,
> there are significant benefits:
> 
> 
>> I'm trying to reason how one could break up the individual SeqIO/
>> SearchIO/otherIO modules into single module distributions.  They are 
>> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, 
>> which relies on the various interfaces, RootIO, and on down).  How 
>> would tests be run off CPAN when the modules are distributed 
>> independently?
> 
> Bio::SeqIO::genbank would have a dependency on the latest version of
> Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.
> 
> So when a user wants to get the latest version of Bio::SeqIO::genbank,
> they no longer have to worry about what other modules in its dependency
> hierarchy they should also install.
> 
> Instead they just request Bio::SeqIO::genbank which itself ensures you
> have the latest version of all its dependencies before installing itself
> and running its tests.

This was my thinking when I first brought this up at the
begining/splitting of this thread. This way of thinking of modules as
the constituent parts of a larger package should make it easier for
people to define dependencies far easier as well as users only needing
to install those parts they require. As Sendu points out, if the user
wants to convert seqs from genbank to fasta they could simply install
Bio::SeqIO::genbank and Bio::SeqIO::fasta and they would get all the
other modules that are the dependencies of Bio::SeqIO::genbank and
Bio::SeqIO::fasta.

> 
> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
> users should have, he could just call './Build dist Bio::SeqIO::genbank'
> which would generate a new package for Bio::SeqIO::genbank suitable for
> uploading to CPAN. No more long release cycles and having to constantly
> tell people to 'use CVS' to get working Bioperl code.

However, how would the test suite work out with this? e.g. when someone
installs Bio::SeqIO::genbank they want to have the tests associated with
Bio::SeqIO::genbank to be run. Would there be tests that would be run
redundantly if for example someone installed Bio::SeqIO::genbank and
Bio::SeqIO::fasta?

> 
> 
>> Would they also be individually distributed?  What  would you use to
>> tie all the individual modules together?  How would  you explain to
>> the CPAN maintainers that you want to split bioperl  into 990
>> individual modules, all updated independently, but intend on  bundling
>> them afterwards anyway?
> 
> They would be tied together by a CPAN bundle. You don't have to
> 'explain' anything to the CPAN maintainers because you're not doing
> anything wrong. In fact, you're using it the way you're supposed to.

Yep. real modules are released as modules, each with their own set of
dependencies. The use CPAN bundles the way there were supposed to be for
- - distributing a set of CPAN modules that make a coherent set of
functionality. You "could" also bundle in other authors modules e.g.
Bio::ASN1::EntrezGene?

> 
> 
>> Splitting up core:
>>
>> As I see it, here are the advantages of a defined split as Steve and 
>> I see it (off the top of my head).  Some of this probably reiterates 
>> my previous points, as well as Steve's, so apologies in advance.
> 
> Below I answer with how it would be with my single-module approach
> compared to the defined splits.
> 
> 
>> - A lean, mean, focused set of bioperl base modules (core) w/o or 
>> with very few external deps, minimal installation issues, etc.  The 
>> very basic stuff to get up and running.
> 
> Even leaner, even more focused.
> 
> 
>> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused 
>> functionality, code, and tests, which add a bit more 'sugar' to the 
>> base functionality of the core.  If you only care about parsing BLAST 
>> reports, get SearchIO, which requires core and optionally other 
>> modules (XML::SAX).  If you want additional DB functionality apart 
>> from the very basic ones in core, install DB (with it's additional 
>> requirements, including core, DBI, and so on).  Same with Graphics, 
>> Tools, Tree/Phylo, etc.  We just need to define and limit the number 
>> of splits.
> 
> The same can be achieved with CPAN bundles for each kind of functional
> grouping you can think of. And since its just a single text file that
> defines such a grouping, its easy to change or add new ones as you feel
> like it, as opposed to the rather more permanent and substantial effort
> of creating one of your splits on the code-base level.
> 
> Also, the world doesn't have to rely on /our/ ideas of what a useful
> functional split is. If someone just wants to parse Blast results, they
> can just use CPAN to install Bio::SearchIO::blast_pull instead of having
> to install all of SearchIO.
> 
> 
>> - Easier to add additional bundled modules.  For instance, I could 
>> focus all of my RNA work into a discrete set of modules (say, bioperl-
>> rna) which I maintain, I ensure works with the latest core code, I 
>> ensure also plays well with the other children =) , and I distribute 
>> via CPAN.  Same with EUtilities, which could go into a separated DB-
>> related set or stay in core.
> 
> And if you lose interest in them? They eventually die because they no
> longer have someone looking after them by default (the pumpkin and other
> devs). Alternatively you could just make a CPAN bundle. One text file!
> Easy! No duplication of modules in CPAN, no new hassle for you or the
> Bioperl 'core' pumpkin to ensure that the latest version of each work
> with each other and other splits.

Hmm, how would module versions be handled? Wouldn't this approach
require each module to have it's own independent version number, which
could then be used for building the dependencies? Each new release of
that module would only bump that module's version number.

Bundles can specify the minimum version of a module to be installed,
such that bug fixes to individual modules and be released into CPAN and
would automatically get picked up when installing bundles etc.

I'm not quite sure how the current stable/dev releases would work. I
assume bug fixes would have to be made on a branch e.g. branch 1.6 and
released to cpan from there. Then when the next stable release is made,
all module versions would be bumped and and released to CPAN. With any
modifications to the content of the bundle to be made. Is it possible to
have a stable and developer release bundles that are able to specify the
minimum stable and developer modules versions respectively?


> 
> 
>> - If we want a full-fledged 'install everything', the CPAN Bundle 
>> system is available.  I think it's easier to use a Bundle for 4-5, 
>> even 10 groups of modules as opposed to over 900.
> 
> No, it isn't any easier. Its /equally/ easy to install a bundle of 900
> packages of 900 modules as it is to install 5 packages of 900 modules.
> 
> When not installing absolutely everything, but perhaps 'most' things,
> there's the additional benefit that it would be easier to skip a
> particular Bio::module because you didn't want to install its external
> dependencies and weren't that interested in it anyway.
> 
> 
>> - A Bundle or a build file where discrete distributions are listed 
>> (Bio::SearchIO, etc) wouldn't need to be updated every time a new 
>> module is added to a distribution.  I suppose this could be 
>> automated, but why have the additional headache?
> 
> Yes, it would be automated, and no, it wouldn't at all be any kind of
> additional headache. I'm proposing a fully-automated system that the
> pumpkin wouldn't even have to think about it. Much /less/ of a headache
> than dealing with splits. Orders of magnitude easier to deal with.
> 
> 
>> - A chance to cut out some cruft.  We all know that particular areas 
>> need work or a complete overhaul (Restriction, Structure, maybe a few 
>> others).  Smaller, concentrated sets of modules I believe would be 
>> easier to maintain, and those that don't get use will eventually fall 
>> out of favor and may be lost or replaced from the more maintained 
>> group of modules.  Survival of the fittest.
> 
> And the smallest, most concentrated set of modules is the individual
> module.
> 
> 
>> - We already have had practice; bioperl-db, bioperl-run, bioperl-
>> network, and others.  Those that have been routinely maintained and 
>> enjoy wide use (db, run, network) have survived; others not so much 
>> (corba-related stuff, microarray, ext, etc., though the code is still 
>> available if someone else wants to take it up and revive it!).
> 
> The reason some of these existing splits (micoarray, ext) have fallen by
> the way-side? /Because/ they're splits. If they had been part of
> bioperl-live all along, they'd have been kept in a working, compatible
> state and would have been released along with everything else in 1.5.2
> 
> 
>> Disadvantages of a defined split:
>>
>> - The initial headache of identifying which groups go where, 
>> coordinating with those who rely on bioperl (GMOD, etc) on how this 
>> will be set up, so on...
> 
> No need to worry about this with individual modules.
> 
> 
>> - Separate groups of modules require testing together to ensure 
>> functionality is consistent and maintained (something I think you 
>> pointed out previously).
> 
> No need to worry.

Maye need to worry aout how the tests are run when installing individual
modules etc?

> 
> 
>> - I think an increased possibility of branching is possible.
>>
>> - Extra headaches for devs, who have to keep track of the various 
>> critical distributions and make sure they work well together.
> 
> No headaches.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg2/uczuW2jkwy2gRAlR4AJ44kHIXWWapNVGOIrkFBJdP9rn3vwCdErhT
VkymyXNshguE44/RilEXWDA=
=O5ex
-----END PGP SIGNATURE-----

From n.haigh at sheffield.ac.uk  Thu Jun 28 04:27:54 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:27:54 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <4683710A.9010808@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Chris Fields wrote:
>> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>>> What advantage is there of these defined splits instead of 
>>> individual modules? As I see it you lose some of the potential 
>>> benefits of breaking Bioperl up completely, whilst also suffering 
>>> the maintenance problems I outlined in my objection to Steve's post.
>>>
>>> Being able to work on all Bioperl from a single cvs (ne svn) check 
>>> out/ archive, whilst distributing it as individual modules on CPAN 
>>> seems like the best of both worlds to me. What am I missing?
>>
>> Okay, forewarned, but here's my long-winded reasoning.  The short and 
>> sweet version: I (very) respectfully don't agree with you, at least 
>> re: the idea we should commit all modules to CPAN independently. It 
>> doesn't make any sense to me, but maybe you can elaborate more?  
>> Maybe I'm misinterpreting what you mean?
> 
> The short and sweet version: my proposal has all the benefits of yours,
> but none of the disadvantages. What's not to like?
> 
> 
>> Finally, all of this should wait until later.  Much later, like after 
>> a decent release, after svn, etc kind of 'later'.  I think we can 
>> agree on that.
> 
> Hmm, not really. If it can be implemented by a change in just Build.PL
> and ModuleBuildBioperl, its really independent of everything else.
> That's the beauty of it: the only thing that changes is how things are
> uploaded to and downloaded from CPAN. The only person that normally
> deals with that issue is the pumpkin for a release, and he only cares
> about it at release time.
> 
> In fact, if we're going to do it at all it makes sense to try it out on
> a minor release like 1.5.3. We've already got experience of doing it
> split-style from 1.5.2. (And let me tell you: splits at the code-base
> level suck.)
> 
> 
>> Individual CPAN modules:
>>
>> CPAN is not our personal versioning system; it may be if a 
>> distribution consists of only a few modules, but not when it's one of 
>> the largest distros present.  If someone wants to update an 
>> individual bioperl module for a quick bug fix they are more than 
>> welcome to download it via cvs, svn, or even using a web browser, and 
>> replace the one they have.
> 
> And where is the harm in letting them do it via CPAN as well? In fact,
> there are significant benefits:
> 
> 
>> I'm trying to reason how one could break up the individual SeqIO/
>> SearchIO/otherIO modules into single module distributions.  They are 
>> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, 
>> which relies on the various interfaces, RootIO, and on down).  How 
>> would tests be run off CPAN when the modules are distributed 
>> independently?
> 
> Bio::SeqIO::genbank would have a dependency on the latest version of
> Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.
> 
> So when a user wants to get the latest version of Bio::SeqIO::genbank,
> they no longer have to worry about what other modules in its dependency
> hierarchy they should also install.
> 
> Instead they just request Bio::SeqIO::genbank which itself ensures you
> have the latest version of all its dependencies before installing itself
> and running its tests.
> 
> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
> users should have, he could just call './Build dist Bio::SeqIO::genbank'
> which would generate a new package for Bio::SeqIO::genbank suitable for
> uploading to CPAN. No more long release cycles and having to constantly
> tell people to 'use CVS' to get working Bioperl code.
> 
> 
>> Would they also be individually distributed?  What  would you use to
>> tie all the individual modules together?  How would  you explain to
>> the CPAN maintainers that you want to split bioperl  into 990
>> individual modules, all updated independently, but intend on  bundling
>> them afterwards anyway?
> 
> They would be tied together by a CPAN bundle. You don't have to
> 'explain' anything to the CPAN maintainers because you're not doing
> anything wrong. In fact, you're using it the way you're supposed to.
> 


The successor to Bundles - may prove interesting:
http://search.cpan.org/~adamk/Task-1.01/lib/Task.pm


> 
>> Splitting up core:
>>
>> As I see it, here are the advantages of a defined split as Steve and 
>> I see it (off the top of my head).  Some of this probably reiterates 
>> my previous points, as well as Steve's, so apologies in advance.
> 
> Below I answer with how it would be with my single-module approach
> compared to the defined splits.
> 
> 
>> - A lean, mean, focused set of bioperl base modules (core) w/o or 
>> with very few external deps, minimal installation issues, etc.  The 
>> very basic stuff to get up and running.
> 
> Even leaner, even more focused.
> 
> 
>> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused 
>> functionality, code, and tests, which add a bit more 'sugar' to the 
>> base functionality of the core.  If you only care about parsing BLAST 
>> reports, get SearchIO, which requires core and optionally other 
>> modules (XML::SAX).  If you want additional DB functionality apart 
>> from the very basic ones in core, install DB (with it's additional 
>> requirements, including core, DBI, and so on).  Same with Graphics, 
>> Tools, Tree/Phylo, etc.  We just need to define and limit the number 
>> of splits.
> 
> The same can be achieved with CPAN bundles for each kind of functional
> grouping you can think of. And since its just a single text file that
> defines such a grouping, its easy to change or add new ones as you feel
> like it, as opposed to the rather more permanent and substantial effort
> of creating one of your splits on the code-base level.
> 
> Also, the world doesn't have to rely on /our/ ideas of what a useful
> functional split is. If someone just wants to parse Blast results, they
> can just use CPAN to install Bio::SearchIO::blast_pull instead of having
> to install all of SearchIO.
> 
> 
>> - Easier to add additional bundled modules.  For instance, I could 
>> focus all of my RNA work into a discrete set of modules (say, bioperl-
>> rna) which I maintain, I ensure works with the latest core code, I 
>> ensure also plays well with the other children =) , and I distribute 
>> via CPAN.  Same with EUtilities, which could go into a separated DB-
>> related set or stay in core.
> 
> And if you lose interest in them? They eventually die because they no
> longer have someone looking after them by default (the pumpkin and other
> devs). Alternatively you could just make a CPAN bundle. One text file!
> Easy! No duplication of modules in CPAN, no new hassle for you or the
> Bioperl 'core' pumpkin to ensure that the latest version of each work
> with each other and other splits.
> 
> 
>> - If we want a full-fledged 'install everything', the CPAN Bundle 
>> system is available.  I think it's easier to use a Bundle for 4-5, 
>> even 10 groups of modules as opposed to over 900.
> 
> No, it isn't any easier. Its /equally/ easy to install a bundle of 900
> packages of 900 modules as it is to install 5 packages of 900 modules.
> 
> When not installing absolutely everything, but perhaps 'most' things,
> there's the additional benefit that it would be easier to skip a
> particular Bio::module because you didn't want to install its external
> dependencies and weren't that interested in it anyway.
> 
> 
>> - A Bundle or a build file where discrete distributions are listed 
>> (Bio::SearchIO, etc) wouldn't need to be updated every time a new 
>> module is added to a distribution.  I suppose this could be 
>> automated, but why have the additional headache?
> 
> Yes, it would be automated, and no, it wouldn't at all be any kind of
> additional headache. I'm proposing a fully-automated system that the
> pumpkin wouldn't even have to think about it. Much /less/ of a headache
> than dealing with splits. Orders of magnitude easier to deal with.
> 
> 
>> - A chance to cut out some cruft.  We all know that particular areas 
>> need work or a complete overhaul (Restriction, Structure, maybe a few 
>> others).  Smaller, concentrated sets of modules I believe would be 
>> easier to maintain, and those that don't get use will eventually fall 
>> out of favor and may be lost or replaced from the more maintained 
>> group of modules.  Survival of the fittest.
> 
> And the smallest, most concentrated set of modules is the individual
> module.
> 
> 
>> - We already have had practice; bioperl-db, bioperl-run, bioperl-
>> network, and others.  Those that have been routinely maintained and 
>> enjoy wide use (db, run, network) have survived; others not so much 
>> (corba-related stuff, microarray, ext, etc., though the code is still 
>> available if someone else wants to take it up and revive it!).
> 
> The reason some of these existing splits (micoarray, ext) have fallen by
> the way-side? /Because/ they're splits. If they had been part of
> bioperl-live all along, they'd have been kept in a working, compatible
> state and would have been released along with everything else in 1.5.2
> 
> 
>> Disadvantages of a defined split:
>>
>> - The initial headache of identifying which groups go where, 
>> coordinating with those who rely on bioperl (GMOD, etc) on how this 
>> will be set up, so on...
> 
> No need to worry about this with individual modules.
> 
> 
>> - Separate groups of modules require testing together to ensure 
>> functionality is consistent and maintained (something I think you 
>> pointed out previously).
> 
> No need to worry.
> 
> 
>> - I think an increased possibility of branching is possible.
>>
>> - Extra headaches for devs, who have to keep track of the various 
>> critical distributions and make sure they work well together.
> 
> No headaches.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg3EKczuW2jkwy2gRAriiAJ47Qz9jTshEXuaG0XMYrUTI0hHqAwCeL45r
r/BykCKbM9lqJM0khARuEms=
=NB4B
-----END PGP SIGNATURE-----

From n.haigh at sheffield.ac.uk  Thu Jun 28 04:51:19 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:51:19 +0100
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org>
References: <20070628074004.GD6338@kunpuu.plessy.org>
Message-ID: <46837687.7010101@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Charles Plessy wrote:
> Dear developpers,
> 
> I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
> it would make sense to call it "bioperl-live" and distribute it in
> parallel with the stable 1.4.0 version, if bioperl-live means "the
> current developepr version".
> 
> If I am wrong, can somebody explain me what bioperl-live exactly refers
> to ?
> 
> Have a nice day,
> 

bioperl-live really means the HEAD of the cvs repository so is the most
bleeding-edge code available.

Version 1.5.* is the developer release, while the 1.4.* is the stable
release. However, there have been few updates to the 1.4.* release which
means that it is more unstable than the 1.5.* dev release. I think the
consensus, was to have more rapid release cycles of the stable branch in
future in order to avoid this. I'm sure there are others more qualified
to expand/correct me on this if needs e.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg3aHczuW2jkwy2gRAo5pAJ95BGqrA5bLwRKNfUQi/HfBnkUJjwCg0mYB
/fHFyYkqAvcmOSxu4djPll0=
=KwVH
-----END PGP SIGNATURE-----

From bix at sendu.me.uk  Thu Jun 28 05:11:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 10:11:39 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <46836FEE.5030203@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk> <46836FEE.5030203@sheffield.ac.uk>
Message-ID: <46837B4B.7060705@sendu.me.uk>

Nathan S. Haigh wrote:
(Please try and snip more: don't quote whole posts just to reply to 
certain paragraphs)

> Sendu Bala wrote:
>> Chris Fields wrote:
>> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
>> users should have, he could just call './Build dist Bio::SeqIO::genbank'
>> which would generate a new package for Bio::SeqIO::genbank suitable for
>> uploading to CPAN. No more long release cycles and having to constantly
>> tell people to 'use CVS' to get working Bioperl code.
> 
> However, how would the test suite work out with this? e.g. when someone
> installs Bio::SeqIO::genbank they want to have the tests associated with
> Bio::SeqIO::genbank to be run. Would there be tests that would be run
> redundantly if for example someone installed Bio::SeqIO::genbank and
> Bio::SeqIO::fasta?

We would want to move to a strict test-script-per-module system. But 
that's desirable in any case, as it would greatly ease reaching our goal 
of complete test coverage, and subsequent maintenance of those tests.

The genbank test would only run tests specific to genbank parsing, and 
likewise for fasta. They would both have a dependency on Bio::SeqIO, and 
if that was also recently updated, it would get installed prior to you 
installing genbank (and therefor run its own generic SeqIO tests), but 
wouldn't get installed again (wouldn't run its tests again) when you 
install fasta afterwards.


On the subject of tests, I'm reminded of another benefit of the 
individual-module approach. Currently if a test fails during a CPAN 
install, nothing gets installed. Users do one of:

# refuse to install at all (strict sys-admins)
# cry and give up (newbies)
# cry and seek help (newbies who really really need Bioperl)
# force install, leaving them in some undefined state because they 
didn't understand the problems (most remaining users)
# force install, happy that the problems are ok (some Bioperl devs)

With a bundle of individual modules you would install virtually all 
Bioperl modules with no problems, and the problems with the remainder 
would be clear to everyone. No one would need to force install since the 
tests results would now be meaningful: the thing you're trying to 
install really isn't going to work if the tests are failing. If you 
really needed that particular Bioperl module you could then pay 
particular attention to why its failing (most likely some problem with 
an external dependency).


>>> Would they also be individually distributed?  What  would you use to
>>> tie all the individual modules together?
>>
>> They would be tied together by a CPAN bundle. You don't have to
>> 'explain' anything to the CPAN maintainers because you're not doing
>> anything wrong. In fact, you're using it the way you're supposed to.
> 
> Yep. real modules are released as modules, each with their own set of
> dependencies. The use CPAN bundles the way there were supposed to be for
> - - distributing a set of CPAN modules that make a coherent set of
> functionality. You "could" also bundle in other authors modules e.g.
> Bio::ASN1::EntrezGene?

Any bundle featuring Bio::SeqIO::entrezgene would necessarily include 
Bio::ASN1::EntrezGene in the bundle.


> Hmm, how would module versions be handled? Wouldn't this approach
> require each module to have it's own independent version number, which
> could then be used for building the dependencies? Each new release of
> that module would only bump that module's version number.

Yes, that's how it would work. No more global version number.


> Bundles can specify the minimum version of a module to be installed,
> such that bug fixes to individual modules and be released into CPAN and
> would automatically get picked up when installing bundles etc.

Yes.


> I'm not quite sure how the current stable/dev releases would work. I
> assume bug fixes would have to be made on a branch e.g. branch 1.6 and
> released to cpan from there. Then when the next stable release is made,
> all module versions would be bumped and and released to CPAN. With any
> modifications to the content of the bundle to be made. Is it possible to
> have a stable and developer release bundles that are able to specify the
> minimum stable and developer modules versions respectively?

No, the distinction becomes pretty meaningless. We could still do big 
major releases, but modules wouldn't be version-bumped. The big release 
would just be an update of the bundle that specifies the latest version 
of all Bioperl modules.

Remember that bundles only specify the minimum version, not the required 
version: in this brave new world users would end up with the same 
versions of modules if they installed a 1.8 bundle compared to 1.7 bundle.

The only way to get a true snapshot of 1.7 after it was released would 
be if we took snapshots and archived them, making them available from 
bioperl.org (or by checking out the 1.7 tag from cvs/svn).

I don't see that as a significant problem. You lose the trivial benefit 
of being able to install old snapshots from CPAN. The people who have a 
great need to install old snapshots can find their way to bioperl.org no 
problem.

From bix at sendu.me.uk  Thu Jun 28 04:50:09 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 09:50:09 +0100
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org>
References: <20070628074004.GD6338@kunpuu.plessy.org>
Message-ID: <46837641.8050106@sendu.me.uk>

Charles Plessy wrote:
> I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
> it would make sense to call it "bioperl-live" and distribute it in
> parallel with the stable 1.4.0 version, if bioperl-live means "the
> current developepr version".
> 
> If I am wrong, can somebody explain me what bioperl-live exactly refers
> to ?

bioperl-live is the name of the CVS repository containing what is 
currently considered the 'Core package' or core modules.
http://www.bioperl.org/wiki/Using_CVS

If you want to call it something to distinguish it from stable, call it 
'developer' vs 'stable' or '1.5.2' vs '1.4.0'.

To distinguish them both from the other packages, call them 'core' vs 
'run' etc.


From hlapp at gmx.net  Thu Jun 28 06:31:29 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 28 Jun 2007 07:31:29 -0300
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>


On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote:

> [...] Also - the main point I wanted to make - Can I suggest we  
> spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

I agree we need to discuss a path towards 1.6, but I think that  
should be kept separate from the cvs->svn migration. Otherwise one  
stalls the other (by stopping people who seem to have the energy and  
motivation right now to do one but not the other) for no really good  
reason.

> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

I'm not sure that's feasible to be happening but if someone steps up  
it maybe it is.

>
> Will it be productive to schedule a fair amount of time at BOSC
> discussing how to partition out the packages into separate sub-
> packages after we've done a successful release rather than trying to
> change things right now?

I agree. I also don't think that people are partitioning right now  
(other than the existing partitioning), though maybe I'm mistaken.

> [...]
> It would  probably mean moving Bio::Graphics, Bio::DB::GFF and
> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
> so they could be released more regularly on par with Gbrowse
> schedules.

Possibly. I'm not fully sure why those modules couldn't also be  
released more often out of the "main trunk" of modules. In Java/ant,  
it'd be relatively easy to write build script filters that select the  
appropriate modules and package them on the fly. I'm not sure whether  
the build tools for Perl can do that too, though.

>   Also I think someone needs to figure out Bio::Tools::GFF
> vs Bio::FeatureIO -- what do we want to do?

I believe FeatureIO has the ontology download tied into it?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Thu Jun 28 06:47:39 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 28 Jun 2007 07:47:39 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>


On Jun 28, 2007, at 12:29 AM, Jason Stajich wrote:

> As I tried to ask for in the past, would someone also illustrate the
> importance of why _WE_ need to switch to SVN on a wiki page on
> Bioperl so that when someone complains/asks about this in the future
> the arguments are already laid out.  I am basically fine with it, but
> I don't honestly see a compelling reason beyond what has been
> mentioned wrt better integration in IDEs.
> http://bioperl.org/wiki/Why_SVN

I guess at the end of the day svn is just the system of choice for  
new developers. I've had people tell me who started with svn that cvs  
seems a lot harder to use. The newer projects are all on svn and for  
example to integrate Bio::Phylo into BioPerl should become a question  
of the revision control system.

At the end of the day if being on svn makes it easier for new people  
to contribute it's enough of an argument for me, whether it's  
rational or not.

IMHO, there's two advantages that svn has over cvs. First,  
directories are versioned, have properties, and generally are the  
same class of citizens as files. They can be added, renamed, and  
removed from the repository. In cvs, we all know what a hassle it is  
to rename or even retire directories. Second, svn log gives you the  
commits, i.e., the set of changes that constituted one particular  
commit (and therefore version increase). In cvs that's hard or  
impossible to reconstruct.

Bottom line - I don't think many people if any will question why we  
moved from cvs to svn ...

My $0.02 ...

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Wed Jun 27 20:34:37 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:34:37 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
	<9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
	<1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>
Message-ID: <18051.541.684705.567954@almost.alerce.com>

Chris Fields writes:
 > We should port them all, yes.
 > 
 > chris
 > 
 > On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote:
 > 
 > > Is there a reason not to port every subproject over?
 > >
 > > 	-hilmar

They're all there.  At least everything that I found in the CVS repo.
Some of the directories were empty, some had very little content, I
was just mechanical about it.

Here's what I have:

  [hartzell at dev ~]$ svn ls file://`pwd`/bioperl
  biodata/
  bioperl-cookbook/
  bioperl-corba-client/
  bioperl-corba-server/
  bioperl-das-client/
  bioperl-db/
  bioperl-ext/
  bioperl-gui/
  bioperl-live/
  bioperl-microarray/
  bioperl-network/
  bioperl-papers/
  bioperl-pedigree/
  bioperl-pipeline/
  bioperl-run/
  biosql-schema/
  html/
  task-manager/
  xml-html/

I wasn't very clear in my original request, but I was hoping that
someone out there who's familiar with the various out-of-the-way bits
and pieces could take a look at them.  I was afraid that everyone was
just checking out bioperl-live and doing 'make test'.

Someone (chris?) made a point about binary files in bioperl-run.  It'd
be great if someone in the know could check on them.

Also, to the degree that it's possible, look around at various tags
and branches and see if they're what you'd expect.

Thanks!

g.

From bix at sendu.me.uk  Thu Jun 28 08:21:37 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 13:21:37 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <4683A7D1.8070403@sendu.me.uk>

George Hartzell wrote:
> Chris Fields writes:
>  > [...]
>  > It looks like George Hartzell may be taking a crack at it, with  
>  > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
>  > could have something testable relatively soon.  After that we'll need  
>  > to work out a few other issues, basically what's on Hilmar's list.
> 
> There's a repository on file:///home/hartzell/bioperl with all of the
> components projects in place.
> 
> If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
> 
>   file:///home/hartzell/bioperl

I'm confused. Presumably that only works whilst logged into 
dev.open-bio.org?


>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

I just tried:

svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl

on Mac OS X and things seemed to go well, except for this error message 
at the end:


svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
svn: Can't move source to dest
svn: Can't move 
'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
to 
'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
No such file or directory

I also ended up with only:
bioperl-corba-server    bioperl-db              bioperl-live 
bioperl-network         bioperl-papers          biosql-schema


Am I doing something totally wrong here?

From hartzell at alerce.com  Thu Jun 28 08:32:36 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:32:36 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN
	and	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <18051.43620.481558.447399@almost.alerce.com>

Jason Stajich writes:
 > [...]
 > The repository machine (dev) is a locked down machine meaning it only  
 > really runs ssh and not many servers include httpd.  We have  
 > anonymous CVS (client and through httpd browsing) running on a  
 > separate machine (code) that has the info rsynced over every 10 or 15  
 > minutes.

A great way to provide a read-only mirror of the repos. for anonymous
users is to have svnsync running out of cron on code.open-bio.org,
configured to pull from the dev.open-bio.org repository.  It might
actually work to have rsync mirror the fsfs-backed repository, but
that's scary-poking-into-the-internals.

g.


From hartzell at alerce.com  Thu Jun 28 08:43:37 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:43:37 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
Message-ID: <18051.44281.831316.749586@almost.alerce.com>

David Messina writes:
 > 
 > On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote:
 > 
 > >
 > > On Jun 27, 2007, at 1:27 PM, David Messina wrote:
 > >
 > >> I would think we would want "Author Date Id Rev URL" set on
 > >> everything, no?. So either cvs2svn or your tool (whichever you think
 > >> is better), followed by
 > >>
 > >> 	svn propset svn:keywords "Author Date Id Rev URL" *
 > >
 > > Shouldn't this be done recursively?
 > 
 > 
 > Yep, good catch! Thanks, Hilmar.
 > 
 > Should be:
 > 
 > 	svn propset --recursive svn:keywords "Author Date Id Rev URL" *

That's not quite what you want either.  It'll set the the keyword
property on all of the files, including things where you probably
don't want expansion to happen (e.g. images, someone said there are
binary wads in bioperl-run, etc...).

The Right Thing To Do is to grub around (grep) for '\$Id:' (and the
others) and set svn:keywords to files that are already using
keywords.  I have a bourne shell hack that'll do this, although it's
painful because it has to run in working directories....

Once we settle on a list of keywords to use, I'll take a wack at the
demo repository.

Likewise, you probably DON'T want to use this in your config file:

	  enable-auto-props = yes
	  * = svn:keywords="Author Date Id Rev URL"

since it'll do the same thing.

The Right Thing To Do is a more tedious 

	  *.pl = svn:keywords="Author Date Id Rev URL"
	  *.pm = svn:keywords="Author Date Id Rev URL"
  	  *.c = svn:keywords="Author Date Id Rev URL"

A bit of googling will give you a good starting point for the list,
and we should probably maintain a common one somewhere in the repo.

I don't think that there's a server side way of doing this, short of
running some script via a hook around commit time.

g.

From hartzell at alerce.com  Thu Jun 28 08:54:40 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:54:40 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN
	and	...Re:	Perltidy]
In-Reply-To: <F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
	<F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>
Message-ID: <18051.44944.982207.37624@almost.alerce.com>

Hilmar Lapp writes:
 > [...]
 > IMHO, there's two advantages that svn has over cvs. First,  
 > directories are versioned, have properties, and generally are the  
 > same class of citizens as files. They can be added, renamed, and  
 > removed from the repository. In cvs, we all know what a hassle it is  
 > to rename or even retire directories. Second, svn log gives you the  
 > commits, i.e., the set of changes that constituted one particular  
 > commit (and therefore version increase). In cvs that's hard or  
 > impossible to reconstruct.

Two more:

  - svn groups changes into revisions, so that they can be considered
    together, CVS versions individual files.
  - subversion tracks renames/moves correctly,
  - subversion commits are atomic, so you never have to worry about
    all of your stuff making it into the repos. at the same time [if
    you've never had to un-muck this, count yourself blessed!] ,
  - svk, which allows disconnected development while still commiting
    your work to a repo at natural points along the way (you can
    revert, branch, etc.... to your hearts content).

[yeah, that's 3, err, 4. Math is hard.]

g.

From cjfields at uiuc.edu  Thu Jun 28 09:07:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 08:07:24 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
	<23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>
Message-ID: <01812F01-9409-49FB-9061-330FA52177C1@uiuc.edu>


On Jun 28, 2007, at 5:31 AM, Hilmar Lapp wrote:

>
> On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote:
>
>> ...It
>> seems like we really need to do this first so that we have a stable
>> release that can be followed by CVS -> SVN migration, then consider
>> major changes to the repository structure and release packaging, and
>> potential deprecation and incorporation of other modules.
>
> I agree we need to discuss a path towards 1.6, but I think that
> should be kept separate from the cvs->svn migration. Otherwise one
> stalls the other (by stopping people who seem to have the energy and
> motivation right now to do one but not the other) for no really good
> reason.

It's good to discuss it as long as it doesn't take time and energy  
away from other priorities.

>> I assume there is no chance that we'd have a 1.6 candidate by BOSC
>> next month?
>
> I'm not sure that's feasible to be happening but if someone steps up
> it maybe it is.

Maybe a 1.5.3 and (if we work hard on it) a 1.6 soon after.  Then  
maybe work on partitioning if everyone's up for it and a scheme is  
worked out.

>> Will it be productive to schedule a fair amount of time at BOSC
>> discussing how to partition out the packages into separate sub-
>> packages after we've done a successful release rather than trying to
>> change things right now?
>
> I agree. I also don't think that people are partitioning right now
> (other than the existing partitioning), though maybe I'm mistaken.

The original proposal was based on Steve's idea of splitting up  
core.  I don't think a partition is feasible at this point, at least  
until we put more thought into it  (our energy should be focused  
elsewhere), but it's well worth discussing as a future path.

At this time there are two proposals:

1)  Steve's and my 'split into discrete sections' proposal, where we  
split core into self-sustaining sections with a common core listed as  
a dependency, tying installation of all together with a Bundle or  
similar.

2)  Sendu's 'break everything up' approach where all modules are  
submitted independently to CPAN, with their own tests, dependencies,  
etc.

There are advantages and disadvantages to both approaches.  Not sure  
if CPAN would go for the latter (it's pretty drastic), but I don't  
know for sure.  If you want in on that discussion (in this thread)  
feel free to join in!  The more the merrier!

>> [...]
>> It would  probably mean moving Bio::Graphics, Bio::DB::GFF and
>> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
>> so they could be released more regularly on par with Gbrowse
>> schedules.
>
> Possibly. I'm not fully sure why those modules couldn't also be
> released more often out of the "main trunk" of modules. In Java/ant,
> it'd be relatively easy to write build script filters that select the
> appropriate modules and package them on the fly. I'm not sure whether
> the build tools for Perl can do that too, though.

Both approaches above would probably use Module::Build to install  
other bioperl dependencies, each of which could have it's own  
dependency set, possibly using a Bundle to tie everything together.

>>   Also I think someone needs to figure out Bio::Tools::GFF
>> vs Bio::FeatureIO -- what do we want to do?
>
> I believe FeatureIO has the ontology download tied into it?
>
> 	-hilmar

 From recent posts here and on the gbrowse mail list by Scott and  
Lincoln, it seemed like they were moving away from using Bio::DB::GFF  
and were trying to get users to switch to Bio::DB::SeqFeature.  Maybe  
should get a more direct response?

chris


From hartzell at alerce.com  Thu Jun 28 09:16:18 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 09:16:18 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <18051.46242.942184.758493@almost.alerce.com>

Sendu Bala writes:
 > George Hartzell wrote:
 > > Chris Fields writes:
 > >  > [...]
 > >  > It looks like George Hartzell may be taking a crack at it, with  
 > >  > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
 > >  > could have something testable relatively soon.  After that we'll need  
 > >  > to work out a few other issues, basically what's on Hilmar's list.
 > > 
 > > There's a repository on file:///home/hartzell/bioperl with all of the
 > > components projects in place.
 > > 
 > > If you have a dev.open-bio.org account and you're in the bioperl
 > > group, you're good to get at it via:
 > > 
 > >   file:///home/hartzell/bioperl
 > 
 > I'm confused. Presumably that only works whilst logged into 
 > dev.open-bio.org?

Yes, that only works if you're actually on the machine.

 > >   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > I just tried:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > on Mac OS X and things seemed to go well, except for this error message 
 > at the end:
 > 
 > 
 > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
 > svn: Can't move source to dest
 > svn: Can't move 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
 > to 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
 > No such file or directory
 > 
 > I also ended up with only:
 > bioperl-corba-server    bioperl-db              bioperl-live 
 > bioperl-network         bioperl-papers          biosql-schema
 > 
 > 
 > Am I doing something totally wrong here?

It looks like you tried to check out the *entire* repository.  It
never occured to me to try that.  I'll take a look at what you
reported.

g.

From bix at sendu.me.uk  Thu Jun 28 09:20:19 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 14:20:19 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.46242.942184.758493@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.46242.942184.758493@almost.alerce.com>
Message-ID: <4683B593.3050108@sendu.me.uk>

George Hartzell wrote:
> Sendu Bala writes:
>> I just tried:
>> 
>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
[snip]
> It looks like you tried to check out the *entire* repository.

Yes. If you don't want everything, how does one 'browse' the repository
to find out the address of the thing you /do/ want?


> It never occured to me to try that.  I'll take a look at what you 
> reported.

Cheers.


From bix at sendu.me.uk  Thu Jun 28 09:27:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 14:27:29 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <4683B741.5020600@sendu.me.uk>

George Hartzell wrote:
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
> 
> Am I missing something, or don't we use them?

It would be great to have the following files svn:ignored :

In all package roots:
? Build
? MANIFEST
? MANIFEST.SKIP
? META.yml
? _build
? bioperl-*.tar.bz2
? bioperl-*.tar.gz
? bioperl-*.zip
? blib
? cover_db

In any and all directories:
? .DS_Store
? .DAV

In bioperl-live:
? t/BioDBSeqFeature.t
? t/BioDBSeqFeature_BDB.t
? t/BioDBSeqFeature_mysql.t


Can't think of anything else right now.

Thanks for your efforts,
Sendu.

From cjfields at uiuc.edu  Thu Jun 28 09:30:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 08:30:43 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <A2B0A715-BEF7-4632-91B3-1A215FBFE3D5@uiuc.edu>


On Jun 28, 2007, at 7:21 AM, Sendu Bala wrote:

>> ...
>>   file:///home/hartzell/bioperl
>
> I'm confused. Presumably that only works whilst logged into
> dev.open-bio.org?

Yes, it's just a tester.

>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>
> I just tried:
>
> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl

Try 'svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/trunk /mybiodir' to check out the main trunk for core.

chris


From hartzell at alerce.com  Thu Jun 28 09:57:00 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 09:57:00 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <18051.48684.996884.134046@almost.alerce.com>

Sendu Bala writes:
 > [...]
 > I just tried:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > on Mac OS X and things seemed to go well, except for this error message 
 > at the end:
 > 
 > 
 > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
 > svn: Can't move source to dest
 > svn: Can't move 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
 > to 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
 > No such file or directory
 > 
 > I also ended up with only:
 > bioperl-corba-server    bioperl-db              bioperl-live 
 > bioperl-network         bioperl-papers          biosql-schema
 > 
 > 
 > Am I doing something totally wrong here?

So, you probably wanted something like

  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

to pick up the head of the bioperl live tree (or
/.../bioperl-run/trunk, etc...).

I just checked out

  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/

and it ran to completion and gave me 

   (delicious)[6:50am]~/tmp>>ls bioperl | cat
   biodata
   bioperl-cookbook
   bioperl-corba-client
   bioperl-corba-server
   bioperl-das-client
   bioperl-db
   bioperl-ext
   bioperl-gui
   bioperl-live
   bioperl-microarray
   bioperl-network
   bioperl-papers
   bioperl-pedigree
   bioperl-pipeline
   bioperl-run
   biosql-schema
   html
   task-manager
   xml-html

Can another mac os x user out there give the Great Big Checkout a try
and see if it runs to completion.  Potential problems that come to
mind are:

  - the "mac's are case insensitive, sort of" problem
  - you filled up your disk
  - something else.

g.

From charles-listes+bioperl at plessy.org  Thu Jun 28 09:44:56 2007
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Thu, 28 Jun 2007 22:44:56 +0900
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <46837687.7010101@sheffield.ac.uk>
References: <20070628074004.GD6338@kunpuu.plessy.org>
	<46837687.7010101@sheffield.ac.uk>
Message-ID: <20070628134456.GB14492@kunpuu.plessy.org>

Le Thu, Jun 28, 2007 at 09:51:19AM +0100, Nathan S. Haigh a ?crit :
> 
> Version 1.5.* is the developer release, while the 1.4.* is the stable
> release. However, there have been few updates to the 1.4.* release which
> means that it is more unstable than the 1.5.* dev release. I think the
> consensus, was to have more rapid release cycles of the stable branch in
> future in order to avoid this. I'm sure there are others more qualified
> to expand/correct me on this if needs e.

Ok, thank you all for the answers. I think that I will simply upgrade
bioperl to 1.5.2 in Debian testing, and maybe rename it bioperl-core
when I will package other components.

Have a nice day,

-- 
Charles Plessy
Debian-Med packaging team
Wako, Saitama, Japan

From bix at sendu.me.uk  Thu Jun 28 10:19:49 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 15:19:49 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.48684.996884.134046@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
Message-ID: <4683C385.3050904@sendu.me.uk>

George Hartzell wrote:
> Sendu Bala writes:
>  > [...]
>  > I just tried:
>  > 
>  > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>  > 
>  > on Mac OS X and things seemed to go well, except for this error message 
>  > at the end:
>  > 
>  > 
>  > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
>  > svn: Can't move source to dest
>  > svn: Can't move 
>  > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
>  > to 
>  > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
>  > No such file or directory
>  > 
>  > I also ended up with only:
>  > bioperl-corba-server    bioperl-db              bioperl-live 
>  > bioperl-network         bioperl-papers          biosql-schema

I tried again in the same location and it told me I had to 'svn 
cleanup', which I did. But subsequently it kept complaining about files 
already being there.


> I just checked out
> 
>   svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/
> 
> and it ran to completion
[snip]
> Can another mac os x user out there give the Great Big Checkout a try
> and see if it runs to completion.  Potential problems that come to
> mind are:
> 
>   - the "mac's are case insensitive, sort of" problem
>   - you filled up your disk
>   - something else.

Well, I didn't run out of disc space. After a rm -fr * and trying again 
it failed at exactly the same point, in the same way.

svn co 
svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data

causes this repeatable problem:

[...]
A    data/phredfile.phd
svn: In directory 'data'
svn: Can't move source to dest
svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 
'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory

That is with Mac OS X svn command-line client, version 1.4.4

I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with 
a linux svn command-line client, version 1.2.3.


Cheers,
Sendu.

From dmessina at wustl.edu  Thu Jun 28 11:08:59 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 10:08:59 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18051.44281.831316.749586@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
Message-ID: <F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>

> [George]
> Likewise, you probably DON'T want to use this in your config file:
>
> 	  enable-auto-props = yes
> 	  * = svn:keywords="Author Date Id Rev URL"
>
> since it'll do the same thing.

Ah, so I've been doing it wrong all along then. :) Thanks, George!


> The Right Thing To Do is a more tedious
>
> 	  *.pl = svn:keywords="Author Date Id Rev URL"
> 	  *.pm = svn:keywords="Author Date Id Rev URL"
>   	  *.c = svn:keywords="Author Date Id Rev URL"
>
> A bit of googling will give you a good starting point for the list,
> and we should probably maintain a common one somewhere in the repo.


I've googled around and gathered the following as a possible list for  
our repo. Since I obviously don't know what I'm doing :), of course  
adjust and refine as necessary.

Dave

-------
[auto-props]
# Code formats
*.c          = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.cpp        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.h          = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.java       = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.as         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.cgi        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn-mine-type=text/plain
*.js         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/javascript
*.php        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL" Rev Date; svn:mime-type=text/x-php
*.pl         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-perl; svn:executable
*.pm         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-perl
*.py         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-python; svn:executable
*.sh         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-sh; svn:executable

# Image formats
*.bmp        = svn:mime-type=image/bmp
*.gif        = svn:mime-type=image/gif
*.ico        = svn:mime-type=image/ico
*.jpeg       = svn:mime-type=image/jpeg
*.jpg        = svn:mime-type=image/jpeg
*.png        = svn:mime-type=image/png
*.tif        = svn:mime-type=image/tiff
*.tiff       = svn:mime-type=image/tiff

# Data formats
*.pdf        = svn:mime-type=application/pdf
*.avi        = svn:mime-type=video/avi
*.doc        = svn:mime-type=application/msword
*.eps        = svn:mime-type=application/postscript
*.gz         = svn:mime-type=application/gzip
*.mov        = svn:mime-type=video/quicktime
*.mp3        = svn:mime-type=audio/mpeg
*.ppt        = svn:mime-type=application/vnd.ms-powerpoint
*.ps         = svn:mime-type=application/postscript
*.psd        = svn:mime-type=application/photoshop
*.rtf        = svn:mime-type=text/rtf
*.swf        = svn:mime-type=application/x-shockwave-flash
*.tgz        = svn:mime-type=application/gzip
*.wav        = svn:mime-type=audio/wav
*.xls        = svn:mime-type=application/vnd.ms-excel
*.zip        = svn:mime-type=application/zip

# Text formats
.htaccess    = svn:mime-type=text/plain
*.css        = svn:mime-type=text/css
*.dtd        = svn:mime-type=text/xml
*.html       = svn:mime-type=text/html
*.ini        = svn:mime-type=text/plain
*.sql        = svn:mime-type=text/x-sql
*.txt        = svn:mime-type=text/plain
*.xhtml      = svn:mime-type=text/xhtml+xml
*.xml        = svn:mime-type=text/xml
*.xsd        = svn:mime-type=text/xml
*.xsl        = svn:mime-type=text/xml
*.xslt       = svn:mime-type=text/xml
*.xul        = svn:mime-type=text/xul
*.yml        = svn:mime-type=text/plain
CHANGES      = svn:mime-type=text/plain
COPYING      = svn:mime-type=text/plain
INSTALL      = svn:mime-type=text/plain
Makefile*    = svn:mime-type=text/plain
README       = svn:mime-type=text/plain
TODO         = svn:mime-type=text/plain


From dmessina at wustl.edu  Thu Jun 28 11:11:23 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 10:11:23 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683B593.3050108@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.46242.942184.758493@almost.alerce.com>
	<4683B593.3050108@sendu.me.uk>
Message-ID: <F55A8B8A-B7B8-4354-85B7-E459B3679E41@wustl.edu>

> [Sendu]
>
> Yes. If you don't want everything, how does one 'browse' the  
> repository
> to find out the address of the thing you /do/ want?

svn ls file://dev.open-bio.org/home/hartzell/bioperl

or

svn ls svn+ssh://dev.open-bio.org/home/hartzell/bioperl

From n.haigh at sheffield.ac.uk  Thu Jun 28 11:13:58 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 16:13:58 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683B593.3050108@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>	<18051.46242.942184.758493@almost.alerce.com>
	<4683B593.3050108@sendu.me.uk>
Message-ID: <4683D036.5060109@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> George Hartzell wrote:
>> Sendu Bala writes:
>>> I just tried:
>>>
>>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
> [snip]
>> It looks like you tried to check out the *entire* repository.
> 
> Yes. If you don't want everything, how does one 'browse' the repository
> to find out the address of the thing you /do/ want?
> 

You could try:
svn ls

or

svn ls -R

to get a list of directories.

> 
>> It never occured to me to try that.  I'll take a look at what you 
>> reported.
> 
> Cheers.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg9A2czuW2jkwy2gRAgirAKCnMAg6a7W7RM22O2rOi4vD5w3HPwCePsku
akLhIszoQbRc/aVX3d/Jp7w=
=mlHY
-----END PGP SIGNATURE-----

From cjfields at uiuc.edu  Thu Jun 28 11:20:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 10:20:46 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683C385.3050904@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
Message-ID: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>

I can replicate the same problem (Mac OS X) with a full checkout:

svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
svn: Can't move source to dest
svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/ 
tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/ 
tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base':  
No such file or directory

What local (mac) svn version are you using?  I'm running off macports:

svn --version
svn, version 1.4.4 (r25188)
    compiled Jun 16 2007, 23:40:53

chris

On Jun 28, 2007, at 9:19 AM, Sendu Bala wrote:
...

> I tried again in the same location and it told me I had to 'svn
> cleanup', which I did. But subsequently it kept complaining about  
> files
> already being there.
>>
> [snip]
>> Can another mac os x user out there give the Great Big Checkout a try
>> and see if it runs to completion.  Potential problems that come to
>> mind are:
>>
>>   - the "mac's are case insensitive, sort of" problem
>>   - you filled up your disk
>>   - something else.
>
> Well, I didn't run out of disc space. After a rm -fr * and trying  
> again
> it failed at exactly the same point, in the same way.
>
> svn co
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/ 
> release-0-9-2/t/data
>
> causes this repeatable problem:
>
> [...]
> A    data/phredfile.phd
> svn: In directory 'data'
> svn: Can't move source to dest
> svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to
> 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or  
> directory
>
> That is with Mac OS X svn command-line client, version 1.4.4
>
> I can get bioperl-live/tags/release-0-9-2/t/data to check out fine  
> with
> a linux svn command-line client, version 1.2.3.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Jun 28 11:37:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 10:37:27 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>

On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> ...
>
> The short and sweet version: my proposal has all the benefits of  
> yours, but none of the disadvantages. What's not to like?

The short and sweet version: I'm more convinced after you laid out  
your argument in detail, which would have saved me some typing last  
night, BTW, thanks! ; >

The other core devs need to chip in and we need to openly (candidly)  
discuss it some more (I've added Hilmar to this).  There is also a  
tenable solution that allows both aspects ('cliques' and single mode)  
which might make everybody happy.

Let's say we only want to install Bio::SeqIO::genbank.  The  
Bio::SeqIO::genbank Build.PL would only install what was needed (as  
you indicated), only Bio::SeqIO::genbank-related tests would run  
(along with dependency test, if available), and life would go on.   
However, what if we wanted to install everything in SeqIO/DB/AlignIO/ 
etc?

We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO  
modules installed or a select few (maybe a quick 'install all (y/n)?'  
followed by a list, which installs them one at a time along with  
dependencies), or have the option to specifically denote them as  
passed args to SeqIO's Build.PL, something like 'perl Build.PL - 
install-plugins genbank embl swiss', 'perl Build.PL -install-plugins  
all', etc.  If a specific module (Bio::SeqIO::genbank) is installed  
directly then maybe the installation q&a's of followed modules could  
be bypassed when installing down the dependency tree with additional  
passed args.

This would, in effect, be a bioperl-specific mini-CPAN within CPAN.   
Nice!

Now, this doesn't address several related issues, such as how we  
handle versioning of the independent modules (should be in a  
controlled manner), what we do about deprecated modules which linger  
about on CPAN, how we deal with PPMs/RPMs/packaging, and so on.  All  
have possible reasonable ways they can be addressed, I believe.   
Also, I think we should still think about doing regular full-scale  
'stable' (1.#) releases (sort of our stamp of approval for that batch  
of modules at that point in time, with a reasonable 'sell-by' date).

Again, it should be seriously discussed among the core devs and the  
bioperl community at large prior to any serious work on it, and it  
would be quite a large-scale project, but possibly worth it.  It can  
only go forward if there is enough momentum behind it.

>> Finally, all of this should wait until later.  Much later, like  
>> after  a decent release, after svn, etc kind of 'later'.  I think  
>> we can  agree on that.
>
> Hmm, not really. If it can be implemented by a change in just  
> Build.PL and ModuleBuildBioperl, its really independent of  
> everything else. That's the beauty of it: the only thing that  
> changes is how things are uploaded to and downloaded from CPAN. The  
> only person that normally deals with that issue is the pumpkin for  
> a release, and he only cares about it at release time.
>
> In fact, if we're going to do it at all it makes sense to try it  
> out on a minor release like 1.5.3. We've already got experience of  
> doing it split-style from 1.5.2. (And let me tell you: splits at  
> the code-base level suck.)

BOSC is coming up, and I would like to focus on getting svn migration  
taken care of ASAP (which is sounding more and more like we plan on  
moving all open-bio over, unless I misread Jason's post?) and  
stomping of bugs (my next priority after EUtilities).  Maybe in the  
interim we should try focusing on bug squashing, get out a quick  
standard dev release (1.5.3) before BOSC, and then a few of us could  
all communicate there via email/text/IM/phone off-list?  Maybe post  
updates via the bioperl blog and list?

> And where is the harm in letting them do it via CPAN as well? In  
> fact, there are significant benefits:
...

I'm already pretty convinced...

> The same can be achieved with CPAN bundles for each kind of  
> functional grouping you can think of. And since its just a single  
> text file that defines such a grouping, its easy to change or add  
> new ones as you feel like it, as opposed to the rather more  
> permanent and substantial effort of creating one of your splits on  
> the code-base level.

... or it could be run right in Module::Build for specific parent  
classes (as I mention above).  Bundling could be instituted for  
something like a standard GBrowse release (Bundle::BioPerl::GBrowse)  
where the functionality might be more spread out (Bio::DB*,  
Bio::Graphics, Bio::FeatureIO, etc).  For a full-scale old-style core  
install, another Bundle (Bundle::BioPerl::Standard).

...

> Yes, it would be automated, and no, it wouldn't at all be any kind  
> of additional headache. I'm proposing a fully-automated system that  
> the pumpkin wouldn't even have to think about it. Much /less/ of a  
> headache than dealing with splits. Orders of magnitude easier to  
> deal with.

The 'headache' would be the initial setup (splitting test, individual  
Build.PL, etc), but this could be done stepwise or section-wise, I  
suppose.
...

> And the smallest, most concentrated set of modules is the  
> individual module.

Well, only if it runs correctly (i.e. has the entire dep. tree  
installed).  But the 'follow' tests would handle that.

> The reason some of these existing splits (micoarray, ext) have  
> fallen by the way-side? /Because/ they're splits. If they had been  
> part of bioperl-live all along, they'd have been kept in a working,  
> compatible state and would have been released along with everything  
> else in 1.5.2

microarray fell out of favor for other reasons (much faster ways to  
do the same thing via R), though I think it still could be salvaged  
if someone wanted to take it up.

the other bioperl distros (network, db, run, etc) would also  
necessitate following the same path as core, but I guess they could  
be bundled as well.

> ...
> No headaches.

I already have one, sorry!

chris

From n.haigh at sheffield.ac.uk  Thu Jun 28 11:53:52 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 16:53:52 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <4683D990.8090909@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>> ...
>>
>> The short and sweet version: my proposal has all the benefits of
>> yours, but none of the disadvantages. What's not to like?
> 
> The short and sweet version: I'm more convinced after you laid out your
> argument in detail, which would have saved me some typing last night,
> BTW, thanks! ; >
> 
> The other core devs need to chip in and we need to openly (candidly)
> discuss it some more (I've added Hilmar to this).  There is also a
> tenable solution that allows both aspects ('cliques' and single mode)
> which might make everybody happy.

Couldn't "cliques" simply be satisfied with CPAN Bundles?

> 
> Let's say we only want to install Bio::SeqIO::genbank.  The
> Bio::SeqIO::genbank Build.PL would only install what was needed (as you
> indicated), only Bio::SeqIO::genbank-related tests would run (along with
> dependency test, if available), and life would go on.  However, what if
> we wanted to install everything in SeqIO/DB/AlignIO/etc?

I think this might be where Bundles come in for installing these
"cliques" of related modules?

- -- snip --

> 
>> Yes, it would be automated, and no, it wouldn't at all be any kind of
>> additional headache. I'm proposing a fully-automated system that the
>> pumpkin wouldn't even have to think about it. Much /less/ of a
>> headache than dealing with splits. Orders of magnitude easier to deal
>> with.
> 
> The 'headache' would be the initial setup (splitting test, individual
> Build.PL, etc), but this could be done stepwise or section-wise, I suppose.

Yes, I think this is where most of the labour will be. However, setting
the test suite up like this would be beneficial with or without
publishing modules individually.

- -- snip --
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg9mQczuW2jkwy2gRAlfBAKCFP7XUvWXsjycSv0MVGN3Ru40D/wCcDiDg
UKE/Q/wA3gu1Gb7S6rarCQw=
=WQdY
-----END PGP SIGNATURE-----

From bix at sendu.me.uk  Thu Jun 28 12:03:54 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 17:03:54 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <4683DBEA.90005@sendu.me.uk>

Chris Fields wrote:
> On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:
> Let's say we only want to install Bio::SeqIO::genbank.  The 
> Bio::SeqIO::genbank Build.PL would only install what was needed (as you 
> indicated), only Bio::SeqIO::genbank-related tests would run (along with 
> dependency test, if available), and life would go on.  However, what if 
> we wanted to install everything in SeqIO/DB/AlignIO/etc?
> 
> We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO 
> modules installed or a select few (maybe a quick 'install all (y/n)?' 
> followed by a list, which installs them one at a time along with 
> dependencies), or have the option to specifically denote them as passed 
> args to SeqIO's Build.PL, something like 'perl Build.PL -install-plugins 
> genbank embl swiss', 'perl Build.PL -install-plugins all', etc.  If a 
> specific module (Bio::SeqIO::genbank) is installed directly then maybe 
> the installation q&a's of followed modules could be bypassed when 
> installing down the dependency tree with additional passed args.

I'd probably stay away from something like this. My primary reason 
being, off-the-top-of-my-head I don't see how to get it to work. If 
you're installing Bio::SeqIO for the first time via CPAN you can't ask 
it to install Bio::SeqIO::genbank et al. at the same time because 
Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some circularity.

I also wouldn't want these things to be complicated. There should be 
little in the way of questions to ask during install. Each module's 
Build.PL should be ultra-simple with no advanced logic at all. It should 
just specify things that are absolute requirements. This simplicity 
helps avoid some of the problems we face by distributing the monolithic 
Bioperl.

No, much better for us and for users to provide a Bundle::Bio-SeqIO.


> Now, this doesn't address several related issues, such as how we handle 
> versioning of the independent modules (should be in a controlled 
> manner),

When a module is changed, it gets a version bump. Nothing complicated 
needs to be done. Transparent and obvious, behaving like all other CPAN 
modules would be my choice.


> what we do about deprecated modules which linger about on CPAN,

Delete them from CPAN seems appropriate.


> how we deal with PPMs/RPMs/packaging, and so on.  All have possible 
> reasonable ways they can be addressed, I believe.  Also, I think we 
> should still think about doing regular full-scale 'stable' (1.#) 
> releases (sort of our stamp of approval for that batch of modules at 
> that point in time, with a reasonable 'sell-by' date).

Yes, we can still choose to take a snapshot and announce it to the 
world, but at the module-level nothing special would happen. There would 
just be an updated Bundle::Bioperl-everything (or whatever).


> Again, it should be seriously discussed among the core devs and the 
> bioperl community at large prior to any serious work on it, and it would 
> be quite a large-scale project, but possibly worth it.  It can only go 
> forward if there is enough momentum behind it.

The requirement for this approach is per-module test scripts. Which as I 
identified already, is very desirable anyway so we can hit 100% test 
coverage.

So, regardless of anything else can we all agree that per-module test 
scripts are a good idea and should be worked on? If so, I'll look into 
the feasibility and figure out how much work will be involved.

From cjfields at uiuc.edu  Thu Jun 28 13:17:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 12:17:50 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683DBEA.90005@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
Message-ID: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>


On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:

> ...
> I'd probably stay away from something like this. My primary reason  
> being, off-the-top-of-my-head I don't see how to get it to work. If  
> you're installing Bio::SeqIO for the first time via CPAN you can't  
> ask it to install Bio::SeqIO::genbank et al. at the same time  
> because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some  
> circularity.

True...

> I also wouldn't want these things to be complicated. There should  
> be little in the way of questions to ask during install. Each  
> module's Build.PL should be ultra-simple with no advanced logic at  
> all. It should just specify things that are absolute requirements.  
> This simplicity helps avoid some of the problems we face by  
> distributing the monolithic Bioperl.
>
> No, much better for us and for users to provide a Bundle::Bio-SeqIO.

I just don't want too much Bundle-itis as it'll gets confusing for  
newbie (i.e. Vista-itis, or AdobeCS-itis).  It should be limited to  
functional grouping (SeqIO, AlignIO, DB, etc), 'install everything',  
or distribution-specific (GBrowse).

I also think (though Hilmar may veto this) that we should work on  
integrating bioperl-db, network, etc. into this if it goes forward.

Here's a question: how do we plan on handling uploading bioperl  
updates to CPAN via PAUSE?  Do we want to run every single module  
through one pumpkin?  Or do we want to have a core dev group PAUSE  
account?  I can see, for instance, removing everything EUtilities- 
related and submitting it independently using my own PAUSE account,  
but it would be nice to have it under an umbrella 'bioperl-devs'  
account instead.

>> Now, this doesn't address several related issues, such as how we  
>> handle versioning of the independent modules (should be in a  
>> controlled manner),
>
> When a module is changed, it gets a version bump. Nothing  
> complicated needs to be done. Transparent and obvious, behaving  
> like all other CPAN modules would be my choice.
>
>> what we do about deprecated modules which linger about on CPAN,
>
> Delete them from CPAN seems appropriate.

I know you can do that via PAUSE, but I think it lingers about on  
search.cpan.org (unless that's been fixed).  This would prob. have to  
be used sparingly.

>> how we deal with PPMs/RPMs/packaging, and so on.  All have  
>> possible reasonable ways they can be addressed, I believe.  Also,  
>> I think we should still think about doing regular full-scale  
>> 'stable' (1.#) releases (sort of our stamp of approval for that  
>> batch of modules at that point in time, with a reasonable 'sell- 
>> by' date).
>
> Yes, we can still choose to take a snapshot and announce it to the  
> world, but at the module-level nothing special would happen. There  
> would just be an updated Bundle::Bioperl-everything (or whatever).

Right, it would basically be a stamp of certification.

>> Again, it should be seriously discussed among the core devs and  
>> the bioperl community at large prior to any serious work on it,  
>> and it would be quite a large-scale project, but possibly worth  
>> it.  It can only go forward if there is enough momentum behind it.
>
> The requirement for this approach is per-module test scripts. Which  
> as I identified already, is very desirable anyway so we can hit  
> 100% test coverage.
>
> So, regardless of anything else can we all agree that per-module  
> test scripts are a good idea and should be worked on? If so, I'll  
> look into the feasibility and figure out how much work will be  
> involved.

I think so, but the feasibility issue is critical.  Do we want cvs/ 
svn to be divided up into 900 subdirectories (one for each module),  
or do we want to have a similar directory structure as we have now,  
but with each module in it's own directory?  Or leave everything as  
is and generate Build.PL on-the-fly (prob. least feasible)?

This is where it might be wise to do it piece-meal at first (maybe  
starting with something somewhat segregated like Bio::Tools), then  
progress from there.

chris


From hartzell at alerce.com  Thu Jun 28 13:38:48 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 13:38:48 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
Message-ID: <18051.61992.627473.323346@almost.alerce.com>

David Messina writes:
 > > [George]
 > > Likewise, you probably DON'T want to use this in your config file:
 > >
 > > 	  enable-auto-props = yes
 > > 	  * = svn:keywords="Author Date Id Rev URL"
 > >
 > > since it'll do the same thing.
 > 
 > Ah, so I've been doing it wrong all along then. :) Thanks, George!

It's not *wrong* if it's never done anything to you that you've
regretted.  The right answer depends on your situation....

 > [...]
 > I've googled around and gathered the following as a possible list for  
 > our repo. Since I obviously don't know what I'm doing :), of course  
 > adjust and refine as necessary.
 > 

That's a great starting point.  Do you have write access to the wiki?
Could you link it off of the instructions for using svn?

g.

From hartzell at alerce.com  Thu Jun 28 14:06:50 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 14:06:50 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683C385.3050904@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
Message-ID: <18051.63674.685297.426813@almost.alerce.com>

Sendu Bala writes:
 > [...]
 > I tried again in the same location and it told me I had to 'svn 
 > cleanup', which I did. But subsequently it kept complaining about files 
 > already being there.

You need to do the cleanup because svn exited gracelessly and you
needed to help it get back in it's feet.  The cleanup doesn't remove
the stuff that you did get checked out, so it's still there getting in
the way of your new checkout.

 > [...]
 > svn co 
 > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data
 > 
 > causes this repeatable problem:
 > 
 > [...]
 > A    data/phredfile.phd
 > svn: In directory 'data'
 > svn: Can't move source to dest
 > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 
 > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory
 > 
 > That is with Mac OS X svn command-line client, version 1.4.4
 > 
 > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with 
 > a linux svn command-line client, version 1.2.3.

I'm not 100% sure what's going on here, but I'm inclined to say "get a
real computer" (and yes, I'm typing this on a mac...).  I have a mac
pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
the tiger used to say)....

I think that we're having trouble with case sensitivity.  My only
evidence is that I can see where there have been both HUMBETGLOA.FASTA
and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
anything else that's weird about that file.  On the other hand, I
can't see how this would cause the error you're seeing though.

The experiment would be to grab a usb or firewire disk (or even a
memory stick), partition/format it as case sensitive (or even *unix*)
and try to do

 svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data

into it.  If it works, voila.  If not, I'll keep making stuff up, err,
thinking about it.

g.

From dmessina at wustl.edu  Thu Jun 28 14:15:32 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 13:15:32 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>
Message-ID: <459D9BC0-4FBA-4560-80A8-E6243DE9D9CC@wustl.edu>

Same svn error here on the full checkout.


> What local (mac) svn version are you using?  I'm running off macports:
>
> svn --version
> svn, version 1.4.4 (r25188)
>     compiled Jun 16 2007, 23:40:53

I have svn 1.4.3.

% svn --version
svn, version 1.4.3 (r23084)
    compiled Apr  1 2007, 02:47:14

Copyright (C) 2000-2006 CollabNet.
Subversion is open source software, see http://subversion.tigris.org/
This product includes software developed by CollabNet (http:// 
www.Collab.Net/).

The following repository access (RA) modules are available:

* ra_dav : Module for accessing a repository via WebDAV (DeltaV)  
protocol.
   - handles 'http' scheme
* ra_svn : Module for accessing a repository using the svn network  
protocol.
   - handles 'svn' scheme
* ra_local : Module for accessing a repository on local disk.
   - handles 'file' scheme


From cjfields at uiuc.edu  Thu Jun 28 14:54:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 13:54:15 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.63674.685297.426813@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
Message-ID: <D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>


On Jun 28, 2007, at 1:06 PM, George Hartzell wrote:

> ...
> I'm not 100% sure what's going on here, but I'm inclined to say "get a
> real computer" (and yes, I'm typing this on a mac...).  I have a mac
> pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
> the tiger used to say)....

Ouch!  Though it could be worse (**coughwindowscough**).

> I think that we're having trouble with case sensitivity.  My only
> evidence is that I can see where there have been both HUMBETGLOA.FASTA
> and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
> anything else that's weird about that file.  On the other hand, I
> can't see how this would cause the error you're seeing though.

Odd that other branches (including the main trunk) work but that one  
doesn't.

> The experiment would be to grab a usb or firewire disk (or even a
> memory stick), partition/format it as case sensitive (or even *unix*)
> and try to do
>
>  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data
>
> into it.  If it works, voila.  If not, I'll keep making stuff up, err,
> thinking about it.
>
> g.

I'll have to figure out why I can't get ssh keys to work locally to  
test it out more (I have a usb drive to test with); just don't have  
time at the moment.

chris


From dmessina at wustl.edu  Thu Jun 28 14:47:04 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 13:47:04 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18051.61992.627473.323346@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
Message-ID: <0027C4E0-26B1-41F3-8FD8-EAB5465CA80E@wustl.edu>

> That's a great starting point.  Do you have write access to the wiki?
> Could you link it off of the instructions for using svn?

Done.

http://www.bioperl.org/wiki/Svn_auto-props

linked from:
http://www.bioperl.org/wiki/Using_Subversion (bottom of page)


From bix at sendu.me.uk  Thu Jun 28 15:19:35 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 20:19:35 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
Message-ID: <468409C7.7020102@sendu.me.uk>

Chris Fields wrote:
> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:
> Here's a question: how do we plan on handling uploading bioperl  
> updates to CPAN via PAUSE?  Do we want to run every single module  
> through one pumpkin?  Or do we want to have a core dev group PAUSE  
> account?  I can see, for instance, removing everything EUtilities- 
> related and submitting it independently using my own PAUSE account,  
> but it would be nice to have it under an umbrella 'bioperl-devs'  
> account instead.

All Bioperl modules (except the Bundle!) are owned by BIOPERLML on 
PAUSE. Its a little akward since PAUSE is uploader-centric, but see my 
notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release

And certainly, everything that wants to consider itself part of Bioperl 
(and gain the benefit of lots of devs looking after it) should certainly 
  have BIOPERLML as the primary owner.


> I think so, but the feasibility issue is critical.  Do we want cvs/ 
> svn to be divided up into 900 subdirectories (one for each module),  
> or do we want to have a similar directory structure as we have now,  
> but with each module in it's own directory?  Or leave everything as  
> is and generate Build.PL on-the-fly (prob. least feasible)?

Very definitely the latter. The key benefit of my approach is that the 
organisation stays as is and that a snapshot of the repository remains a 
single directory of modules in Bio so that people don't have to 
'install' Bioperl, they can still just uncompress the archive (or check 
out the package from svn) and point their PERL5LIB to the root dir of 
the package.

For that reason I very much like the idea of folding the current 
split-out packages (run, network etc.) back into the core package so 
everything is one place. Folding them back in should obviously wait 
until everything is in place and working with core already.


My proposal obviously wasn't very clear. As far as all other devs are 
concerned, nothing changes at all (except for lots of new improved test 
scripts). The pumpkin will, however, be able to say:

./Build dist

Right now that generates the distribution archives (in different 
compression formats) - one big archive containing everything.
My proposal is simply that instead it generates lots of archives, one 
archive per module. It will also generate some Bundles and whatever else 
might be needed.

I don't envisage any major difficulties in achieving this. The 
'feasibility' issue I was going to look into was strictly regarding 
doing all the new test scripts.

From hartzell at alerce.com  Thu Jun 28 15:43:38 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 15:43:38 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
Message-ID: <18052.3946.224905.415905@almost.alerce.com>

Chris Fields writes:
 > 
 > On Jun 28, 2007, at 1:06 PM, George Hartzell wrote:
 > 
 > > ...
 > > I'm not 100% sure what's going on here, but I'm inclined to say "get a
 > > real computer" (and yes, I'm typing this on a mac...).  I have a mac
 > > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
 > > the tiger used to say)....
 > 
 > Ouch!  Though it could be worse (**coughwindowscough**).
 > 
 > > I think that we're having trouble with case sensitivity.  My only
 > > evidence is that I can see where there have been both HUMBETGLOA.FASTA
 > > and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
 > > anything else that's weird about that file.  On the other hand, I
 > > can't see how this would cause the error you're seeing though.
 > 
 > Odd that other branches (including the main trunk) work but that one  
 > doesn't.
 > 
 > > The experiment would be to grab a usb or firewire disk (or even a
 > > memory stick), partition/format it as case sensitive (or even *unix*)
 > > and try to do
 > >
 > >  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
 > > live/tags/release-0-9-2/t/data
 > >
 > > into it.  If it works, voila.  If not, I'll keep making stuff up, err,
 > > thinking about it.
 > >
 > > g.
 > 
 > I'll have to figure out why I can't get ssh keys to work locally to  
 > test it out more (I have a usb drive to test with); just don't have  
 > time at the moment.

I just did the experiment, and filename-insensitivity seems to be
breaking something.

I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.

I reformatted a memory stick to be case sensitive and co of

  bioperl/bioperl-live/tags/release-0-9-2/t 

worked, then I made a directory in my home dir (normal mac thing) and
got the same error as above.

I can get a copy of the trunk, so I'm inclined to ask someone to
mention the problem on the wiki and then just ignore it.

g.

From cjfields at uiuc.edu  Thu Jun 28 16:29:09 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 15:29:09 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <468409C7.7020102@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
Message-ID: <026156F4-4C46-4CC6-82B5-07FC5326A244@uiuc.edu>


On Jun 28, 2007, at 2:19 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:
>> Here's a question: how do we plan on handling uploading bioperl
>> updates to CPAN via PAUSE?  Do we want to run every single module
>> through one pumpkin?  Or do we want to have a core dev group PAUSE
>> account?  I can see, for instance, removing everything EUtilities-
>> related and submitting it independently using my own PAUSE account,
>> but it would be nice to have it under an umbrella 'bioperl-devs'
>> account instead.
>
> All Bioperl modules (except the Bundle!) are owned by BIOPERLML on
> PAUSE. Its a little akward since PAUSE is uploader-centric, but see my
> notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release
>
> And certainly, everything that wants to consider itself part of  
> Bioperl
> (and gain the benefit of lots of devs looking after it) should  
> certainly
>   have BIOPERLML as the primary owner.

Alrighty then.

>> I think so, but the feasibility issue is critical.  Do we want cvs/
>> svn to be divided up into 900 subdirectories (one for each module),
>> or do we want to have a similar directory structure as we have now,
>> but with each module in it's own directory?  Or leave everything as
>> is and generate Build.PL on-the-fly (prob. least feasible)?
>
> Very definitely the latter. The key benefit of my approach is that the
> organisation stays as is and that a snapshot of the repository  
> remains a
> single directory of modules in Bio so that people don't have to
> 'install' Bioperl, they can still just uncompress the archive (or  
> check
> out the package from svn) and point their PERL5LIB to the root dir of
> the package.

Okay, makes sense.

> For that reason I very much like the idea of folding the current
> split-out packages (run, network etc.) back into the core package so
> everything is one place. Folding them back in should obviously wait
> until everything is in place and working with core already.

I agree, but that's up to Brian, Hilmar, and the others who donated  
the packages (or at least a consensus of core devs).  One thing at a  
time.

> My proposal obviously wasn't very clear. As far as all other devs are
> concerned, nothing changes at all (except for lots of new improved  
> test
> scripts). The pumpkin will, however, be able to say:
>
> ./Build dist
>
> Right now that generates the distribution archives (in different
> compression formats) - one big archive containing everything.
> My proposal is simply that instead it generates lots of archives, one
> archive per module. It will also generate some Bundles and whatever  
> else
> might be needed.

We'll need to define which tests and data goes with each module and  
so on.

> I don't envisage any major difficulties in achieving this. The
> 'feasibility' issue I was going to look into was strictly regarding
> doing all the new test scripts.

Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3  
is ready to go.  We'll still need to get thoughts on this from other  
core devs out there, and it prob. should until everybody is  
comfortable with the idea.

chris


From dmessina at wustl.edu  Thu Jun 28 18:13:48 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 17:13:48 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>

Coming late to this party, I'm replying to snippets from multiple  
emails.


> [Chris]
> what we do about deprecated modules which linger
> about on CPAN

> [Sendu]
> Delete them from CPAN seems appropriate.

I coulda sworn this was frowned upon, but a recent thread suggests  
it's totally kosher.

	http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html


> [Sendu]
> So, regardless of anything else can we all agree that per-module test
> scripts are a good idea and should be worked on?

I agree.


> [Sendu]
> people don't have to
> 'install' Bioperl, they can still just uncompress the archive (or  
> check
> out the package from svn) and point their PERL5LIB to the root dir of
> the package.

Could you elaborate a bit on how this works? How is XS code that  
needs compiling handled? Or the scripts directory? I would love to be  
able to do this.


> [Sendu]
> For that reason I very much like the idea of folding the current
> split-out packages (run, network etc.) back into the core package so
> everything is one place. Folding them back in should obviously wait
> until everything is in place and working with core already.

 From an organizational standpoint, I'm concerned that with ~900  
modules in core right now, adding all of the additional stuff from  
the split-out packages would make for a daunting directory.

But as you said, this is way down the road, so this proposal doesn't  
bear on the other, closer-to-now issues on the table.


> [Chris]
> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
> is ready to go.  We'll still need to get thoughts on this from other
> core devs out there, and it prob. should until everybody is
> comfortable with the idea.

If we go forward with the CPAN split plan, I like the idea of having  
a trial. We can foresee some of the issues that such a change may  
bring, and yet still more no doubt wait for us once we do it.


Dave

From bix at sendu.me.uk  Thu Jun 28 18:59:35 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 23:59:35 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <46843D57.2080409@sendu.me.uk>

David Messina wrote:
>> people don't have to 'install' Bioperl, they can still just
>> uncompress the archive (or check out the package from svn) and
>> point their PERL5LIB to the root dir of the package.
> 
> Could you elaborate a bit on how this works? How is XS code that 
> needs compiling handled? Or the scripts directory? I would love to be
> able to do this.

I meant for the most part. Core doesn't have any XS code so that's not 
an issue. Scripts can be run manually like any other perl script. When 
you discover something isn't working because of a missing external 
dependency, you just install it. (But that happens very rarely.)

Personally I've /never/ installed Bioperl and used that installed set of 
modules. I've always just pointed my PERL5LIB at the distribution folder 
or my cvs checkout.

Which makes me a strange candidate for advocating all these 
CPAN-specific changes, but there you go ;)

From cjfields at uiuc.edu  Thu Jun 28 19:03:02 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 18:03:02 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <8B6FBB52-5CCE-4122-876C-B9827C86E46E@uiuc.edu>


On Jun 28, 2007, at 5:13 PM, David Messina wrote:

> Coming late to this party, I'm replying to snippets from multiple  
> emails.
>
>
>> [Chris]
>> what we do about deprecated modules which linger
>> about on CPAN
>
>> [Sendu]
>> Delete them from CPAN seems appropriate.
>
> I coulda sworn this was frowned upon, but a recent thread suggests  
> it's totally kosher.
>
> 	http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html

As long as it doesn't show up somewhere to confuse newbies I'm okay  
with it.

>> [Sendu]
>> people don't have to
>> 'install' Bioperl, they can still just uncompress the archive (or  
>> check
>> out the package from svn) and point their PERL5LIB to the root dir of
>> the package.
>
> Could you elaborate a bit on how this works? How is XS code that  
> needs compiling handled? Or the scripts directory? I would love to  
> be able to do this.

Maybe Sendu can add to this, but the XS code is limited to bioperl- 
ext AFAIK.  We could keep that separate until it plays well with  
bioperl itself.

Scripts and examples - maybe packaged along with a Bundle?

>> [Sendu]
>> For that reason I very much like the idea of folding the current
>> split-out packages (run, network etc.) back into the core package so
>> everything is one place. Folding them back in should obviously wait
>> until everything is in place and working with core already.
>
> From an organizational standpoint, I'm concerned that with ~900  
> modules in core right now, adding all of the additional stuff from  
> the split-out packages would make for a daunting directory.
>
> But as you said, this is way down the road, so this proposal  
> doesn't bear on the other, closer-to-now issues on the table.

Well, the code in bioperl-db and network complement code in core, so  
I agree with Sendu they belong there.  They should be under the same  
scrutiny as the rest anyway (code, tests, etc), but won't be bundled  
unles there is an 'install everything' Bundle.

>> [Chris]
>> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
>> is ready to go.  We'll still need to get thoughts on this from other
>> core devs out there, and it prob. should until everybody is
>> comfortable with the idea.
>
> If we go forward with the CPAN split plan, I like the idea of  
> having a trial. We can foresee some of the issues that such a  
> change may bring, and yet still more no doubt wait for us once we  
> do it.

That's what branches are for; testing stuff out like this.

chris

From hartzell at alerce.com  Thu Jun 28 19:05:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 19:05:32 -0400
Subject: [Bioperl-l] problem with binary files.
Message-ID: <18052.16060.932502.183552@almost.alerce.com>


Ok, after pointing out the problem with setting the svn:keywords
property on binary files, it turns out that I *did* that.  Worse yet,
I set the svn:eol-style to 'native' on everything, including binary
files, so depending on your platform they're likely to be fubar.

For example, bioperl-run/t/data/H_pylori_J99.glimmer2.icm may or may
not be what you expect it to be, depending on whether your eol-style
matches the servers and whether any conversions were done.

I'll touch up the way that the little tool I'm using calls cvs2svn and
redo the repository.

g.

From n.haigh at sheffield.ac.uk  Fri Jun 29 02:59:21 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 07:59:21 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>	<4682C6F5.4020406@sendu.me.uk>
	<4682D12E.3000803@sendu.me.uk>	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>	<4682E824.1050507@sendu.me.uk>	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>	<4683624F.6020402@sendu.me.uk>	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <4684ADC9.8040404@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- -- split --
>> [Sendu]
>> For that reason I very much like the idea of folding the current
>> split-out packages (run, network etc.) back into the core package so
>> everything is one place. Folding them back in should obviously wait
>> until everything is in place and working with core already.
> 
>  From an organizational standpoint, I'm concerned that with ~900  
> modules in core right now, adding all of the additional stuff from  
> the split-out packages would make for a daunting directory.
> 
> But as you said, this is way down the road, so this proposal doesn't  
> bear on the other, closer-to-now issues on the table.
> 

I don't think this is an issue - it would simply mean everything is
under the same version control hierarchy. And with svn it's Soooooo much
easier to fiddle around with directory structures

> 
> 
>> [Chris]
>> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
>> is ready to go.  We'll still need to get thoughts on this from other
>> core devs out there, and it prob. should until everybody is
>> comfortable with the idea.
> 
> If we go forward with the CPAN split plan, I like the idea of having  
> a trial. We can foresee some of the issues that such a change may  
> bring, and yet still more no doubt wait for us once we do it.
> 

Under svn it would be easy to make an "svn copy" of run, network etc
into a branch of live to test this out. Not that this might be a
problem, but: Since we are looking at bioperl-* packages being under the
same svn repository, then then "svn copy's" are cheap for disk space.

> 
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhK3JczuW2jkwy2gRAtI2AJ4kNrpGY8XMMh9KxOqs+l0PrEVcwgCfVFj6
BCvltmPyWF4ImueYmd7VFAc=
=ktl+
-----END PGP SIGNATURE-----

From n.haigh at sheffield.ac.uk  Fri Jun 29 03:05:33 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 08:05:33 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <18051.61992.627473.323346@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
Message-ID: <4684AF3D.5090907@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:

- -- snip --

>  > [...]
>  > I've googled around and gathered the following as a possible list for  
>  > our repo. Since I obviously don't know what I'm doing :), of course  
>  > adjust and refine as necessary.
>  > 
> 
> That's a great starting point.  Do you have write access to the wiki?
> Could you link it off of the instructions for using svn?
> 
> g.

Don't .t files need adding to the auto-props?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhK89czuW2jkwy2gRAnRGAJ0VnBNVBAdQdfUnqPhmvsyQnD/bswCggSHC
/Iivb6Lc4/51bUdrTmRQYlE=
=V+t2
-----END PGP SIGNATURE-----

From sac at bioperl.org  Fri Jun 29 04:25:36 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Fri, 29 Jun 2007 01:25:36 -0700
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>

On 6/27/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:
>
> > ...
> > If you have a dev.open-bio.org account and you're in the bioperl
> > group, you're good to get at it via:
> >
> >   file:///home/hartzell/bioperl
> >
> > or
> >
> >   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>
> I managed to get it working using file://.  Haven't tried svn+ssh yet
> but I've had persistent problems getting ssh to work properly on my
> macbook; not sure why yet but I haven't had time to play around with it.

Are you using the ssh that comes installed with OSX? If so, I'd
recommend installing openssh from MacPorts. I recall having issues
with the stock version which were resolved by using the more
up-to-date version you can get via MacPorts.

BTW, I haven't been able to check out the new svn repository via
svn+ssh:// because I can't get svn to authenticate with an alternative
username. My username on dev.open-bio.org differs from what it is on
my local machine, so I issue a command such as:

steve at localhost $ svn --username sac checkout
svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

but I get challenged with:
steve at dev.open-bio.org's password:

I also tried putting the --username argument after the subcommand, but
it still wants to use my local username. I can ssh -l sac into the dev
box no problem. Any suggestions?

Steve

From bix at sendu.me.uk  Fri Jun 29 04:52:42 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 29 Jun 2007 09:52:42 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <4684C85A.5030206@sendu.me.uk>

Steve Chervitz wrote:
> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username. My username on dev.open-bio.org differs from what it is on
> my local machine, so I issue a command such as:
> 
> steve at localhost $ svn --username sac checkout
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> 
> but I get challenged with:
> steve at dev.open-bio.org's password:
> 
> I also tried putting the --username argument after the subcommand, but
> it still wants to use my local username. I can ssh -l sac into the dev
> box no problem. Any suggestions?

Set up your ssh key on the dev machine. I'm also on a machine with the 
wrong username and it works even without attempting to supply the 
correct one.

It does, however, show the 'Welcome to the new developer system' message 
2 or 3 times for every svn+ssh action, which freaks me out a little.

From N.Haigh at sheffield.ac.uk  Fri Jun 29 05:32:38 2007
From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 10:32:38 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <1183109558.4684d1b69bcec@webmail.shef.ac.uk>

Quoting Steve Chervitz <sac at bioperl.org>:

-- snip --

> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username. My username on dev.open-bio.org differs from what it is on
> my local machine, so I issue a command such as:
> 
> steve at localhost $ svn --username sac checkout
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> 
> but I get challenged with:
> steve at dev.open-bio.org's password:
> 
> I also tried putting the --username argument after the subcommand, but
> it still wants to use my local username. I can ssh -l sac into the dev
> box no problem. Any suggestions?
> 
> Steve
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


You could try:
svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

Nath

From dmessina at wustl.edu  Fri Jun 29 08:28:26 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 29 Jun 2007 07:28:26 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>

>
> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username.

I have the same issue. I set up a stanza in my ~/.ssh/config:

Host dev.open-bio.org
   User dave_messina

where dave_messina is my dev.open-bio.org username.


From cjfields at uiuc.edu  Fri Jun 29 13:00:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 29 Jun 2007 12:00:27 -0500
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
Message-ID: <F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>


On Jun 29, 2007, at 7:28 AM, David Messina wrote:

>>
>> BTW, I haven't been able to check out the new svn repository via
>> svn+ssh:// because I can't get svn to authenticate with an  
>> alternative
>> username.
>
> I have the same issue. I set up a stanza in my ~/.ssh/config:
>
> Host dev.open-bio.org
>    User dave_messina
>
> where dave_messina is my dev.open-bio.org username.

I changed to the macports ssh w/o luck.  It appears the key is  
offered up, so maybe the problem is how I have everything set up on  
dev (though I followed everything on the wiki):

....
  Contact 'support at open-bio.org' for
your new login information.
======================================
debug1: Authentications that can continue: publickey,gssapi-with- 
mic,password
debug1: Next authentication method: publickey
debug1: Offering public key: /Users/cjfields/.ssh/id_dsa
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,gssapi-with- 
mic,password
debug2: we did not send a packet, disable method
debug1: Next authentication method: password

It's odd; I can use passwordless logins for other servers (admittedly  
Mac servers) w/o problems using ssh keys, but dev.open-bio.org always  
prompts for a password regardless.

My feeling is it's something with my local ssh or sshd config; I'll  
try fiddling with it to see what happens.  Anyone have suggestions?   
I've lost enough hair as is; don't want to lose more!

chris

From sac at bioperl.org  Fri Jun 29 13:07:45 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Fri, 29 Jun 2007 10:07:45 -0700
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <1183109558.4684d1b69bcec@webmail.shef.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<1183109558.4684d1b69bcec@webmail.shef.ac.uk>
Message-ID: <8f200b4c0706291007x2b765323n75c9003a47fe7cbb@mail.gmail.com>

On 6/29/07, Nathan S. Haigh <N.Haigh at sheffield.ac.uk> wrote:
> Quoting Steve Chervitz <sac at bioperl.org>:
>
> -- snip --
>
> > BTW, I haven't been able to check out the new svn repository via
> > svn+ssh:// because I can't get svn to authenticate with an alternative
> > username. My username on dev.open-bio.org differs from what it is on
> > my local machine, so I issue a command such as:
> >
> > steve at localhost $ svn --username sac checkout
> > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> >
> > but I get challenged with:
> > steve at dev.open-bio.org's password:
> >
> > I also tried putting the --username argument after the subcommand, but
> > it still wants to use my local username. I can ssh -l sac into the dev
> > box no problem. Any suggestions?
>
> [...]
> You could try:
> svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

Bingo. Thanks for the tips, guys.

BTW, setting up ssh keys was not the issue, since my key is already
set up on the dev machine. The svn --username setting appears to not
be operative at the ssh layer. I  suspected this might be the case
given that the usage info says:

 $ svn --help co
  --username arg           : specify a username ARG
  --password arg           : specify a password ARG

which seemed insecure. I didn't want to send my password in the clear,
and didn't know if or whether svn would hand it off to ssh. It wasn't
even sending my username to ssh, so I knew something was wrong. These
args are probably only intended for accessing local svn repositories,
or non-svn+ssh-based checkouts.

BTW, the svn+ssh check out on Mac OS X works for me. I'm using svn and
openssh installed via MacPorts:

$ svn --version
svn, version 1.4.4 (r25188)
   compiled Jun 28 2007, 23:51:53

$ ssh -version
OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007

Steve

From hartzell at alerce.com  Fri Jun 29 15:19:31 2007
From: hartzell at alerce.com (George Hartzell)
Date: Fri, 29 Jun 2007 15:19:31 -0400
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
	<F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
Message-ID: <18053.23363.102371.602742@almost.alerce.com>

Chris Fields writes:
 > 
 > On Jun 29, 2007, at 7:28 AM, David Messina wrote:
 > 
 > >>
 > >> BTW, I haven't been able to check out the new svn repository via
 > >> svn+ssh:// because I can't get svn to authenticate with an  
 > >> alternative
 > >> username.
 > >
 > > I have the same issue. I set up a stanza in my ~/.ssh/config:
 > >
 > > Host dev.open-bio.org
 > >    User dave_messina
 > >
 > > where dave_messina is my dev.open-bio.org username.
 > 
 > I changed to the macports ssh w/o luck.  It appears the key is  
 > offered up, so maybe the problem is how I have everything set up on  
 > dev (though I followed everything on the wiki):

A couple of things to check.

  - make sure that you put your public key in ~/.ssh/authorized_keys2
    (not authorized_keys)

  - make sure that authorized_keys2 is chmod'ed 600 (644 might be
    enough...).

  - make sure that ~/.ssh is chmoded 700.

  - make sure that your home directory is 755.

Then see if it works.  You might be able to relax some of those
protections a bit, but ssh's uptight about letting other people mess
with that data.

g.


From dmessina at wustl.edu  Fri Jun 29 18:47:14 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 29 Jun 2007 17:47:14 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <4684AF3D.5090907@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
Message-ID: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>

> [Nathan]
> Don't .t files need adding to the auto-props?

Yes -- thanks for reminding me. Please feel free to add it to the  
wiki page. I'll be tweaking it some more later on in any case.


Dave

From n.haigh at sheffield.ac.uk  Sat Jun 30 05:55:56 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 30 Jun 2007 10:55:56 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
Message-ID: <468628AC.9060200@sheffield.ac.uk>

David Messina wrote:
>> [Nathan]
>> Don't .t files need adding to the auto-props?
> 
> Yes -- thanks for reminding me. Please feel free to add it to the wiki 
> page. I'll be tweaking it some more later on in any case.
> 
> 
> Dave

I noticed this has already been done. I have just been through the 
t/data dir and added a list of extensions I found (without props). There 
are some files without extensions, how should these be dealt with? There 
seems to be a plethora of file naming styles which means there's a 
pretty long list of non-standard extensions. So at some point someone 
will commit a new data file with a new extension (often describing what 
program created the output or the test for which it's intended) that 
won't be in the auto-props file - can you think of a way around this?

Nath

From cjfields at uiuc.edu  Sat Jun 30 08:48:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 07:48:10 -0500
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <18053.23363.102371.602742@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
	<F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
	<18053.23363.102371.602742@almost.alerce.com>
Message-ID: <3874B4EE-0119-40BC-8B92-11133A766417@uiuc.edu>


On Jun 29, 2007, at 2:19 PM, George Hartzell wrote:

> Chris Fields writes:
>>
>> On Jun 29, 2007, at 7:28 AM, David Messina wrote:
>>
>>>>
>>>> BTW, I haven't been able to check out the new svn repository via
>>>> svn+ssh:// because I can't get svn to authenticate with an
>>>> alternative
>>>> username.
>>>
>>> I have the same issue. I set up a stanza in my ~/.ssh/config:
>>>
>>> Host dev.open-bio.org
>>>    User dave_messina
>>>
>>> where dave_messina is my dev.open-bio.org username.
>>
>> I changed to the macports ssh w/o luck.  It appears the key is
>> offered up, so maybe the problem is how I have everything set up on
>> dev (though I followed everything on the wiki):
>
> A couple of things to check.
>
>   - make sure that you put your public key in ~/.ssh/authorized_keys2
>     (not authorized_keys)
>
>   - make sure that authorized_keys2 is chmod'ed 600 (644 might be
>     enough...).
>
>   - make sure that ~/.ssh is chmoded 700.
>
>   - make sure that your home directory is 755.
>
> Then see if it works.  You might be able to relax some of those
> protections a bit, but ssh's uptight about letting other people mess
> with that data.
>
> g.

Got it working; it was the permissions on my home dir (the last  
one).  Thanks George!

chris

From dmessina at wustl.edu  Sat Jun 30 11:37:44 2007
From: dmessina at wustl.edu (David Messina)
Date: Sat, 30 Jun 2007 10:37:44 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <468628AC.9060200@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
Message-ID: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>

> I have just been through the t/data dir and added a list of  
> extensions I found

Thanks! That's a big help. I'll add prop definitions to those shortly.


>  There are some files without extensions, how should these be dealt  
> with?

If you look in the text files section, there are some files there  
which don't have extensions, e.g. AUTHORS, BUGS. There's also

	Makefile.*

so we have some flexibility in how svn knows to auto-prop a file. I  
haven't read up on the details yet to find out how it handles files  
that match multiple criteria -- it may be dependent simply on the  
order they're defined.


> There seems to be a plethora of file naming styles which means  
> there's a pretty long list of non-standard extensions. So at some  
> point someone will commit a new data file with a new extension  
> (often describing what program created the output or the test for  
> which it's intended) that won't be in the auto-props file - can you  
> think of a way around this?

Ive been thinking about this a bit. How about this?

- We have just "standard" files and extensions (like *.blast,  
*.fasta) in the auto-props list.

- We manually add props for the files that have nonstandard,  
arbitrary extensions so all the files have now are prop'd.

- At some point we rename those nonstandard files to have standard  
extensions. Especially for the t/data/ files, we'll have to make sure  
to update the tests that rely on them.

- We can have the suggested list of extensions for new files that get  
added. I don't think we need to strictly enforce this just for the  
sake of svn (after all, its primary function of version control will  
work just fine without any properties set), but it would be nice if  
we could try to keep to it mostly.

Many distros come with an /etc/mime.types file which has the list of  
officially registered MIME types. I found a script that will take  
this list and convert it into auto-props format. I don't think we  
need to support *all* of the gazillion filetypes since most of the  
them our repository will never see, but we certainly could.


Dave

From dmessina at wustl.edu  Sat Jun 30 12:26:27 2007
From: dmessina at wustl.edu (David Messina)
Date: Sat, 30 Jun 2007 11:26:27 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
Message-ID: <D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>


On Jun 30, 2007, at 10:37 AM, David Messina wrote:

> - We manually add props for the files that have nonstandard,
> arbitrary extensions so all the files have now are prop'd.

Er, that should be

- We manually add props for the files that have nonstandard,  
arbitrary extensions so that all the files now in the repository are  
prop'd.


From n.haigh at sheffield.ac.uk  Sat Jun 30 13:25:58 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 30 Jun 2007 18:25:58 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
Message-ID: <46869226.70203@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- -- snip --
> 
> 
>> There seems to be a plethora of file naming styles which means there's
>> a pretty long list of non-standard extensions. So at some point
>> someone will commit a new data file with a new extension (often
>> describing what program created the output or the test for which it's
>> intended) that won't be in the auto-props file - can you think of a
>> way around this?
> 
> Ive been thinking about this a bit. How about this?
> 
> - We have just "standard" files and extensions (like *.blast, *.fasta)
> in the auto-props list.

I think the list of seq formats recognised by Bioperl in Bio::SeqIO and
Bio::AlignIO would be a good start. As these are likely to be the ones
that are sensitive to file format recognition and thus could break tests
if renamed.

I think a lot of people have used "." in file names as an alternative to
a space. I think it would be beneficial to use an underscore "_" in
these cases and leave the "." to represent the beginning of the file
extension.

> 
> - We manually add props for the files that have nonstandard, arbitrary
> extensions so all the files that we currently have now are prop'd.
> 
> - At some point we rename those nonstandard files to have standard
> extensions. Especially for the t/data/ files, we'll have to make sure to
> update the tests that rely on them.

Nice and easy with svn :)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhpHiczuW2jkwy2gRAuZ5AKCnd2MvCsvSn1NemDVMmabnieR2vACg1Qk0
pYVvXwxq0lpiGfM09RQ6A1I=
=3Lhw
-----END PGP SIGNATURE-----

From cjfields at uiuc.edu  Sat Jun 30 15:11:52 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 14:11:52 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
	<D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>
Message-ID: <C274666B-9771-4296-80BB-8DFFB036F29C@uiuc.edu>


On Jun 30, 2007, at 11:26 AM, David Messina wrote:

>
> On Jun 30, 2007, at 10:37 AM, David Messina wrote:
>
>> - We manually add props for the files that have nonstandard,
>> arbitrary extensions so all the files have now are prop'd.
>
> Er, that should be
>
> - We manually add props for the files that have nonstandard,
> arbitrary extensions so that all the files now in the repository are
> prop'd.

Do we need to define every filetype extension, or can there be a  
fallback (eg if it isn't on the list or has no extension it's plain  
text)?

chris

From hlapp at gmx.net  Sat Jun 30 17:26:22 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 17:26:22 -0400
Subject: [Bioperl-l] Splits again
In-Reply-To: <468409C7.7020102@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
Message-ID: <A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>


On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:

> [...]
> Very definitely the latter. The key benefit of my approach is that  
> the organisation stays as is and that a snapshot of the repository  
> remains a single directory of modules in Bio so that people don't  
> have to 'install' Bioperl, they can still just uncompress the  
> archive (or check out the package from svn) and point their  
> PERL5LIB to the root dir of the package.

I think this is absolutely key to keep in mind. Anything without this  
feature will likely be a non-starter.

I don't really have time to follow the discussion let alone  
participate, so really all I can contribute is to offer some sanity/ 
reality checks (such as the above).

In this sense, I understand a release pumpkin will generate ~900  
packages to upload to CPAN? How much hassle is that compared to what  
uploading a bioperl release means right now?

How brittle is all the Build.PL code that will be needed to automate  
all of this, and how difficult will it be to maintain? For example,  
if someone adds in 10 new modules, what Build.PL-related work will  
need to be done?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Sat Jun 30 17:32:52 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 30 Jun 2007 22:32:52 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
	<A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
Message-ID: <4686CC04.6000403@sendu.me.uk>

Hilmar Lapp wrote:
> On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:
> 
>> [...]
>> Very definitely the latter. The key benefit of my approach is that  
>> the organisation stays as is and that a snapshot of the repository  
>> remains a single directory of modules in Bio so that people don't  
>> have to 'install' Bioperl, they can still just uncompress the  
>> archive (or check out the package from svn) and point their  
>> PERL5LIB to the root dir of the package.
[snip]
> In this sense, I understand a release pumpkin will generate ~900  
> packages to upload to CPAN? How much hassle is that compared to what  
> uploading a bioperl release means right now?

I'd have to investigate. I did my uploads using the PAUSE website, which 
for 900 packages would be unfeasible. Will have to see if the process 
can be automated.


> How brittle is all the Build.PL code that will be needed to automate  
> all of this, and how difficult will it be to maintain? For example,  
> if someone adds in 10 new modules, what Build.PL-related work will  
> need to be done?

Well, my plan will be that once the work is done, you won't need to 
touch the Build.PL code again. My intent is that the pumpkin can just 
type one command and not think about anything.

As for the reality, I won't know until I think about it properly and 
experiment.

From hlapp at gmx.net  Sat Jun 30 19:36:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 19:36:45 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18052.3946.224905.415905@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
Message-ID: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>


On Jun 28, 2007, at 3:43 PM, George Hartzell wrote:

> I just did the experiment, and filename-insensitivity seems to be
> breaking something.
>
> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.
>
> I reformatted a memory stick to be case sensitive and co of
>
>   bioperl/bioperl-live/tags/release-0-9-2/t
>
> worked, then I made a directory in my home dir (normal mac thing) and
> got the same error as above.

You picked up a rename of a file from lower case extension to upper  
case extension. Unfortunately, there are several months between  
adding the upper-case and removing the lower-case version.

We can reconstruct what happened with this using svn log on the  
directory (this does not require a checkout):

$ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ 
bioperl-live/trunk/t/data

Searching for HUMBETGLOA yields the following two commits that added  
one and removed the other:

------------------------------------------------------------------------
r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines
Changed paths:
    M /bioperl-live/trunk/t/SearchIO.t
    A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA
    A /bioperl-live/trunk/t/data/cysprot1.FASTA

added tests for FASTA

------------------------------------------------------------------------
r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines
Changed paths:
    A /bioperl-live/trunk/t/data/HUMBETGLOA.fa
    D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta

renaming file to avoid clobbering on windows

Unfortunately, both files are in the tag (again, no checkout required):

$ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta
HUMBETGLOA.FASTA
HUMBETGLOA.fasta

We can remove the offending version from the repository (again,  
without needing a checkout):

$ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta

I did this, and now the tag checks out fine on OSX. Can anyone confirm?

(BTW the ability to operate on the repository w/o needing a checkout  
is another advantage of svn)

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Jun 30 20:40:53 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 19:40:53 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
	<2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
Message-ID: <A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>

Checkout worked for me (Mac OS X) using both:

svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
tags/release-0-9-2/t/data
svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
tags/release-0-9-2/

so removing the offending file worked (good catch!).  Haven't run a  
full co but probably isn't necessary.

chris

On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote:

>
> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote:
>
>> I just did the experiment, and filename-insensitivity seems to be
>> breaking something.
>>
>> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.
>>
>> I reformatted a memory stick to be case sensitive and co of
>>
>>   bioperl/bioperl-live/tags/release-0-9-2/t
>>
>> worked, then I made a directory in my home dir (normal mac thing) and
>> got the same error as above.
>
> You picked up a rename of a file from lower case extension to upper  
> case extension. Unfortunately, there are several months between  
> adding the upper-case and removing the lower-case version.
>
> We can reconstruct what happened with this using svn log on the  
> directory (this does not require a checkout):
>
> $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ 
> bioperl/bioperl-live/trunk/t/data
>
> Searching for HUMBETGLOA yields the following two commits that  
> added one and removed the other:
>
> ---------------------------------------------------------------------- 
> --
> r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines
> Changed paths:
>    M /bioperl-live/trunk/t/SearchIO.t
>    A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA
>    A /bioperl-live/trunk/t/data/cysprot1.FASTA
>
> added tests for FASTA
>
> ---------------------------------------------------------------------- 
> --
> r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines
> Changed paths:
>    A /bioperl-live/trunk/t/data/HUMBETGLOA.fa
>    D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta
>
> renaming file to avoid clobbering on windows
>
> Unfortunately, both files are in the tag (again, no checkout  
> required):
>
> $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta
> HUMBETGLOA.FASTA
> HUMBETGLOA.fasta
>
> We can remove the offending version from the repository (again,  
> without needing a checkout):
>
> $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta
>
> I did this, and now the tag checks out fine on OSX. Can anyone  
> confirm?
>
> (BTW the ability to operate on the repository w/o needing a  
> checkout is another advantage of svn)
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hartzell at alerce.com  Sat Jun 30 20:48:06 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 30 Jun 2007 17:48:06 -0700
Subject: [Bioperl-l] Take 2 of the new subversion repository.
Message-ID: <18054.63942.316904.413911@almost.alerce.com>


There's a second cut at the subversion repository.  I've done a better
job of setting svn:keywords and svn:eol-style on various files.  The
defaults were more cautious and I used an auto-props files based on
the wiki version.

  svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2

The old repository's still around as

  svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1

I renamed it so that people would work with it by mistake.  If, for
some hard-to-imagine reason, you have a working copy that you want to
run against it, you should be able to do an svn switch --relocate on
your working copy and be back in shape.  In fact, it might be a good
time to give it a try....

g.

From hartzell at alerce.com  Sat Jun 30 21:17:18 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 30 Jun 2007 18:17:18 -0700
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
	<2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
	<A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
Message-ID: <18055.158.30409.808612@almost.alerce.com>

Chris Fields writes:
 > Checkout worked for me (Mac OS X) using both:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
 > tags/release-0-9-2/t/data
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
 > tags/release-0-9-2/
 > 
 > so removing the offending file worked (good catch!).  Haven't run a  
 > full co but probably isn't necessary.
 > [...]

I'll keep a note of that as something to do when I prepare the final
cut of the repository.

g.


From jason at bioperl.org  Sat Jun 30 21:25:30 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 30 Jun 2007 18:25:30 -0700
Subject: [Bioperl-l] Take 2 of the new subversion repository.
In-Reply-To: <18054.63942.316904.413911@almost.alerce.com>
References: <18054.63942.316904.413911@almost.alerce.com>
Message-ID: <D8C71EF7-6E2E-498E-8638-373512ADE3EE@bioperl.org>

Thanks George -
I also did
chgrp -R bioperl /home/hartzell/bioperl_take?
to make sure the group permission was set right.

We may also want to do a chmod g+s on all the dirs in there as well  
so that permissions are preserved when this gets deployed for real.

If anyone wants to make some changes to files and commit them, as  
well as make some branches/tags to play around a little bit since  
we'll likely throw this away and do it again from locked down version  
from CVS at some appointed time.

Do you know how to have svn commit messages generate summary emails  
as well?

-j
On Jun 30, 2007, at 5:48 PM, George Hartzell wrote:

>
> There's a second cut at the subversion repository.  I've done a better
> job of setting svn:keywords and svn:eol-style on various files.  The
> defaults were more cautious and I used an auto-props files based on
> the wiki version.
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2
>
> The old repository's still around as
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1
>
> I renamed it so that people would work with it by mistake.  If, for
> some hard-to-imagine reason, you have a working copy that you want to
> run against it, you should be able to do an svn switch --relocate on
> your working copy and be back in shape.  In fact, it might be a good
> time to give it a try....
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hlapp at gmx.net  Sat Jun 30 22:21:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 22:21:25 -0400
Subject: [Bioperl-l] Take 2 of the new subversion repository.
In-Reply-To: <18054.63942.316904.413911@almost.alerce.com>
References: <18054.63942.316904.413911@almost.alerce.com>
Message-ID: <5F53A433-BAA9-431D-A0C5-5955690D0B73@gmx.net>


On Jun 30, 2007, at 8:48 PM, George Hartzell wrote:

> I renamed it so that people would work with it by mistake.  If, for
> some hard-to-imagine reason, you have a working copy that you want to
> run against it,

It's not so hard to imagine - checking out the entire repository  
takes a long time.

> you should be able to do an svn switch --relocate on
> your working copy and be back in shape.  In fact, it might be a good
> time to give it a try....

It doesn't work:

svn: The repository at 'svn+ssh://dev.open-bio.org/home/hartzell/ 
bioperl_take2' has uuid '31277767-6726-dc11-ab4c-0019e3f901d6', but  
the WC has '27e854f1-f323-dc11-8c1b-0019e3f901d6'

You can't relocate to a totally new repository (relocating to  
bioperl_take1 does work though).

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Jun 30 22:39:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 21:39:27 -0500
Subject: [Bioperl-l] Take 2 of the new subversion repository.
In-Reply-To: <D8C71EF7-6E2E-498E-8638-373512ADE3EE@bioperl.org>
References: <18054.63942.316904.413911@almost.alerce.com>
	<D8C71EF7-6E2E-498E-8638-373512ADE3EE@bioperl.org>
Message-ID: <7C6FD6C9-CBED-40D3-BA90-4B34F79E6DE0@uiuc.edu>

There are a few CPAN modules available; here's one:

http://search.cpan.org/~dwheeler/SVN-Notify-2.66/lib/SVN/Notify.pm

chris

On Jun 30, 2007, at 8:25 PM, Jason Stajich wrote:

> Thanks George -
> I also did
> chgrp -R bioperl /home/hartzell/bioperl_take?
> to make sure the group permission was set right.
>
> We may also want to do a chmod g+s on all the dirs in there as well
> so that permissions are preserved when this gets deployed for real.
>
> If anyone wants to make some changes to files and commit them, as
> well as make some branches/tags to play around a little bit since
> we'll likely throw this away and do it again from locked down version
> from CVS at some appointed time.
>
> Do you know how to have svn commit messages generate summary emails
> as well?
>
> -j
> On Jun 30, 2007, at 5:48 PM, George Hartzell wrote:
>
>>
>> There's a second cut at the subversion repository.  I've done a  
>> better
>> job of setting svn:keywords and svn:eol-style on various files.  The
>> defaults were more cautious and I used an auto-props files based on
>> the wiki version.
>>
>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2
>>
>> The old repository's still around as
>>
>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1
>>
>> I renamed it so that people would work with it by mistake.  If, for
>> some hard-to-imagine reason, you have a working copy that you want to
>> run against it, you should be able to do an svn switch --relocate on
>> your working copy and be back in shape.  In fact, it might be a good
>> time to give it a try....
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sat Jun 30 22:46:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 21:46:05 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4686CC04.6000403@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
	<A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
	<4686CC04.6000403@sendu.me.uk>
Message-ID: <D10BF6DE-D8A6-448A-8850-A7B13AE54266@uiuc.edu>


On Jun 30, 2007, at 4:32 PM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:
>>> [...]
>>> Very definitely the latter. The key benefit of my approach is  
>>> that  the organisation stays as is and that a snapshot of the  
>>> repository  remains a single directory of modules in Bio so that  
>>> people don't  have to 'install' Bioperl, they can still just  
>>> uncompress the  archive (or check out the package from svn) and  
>>> point their  PERL5LIB to the root dir of the package.
> [snip]
>> In this sense, I understand a release pumpkin will generate ~900   
>> packages to upload to CPAN? How much hassle is that compared to  
>> what  uploading a bioperl release means right now?
>
> I'd have to investigate. I did my uploads using the PAUSE website,  
> which for 900 packages would be unfeasible. Will have to see if the  
> process can be automated.

Not that they would care one way or another but maybe we should  
contact the CPAN maintainers to get their thoughts.  They might have  
some ideas...

>> How brittle is all the Build.PL code that will be needed to  
>> automate  all of this, and how difficult will it be to maintain?  
>> For example,  if someone adds in 10 new modules, what Build.PL- 
>> related work will  need to be done?
>
> Well, my plan will be that once the work is done, you won't need to  
> touch the Build.PL code again. My intent is that the pumpkin can  
> just type one command and not think about anything.
>
> As for the reality, I won't know until I think about it properly  
> and experiment.

A good experiment for a branch.  I still think this could be  
accomplished step-wise; for instance run a quick test using something  
with a simple dependency tree like Bio::Root::Root (only needs  
RootI), finish up with Bio::Root*, then work down into PrimarySeq,  
Seq, etc.  Submit them to CPAN piecemeal or in batches (all  
Bio::Seq*, so on).

If the Build.PL, etc are to be generated on the fly then maybe there  
should be a simple way of registering or matching tests to modules  
(or vice versa) to ease the pain, particularly for new code.

chris


From hlapp at gmx.net  Sat Jun 30 22:56:04 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 22:56:04 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
	<2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
	<A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
Message-ID: <E250DB37-E2C1-4F71-A2FE-B64603EB69FD@gmx.net>

It turns out that both files are also present on the release-0-9-3,  
bioperl-1-0-0, bioperl-1-0-alpha, and bioperl-1-0-alpha2-rc tags, so add

$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/release-0-9-3/t/data/ 
HUMBETGLOA.fasta
$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-0/t/data/ 
HUMBETGLOA.fasta
$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha/t/data/ 
HUMBETGLOA.fasta
$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha2-rc/t/data/ 
HUMBETGLOA.fasta

to the post-processing commands.

	-hilmar

On Jun 30, 2007, at 8:40 PM, Chris Fields wrote:

> Checkout worked for me (Mac OS X) using both:
>
> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data
> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/
>
> so removing the offending file worked (good catch!).  Haven't run a  
> full co but probably isn't necessary.
>
> chris
>
> On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote:
>
>>
>> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote:
>>
>>> I just did the experiment, and filename-insensitivity seems to be
>>> breaking something.
>>>
>>> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.
>>>
>>> I reformatted a memory stick to be case sensitive and co of
>>>
>>>   bioperl/bioperl-live/tags/release-0-9-2/t
>>>
>>> worked, then I made a directory in my home dir (normal mac thing)  
>>> and
>>> got the same error as above.
>>
>> You picked up a rename of a file from lower case extension to  
>> upper case extension. Unfortunately, there are several months  
>> between adding the upper-case and removing the lower-case version.
>>
>> We can reconstruct what happened with this using svn log on the  
>> directory (this does not require a checkout):
>>
>> $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ 
>> bioperl/bioperl-live/trunk/t/data
>>
>> Searching for HUMBETGLOA yields the following two commits that  
>> added one and removed the other:
>>
>> --------------------------------------------------------------------- 
>> ---
>> r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2  
>> lines
>> Changed paths:
>>    M /bioperl-live/trunk/t/SearchIO.t
>>    A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA
>>    A /bioperl-live/trunk/t/data/cysprot1.FASTA
>>
>> added tests for FASTA
>>
>> --------------------------------------------------------------------- 
>> ---
>> r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2  
>> lines
>> Changed paths:
>>    A /bioperl-live/trunk/t/data/HUMBETGLOA.fa
>>    D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta
>>
>> renaming file to avoid clobbering on windows
>>
>> Unfortunately, both files are in the tag (again, no checkout  
>> required):
>>
>> $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ 
>> bioperl-live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i  
>> fasta
>> HUMBETGLOA.FASTA
>> HUMBETGLOA.fasta
>>
>> We can remove the offending version from the repository (again,  
>> without needing a checkout):
>>
>> $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
>> live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta
>>
>> I did this, and now the tag checks out fine on OSX. Can anyone  
>> confirm?
>>
>> (BTW the ability to operate on the repository w/o needing a  
>> checkout is another advantage of svn)
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Fri Jun  1 04:06:04 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 01 Jun 2007 09:06:04 +0100
Subject: [Bioperl-l] ClustalW Score?
In-Reply-To: <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu>
References: <00e201c7a2de$91f60f50$2d01a8c0@PICO><DFEEDFC9-68C4-4821-846F-69AC9559C70B@bioperl.org><465E9B58.1020403@sendu.me.uk>	<49B6333A-18B9-4B63-80EF-81C57A295494@bioperl.org>
	<1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu>
Message-ID: <465FD36C.5060603@sendu.me.uk>

Kevin Brown wrote:
>> you're right --- it is not really my code, I was just 
>> elaborating Kevin's example --- it would probably need to be 
>> more specific or perhaps the last Score seen is sufficient 
>> for what one is trying to capture?
> 
> I took that code from a pairwise clustal alignment script that I wrote
> to deal with aligning a bunch of short sequences against a long one to
> see where they line up at.  When all of them were fed to Clustal the
> short sequences all ended up aligned to each other and not well aligned
> to the longer sequence.  I only saw one score in the output from the
> pairwise, so that is what I used to find a reasonable value.

Ok, well I've hedged my bets and used both. Now commited to CVS.


From jy at genseq.co.uk  Fri Jun  1 22:39:48 2007
From: jy at genseq.co.uk (Jean-Yves Sireau)
Date: Sat, 2 Jun 2007 10:39:48 +0800
Subject: [Bioperl-l] Genseq
Message-ID: <20070602103948.093d713c@jys.my.regentmarkets.com>

Dear List members,

I would like to let you know of the formation of Genseq Ltd., a
bioinformatics company that will (in time!) offer genome sequencing to
high net worth individuals and bioinformatic analysis of the sequence
data to detect predisposition to illness.  The company's website is
www.genseq.co.uk

Genseq would be willing to sponsor bioperl, whether financially or by
providing resources, notably for any bioperl-related activities in the
Asia Pacific region.  Genseq's bioinformatics team will be based in
Cyberjaya (Malaysia), and we are in particular interested to promote
bioperl in Malaysia.  We are also actively recruiting at the moment
in Malaysia and India.

If there was sufficient demand, we would be willing to organise a
bioperl conference in Cyberjaya at the Cyberview Lodge
(www.cyberview-lodge.com), which would be the ideal place for such a
conference in Malaysia.

Looking forward to your comments, suggestions and proposals.

Best regards
Jean-Yves Sireau

-- 

Jean-Yves Sireau
CEO, Genseq Ltd.
www.genseq.co.uk


From cjfields at uiuc.edu  Sat Jun  2 01:16:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 00:16:05 -0500
Subject: [Bioperl-l] EUtilities overhaul started
Message-ID: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>

To anyone using Bio::DB::EUilities,

I am in the midst of a major overhaul to the various EUtilities tools  
and to Bio::DB::GenericWebDBI (the latter which I am forming into  
more or less a test bed for other database interfaces).  I'm about  
80% done at this point, and will likely start committing changes this  
coming week.

The overall interface will change (something I had warned about in  
the Bio::DB::EUtilities POD) but I am hoping it will be more  
intuitive and easier to use in the long run.  I'll describe the  
overall redesign and use in an upcoming HOWTO (as recommended by  
Brian a while back).

If anyone has any suggestions/ideas/flames, please let me know!

Cheers!

chris


From cjfields at uiuc.edu  Sat Jun  2 10:39:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 09:39:25 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
Message-ID: <AF243C87-B82E-4C33-939D-2B84B9E41537@uiuc.edu>

Yes, there are a few odd issues, though that's one I've not heard of  
yet.  You might try one of the sub-nucleotide databases (nuccore,  
nucest, nucgss).

I'll try looking into it and (if necessary) pester NCBI about it.   
I'll pass this on to the mail list to see if anyone else knows about  
the problem.

chris

On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote:

> Hi Chris,
>
> Thanks for your work on EUtilities.
> For a production task, I used EUtilitities directly (given your
> announced overhaul). I noticed a recent problem at NCBI (reported two
> weeks ago to NCBI, no reply yet). Possibly you may run into this with
> testing: if you ePOST gi ids to the EU server and then use this set in
> Esearch (using the query key) no results are returned for the
> nucleotide database.
> ESearches like "db=$db%23$QueryKey" typically fail if the $db is
> nucleotide (but work f $db='protein'). The XML output has Count 0 and
> an empty QueryTranslationSet for db=nucleotide only.
> For completeness, I attach a simple test script I used.
>
>
> Best regards,
> Bernd
>
>
> On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> To anyone using Bio::DB::EUilities,
>>
>> I am in the midst of a major overhaul to the various EUtilities tools
>> and to Bio::DB::GenericWebDBI (the latter which I am forming into
>> more or less a test bed for other database interfaces).  I'm about
>> 80% done at this point, and will likely start committing changes this
>> coming week.
>>
>> The overall interface will change (something I had warned about in
>> the Bio::DB::EUtilities POD) but I am hoping it will be more
>> intuitive and easier to use in the long run.  I'll describe the
>> overall redesign and use in an upcoming HOWTO (as recommended by
>> Brian a while back).
>>
>> If anyone has any suggestions/ideas/flames, please let me know!
>>
>> Cheers!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> <EUsearch.pl>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Jun  3 00:51:57 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 23:51:57 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <e572b3c70706020948l708f14c8q706b65c73617c86d@mail.gmail.com>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
	<AF243C87-B82E-4C33-939D-2B84B9E41537@uiuc.edu>
	<e572b3c70706020948l708f14c8q706b65c73617c86d@mail.gmail.com>
Message-ID: <1A2AF5C4-6A58-4FDD-A4CA-6ABCE30F0D1B@uiuc.edu>

I can confirm this; however it only relates to the use of history  
with esearch and nucleotide (use of the history with other eutils  
seems to work fine); retrieving sequences via efetch is not  
affected.  If I find out anything more I'll post something on the  
mail list.

chris

On Jun 2, 2007, at 11:48 AM, Bernd Brandt wrote:

> I can confirm that using the correct sub-nucleotide database works
> (nuccore in my case).
> This seems to be a quite recent change/bug at NCBI. Until recently,
> db=nucleotide worked. Moreover, EInfo still lists nucleotide as valid
> db.
> It is not optimal to have to choose the sub-database and the searches
> work via the Entrez web-interface. Note that this problem is related
> to the ESearch and db=nucleotide.
>
> bernd
>
> On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> Yes, there are a few odd issues, though that's one I've not heard of
>> yet.  You might try one of the sub-nucleotide databases (nuccore,
>> nucest, nucgss).
>>
>> I'll try looking into it and (if necessary) pester NCBI about it.
>> I'll pass this on to the mail list to see if anyone else knows about
>> the problem.
>>
>> chris
>>
>> On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote:
>>
>> > Hi Chris,
>> >
>> > Thanks for your work on EUtilities.
>> > For a production task, I used EUtilitities directly (given your
>> > announced overhaul). I noticed a recent problem at NCBI  
>> (reported two
>> > weeks ago to NCBI, no reply yet). Possibly you may run into this  
>> with
>> > testing: if you ePOST gi ids to the EU server and then use this  
>> set in
>> > Esearch (using the query key) no results are returned for the
>> > nucleotide database.
>> > ESearches like "db=$db%23$QueryKey" typically fail if the $db is
>> > nucleotide (but work f $db='protein'). The XML output has Count  
>> 0 and
>> > an empty QueryTranslationSet for db=nucleotide only.
>> > For completeness, I attach a simple test script I used.
>> >
>> >
>> > Best regards,
>> > Bernd
>> >
>> >
>> > On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> >> To anyone using Bio::DB::EUilities,
>> >>
>> >> I am in the midst of a major overhaul to the various EUtilities  
>> tools
>> >> and to Bio::DB::GenericWebDBI (the latter which I am forming into
>> >> more or less a test bed for other database interfaces).  I'm about
>> >> 80% done at this point, and will likely start committing  
>> changes this
>> >> coming week.
>> >>
>> >> The overall interface will change (something I had warned about in
>> >> the Bio::DB::EUtilities POD) but I am hoping it will be more
>> >> intuitive and easier to use in the long run.  I'll describe the
>> >> overall redesign and use in an upcoming HOWTO (as recommended by
>> >> Brian a while back).
>> >>
>> >> If anyone has any suggestions/ideas/flames, please let me know!
>> >>
>> >> Cheers!
>> >>
>> >> chris
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>
>> >> <EUsearch.pl>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From basu at pharm.stonybrook.edu  Sun Jun  3 10:44:18 2007
From: basu at pharm.stonybrook.edu (Siddhartha Basu)
Date: Sun, 03 Jun 2007 10:44:18 -0400
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
Message-ID: <web-5961520@pharm.stonybrook.edu>

On Sat, 2 Jun 2007 00:16:05 -0500
  Chris Fields <cjfields at uiuc.edu> wrote:
> To anyone using Bio::DB::EUilities,
> 
> I am in the midst of a major overhaul to the various 
>EUtilities tools  
> and to Bio::DB::GenericWebDBI (the latter which I am 
>forming into  
> more or less a test bed for other database interfaces). 
> I'm about  
> 80% done at this point, and will likely start committing 
>changes this  
> coming week.
> 
> The overall interface will change (something I had 
>warned about in  
> the Bio::DB::EUtilities POD) but I am hoping it will be 
>more  
> intuitive and easier to use in the long run.  I'll 
>describe the  
> overall redesign and use in an upcoming HOWTO (as 
>recommended by  
> Brian a while back).

Hi chris,
Being a frequent user of EUtilities, hopefully this api 
facelift and upcoming howto will definitely be more 
helpful.
Anyway, one thing i noticed that for each eutil call such 
as efetch,epost,esearch,esummary a new 
'Bio::DB::Utilities' object has to be
instantiated. And thereafter it cannot be set during 
runtime such as
$eutils->id('ids'), for example....

my $eutils = Bio::DB::Eutilities->new ( -id => $id,
                                        -eutil => 
'esummary',
                                        -db => 'protein',
                                      );
my $ct = $eutils->get_response->content();

## -- now i cannot do this...
$eutils->id($newid);
my $ct = $eutils->get_response->content();

Is the new api going to address something along this line 
or is there currently anyway to reuse
the object.
Thanks again for this nice toolkit.

-siddhartha


> 
> If anyone has any suggestions/ideas/flames, please let 
>me know!
> 
> Cheers!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Sun Jun  3 19:52:39 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 3 Jun 2007 18:52:39 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <web-5961520@pharm.stonybrook.edu>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<web-5961520@pharm.stonybrook.edu>
Message-ID: <5120BD7B-CA89-46E4-8D6B-6B24C1F93A5E@uiuc.edu>

On Jun 3, 2007, at 9:44 AM, Siddhartha Basu wrote:

> ...
> Hi chris,
> Being a frequent user of EUtilities, hopefully this api facelift  
> and upcoming howto will definitely be more helpful.
> Anyway, one thing i noticed that for each eutil call such as  
> efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has  
> to be
> instantiated. And thereafter it cannot be set during runtime such as
> $eutils->id('ids'), for example....
>
> my $eutils = Bio::DB::Eutilities->new ( -id => $id,
>                                        -eutil => 'esummary',
>                                        -db => 'protein',
>                                      );
> my $ct = $eutils->get_response->content();
>
> ## -- now i cannot do this...
> $eutils->id($newid);
> my $ct = $eutils->get_response->content();

I'll have to check up on that, though changing id() should work with  
the old API.  It won't matter with the new API (it works fine), but  
it is still troubling...

> Is the new api going to address something along this line or is  
> there currently anyway to reuse
> the object.
> Thanks again for this nice toolkit.
>
> -siddhartha

The old API was based upon the idea of creating discrete user agents  
for each eutil to retrieve data.  The problem with the old interface  
is it attempts to do too much (take care of parameters, set up  
requests, retrieve responses, parse data, etc), and many tasks  
required instantiating a new EUtilities object.  I was never really  
satisfied with it.

The new interface is a composition of three classes: the web user  
agent (LWP::UserAgent), a class encapsulating parameter handling, and  
a parser class (all which can be used independently if needed).  When  
parameters change a new request is made 'lazily' (i.e. only when  
needed).  Similarly, when data is requested after any parameter  
change a new parser instance is created and the new response is parsed.

With that in mind you can now do the following:
----------------------------------------
my @params = (-eutil => 'esearch',
               -db    => 'protein',
               -term => 'BRCA1',
               -retmax => 100);

my $eutil = Bio::DB::EUtilities->new(@params);

# no need to get response first; get_ids() calls that if needed

my @ids = $eutil->get_ids;

# below changes only those parameters, leaves all others set as before
$eutil->set_parameters(-eutil => 'efetch',
                        -id  => \@ids,
                        -retmode => 'text',
                        -rettype => 'fasta');

# sends streamed content directly to a file
$eutil->get_response(-content_file => 'seqs.fas');

# or to a LWP::UserAgent-supported request callback
$eutil->get_response(-content_cb => \&my_cb);

my @newparams = (-eutil => 'esearch',
               -db    => 'protein',
               -term => 'BRCA2',
               -retmax => 100);

# Resets eutility to passed parameters (or undef)
$eutil->reset_parameters(@newparams);

# retrieve new IDs
my @new_ids = $eutil->get_ids;
----------------------------------------

Note the same eutil object is used for all of the above, so to answer  
your last question, yes, you should be able to create data pipelines  
using the same object if necessary.

chris


From sac at bioperl.org  Mon Jun  4 13:56:57 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Mon, 4 Jun 2007 10:56:57 -0700
Subject: [Bioperl-l] question about Bio::Restriction::Analysis
In-Reply-To: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu>
References: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu>
Message-ID: <8f200b4c0706041056o4dbaadfexddf9f82fc33c6da@mail.gmail.com>

Hi Apurva,

I'm cc:ing the list to let others know you have found performance
issues with Bio::Restriction::Analysis. Ideally, we should focus on
addressing those issues rather than fixing a module that is now
deprecated.

But taking a quick look at my Bio::Tools::RestrictionEnzyme module,
I'm not sure why HpaII would give slower performance relative to other
non-ambiguous cutters. This enzyme has a 4-base recognition sequence
CCGG, and if you're feeding it a large CG-rich input sequence, that
could be a factor. To test, you might try using some other 4-base
cutters that aren't CG-rich (TaqI, TasI) or try some other input
sequences. There is no special flag to indicate that the enzyme is
non-ambiguous. The module handles that automatically.

Good luck,
Steve

On 6/4/07, Apurva Narechania <apurva at cshl.edu> wrote:
> Hi Rob and Steve,
>
> I was hoping you could answer a quick performance question regarding
> the Bio::Restriction::Analysis module. I have found that though this
> module works well, it is considerably slower than the deprecated
> Bio::Tools::RestrictionEnzyme. I see that there are two algorithms
> available to your module, and since I am using HpaII, a non-ambiguous
> enzyme, I thought I might find similar performance to the older,
> deprecated module, but I do not. Is it possible that I am not setting
> the non-ambiguous flag correctly? Does it need to be set in the first
> place?
>
> As far as Bio::Tools::RestrictionEnzyme, though it is faster, I have
> found instances where it is inaccurate, especially in calculating
> fragments of extremely small size 1-5 base pairs, so I would like to
> use your module if possible. It just seems slow to me.
>
> Can you clarify?
>
> I have copied my code below since it is a short, simple script.
>
> Thanks!
> Apurva Narechania
> Ware Lab
> Cold Spring Harbor Labs
>
> ----------
>
> #!/usr/bin/perl
>
> # This program generates a fasta of restriction frags given an
> # input fasta and a restriction cut site
>
> use Getopt::Std;
> use Bio::Seq;
> use Bio::SeqIO;
> use strict;
>
> use Bio::Tools::RestrictionEnzyme;
>
> my %opts = ();
> getopts ('f:', \%opts);
> my $fasta  = $opts{'f'};
>
> # read fasta file
> my $seqin = Bio::SeqIO -> new (-format => 'Fasta', -file => "$fasta");
>
> my $x = 0;
> while (my $sequence_obj = $seqin -> next_seq()){
>      $x++;
>      my $id = $sequence_obj->id();
>
>      print STDERR "$x Working on $id\n";
>
>      # generate the rx object
>      my $ra = new Bio::Tools::RestrictionEnzyme(-NAME=>'HpaII');
>
>      my @frags = $ra->cut_seq($sequence_obj);
>
>      my $counter = 0;
>      foreach my $frag (@frags){
>          $counter++;
>          my $length = length ($frag);
>          print ">$id.$counter length=$length\n$frag\n";
>      }
>
> }
>
>


From anhthu.tieu at gsf.de  Tue Jun  5 04:14:09 2007
From: anhthu.tieu at gsf.de (Tieu, Anh-Thu)
Date: Tue, 5 Jun 2007 10:14:09 +0200
Subject: [Bioperl-l] problems with image maps and IE 6 or higher
Message-ID: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>

Hi, 

 I have a problem using the bioperl image maps function with the IE6 or and
 higher browser. It might be a more general problem with IE6 rather than with bioperl,
 but as I used bioperl to create my image maps, I thought I could still post this problem 
 here and ask for people's opinion. I wondered if anyone else faced the same problem and if
 possible if anyone could share their experiences and their solutions. 
 
  
<div>
<p><img src="/ggtc/tmp_bilder/19727dab708e1cbf567dd48480febb96.png" usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/></p>
<map name="mapnameD064C01" id="mapnameD064C01">
<area shape="rect" coords="108,0,608,20" href="javascript:void(0)" onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale " alt="scale " target="_blank"/>
<area shape="rect" coords="234,44,244,55" href="javascript:void(0)" onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
alue: ' ));;return false;" title="alignment5 " alt="alignment5 " target="_blank"/>
<area shape="rect" coords="241,57,247,68" href="javascript:void(0)" onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
alue: ' ));;return false;" title="integration_pt " alt="integration_pt " target="_blank"/>
<area shape="rect" coords="108,70,608,81" href="javascript:void(0)" onclick="javascript:void(zmenu( 'Nphs1                                   ', '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', '
stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene " alt="gene " target="_blank"/>
<area shape="rect" coords="108,83,117,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop: 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a
lt="exon1 " target="_blank"/>
<area shape="rect" coords="117,83,119,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop: 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1
 " alt="intron1 " target="_blank"/>
<area shape="rect" coords="119,83,123,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop: 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a
lt="exon2 " target="_blank"/>
<area shape="rect" coords="123,83,124,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop: 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2
...
</div>


 This is part of the code I used in my HTML file to display the image map and it really runs beautifully
 with Mozilla 1.7 or the latest Firefox version. However, if used in IE6 the clickable pop-ups do not appear/ work.
 
 I appreciate any help and would like to thank everyone for their help. 
 
 Best regards, 
 
 
 Anh-Thu
________________________________________________________________________
GSF-Forschungszentrum

Ingolst?dter Landstr. 1

85764 M?nchen-Neuherberg, Germany

Chairman of Supervisory Board: MinDir Dr. Peter Lange

Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum

Register of Societies: Amtsgericht M?nchen HRB 6466


From lstein at cshl.edu  Tue Jun  5 09:56:57 2007
From: lstein at cshl.edu (Lincoln Stein)
Date: Tue, 5 Jun 2007 09:55:57 -0401
Subject: [Bioperl-l] problems with image maps and IE 6 or higher
In-Reply-To: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>
References: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>
Message-ID: <6dce9a0b0706050656n783d27b3u9229f948b2710d90@mail.gmail.com>

Hi Anh-Thu,

Could you send me a snippet of the code that is generating this imagemap? It
looks like you are relying on a javascript library for the zmenu() call, and
it may be that this library is in need of updating.

You might also consider replacing the library with Sheldon McKay's popup
balloon library, located at
http://www.wormbase.org/wiki/index.php/Balloon_Tooltips

Lincoln

On 6/5/07, Tieu, Anh-Thu <anhthu.tieu at gsf.de> wrote:
>
> Hi,
>
> I have a problem using the bioperl image maps function with the IE6 or and
> higher browser. It might be a more general problem with IE6 rather than
> with bioperl,
> but as I used bioperl to create my image maps, I thought I could still
> post this problem
> here and ask for people's opinion. I wondered if anyone else faced the
> same problem and if
> possible if anyone could share their experiences and their solutions.
>
>
> <div>
> <p><img src="/ggtc/tmp_bilder/19727dab708e1cbf567dd48480febb96.png"
> usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/></p>
> <map name="mapnameD064C01" id="mapnameD064C01">
> <area shape="rect" coords="108,0,608,20" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale "
> alt="scale " target="_blank"/>
> <area shape="rect" coords="234,44,244,55" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '',
> 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
> alue: ' ));;return false;" title="alignment5 " alt="alignment5 "
> target="_blank"/>
> <area shape="rect" coords="241,57,247,68" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '',
> 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
> alue: ' ));;return false;" title="integration_pt " alt="integration_pt "
> target="_blank"/>
> <area shape="rect" coords="108,70,608,81" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'Nphs1                                   ',
> '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', '
> stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene "
> alt="gene " target="_blank"/>
> <area shape="rect" coords="108,83,117,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop:
> 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a
> lt="exon1 " target="_blank"/>
> <area shape="rect" coords="117,83,119,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop:
> 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1
> " alt="intron1 " target="_blank"/>
> <area shape="rect" coords="119,83,123,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop:
> 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a
> lt="exon2 " target="_blank"/>
> <area shape="rect" coords="123,83,124,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop:
> 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2
> ..
> </div>
>
>
> This is part of the code I used in my HTML file to display the image map
> and it really runs beautifully
> with Mozilla 1.7 or the latest Firefox version. However, if used in IE6
> the clickable pop-ups do not appear/ work.
>
> I appreciate any help and would like to thank everyone for their help.
>
> Best regards,
>
>
> Anh-Thu
> ________________________________________________________________________
> GSF-Forschungszentrum
>
> Ingolst?dter Landstr. 1
>
> 85764 M?nchen-Neuherberg, Germany
>
> Chairman of Supervisory Board: MinDir Dr. Peter Lange
>
> Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum
>
> Register of Societies: Amtsgericht M?nchen HRB 6466
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From cjfields at uiuc.edu  Tue Jun  5 11:28:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 5 Jun 2007 10:28:24 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <46656D64.7010508@ribosome.natur.cuni.cz>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
Message-ID: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>

Martin,

The example file you give in the bioperl bugzilla report has several  
blank annotation lines which may lead to additional problems.  When  
the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,  
DEFINITION, etc) then it expects there will also be relevant data  
(text descriptions) accompanying it; I assume the BioPython parser  
expects likewise though I may be wrong.

AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- 
compliant.  GenBank records lacking text either have a '.' instead or  
are left out entirely:

http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html

We could add a fix but you should probably contact the ApE developers  
and request that field names w/o text be left out or have '.' added.

chris

On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:

> Ezequiel Panepucci wrote:
>>>     genbank entry = parser.parse(fhandle)
>>
>> there is a space character between "genbank" and "entry".
>> It is a syntax error.
>> I suppose you meant "genbank_entry" ?
>
> Yes, the next command was right and has shown the error. Sorry, I  
> forgot
> to delete the first attempt. ;-)
>
>>>> genbank_entry = parser.parse(fhandle)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",  
> line 187, in parse
>    self._scanner.feed(handle, self._consumer)
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",  
> line 360, in feed
>    self._feed_first_line(consumer, self.line)
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",  
> line 835, in _feed_first_line
>    assert False, \
> AssertionError: Did not recognise the LOCUS line layout:
> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>
>>>>
>
> Martin
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stewarta at nmrc.navy.mil  Tue Jun  5 11:34:14 2007
From: stewarta at nmrc.navy.mil (Andrew Stewart)
Date: Tue, 5 Jun 2007 11:34:14 -0400
Subject: [Bioperl-l] Setting attributes on a Bio::DB::GFF::Feature object
Message-ID: <95C9F539-A4C4-4B6A-8DA8-079B957BF909@nmrc.navy.mil>

I see bidirectional mutator methods for source, type, strand, etc. in  
the Bio::DB::GFF::Feature documentation but I see that ->attributes  
is only able to get and not set the feature attributes.  Is there no  
way to modify the attributes of a Bio::DB::GFF::Feature live?


--
Andrew Stewart
Research Assistant, Genomics Team
Navy Medical Research Center (NMRC)
Biological Defense Research Directorate (BDRD)
BDRD Annex
12300 Washington Avenue, 2nd Floor
Rockville, MD 20852

email: stewarta at nmrc.navy.mil
phone: 301-231-6700 Ext 270


From cjfields at uiuc.edu  Tue Jun  5 12:07:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 5 Jun 2007 11:07:41 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
Message-ID: <D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>

One thing I missed which explains the biopython error: the LOCUS line  
is missing the locus identifier (see the NCBI example record link).   
This doesn't choke the bioperl parser but it appears to stop the  
biopython parser in it's tracks (maybe a feature instead of a bug!).

You should try adding a unique identifier (maybe the name of the file  
or record) to the LOCUS line to see if it works:

LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006

The bioperl parser in CVS writes out the correct alphabet when this  
is added:

LOCUS       testfile                6499 bp    ds-DNA  linear   02- 
AUG-2006

I'll try adding a warning to the bioperl parser for this.

chris

On Jun 5, 2007, at 10:28 AM, Chris Fields wrote:

> Martin,
>
> The example file you give in the bioperl bugzilla report has several
> blank annotation lines which may lead to additional problems.  When
> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,
> DEFINITION, etc) then it expects there will also be relevant data
> (text descriptions) accompanying it; I assume the BioPython parser
> expects likewise though I may be wrong.
>
> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL-
> compliant.  GenBank records lacking text either have a '.' instead or
> are left out entirely:
>
> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html
>
> We could add a fix but you should probably contact the ApE developers
> and request that field names w/o text be left out or have '.' added.
>
> chris
>
> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:
>
>> Ezequiel Panepucci wrote:
>>>>     genbank entry = parser.parse(fhandle)
>>>
>>> there is a space character between "genbank" and "entry".
>>> It is a syntax error.
>>> I suppose you meant "genbank_entry" ?
>>
>> Yes, the next command was right and has shown the error. Sorry, I
>> forgot
>> to delete the first attempt. ;-)
>>
>>>>> genbank_entry = parser.parse(fhandle)
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in ?
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",
>> line 187, in parse
>>    self._scanner.feed(handle, self._consumer)
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>> line 360, in feed
>>    self._feed_first_line(consumer, self.line)
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>> line 835, in _feed_first_line
>>    assert False, \
>> AssertionError: Did not recognise the LOCUS line layout:
>> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>>
>>>>>
>>
>> Martin
>> _______________________________________________
>> BioPython mailing list  -  BioPython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From staffa at niehs.nih.gov  Tue Jun  5 22:00:34 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Tue, 05 Jun 2007 22:00:34 -0400
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C170E69F.246E%staffa@niehs.nih.gov>
Message-ID: <C28B8D82.51AE%staffa@niehs.nih.gov>

I am wondering if I knew what this error message exactly meant, if I could
discern my error. 
I don't see much difference in this program and programs that worked.
Can I assume that the new worked because an index file exists?
I don't know how the filehandle UTR_TT_GENES gets involved.
Maybe I should use some other module, but I really would like to have
get_Seq_by_id functionality.

The error message:
Dpse ortholog = Dpse_GA17307
fetching GA17307
Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,
<UTR_TT_GENES> line 4.

Relevant code:
#!/usr/bin/perl
#
#
#
use strict;
use Bio::DB::Fasta;
use Bio::Tools::SeqWords;
use Bio::Seq;
use Bio::SeqIO;
#
my $db = 
Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/TT_orthol
ogs_Dpse_genes.fa',
                                -makeid => \&make_my_id);
...
...
...
my $pse_obj = $db->get_Seq_by_id('GA17307');
my $pse_sequence = $pse_obj->seq;


Nick Staffa 
Telephone: 919-316-4569  (NIEHS: 6-4569)
Scientific Computing Support Group
NIEHS Information Technology Support Services Contract
(Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov)
National Institute of Environmental Health Sciences
National Institutes of Health
Research Triangle Park, North Carolina


From jason at bioperl.org  Tue Jun  5 23:12:40 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 5 Jun 2007 20:12:40 -0700
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C28B8D82.51AE%staffa@niehs.nih.gov>
References: <C28B8D82.51AE%staffa@niehs.nih.gov>
Message-ID: <EC9E4A2E-2C06-4ADE-8317-9E25DDF1C9C4@bioperl.org>

the file handle is probably not important, Perl just reports this if  
there is a filehandle open.

more importantly what is on line 84....

my guess is you are trying to get a sequence out and it doesn't exist  
- some error code around the lines getting the sequence out would be  
helpful.


On Jun 5, 2007, at 7:00 PM, Staffa, Nick (NIH/NIEHS) wrote:

> I am wondering if I knew what this error message exactly meant, if  
> I could
> discern my error.
> I don't see much difference in this program and programs that worked.
> Can I assume that the new worked because an index file exists?
> I don't know how the filehandle UTR_TT_GENES gets involved.
> Maybe I should use some other module, but I really would like to have
> get_Seq_by_id functionality.
>
> The error message:
> Dpse ortholog = Dpse_GA17307
> fetching GA17307
> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl  
> line 84,
> <UTR_TT_GENES> line 4.
>
> Relevant code:
> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> #
> my $db =
> Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/ 
> TT_orthol
> ogs_Dpse_genes.fa',
>                                 -makeid => \&make_my_id);
> ...
> ...
> ...
> my $pse_obj = $db->get_Seq_by_id('GA17307');
> my $pse_sequence = $pse_obj->seq;
>
>
>
>
> Nick Staffa
> Telephone: 919-316-4569  (NIEHS: 6-4569)
> Scientific Computing Support Group
> NIEHS Information Technology Support Services Contract
> (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov)
> National Institute of Environmental Health Sciences
> National Institutes of Health
> Research Triangle Park, North Carolina
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070605/7e056ff6/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2613 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070605/7e056ff6/attachment-0002.bin>

From torsten.seemann at infotech.monash.edu.au  Wed Jun  6 02:06:37 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 6 Jun 2007 16:06:37 +1000
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C28B8D82.51AE%staffa@niehs.nih.gov>
References: <C170E69F.246E%staffa@niehs.nih.gov>
	<C28B8D82.51AE%staffa@niehs.nih.gov>
Message-ID: <a79f6a4b0706052306r16f7ce61y28448c18349ac3f4@mail.gmail.com>

Nick,

> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,

The error makes it pretty clear. You are calling the ->seq method on
an undefined value, ie. $pse_obj.

> my $pse_obj = $db->get_Seq_by_id('GA17307');

# check we got something!
die "sequence not in database" unless $pse_obj;

> my $pse_sequence = $pse_obj->seq;


-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From shameer at ncbs.res.in  Wed Jun  6 02:27:42 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Wed, 6 Jun 2007 11:57:42 +0530 (IST)
Subject: [Bioperl-l] Validation of files using BioPerl
Message-ID: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>

Dear All,

How to validate an input file in fasta/PIR/GenPept/PDB format using
Bioperl ? (This is to avoid unnecessary files to be submitted to servers
by new users).   Any module available ?

Many thanks in advance,
-- 
Shameer Khadar


From cjfields at uiuc.edu  Wed Jun  6 08:37:28 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 6 Jun 2007 07:37:28 -0500
Subject: [Bioperl-l] Validation of files using BioPerl
In-Reply-To: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>
References: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>
Message-ID: <39F5F622-0C93-4DC5-B969-491F789FC932@uiuc.edu>

It has been discussed but never coded.  I believe if it passes  
through the Bio::SeqIO parser it's generally considered validly  
formatted (spacing, balanced quotes), though it doesn't specifically  
check FT keys and qualifiers for invalid ones, look for missing  
annotation, check taxonomy, etc.

As long as the end sequence mark (//) is present for every file, you  
cold try parsing the file into chunks (read with 'local $/ = '//';')  
and tossing the seq chunks as a filehandle (via IO::String) to a  
Bio::SeqIO object wrapped in an eval block (the parser resets $/, so  
it should work).  Follow the eval with a check of $@ for caught  
errors.  It might get tedious for big sequences...

chris

On Jun 6, 2007, at 1:27 AM, Shameer Khadar wrote:

> Dear All,
>
> How to validate an input file in fasta/PIR/GenPept/PDB format using
> Bioperl ? (This is to avoid unnecessary files to be submitted to  
> servers
> by new users).   Any module available ?
>
> Many thanks in advance,
> -- 
> Shameer Khadar
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From staffa at niehs.nih.gov  Wed Jun  6 10:40:49 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Wed, 06 Jun 2007 10:40:49 -0400
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <a79f6a4b0706052306r16f7ce61y28448c18349ac3f4@mail.gmail.com>
Message-ID: <C28C3FB1.4B73%staffa@niehs.nih.gov>

Indeed.
One must know what is actually in his header,
AND 
one must write the appropriate make_id subroutine
AND
one must specify the exact ID.
THEN things might work.
And they did!
THANK YOU


On 6/6/07 2:06 AM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

> Nick,
> 
>> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,
> 
> The error makes it pretty clear. You are calling the ->seq method on
> an undefined value, ie. $pse_obj.
> 
>> my $pse_obj = $db->get_Seq_by_id('GA17307');
> 
> # check we got something!
> die "sequence not in database" unless $pse_obj;
> 
>> my $pse_sequence = $pse_obj->seq;
> 


From jaudall at gmail.com  Wed Jun  6 17:51:33 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Wed, 6 Jun 2007 15:51:33 -0600
Subject: [Bioperl-l] blastxml interation
Message-ID: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>

I was searching in the deobfuscator under
*Bio::Search::Result::BlastResult*but there doesn't seem to be a
method to extract the iteration number from a
blastxml report.  I can see this number being possibly useful to count the
number of queries that didn't hit anything since the are no empty reports in
the blastxml output.  If I'm missing something, I would welcome an example
how to retrieve the result iteration number.  Thanks in advance for any
suggestions.

Josh


From dmessina at wustl.edu  Wed Jun  6 18:18:26 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 6 Jun 2007 17:18:26 -0500
Subject: [Bioperl-l] blastxml interation
In-Reply-To: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
Message-ID: <CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>

I think you want to look at the hits(), num_hits() and no_hits_found 
() methods. There is a private method _next_iteration_index() which  
should do what you asked for, but num_hits() looks like the better way.

By the way, hits() and num_hits() are listed on the Deobfuscator as  
having no documentation. This (as the below shows) is incorrect and  
is due to some nonstandard formatting issues which I will correct.  
_next_iteration_index() isn't listed on the Deobfuscator because it's  
a private method.


Hope this helps!
Dave


hits()

This method overrides Bio::Search::Result::GenericResult::hits to take
into account the possibility of multiple iterations, as occurs in PSI- 
BLAST reports.
If there are multiple iterations, all 'new' hits for all iterations  
are returned.
These are the hits that did not occur in a previous iteration.
See Also: Bio::Search::Result::GenericResult::hits

num_hits()

This method overrides Bio::Search::Result::GenericResult::num_hits to  
take
into account the possibility of multiple iterations, as occurs in PSI- 
BLAST reports.
If there are multiple iterations, calling num_hits() returns the  
number of
'new' hits for each iteration. These are the hits that did not occur
in a previous iteration.
See Also: Bio::Search::Result::GenericResult::num_hits

no_hits_found()

  Usage     : $nohits = $blast->no_hits_found( $iteration_number );
  Purpose   : Get boolean indicator indicating whether or not any hits
              were present in the report.
              This is NOT the same as determining the number of hits via
              the hits() method, which will return zero hits if there  
were no
              hits in the report or if all hits were filtered out  
during the parse.

              Thus, this method can be used to distinguish these  
possibilities
              for hitless reports generated when filtering.

  Returns   : Boolean
  Argument  : (optional) integer indicating the iteration number (PSI- 
BLAST)
              If iteration number is not specified and this is a PSI- 
BLAST result,
              then this method will return true only if all  
iterations had
              no hits found.


From apurva at cshl.edu  Wed Jun  6 19:51:45 2007
From: apurva at cshl.edu (Apurva Narechania)
Date: Wed, 6 Jun 2007 19:51:45 -0400
Subject: [Bioperl-l] non-palindromic issue in Bio::Restriction::Analysis
Message-ID: <3F7C7E33-416A-4141-969A-DDC4716E8A44@cshl.edu>

Hi,

I was hoping you could confirm and give me some feedback on an issue  
I think I've found with the Bio::Restriction::Analysis module. I am  
using the enzyme AciI, a non-palindromic restriction enzyme with a 5'  
C | CGC 3' recognition site. The module should search both the  
forward and the reverse complement strings in the case of a non- 
palindromic enzyme. I have found that the this works only  
intermittently. For example, the following sequence:

GAAAAAAACAAAGGAAGAAGCTAGCTAGCAGGGCACGCGGTTTGAGGATGGCTGGTGGCCGACCGCAGGGCG 
CGCGGTTG
GAGGATTGCTGGTGGCCGACCAGATGAAACTCACGCGCGGCTGGGGACAGCTGGAATATTTGGGCGGCGGCG 
GCTGGTAT
TACGGGAAAGGAGAGATAGGGTTTTGGACGGCAGCAGCTGGTATTTGGGCCACCAATTTTGCGCGCCAGTAC 
AGGACACC
GATGCCGCAAATTGCACAATGCCTTTTATGGCGACTGACAGTGCGATGCTATAGGTATGAATTGTCGACTGA 
CAAAGTGA
CACTATTCACATATAAATATAACGAATAACACTCAGTTGGAATATAGACATATGCCGACTCACCATCTGTGG 
CAATGTAT
ACCGACTAACAATTCGATGCTAATTCTCTATTTATAGCGACAGTCGTCAGACACTAATTTGGTGTTGTGGTA 
TAATGCTA
GTGCCTCACCGCTGTAGGTGTTGGTCTACTGGTGC

Should digest into 10 fragments using this enzyme, but the module  
produces only 7. Could you please confirm this behavior, and if  
observed, suggest some possible fixes? This may be a bug in the  
_non_pal_enz method, or may be me overlooking something pretty obvious.

Thanks,
Apurva Narechania.


From cjfields at uiuc.edu  Wed Jun  6 20:51:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 6 Jun 2007 19:51:00 -0500
Subject: [Bioperl-l] blastxml interation
In-Reply-To: <CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>
References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
	<CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>
Message-ID: <B494A9F2-80CE-4761-B67F-127B37358819@uiuc.edu>

Joshua,

Just to make sure there is no confusion, do you mean a  
Bio::Search::Iteration::IterationI-based object?  The iteration tags  
have multiple meanings apparently in BLAST XML output (multiple  
queries, multiple PSI-BLAST iterations).  The current  
SearchIO::blastxml parser returns multiple  
Bio::Search::Result::BlastResult objects based on the iterations, so  
PSI-BLAST output is treated as multiple BLAST reports regardless  
(i.e. no Iteration objects).  This is something I want to rectify but  
it may not be a easy fix.

chris

On Jun 6, 2007, at 5:18 PM, David Messina wrote:

> I think you want to look at the hits(), num_hits() and no_hits_found
> () methods. There is a private method _next_iteration_index() which
> should do what you asked for, but num_hits() looks like the better  
> way.
>
> By the way, hits() and num_hits() are listed on the Deobfuscator as
> having no documentation. This (as the below shows) is incorrect and
> is due to some nonstandard formatting issues which I will correct.
> _next_iteration_index() isn't listed on the Deobfuscator because it's
> a private method.
>
>
> Hope this helps!
> Dave
>
>
> hits()
>
> This method overrides Bio::Search::Result::GenericResult::hits to take
> into account the possibility of multiple iterations, as occurs in PSI-
> BLAST reports.
> If there are multiple iterations, all 'new' hits for all iterations
> are returned.
> These are the hits that did not occur in a previous iteration.
> See Also: Bio::Search::Result::GenericResult::hits
>
> num_hits()
>
> This method overrides Bio::Search::Result::GenericResult::num_hits to
> take
> into account the possibility of multiple iterations, as occurs in PSI-
> BLAST reports.
> If there are multiple iterations, calling num_hits() returns the
> number of
> 'new' hits for each iteration. These are the hits that did not occur
> in a previous iteration.
> See Also: Bio::Search::Result::GenericResult::num_hits
>
> no_hits_found()
>
>   Usage     : $nohits = $blast->no_hits_found( $iteration_number );
>   Purpose   : Get boolean indicator indicating whether or not any hits
>               were present in the report.
>               This is NOT the same as determining the number of  
> hits via
>               the hits() method, which will return zero hits if there
> were no
>               hits in the report or if all hits were filtered out
> during the parse.
>
>               Thus, this method can be used to distinguish these
> possibilities
>               for hitless reports generated when filtering.
>
>   Returns   : Boolean
>   Argument  : (optional) integer indicating the iteration number (PSI-
> BLAST)
>               If iteration number is not specified and this is a PSI-
> BLAST result,
>               then this method will return true only if all
> iterations had
>               no hits found.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Wed Jun  6 20:45:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 6 Jun 2007 20:45:14 -0400
Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db
Message-ID: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>

I have added support to BioSQL and bioperl-db for schemas in  
PostgreSQL. A schema in PostgreSQL is more or less a namespace for  
database objects (tables, indexes, views, etc) within a database.

(A database in PostgreSQL is similar to the concept of a user in  
Oracle or MySQL, and therefore for the latter two schemas are  
synonymous with a user. [Not sure I'm still up-to-date on this for  
MySQL, but at least that's what I recall.])

When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts,  
you specify the schema in which BioSQL resides using the --schema  
option.

If you are using bioperl-db as a library, the Bio::DB::BioDB->new()  
call also accepts a -schema named parameter, and Bio::DB::DBContextI  
objects have a $dbc->schema() property for getting/setting the  
schema, Bio::DB::SimpleDBContext->new() accepts a -schema parameter,  
and you may also add the property to the .bioperldb connection  
parameter file (-schema => 'yourschemahere').

Thanks for Brian Osborne for being the instigator (and tester, and  
for adding the code to load_ncbi_taxonomy.pl - I came too late).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jaudall at gmail.com  Wed Jun  6 17:41:08 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Wed, 6 Jun 2007 15:41:08 -0600
Subject: [Bioperl-l] blastxml interation number
Message-ID: <52cea20c0706061441n96ce803v9422e8d14461c2bd@mail.gmail.com>

I was searching in the deobfuscator under
*Bio::Search::Result::BlastResult*but there doesn't seem to be a
method to extract the iteration number from a
blastxml report.  I can see this number being very useful to count the
number of queries that didn't hit anything since the are no empty reports in
the blastxml output.  If I'm missing something, I would welcome an example
how to retrieve the result iteration number, otherwise I'm suggesting that
an iteration_count feature be added to the Result object.  Thanks in advance
for any suggestions.

Josh


From holland at ebi.ac.uk  Thu Jun  7 03:33:25 2007
From: holland at ebi.ac.uk (Richard Holland)
Date: Thu, 07 Jun 2007 08:33:25 +0100
Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db
In-Reply-To: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>
References: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>
Message-ID: <4667B4C5.6070107@ebi.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sounds great.

BioJava users shouldn't need to change anything to get this to work as
PostgreSQL JDBC connection objects already require you to specify a schema.

cheers,
Richard


Hilmar Lapp wrote:
> I have added support to BioSQL and bioperl-db for schemas in PostgreSQL.
> A schema in PostgreSQL is more or less a namespace for database objects
> (tables, indexes, views, etc) within a database.
> 
> (A database in PostgreSQL is similar to the concept of a user in Oracle
> or MySQL, and therefore for the latter two schemas are synonymous with a
> user. [Not sure I'm still up-to-date on this for MySQL, but at least
> that's what I recall.])
> 
> When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you
> specify the schema in which BioSQL resides using the --schema option.
> 
> If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call
> also accepts a -schema named parameter, and Bio::DB::DBContextI objects
> have a $dbc->schema() property for getting/setting the schema,
> Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may
> also add the property to the .bioperldb connection parameter file
> (-schema => 'yourschemahere').
> 
> Thanks for Brian Osborne for being the instigator (and tester, and for
> adding the code to load_ncbi_taxonomy.pl - I came too late).
> 
>     -hilmar
> --===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGZ7TF4C5LeMEKA/QRApwUAJ48q46iX152pB6Xcc/717Ie8foUTQCgm3ij
W/+0iO/ZsNDn1pLuf5yXbYA=
=asUn
-----END PGP SIGNATURE-----


From mmokrejs at ribosome.natur.cuni.cz  Thu Jun  7 10:26:44 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 07 Jun 2007 16:26:44 +0200
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
	<D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
Message-ID: <466815A4.9060505@ribosome.natur.cuni.cz>

Hi,

Chris Fields wrote:
> One thing I missed which explains the biopython error: the LOCUS line is 
> missing the locus identifier (see the NCBI example record link).  This 
> doesn't choke the bioperl parser but it appears to stop the biopython 
> parser in it's tracks (maybe a feature instead of a bug!).
> 
> You should try adding a unique identifier (maybe the name of the file or 
> record) to the LOCUS line to see if it works:
> 
> LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006
> 
> The bioperl parser in CVS writes out the correct alphabet when this is 
> added:
> 
> LOCUS       testfile                6499 bp    ds-DNA  linear   02-AUG-2006
> 
> I'll try adding a warning to the bioperl parser for this.

I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 but let me
emphasize the LOCUS line now contains 

LOCUS                      pRL        5428 bp ds-DNA   linear       07-JUN-2007


which still does not comply with the line you have proposed. But it can be
parsed by bioperl-live from cvs. Is it still wrong? Testcase as pRL.gb-new
in the bugzilla record #2305.

Martin

> 
> chris
> 
> On Jun 5, 2007, at 10:28 AM, Chris Fields wrote:
> 
>> Martin,
>>
>> The example file you give in the bioperl bugzilla report has several
>> blank annotation lines which may lead to additional problems.  When
>> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,
>> DEFINITION, etc) then it expects there will also be relevant data
>> (text descriptions) accompanying it; I assume the BioPython parser
>> expects likewise though I may be wrong.
>>
>> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL-
>> compliant.  GenBank records lacking text either have a '.' instead or
>> are left out entirely:
>>
>> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html
>>
>> We could add a fix but you should probably contact the ApE developers
>> and request that field names w/o text be left out or have '.' added.
>>
>> chris
>>
>> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:
>>
>>> Ezequiel Panepucci wrote:
>>>>>     genbank entry = parser.parse(fhandle)
>>>>
>>>> there is a space character between "genbank" and "entry".
>>>> It is a syntax error.
>>>> I suppose you meant "genbank_entry" ?
>>>
>>> Yes, the next command was right and has shown the error. Sorry, I
>>> forgot
>>> to delete the first attempt. ;-)
>>>
>>>>>> genbank_entry = parser.parse(fhandle)
>>> Traceback (most recent call last):
>>>  File "<stdin>", line 1, in ?
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",
>>> line 187, in parse
>>>    self._scanner.feed(handle, self._consumer)
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>>> line 360, in feed
>>>    self._feed_first_line(consumer, self.line)
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>>> line 835, in _feed_first_line
>>>    assert False, \
>>> AssertionError: Did not recognise the LOCUS line layout:
>>> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>>>
>>>>>>
>>>
>>> Martin
>>> _______________________________________________
>>> BioPython mailing list  -  BioPython at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>
>> _______________________________________________
>> BioPython mailing list  -  BioPython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs


From cjfields at uiuc.edu  Thu Jun  7 11:31:45 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 7 Jun 2007 10:31:45 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <466815A4.9060505@ribosome.natur.cuni.cz>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
	<D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
	<466815A4.9060505@ribosome.natur.cuni.cz>
Message-ID: <2A403865-F1E8-4D19-8D19-455C22E7C6D9@uiuc.edu>

On Jun 7, 2007, at 9:26 AM, Martin MOKREJ? wrote:

> Hi,
>
> Chris Fields wrote:
>> One thing I missed which explains the biopython error: the LOCUS  
>> line is missing the locus identifier (see the NCBI example record  
>> link).  This doesn't choke the bioperl parser but it appears to  
>> stop the biopython parser in it's tracks (maybe a feature instead  
>> of a bug!).
>> You should try adding a unique identifier (maybe the name of the  
>> file or record) to the LOCUS line to see if it works:
>> LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006
>> The bioperl parser in CVS writes out the correct alphabet when  
>> this is added:
>> LOCUS       testfile                6499 bp    ds-DNA  linear   02- 
>> AUG-2006
>> I'll try adding a warning to the bioperl parser for this.
>
> I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305  
> but let me
> emphasize the LOCUS line now contains
> LOCUS                      pRL        5428 bp ds-DNA   linear        
> 07-JUN-2007
>
>
> which still does not comply with the line you have proposed. But it  
> can be
> parsed by bioperl-live from cvs. Is it still wrong? Testcase as  
> pRL.gb-new
> in the bugzilla record #2305.
>
> Martin

That should work.  There isn't a strict uniqueness test (that would  
require caching and isn't worth the trouble IMHO), though it's  
required you add something unique for the accession/locus if you plan  
on indexing them in the future.

Parsing GenBank data produced from third-party software is  
problematic at best; there seems to be no steadfast rule with GenBank  
output for some programs, even though the specification is plainly  
stated in the NCBI release notes.  My take on that is to have a  
stricter (read:follows release notes) GenBank parser which passes off  
the data in the record to default handler methods.  A user could then  
subjugate the defined handlers with their own by subclassing the  
default handler class and overloading the methods or adding their own  
code references directly.

chris

...


From rich at thevillas.eclipse.co.uk  Fri Jun  8 07:00:45 2007
From: rich at thevillas.eclipse.co.uk (richard)
Date: Fri, 08 Jun 2007 12:00:45 +0100
Subject: [Bioperl-l] protparam
Message-ID: <466936DD.8080604@thevillas.eclipse.co.uk>


Hi,

I noticed that in April someone asked whether there was a bioperl mod 
for obtaining protein sequence related properties using protparam.
I have a module that could potentially be submitted to bioperl for this 
purpose. Does anybody have any thoughts on whether it should go in?

Example script and the module are at:

http://81.5.159.173/webshare/ 


Cheers
Rich


From cjfields at uiuc.edu  Fri Jun  8 08:37:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 8 Jun 2007 07:37:27 -0500
Subject: [Bioperl-l] protparam
In-Reply-To: <466936DD.8080604@thevillas.eclipse.co.uk>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
Message-ID: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>

Richard,

We'll gladly add this in, though it'll need to be bioperlized  
(inherit Bio::Root::Root).  We also generally ask for tests but it  
should be easy to write up a quick test suite using any protein seq.

If you can could you add some bioperl-like POD to the module (i.e.  
SYNOPSIS, AUTHOR, DESCRIPTION, etc)?

thanks!

chris

On Jun 8, 2007, at 6:00 AM, richard wrote:

>
> Hi,
>
> I noticed that in April someone asked whether there was a bioperl mod
> for obtaining protein sequence related properties using protparam.
> I have a module that could potentially be submitted to bioperl for  
> this
> purpose. Does anybody have any thoughts on whether it should go in?
>
> Example script and the module are at:
>
> http://81.5.159.173/webshare/
>
>
> Cheers
> Rich
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From mmokrejs at ribosome.natur.cuni.cz  Fri Jun  8 07:09:42 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Fri, 08 Jun 2007 13:09:42 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file?
Message-ID: <466938F6.7050903@ribosome.natur.cuni.cz>

Hi,
  how can I convert GenBank/EMBL formatted file to a GFF file? The manpage for
Bio::Graphics::FeatureFile does not help me in this way. The information is in
the file, so I want just to extract the features to a GFF format, probably somewhere
the sequence has to be stored ...
 Is there a tool so I can convert it automatically? ;) This would be great. I
can't make the GFF manually for every file. Other programs draw plasmid maps
also automatically from the GenBank formatted input so how can I do it in bioperl?
Thanks for help,
Martin


From shameer at ncbs.res.in  Fri Jun  8 10:11:00 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Fri, 8 Jun 2007 19:41:00 +0530 (IST)
Subject: [Bioperl-l] protparam
In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
Message-ID: <54411.192.168.1.1.1181311860.squirrel@mail.ncbs.res.in>

Richard,

I asked for protparam module in bioperl !
Thats a good job.

Cheers,
SK

> Richard,
>
> We'll gladly add this in, though it'll need to be bioperlized
> (inherit Bio::Root::Root).  We also generally ask for tests but it
> should be easy to write up a quick test suite using any protein seq.
>
> If you can could you add some bioperl-like POD to the module (i.e.
> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>
> thanks!
>
> chris
>
> On Jun 8, 2007, at 6:00 AM, richard wrote:
>
>>
>> Hi,
>>
>> I noticed that in April someone asked whether there was a bioperl mod
>> for obtaining protein sequence related properties using protparam.
>> I have a module that could potentially be submitted to bioperl for
>> this
>> purpose. Does anybody have any thoughts on whether it should go in?
>>
>> Example script and the module are at:
>>
>> http://81.5.159.173/webshare/
>>
>>
>> Cheers
>> Rich
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From dmessina at wustl.edu  Fri Jun  8 10:58:20 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 8 Jun 2007 09:58:20 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <466938F6.7050903@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
Message-ID: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>

Hi Martin,

You're in luck -- the BioPerl core distribution includes two scripts  
for doing just that:

	genbank2gff
	genbank2gff3

Look in the scripts directory of the distro.

Also, there is a *huge* amount of documentation and examples on the  
BioPerl website.

	http://www.bioperl.org/wiki/HOWTOs

Reading those, reading the FAQ, and searching the mailing list  
archives are where I look first when I don't know how to do something  
in BioPerl.


Dave

--
Dave Messina
Senior Analyst, Assembly Group
Genome Sequencing Center
Washington University
St. Louis, MO


From rich at thevillas.eclipse.co.uk  Fri Jun  8 11:51:21 2007
From: rich at thevillas.eclipse.co.uk (richard)
Date: Fri, 08 Jun 2007 16:51:21 +0100
Subject: [Bioperl-l] protparam
In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
Message-ID: <46697AF9.2090502@thevillas.eclipse.co.uk>


Hi,

ok, great, that's no problem. I'll add the POD and bioperlize it,

thanks
Rich

Chris Fields wrote:
> Richard,
>
> We'll gladly add this in, though it'll need to be bioperlized  
> (inherit Bio::Root::Root).  We also generally ask for tests but it  
> should be easy to write up a quick test suite using any protein seq.
>
> If you can could you add some bioperl-like POD to the module (i.e.  
> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>
> thanks!
>
> chris
>
> On Jun 8, 2007, at 6:00 AM, richard wrote:
>
>   
>> Hi,
>>
>> I noticed that in April someone asked whether there was a bioperl mod
>> for obtaining protein sequence related properties using protparam.
>> I have a module that could potentially be submitted to bioperl for  
>> this
>> purpose. Does anybody have any thoughts on whether it should go in?
>>
>> Example script and the module are at:
>>
>> http://81.5.159.173/webshare/
>>
>>
>> Cheers
>> Rich
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>   


From cjfields at uiuc.edu  Fri Jun  8 13:45:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 8 Jun 2007 12:45:17 -0500
Subject: [Bioperl-l] protparam
In-Reply-To: <46697AF9.2090502@thevillas.eclipse.co.uk>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
	<46697AF9.2090502@thevillas.eclipse.co.uk>
Message-ID: <AA43E9C9-7064-438A-89A9-12E4B21E4F04@uiuc.edu>

Another issue is namespace.  I suggest Bio::Tools::ProtParam, though  
there may be some others out there.

We can add support for direct Bio::Seq/PrimarySeq input and other  
odds and ends once it's committed.  Good work!

chris

On Jun 8, 2007, at 10:51 AM, richard wrote:

>
> Hi,
>
> ok, great, that's no problem. I'll add the POD and bioperlize it,
>
> thanks
> Rich
>
> Chris Fields wrote:
>> Richard,
>>
>> We'll gladly add this in, though it'll need to be bioperlized
>> (inherit Bio::Root::Root).  We also generally ask for tests but it
>> should be easy to write up a quick test suite using any protein seq.
>>
>> If you can could you add some bioperl-like POD to the module (i.e.
>> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>>
>> thanks!
>>
>> chris
>>
>> On Jun 8, 2007, at 6:00 AM, richard wrote:
>>
>>
>>> Hi,
>>>
>>> I noticed that in April someone asked whether there was a bioperl  
>>> mod
>>> for obtaining protein sequence related properties using protparam.
>>> I have a module that could potentially be submitted to bioperl for
>>> this
>>> purpose. Does anybody have any thoughts on whether it should go in?
>>>
>>> Example script and the module are at:
>>>
>>> http://81.5.159.173/webshare/
>>>
>>>
>>> Cheers
>>> Rich
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Mon Jun 11 07:30:24 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 11 Jun 2007 07:30:24 -0400
Subject: [Bioperl-l] script to load ITIS taxonomy
Message-ID: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>

Hi all -

I added a script to load the ITIS taxonomy (www.itis.gov) into the  
phylodb module. It is called load_itis_taxonomy.pl and is in the  
scripts/ directory.

It is independent of BioPerl right now (the ITIS download is either a  
MS SQL Server or an Informix dump - no kidding), but I'm hoping that  
at some point support for this can be integrated into Bio::TreeIO.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 11 08:24:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 11 Jun 2007 07:24:50 -0500
Subject: [Bioperl-l] script to load ITIS taxonomy
In-Reply-To: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>
References: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>
Message-ID: <99AC6C0F-10DD-4587-AFB3-32BC495CD2BD@uiuc.edu>


On Jun 11, 2007, at 6:30 AM, Hilmar Lapp wrote:

> Hi all -
>
> I added a script to load the ITIS taxonomy (www.itis.gov) into the
> phylodb module. It is called load_itis_taxonomy.pl and is in the
> scripts/ directory.
>
> It is independent of BioPerl right now (the ITIS download is either a
> MS SQL Server or an Informix dump - no kidding), but I'm hoping that
> at some point support for this can be integrated into Bio::TreeIO.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

I second the TreeIO support.  Anyone up for it?

chris


From ryanx07 at hotmail.com  Mon Jun 11 11:24:31 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Mon, 11 Jun 2007 10:24:31 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>

I just started to learn BioPerl by reading the BioPerl Tutorial on the 
BioPerl website. By trying the 1st example on my window,
use Bio::Perl;
$seq_object = get_sequence('swiss',"ID ROA1_HUMAN");
write_sequence(">roa1.fasta",'fasta',$seq_object);

I got the error as the following:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
3
STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
STACK: t8.pl:7

I cannot figure out where is wrong but cannot find the solution on the web. 
Could someone help me please?

Also, this lead to my 2nd question: is there a way to search in the archieve 
of the current list?

Thanks so much


R

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Like puzzles? Play free games & earn great prizes. Play Clink now. 
http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2


From dmessina at wustl.edu  Mon Jun 11 12:34:29 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 11 Jun 2007 11:34:29 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>
References: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>
Message-ID: <25517EA3-7BDA-44AC-BDF3-93A6810D9D63@wustl.edu>

The example code works here, but I'm on OS X. Could you tell us which  
version of Perl and BioPerl you are using, and which operating system?

Are you getting anything in the roa1.fasta file?


> is there a way to search in the archieve of the current list?

http://www.bioperl.org/wiki/Mailing_lists


Dave


From dmessina at wustl.edu  Mon Jun 11 14:48:23 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 11 Jun 2007 13:48:23 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F39783926A21896CCB15CD9B41A0@phx.gbl>
References: <BAY106-F39783926A21896CCB15CD9B41A0@phx.gbl>
Message-ID: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu>

Hi,

Please use 'Reply All' so everyone on the list can follow the  
discussion.

Try adding the following line after the line that starts with  
$seq_object:

	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";

And then run the program again. What do you get? Could you post a  
complete printout of what you're doing?


Dave


On Jun 11, 2007, at 11:45 AM, L Xu wrote:
> I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
> activeperl 5.8.8.819 Thank you very much.


From johnsonm at gmail.com  Mon Jun 11 20:45:13 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Mon, 11 Jun 2007 19:45:13 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
Message-ID: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>

    This bit in Bio::SeqFeature::Gene::Exon is causing me some
problems trying to extend Bio::Tools::Glimmer to handle 'wraparound'
genes (circular genomes):

sub location {
   my ($self,$value) = @_;

   if(defined($value) && $value->isa('Bio::Location::SplitLocationI')) {
       $self->throw("split or compound location is not allowed ".
                    "for an object of type " . ref($self));
   }
   return $self->SUPER::location($value);
}

    That seems to be there all the way back to the initial revision
(checked in by Hilmar).  I presume it's there because of code like
this ( from the seq() method in Bio::SeqFeature::Generic):

# assumming our seq object is sensible, it should not have to yank
# the entire sequence out here.

my $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end());

    That's not going to work too well with a feature that has a
Bio::Location::Split location.  Fixing it up seems straightforward, if
a bit hackish.  Something like:

my $seq;

if (ref($self->location()) eq 'Bio::Location::Split')) {
    my $seqstring;
    my @sublocs = $self->location()->sub_Location();

    foreach my $subloc (@sublocs) {
        $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(),
$subloc->end())->seq();
    }

    my $seq = Bio::Seq->new(
                                          -id =>
$self->{'_gsf_seq'}->display_id(),
                                          -seq => $seqstring
                                         );
}
else {
    $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end());
}

    I don't see any companion to trunc() in Bio::PrimarySeqI for
joining sequences.  A join() would be handy, and make the above
cleaner.
    Comments, suggestions, rotten fruit?


From torsten.seemann at infotech.monash.edu.au  Tue Jun 12 02:18:27 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 12 Jun 2007 16:18:27 +1000
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
Message-ID: <a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>

Mark,

> if (ref($self->location()) eq 'Bio::Location::Split')) {
>     my $seqstring;
>     my @sublocs = $self->location()->sub_Location();
>
>     foreach my $subloc (@sublocs) {
>         $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(),
> $subloc->end())->seq();
>     }

Can you use the ->spliced_seq() method to do this?

http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From pengchy at yahoo.com.cn  Tue Jun 12 03:00:46 2007
From: pengchy at yahoo.com.cn (=?gb2312?q?=D1=EE=20=C5=F4=B3=CC?=)
Date: Tue, 12 Jun 2007 15:00:46 +0800 (CST)
Subject: [Bioperl-l] Can't locate loadable object for module
	TFBS::Ext::pwmsearch
Message-ID: <66745.92089.qm@web15205.mail.cnb.yahoo.com>

hi all,
   
  Today, I download the TFBS package from http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the files contained in the TFBS and Ext directories to directory "C:\perl\site\lib", then put Ext under the TFBS directory. I run the example script1.pl, but a wrong message respond: 
   
  Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC (@INC contains: C:/perl/site/lib C:/perl/lib .) at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, <
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, <DATA> line 206.
Compilation failed in require at script1.pl line 3, <DATA> line 206.
BEGIN failed--compilation aborted at script1.pl line 3, <DATA> line 206.
shell returned 2
   
  when I run the list_matrices.pl script, the same message respond. But when I empty the pwmsearch.pm file, following message respond:
   
  TFBS/Ext/pwmsearch.pm did not return a true value at :/perl/site/lib/TFBS/Matr
x/PWM.pm line 141, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 11, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137,
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 17, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52,
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line2, <DATA> line 206.
Compilation failed in require at script1.pl line 3, <DATA> line 206.
BEGIN failed--compilation aborted at script1.pl line 3, <DATA> line 206.
   
  Is anyone else meet the same problem? Is it a bug for TFBS package?


Best wishes!

Sincerely, Pengcheng
       
---------------------------------
????????????????3.5G??????20M?????? 


From bix at sendu.me.uk  Tue Jun 12 03:32:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 12 Jun 2007 08:32:02 +0100
Subject: [Bioperl-l] Can't locate loadable object for
	module	TFBS::Ext::pwmsearch
In-Reply-To: <66745.92089.qm@web15205.mail.cnb.yahoo.com>
References: <66745.92089.qm@web15205.mail.cnb.yahoo.com>
Message-ID: <466E4BF2.7020504@sendu.me.uk>

? ?? wrote:
> hi all,
> 
> Today, I download the TFBS package from
> http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the
> files contained in the TFBS and Ext directories to directory
> "C:\perl\site\lib", then put Ext under the TFBS directory. I run the
> example script1.pl, but a wrong message respond:
> 
> Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC

You have to follow the installation instructions in the README file.
Copying the files out is insufficient - you have to 'make'.


From ryanx07 at hotmail.com  Tue Jun 12 07:30:09 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 06:30:09 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu>
Message-ID: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>

Here is the code:

use Bio::Perl;
$seq_object = get_sequence('swiss',"ROA1_HUMAN");
print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
write_sequence(">roa1.fasta",'fasta',$seq_object);

The output looks like the same as the previous version:

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\~Scripts>perl test.pl

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
3
STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
STACK: test.pl:7
-----------------------------------------------------------

Thanks.


>From: David Messina <dmessina at wustl.edu>
>To: L Xu <ryanx07 at hotmail.com>
>CC: BioPerl list <bioperl-l at lists.open-bio.org>
>Subject: Re: [Bioperl-l] basic questions
>Date: Mon, 11 Jun 2007 13:48:23 -0500
>
>Hi,
>
>Please use 'Reply All' so everyone on the list can follow the  discussion.
>
>Try adding the following line after the line that starts with  $seq_object:
>
>	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
>
>And then run the program again. What do you get? Could you post a  complete 
>printout of what you're doing?
>
>
>Dave
>
>
>On Jun 11, 2007, at 11:45 AM, L Xu wrote:
>>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
>>activeperl 5.8.8.819 Thank you very much.
>

_________________________________________________________________
Picture this ? share your photos and you could win big!  
http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us


From pengchy at yahoo.com.cn  Tue Jun 12 10:33:15 2007
From: pengchy at yahoo.com.cn (Pengcheng Yang)
Date: Tue, 12 Jun 2007 22:33:15 +0800 (CST)
Subject: [Bioperl-l]
	=?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20basic=20questions?=
In-Reply-To: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>
Message-ID: <936780.8655.qm@web15215.mail.cnb.yahoo.com>


I got the same questions.

I guess that the swissprote database has some problems!

code:
use Bio::DB::SwissProt;
$sp = new Bio::DB::SwissProt;
$seq = $sp->get_Seq_by_id('KPY1_ECOLI'); 
print ref($seq),"\t",$seq->display_id,"\n"

the mesage:

------------- EXCEPTION  -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK Bio::SeqIO::swiss::next_seq C:/perl/site/lib/Bio\SeqIO\swiss.pm:180
STACK Bio::DB::WebDBSeqI::get_Seq_by_id
C:/perl/site/lib/Bio/DB/WebDBSeqI.pm:154

STACK toplevel t.pl:7

--------------------------------------


--- L Xu <ryanx07 at hotmail.com>????:

> Here is the code:
> 
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
> write_sequence(">roa1.fasta",'fasta',$seq_object);
> 
> The output looks like the same as the previous version:
> 
> Microsoft Windows XP [Version 5.1.2600]
> (C) Copyright 1985-2001 Microsoft Corp.
> 
> C:\~Scripts>perl test.pl
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: swissprot stream with no ID. Not swissprot in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
> STACK: Bio::SeqIO::swiss::next_seq
> C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
> 3
> STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
> STACK: test.pl:7
> -----------------------------------------------------------
> 
> Thanks.
> 
> 
> 
> 
> 
> >From: David Messina <dmessina at wustl.edu>
> >To: L Xu <ryanx07 at hotmail.com>
> >CC: BioPerl list <bioperl-l at lists.open-bio.org>
> >Subject: Re: [Bioperl-l] basic questions
> >Date: Mon, 11 Jun 2007 13:48:23 -0500
> >
> >Hi,
> >
> >Please use 'Reply All' so everyone on the list can follow the 
> discussion.
> >
> >Try adding the following line after the line that starts with 
> $seq_object:
> >
> >	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
> >
> >And then run the program again. What do you get? Could you post a 
> complete 
> >printout of what you're doing?
> >
> >
> >Dave
> >
> >
> >On Jun 11, 2007, at 11:45 AM, L Xu wrote:
> >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
> >>activeperl 5.8.8.819 Thank you very much.
> >
> 
> _________________________________________________________________
> Picture this ?share your photos and you could win big!  
> http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us
> 
> > _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


Best wishes!

Sincerely, Pengcheng


      ___________________________________________________________ 
????????????????3.5G??????20M?????? 
http://cn.mail.yahoo.com


From drummike at gmail.com  Tue Jun 12 11:49:36 2007
From: drummike at gmail.com (Mike Williams)
Date: Tue, 12 Jun 2007 11:49:36 -0400
Subject: [Bioperl-l]
	=?GB2312?B?UmU6IFtCaW9wZXJsLWxdILvYuLSjuiBSZTogYmFzaWMgcXVlc3Rpb25z?=
In-Reply-To: <936780.8655.qm@web15215.mail.cnb.yahoo.com>
References: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>
	<936780.8655.qm@web15215.mail.cnb.yahoo.com>
Message-ID: <bc95ab8d0706120849qc60ee50qf743f4a7342580e1@mail.gmail.com>

On 6/12/07, Pengcheng Yang <pengchy at yahoo.com.cn> wrote:
> I got the same questions.
> I guess that the swissprote database has some problems!
> code:
> use Bio::DB::SwissProt;
> $sp = new Bio::DB::SwissProt;
> $seq = $sp->get_Seq_by_id('KPY1_ECOLI');
> print ref($seq),"\t",$seq->display_id,"\n"
> ------------- EXCEPTION  -------------
> MSG: swissprot stream with no ID. Not swissprot in my book
> STACK toplevel t.pl:7

This is a different problem.  The id was not valid.  If you change
KPY1 to KPYK1 it works fine.

$seq = $sp->get_Seq_by_id('KPYK1_ECOLI');
print ref($seq),"\t",$seq->display_id,"\n"
[mike at Wheatley]$ ./bio_quest2.pl

Bio::Seq::RichSeq       KPYK1_ECOLI

If you got this example from the bio perl site would you please post
the url?  Seems to me this same problem has come up before, but I
could not find it in the archives nor on the web site.

Mike


From ryanx07 at hotmail.com  Tue Jun 12 11:42:28 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 10:42:28 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>

I tested another code (the 2nd test on the same machine) from the tutorial 
and got error again. I don't know what happened and please help.
Thanks so much.

===========================================================Code:
use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection;
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection){
   print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";
   # prints name, recognition site, overhang
}
=========================================== Results:

C:\~Scripts>perl t9.pl
Can't use string ("Bio::Restriction::EnzymeCollecti") as a HASH ref while 
"stric
t refs" in use at C:/Perl/site/lib/Bio/Restriction/EnzymeCollection.pm line 
236.


= = = Original message = = =

On Jun 11, 2007, at 11:45 AM, L Xu wrote:

   I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and? 
activeperl 5.8.8.819 Thank you very much.

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Need a break? Find your escape route with Live Search Maps. 
http://maps.live.com/default.aspx?ss=Restaurants~Hotels~Amusement%20Park&cp=33.832922~-117.915659&style=r&lvl=13&tilt=-90&dir=0&alt=-1000&scene=1118863&encType=1&FORM=MGAC01


From limericksean at gmail.com  Tue Jun 12 12:04:40 2007
From: limericksean at gmail.com (Sean O'Keeffe)
Date: Tue, 12 Jun 2007 18:04:40 +0200
Subject: [Bioperl-l] gff2xml
Message-ID: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>

Hi all,
I posted this on the gbrowse list earlier. I'm looking to convert gff
data files into xml. Does anyone know of a module written to do this
already?

respect,
sean.


From johnsonm at gmail.com  Tue Jun 12 12:10:45 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 11:10:45 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
Message-ID: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>

On 6/12/07, Torsten Seemann <torsten.seemann at infotech.monash.edu.au> wrote:
> Can you use the ->spliced_seq() method to do this?
>
> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11
>
> --
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> --Tel +61 3 9905 9010

    Actually, I'd forgotten about spliced_seq().  That seems like it
will Do The Right Thing.  It's just up to the invoker to call
spliced_seq() instead of seq() as appropriate.
    So, is there any other code that will break if I modify
Bio::SeqFeature::Gene::Exon::location to not throw an exception when
encountering Bio::Location::SplitLocationI?  I'm wondering if it's
just a paranoid check or if it's there to guard against something.  If
the latter, I need to know what code to fix.  I'll dig and look, but
if anybody knows or has an idea, save me some time.  I suppose I can
just change it and see what tests start failing. 8)


From dmessina at wustl.edu  Tue Jun 12 12:11:36 2007
From: dmessina at wustl.edu (David Messina)
Date: Tue, 12 Jun 2007 11:11:36 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>
References: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>
Message-ID: <30B8F841-E694-4577-8C15-8703E846CDFE@wustl.edu>

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps  
Perl wasn't seeing the second argument to get_sequence. And then your  
new program has the error 'Can't use string  
("Bio::Restriction::EnzymeCollecti")' where the end of the word is  
cut off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.  Are  
there any example scripts that come with ActivePerl? If there are,  
and they run correctly, perhaps you could look to see how the line  
breaks are done and make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem --  
anyone else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall BioPerl  
and make sure that you run the full test suite and that all of the  
tests pass. My guess is that something in your current setup is not  
quite right.

Dave


From cjfields at uiuc.edu  Tue Jun 12 12:42:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 11:42:29 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
Message-ID: <E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>


On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:

> On 6/12/07, Torsten Seemann  
> <torsten.seemann at infotech.monash.edu.au> wrote:
>> Can you use the ->spliced_seq() method to do this?
>>
>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ 
>> SeqFeatureI.html#POD11
>>
>> --
>> --Torsten Seemann
>> --Victorian Bioinformatics Consortium, Monash University
>> --Tel +61 3 9905 9010
>
>     Actually, I'd forgotten about spliced_seq().  That seems like it
> will Do The Right Thing.  It's just up to the invoker to call
> spliced_seq() instead of seq() as appropriate.
>     So, is there any other code that will break if I modify
> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> just a paranoid check or if it's there to guard against something.  If
> the latter, I need to know what code to fix.  I'll dig and look, but
> if anybody knows or has an idea, save me some time.  I suppose I can
> just change it and see what tests start failing. 8)

I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to  
describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs  
state that the Exon class is used to specifically describe exons, as  
the name implies.  Exons are primarily eukaryotic in origin, so you  
shouldn't encounter wraparounds, and should not have split locations  
by definition (which likely explains the exception).

Wouldn't a SeqFeature::Generic work just as well using a split location?

chris


From johnsonm at gmail.com  Tue Jun 12 12:59:54 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 11:59:54 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
Message-ID: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>

    That's a good point.  Both Bio::Tools::Glimmer and
Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
prokaryotic sequence (multiple exons for eukaryotic).  There are
eukaryotic and prokaryotic versions of both predictor families.  Maybe
the most elegant solution would be to simply modify both modules to
only emit Bio::SeqFeature::Generic features when operating on
prokaryotic mode output?  Fix the data model and the problem goes
away.  8)

On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>
> > On 6/12/07, Torsten Seemann
> > <torsten.seemann at infotech.monash.edu.au> wrote:
> >> Can you use the ->spliced_seq() method to do this?
> >>
> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
> >> SeqFeatureI.html#POD11
> >>
> >> --
> >> --Torsten Seemann
> >> --Victorian Bioinformatics Consortium, Monash University
> >> --Tel +61 3 9905 9010
> >
> >     Actually, I'd forgotten about spliced_seq().  That seems like it
> > will Do The Right Thing.  It's just up to the invoker to call
> > spliced_seq() instead of seq() as appropriate.
> >     So, is there any other code that will break if I modify
> > Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> > encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> > just a paranoid check or if it's there to guard against something.  If
> > the latter, I need to know what code to fix.  I'll dig and look, but
> > if anybody knows or has an idea, save me some time.  I suppose I can
> > just change it and see what tests start failing. 8)
>
> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
> state that the Exon class is used to specifically describe exons, as
> the name implies.  Exons are primarily eukaryotic in origin, so you
> shouldn't encounter wraparounds, and should not have split locations
> by definition (which likely explains the exception).
>
> Wouldn't a SeqFeature::Generic work just as well using a split location?
>
> chris
>


From ryanx07 at hotmail.com  Tue Jun 12 13:17:18 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 12:17:18 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>

I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 build 820.
However, both scripts generated the same error with my computer. I tested 
the code in another WinXP computer with the same versions of activePerl and 
BioPerl, the one for the swissprot did work but the restriction enzyme 
generated the same error.

= = = Original message = = =

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps? Perl 
wasn't seeing the second argument to get_sequence. And then your? new 
program has the error 'Can't use string? 
("Bio::Restriction::EnzymeCollecti")' where the end of the word is? cut off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.? Are? there 
any example scripts that come with ActivePerl? If there are,? and they run 
correctly, perhaps you could look to see how the line? breaks are done and 
make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem --? anyone 
else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall BioPerl? and 
make sure that you run the full test suite and that all of the? tests pass. 
My guess is that something in your current setup is not? quite right.

Dave

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Tue Jun 12 13:51:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 12:51:47 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>
References: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>
Message-ID: <D01CF97A-FE62-4E40-A3DD-FAFD97D8BA45@uiuc.edu>

This is an instance where 'use strict' would have shown the problem  
right away.  You left off your constructor call:

my $all_collection = Bio::Restriction::EnzymeCollection;

should be

my $all_collection = Bio::Restriction::EnzymeCollection->new;

chris

On Jun 12, 2007, at 12:17 PM, L Xu wrote:

> I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8  
> build 820.
> However, both scripts generated the same error with my computer. I  
> tested
> the code in another WinXP computer with the same versions of  
> activePerl and
> BioPerl, the one for the swissprot did work but the restriction enzyme
> generated the same error.
>
> = = = Original message = = =
>
> Hmm, it almost looks like you're having an issue with line breaks.
>
> The 'swissprot stream with no ID' error made me think that perhaps?  
> Perl
> wasn't seeing the second argument to get_sequence. And then your? new
> program has the error 'Can't use string?
> ("Bio::Restriction::EnzymeCollecti")' where the end of the word is?  
> cut off.
>
> I don't know how ActivePerl handles Windows vs UNIX line breaks.?  
> Are? there
> any example scripts that come with ActivePerl? If there are,? and  
> they run
> correctly, perhaps you could look to see how the line? breaks are  
> done and
> make sure the your program does it the same way.
>
> Other than that, I'm not seeing an obvious answer to your problem  
> --? anyone
> else have a suggestion?
>
> Perhaps the easiest thing for you to do would be to reinstall  
> BioPerl? and
> make sure that you run the full test suite and that all of the?  
> tests pass.
> My guess is that something in your current setup is not? quite right.
>
> Dave
>
> ___________________________________________________________
> Sent by ePrompter, the premier email notification software.
> Free download at http://www.ePrompter.com.
>
> _________________________________________________________________
> Get a preview of Live Earth, the hottest event this summer - only  
> on MSN
> http://liveearth.msn.com?source=msntaglineliveearthhm
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ryanx07 at hotmail.com  Tue Jun 12 14:11:15 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 13:11:15 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>

Thank you very much, it did make the script advanced a bit but I got the 
following error:

C:\~Scripts>perl t9.pl
Can't locate object method "name" via package 
"Bio::Restriction::EnzymeCollectio
n" at t9.pl line 5, <DATA> line 532.

I checked the documentation , there is no "name" method for the package. 
Thanks.

= = = Original message = = =

This is an instance where 'use strict' would have shown the problem? right 
away.? You left off your constructor call:

my $all_collection = Bio::Restriction::EnzymeCollection;

should be

my $all_collection = Bio::Restriction::EnzymeCollection->new;

chris

On Jun 12, 2007, at 12:17 PM, L Xu wrote:


   I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8? build 
820.
However, both scripts generated the same error with my computer. I? tested
the code in another WinXP computer with the same versions of? activePerl and
BioPerl, the one for the swissprot did work but the restriction enzyme
generated the same error.

= = = Original message = = =

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps?? Perl
wasn't seeing the second argument to get_sequence. And then your? new
program has the error 'Can't use string?
("Bio::Restriction::EnzymeCollecti")' where the end of the word is?? cut 
off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.?? Are? 
there
any example scripts that come with ActivePerl? If there are,? and? they run
correctly, perhaps you could look to see how the line? breaks are? done and
make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem? --? 
anyone
else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall? BioPerl? and
make sure that you run the full test suite and that all of the?? tests pass.
My guess is that something in your current setup is not? quite right.

Dave

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only? on MSN
http://liveearth.msn.com?source=msntaglineliveearthhm

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Tue Jun 12 14:35:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 13:35:15 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>
References: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>
Message-ID: <287E93E2-1902-4796-971E-B1DCA805D032@uiuc.edu>

Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme  
objects, each with its own name().  Using grouped methods like  
'$collection->cutters(6)' will retrieve a new EnzymeCollection  
containing all six-cutters from the original collection.  You should  
use one of the EnzymeCollection accessor methods to retrieve the  
enzyme that you wanted first or iterate through them all.  This works  
for me:

use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection->new();
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection->each_enzyme){
    print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";
}

chris

On Jun 12, 2007, at 1:11 PM, L Xu wrote:

> Thank you very much, it did make the script advanced a bit but I  
> got the following error:
>
> C:\~Scripts>perl t9.pl
> Can't locate object method "name" via package  
> "Bio::Restriction::EnzymeCollectio
> n" at t9.pl line 5, <DATA> line 532.
>
> I checked the documentation , there is no "name" method for the  
> package. Thanks.


From johnsonm at gmail.com  Tue Jun 12 15:07:57 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 14:07:57 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
Message-ID: <ebf5eb170706121207p4ad86a6cr9af85e766168cfbe@mail.gmail.com>

I'll wait a day, and if there is no opinion to the contrary, implement
it this way.

On 6/12/07, Mark Johnson <johnsonm at gmail.com> wrote:
>     That's a good point.  Both Bio::Tools::Glimmer and
> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
> prokaryotic sequence (multiple exons for eukaryotic).  There are
> eukaryotic and prokaryotic versions of both predictor families.  Maybe
> the most elegant solution would be to simply modify both modules to
> only emit Bio::SeqFeature::Generic features when operating on
> prokaryotic mode output?  Fix the data model and the problem goes
> away.  8)


From torsten.seemann at infotech.monash.edu.au  Tue Jun 12 20:18:27 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 13 Jun 2007 10:18:27 +1000
Subject: [Bioperl-l] gff2xml
In-Reply-To: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
Message-ID: <a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>

Sean

> I posted this on the gbrowse list earlier. I'm looking to convert gff
> data files into xml. Does anyone know of a module written to do this
> already?

What DTD do you want the XML to conform to?
eg. ChadoXML, TinySeq XML, TIGR XML ... ?

What program are you trying to get to load the XML?

BioPerl has some Bio::SeqIO:xxxxx modules for some XML formats that
you could use. There is a script "bp_seqconvert.pl -h" which comes
with BioPerl which may be useful.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From hlapp at gmx.net  Tue Jun 12 20:55:57 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 12 Jun 2007 20:55:57 -0400
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
Message-ID: <0915FAB4-E554-4E65-BA3F-1B916F0F95FC@gmx.net>

I think it was just trying to guard against people trying to do  
stupid things.

I'm actually not sure that representing locations on a circular  
genome using split locations really is the best thing. I'm wondering  
whether one shouldn't rather introduce a CircularLocation object  
(though obviously it isn't the location that's circular...).

Just a thought. In the end, if you have a way to make this work that  
you feel comfortable with than go for it.

	-hilmar

On Jun 12, 2007, at 12:10 PM, Mark Johnson wrote:

> On 6/12/07, Torsten Seemann  
> <torsten.seemann at infotech.monash.edu.au> wrote:
>> Can you use the ->spliced_seq() method to do this?
>>
>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ 
>> SeqFeatureI.html#POD11
>>
>> --
>> --Torsten Seemann
>> --Victorian Bioinformatics Consortium, Monash University
>> --Tel +61 3 9905 9010
>
>     Actually, I'd forgotten about spliced_seq().  That seems like it
> will Do The Right Thing.  It's just up to the invoker to call
> spliced_seq() instead of seq() as appropriate.
>     So, is there any other code that will break if I modify
> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> just a paranoid check or if it's there to guard against something.  If
> the latter, I need to know what code to fix.  I'll dig and look, but
> if anybody knows or has an idea, save me some time.  I suppose I can
> just change it and see what tests start failing. 8)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Tue Jun 12 20:57:06 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 12 Jun 2007 20:57:06 -0400
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
Message-ID: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>

I like that. Don't force a model to do what you want if it doesn't  
really apply anyway.

	-hilmar

On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote:

>     That's a good point.  Both Bio::Tools::Glimmer and
> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
> prokaryotic sequence (multiple exons for eukaryotic).  There are
> eukaryotic and prokaryotic versions of both predictor families.  Maybe
> the most elegant solution would be to simply modify both modules to
> only emit Bio::SeqFeature::Generic features when operating on
> prokaryotic mode output?  Fix the data model and the problem goes
> away.  8)
>
> On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>
>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>>
>>> On 6/12/07, Torsten Seemann
>>> <torsten.seemann at infotech.monash.edu.au> wrote:
>>>> Can you use the ->spliced_seq() method to do this?
>>>>
>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
>>>> SeqFeatureI.html#POD11
>>>>
>>>> --
>>>> --Torsten Seemann
>>>> --Victorian Bioinformatics Consortium, Monash University
>>>> --Tel +61 3 9905 9010
>>>
>>>     Actually, I'd forgotten about spliced_seq().  That seems like it
>>> will Do The Right Thing.  It's just up to the invoker to call
>>> spliced_seq() instead of seq() as appropriate.
>>>     So, is there any other code that will break if I modify
>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
>>> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
>>> just a paranoid check or if it's there to guard against  
>>> something.  If
>>> the latter, I need to know what code to fix.  I'll dig and look, but
>>> if anybody knows or has an idea, save me some time.  I suppose I can
>>> just change it and see what tests start failing. 8)
>>
>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
>> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
>> state that the Exon class is used to specifically describe exons, as
>> the name implies.  Exons are primarily eukaryotic in origin, so you
>> shouldn't encounter wraparounds, and should not have split locations
>> by definition (which likely explains the exception).
>>
>> Wouldn't a SeqFeature::Generic work just as well using a split  
>> location?
>>
>> chris
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Jun 12 21:20:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 20:20:41 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
	<80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>
Message-ID: <951EB9CA-2066-4CD1-BCD5-4E00232CA507@uiuc.edu>

It will be interesting to see if bioperl handles wrap-around split  
locations via spliced_seq() and other methods.  I can't see why it  
wouldn't but one never knows.  Might be something to add to location  
tests at some point...

chris

On Jun 12, 2007, at 7:57 PM, Hilmar Lapp wrote:

> I like that. Don't force a model to do what you want if it doesn't
> really apply anyway.
>
> 	-hilmar
>
> On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote:
>
>>     That's a good point.  Both Bio::Tools::Glimmer and
>> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
>> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
>> prokaryotic sequence (multiple exons for eukaryotic).  There are
>> eukaryotic and prokaryotic versions of both predictor families.   
>> Maybe
>> the most elegant solution would be to simply modify both modules to
>> only emit Bio::SeqFeature::Generic features when operating on
>> prokaryotic mode output?  Fix the data model and the problem goes
>> away.  8)
>>
>> On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>>
>>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>>>
>>>> On 6/12/07, Torsten Seemann
>>>> <torsten.seemann at infotech.monash.edu.au> wrote:
>>>>> Can you use the ->spliced_seq() method to do this?
>>>>>
>>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
>>>>> SeqFeatureI.html#POD11
>>>>>
>>>>> --
>>>>> --Torsten Seemann
>>>>> --Victorian Bioinformatics Consortium, Monash University
>>>>> --Tel +61 3 9905 9010
>>>>
>>>>     Actually, I'd forgotten about spliced_seq().  That seems  
>>>> like it
>>>> will Do The Right Thing.  It's just up to the invoker to call
>>>> spliced_seq() instead of seq() as appropriate.
>>>>     So, is there any other code that will break if I modify
>>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception  
>>>> when
>>>> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
>>>> just a paranoid check or if it's there to guard against
>>>> something.  If
>>>> the latter, I need to know what code to fix.  I'll dig and look,  
>>>> but
>>>> if anybody knows or has an idea, save me some time.  I suppose I  
>>>> can
>>>> just change it and see what tests start failing. 8)
>>>
>>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
>>> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
>>> state that the Exon class is used to specifically describe exons, as
>>> the name implies.  Exons are primarily eukaryotic in origin, so you
>>> shouldn't encounter wraparounds, and should not have split locations
>>> by definition (which likely explains the exception).
>>>
>>> Wouldn't a SeqFeature::Generic work just as well using a split
>>> location?
>>>
>>> chris
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ryanx07 at hotmail.com  Wed Jun 13 08:16:15 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Wed, 13 Jun 2007 07:16:15 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
Message-ID: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>

Thanks so much, Chris, it works now.
All the codes I tested were copied from Bioperl Tutorial. Why did they have 
such problems, because of the platform issue or different versions of 
BioPerl? I tested so far 6 scripts, three work and three don't.

Here is the problem for the 3rd failed script:
=================================
use strict;
use Bio::Tools::Run::RemoteBlast;
my $remote_blast = Bio::Tools::Run::RemoteBlast->new (
         -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' );
my $r = $remote_blast->submit_blast("d1.fa");
my $rc;
while ( my @rids = $remote_blast->each_rid ) {
    for my $rid ( @rids ) {
       $rc = $remote_blast->retrieve_blast($rid);
    }
}
print "$rc\n"; #I just want to print sth here before parsing the result
=========================================================d1.fa
>example
CCCTTCAGGTACCCCGAGGTAACACGAGACACTCGGGATCTGGGAAGGGGACTGGGGCTTCTTTAAAAGCGCTCAGTTTAAAAAGCTTCTATGCCTGAATAGGTGACCGGAGGCCGGCACC
=========================================================result
C:\>perl t13.pl

-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------

-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------
Terminating on signal SIGINT(2)

C:\>


Please help me to correct the problem, thanks.


= = = Original message = = =

Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme? objects, 
each with its own name().? Using grouped methods like? 
'$collection->cutters(6)' will retrieve a new EnzymeCollection? containing 
all six-cutters from the original collection.? You should? use one of the 
EnzymeCollection accessor methods to retrieve the? enzyme that you wanted 
first or iterate through them all.? This works? for me:

use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection->new();
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection->each_enzyme)
?? print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";


chris

On Jun 12, 2007, at 1:11 PM, L Xu wrote:


   Thank you very much, it did make the script advanced a bit but I? got the 
following error:

C:\~Scripts>perl t9.pl
Can't locate object method "name" via package? 
"Bio::Restriction::EnzymeCollectio
n" at t9.pl line 5, <DATA> line 532.

I checked the documentation , there is no "name" method for the? package. 
Thanks.

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Make every IM count. Download Messenger and join the i?m Initiative now. 
It?s free. http://im.live.com/messenger/im/home/?source=TAGHM_June07


From cjfields at uiuc.edu  Wed Jun 13 10:41:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 09:41:55 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
Message-ID: <4F7BE556-BD8C-4378-BDE7-1F31364F49DA@uiuc.edu>

Judging by the output it looks like you have no network access or  
can't connect to the server (what remoteblast needs).  Make sure you  
don't need proxy settings.

To preempt the next question, no, I'm not going to explain what a  
proxy is.  The RemoteBlast docs show how to set them, and Google is a  
wonderful tool...

chris

On Jun 13, 2007, at 7:16 AM, L Xu wrote:

> ...
> -------------------- WARNING ---------------------
> MSG: <HTML>
> <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> <BODY>
> <H1>An Error Occurred</H1>
> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> </BODY>
> </HTML>
>
> ---------------------------------------------------
> ...


From ryanx07 at hotmail.com  Wed Jun 13 11:01:07 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Wed, 13 Jun 2007 10:01:07 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
Message-ID: <BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>

I do have the internet connection bu not use the proxy server.
I tested the network connection with ping command (below). The ncbi website 
does not response. Is there any special network setting needed for 
connecting the ncbi website?
Thank you so much.

C:\>ping www.yahoo.com

Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:

Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
Reply from 69.147.114.210: bytes=32 time=360ms TTL=45

Ping statistics for 69.147.114.210:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 312ms, Maximum = 363ms, Average = 338ms

C:\>ping www.ncbi.nlm.nih.gov

Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:

Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 130.14.29.110:
    Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),


= = = Original message = = =

Judging by the output it looks like you have no network access or? can't 
connect to the server (what remoteblast needs).? Make sure you? don't need 
proxy settings.

To preempt the next question, no, I'm not going to explain what a? proxy 
is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
tool...

chris

On Jun 13, 2007, at 7:16 AM, L Xu wrote:


   ...
-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------
...

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Wed Jun 13 12:14:22 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 11:14:22 -0500
Subject: [Bioperl-l] method naming
Message-ID: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>

Some quick questions on method naming.  I couldn't find this on the  
mail list previously and just want some opinions.

1) Is there any preference on how to name a method that returns a  
list of class instances vs. data?  I have seen 'each' (each_Location,  
each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.  
simple (hits, hsps).

2) Do we want have methods which return objects have the object name  
in Title Case (each_Location, get_Seq_by_id, etc) or does it really  
matter?

chris


From dmessina at wustl.edu  Wed Jun 13 12:41:53 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 13 Jun 2007 11:41:53 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
Message-ID: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>

> 1) Is there any preference on how to name a method that returns a
> list of class instances vs. data?  I have seen 'each' (each_Location,
> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
> simple (hits, hsps).

I'd prefer 'get_all' because it's more intuitive to me what the  
method is doing. 'Each' is too programmer-y.


> 2) Do we want have methods which return objects have the object name
> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
> matter?

I like Title Case because it reinforces the notion that what you're  
getting back is a specific object with that name (Seq) rather than  
the generic thing that the name represents (AGTCTGTGATAT, the actual  
sequence as a string).


Dave


From hlapp at gmx.net  Wed Jun 13 13:03:59 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 13:03:59 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
Message-ID: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>

We set a convention a while back on how to name these. It is  
implemented in the bioperl.lisp file (too bad no one is using emacs  
any more these days - it's a great editor), and in fact we started a  
renaming campaign (not sure when that was) on the SeqI and  
SeqFeatureI classes (you'll still see the old names aliased).

However, we never got to finish the clean up.

The convention was to use get_{ClassName}s, and get_all_{ClassName}s  
if there is a difference to the former (mostly because of  
hierarchical data; for example features can be nested, and  
get_all_SeqFeatures returns them all flattened out, while  
get_SeqFeatures returns only the top objects), and for modifying add_ 
{ClassName} and remove_{ClassName}s.

The class name was to be in title case to emphasize the fact that it  
is an array of object you'd be getting back (and what kind of  
objects). If it is strings or any other scalar type, the name would  
be in lower case.

	-hilmar

On Jun 13, 2007, at 12:14 PM, Chris Fields wrote:

> Some quick questions on method naming.  I couldn't find this on the
> mail list previously and just want some opinions.
>
> 1) Is there any preference on how to name a method that returns a
> list of class instances vs. data?  I have seen 'each' (each_Location,
> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
> simple (hits, hsps).
>
> 2) Do we want have methods which return objects have the object name
> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
> matter?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 13 13:19:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 12:19:43 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
Message-ID: <B7E2E5CA-3027-4D25-B9EA-998D2BC59DBB@uiuc.edu>

Sounds good.  I agree with Dave also one the use of 'each', as it's a  
bit ambiguous (seems to imply iteration as opposed to returning a  
whole list).

We probably need to post this somewhere on the wiki for future  
reference; maybe in Advanced BioPerl?  I'll add this in shortly.

chris

On Jun 13, 2007, at 12:03 PM, Hilmar Lapp wrote:

> We set a convention a while back on how to name these. It is  
> implemented in the bioperl.lisp file (too bad no one is using emacs  
> any more these days - it's a great editor), and in fact we started  
> a renaming campaign (not sure when that was) on the SeqI and  
> SeqFeatureI classes (you'll still see the old names aliased).
>
> However, we never got to finish the clean up.
>
> The convention was to use get_{ClassName}s, and get_all_{ClassName} 
> s if there is a difference to the former (mostly because of  
> hierarchical data; for example features can be nested, and  
> get_all_SeqFeatures returns them all flattened out, while  
> get_SeqFeatures returns only the top objects), and for modifying  
> add_{ClassName} and remove_{ClassName}s.
>
> The class name was to be in title case to emphasize the fact that  
> it is an array of object you'd be getting back (and what kind of  
> objects). If it is strings or any other scalar type, the name would  
> be in lower case.
>
> 	-hilmar
>
> On Jun 13, 2007, at 12:14 PM, Chris Fields wrote:
>
>> Some quick questions on method naming.  I couldn't find this on the
>> mail list previously and just want some opinions.
>>
>> 1) Is there any preference on how to name a method that returns a
>> list of class instances vs. data?  I have seen 'each' (each_Location,
>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
>> simple (hits, hsps).
>>
>> 2) Do we want have methods which return objects have the object name
>> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
>> matter?
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Jun 13 14:43:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 13:43:41 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <467036FC.8000505@watson.wustl.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
	<467036FC.8000505@watson.wustl.edu>
Message-ID: <286EE81C-0926-4AAE-9110-02948DFADF36@uiuc.edu>


On Jun 13, 2007, at 1:27 PM, Michael Kiwala wrote:

>
> David Messina wrote:
>>> 1) Is there any preference on how to name a method that returns a
>>> list of class instances vs. data?  I have seen  
>>> 'each' (each_Location,
>>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures)  
>>> vs.
>>> simple (hits, hsps).
>>>
>>
>> I'd prefer 'get_all' because it's more intuitive to me what the   
>> method is doing. 'Each' is too programmer-y.
>>
>>
>>
> When I think 'get_all', I think of a method that returns a list of  
> objects at once. When I think of 'each', I think of a method that  
> returns a scalar but can be called multiple times to iterate over a  
> set of objects.

Yep, hence the ambiguity issue (and my confusion).  I think it was so  
you could both iterate and return a list using this:

for my $obj ($seq->each_Class) {...}
my @objs = $seq->each_Class;

I use 'next' and 'get/get_all' as an iterator and get accessor  
(similar to how it's used in Bio::SearchIO):

while (my $obj = $seq->next_Class) {...}
my @objs = $seq->get_Class; # or get_all_Class for flattened lists

which to me is much clearer.

chris


From mkiwala at watson.wustl.edu  Wed Jun 13 14:27:08 2007
From: mkiwala at watson.wustl.edu (Michael Kiwala)
Date: Wed, 13 Jun 2007 13:27:08 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
Message-ID: <467036FC.8000505@watson.wustl.edu>


David Messina wrote:
>> 1) Is there any preference on how to name a method that returns a
>> list of class instances vs. data?  I have seen 'each' (each_Location,
>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
>> simple (hits, hsps).
>>     
>
> I'd prefer 'get_all' because it's more intuitive to me what the  
> method is doing. 'Each' is too programmer-y.
>
>
>   
When I think 'get_all', I think of a method that returns a list of 
objects at once. When I think of 'each', I think of a method that 
returns a scalar but can be called multiple times to iterate over a set 
of objects.


From sac at bioperl.org  Wed Jun 13 17:17:27 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Wed, 13 Jun 2007 14:17:27 -0700
Subject: [Bioperl-l] method naming
In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
Message-ID: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>

On 6/13/07, Hilmar Lapp <hlapp at gmx.net> wrote:
> We set a convention a while back on how to name these. It is
> implemented in the bioperl.lisp file (too bad no one is using emacs
> any more these days - it's a great editor),

As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we
could improve the visibility of bioperl.lisp. In truth, I had
forgotten about it, though lit turns out I was loading an old version
of it. (Btw, using the latest version of bioperl.lisp with xemacs
21.4.17, I don't get a bioperl menu item, though I can access bioperl
functions via M-x. Suggestions?)

I see bioperl.lisp is mentioned twice parenthetically in the advanced
bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here
would help. While we're at it, maybe we could add a bioperl.vi file to
the distribution (if you can do such things with vi/vim).

On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
> We probably need to post this somewhere on the wiki for future
> reference; maybe in Advanced BioPerl?  I'll add this in shortly.

Another idea: Add a method naming check to the set of audits we
perform on CVS committed code. It could check for agreement with our
conventions and warn if nothing was found (may not be a problem
though).

Steve


From arareko at campus.iztacala.unam.mx  Wed Jun 13 18:03:34 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 13 Jun 2007 17:03:34 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
Message-ID: <467069B6.7080003@campus.iztacala.unam.mx>

By the time of the 1.5.2 release, I jumped onto the idea of creating a 
BioPerl template for Komodo. Chris F handed me one he had already made 
but in the end I didn't had enough spare time to get into it. If someone 
wants to give it a try please let ChrisF/me know.

Regards,
Mauricio.

Steve Chervitz wrote:
> On 6/13/07, Hilmar Lapp <hlapp at gmx.net> wrote:
>> We set a convention a while back on how to name these. It is
>> implemented in the bioperl.lisp file (too bad no one is using emacs
>> any more these days - it's a great editor),
> 
> As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we
> could improve the visibility of bioperl.lisp. In truth, I had
> forgotten about it, though lit turns out I was loading an old version
> of it. (Btw, using the latest version of bioperl.lisp with xemacs
> 21.4.17, I don't get a bioperl menu item, though I can access bioperl
> functions via M-x. Suggestions?)
> 
> I see bioperl.lisp is mentioned twice parenthetically in the advanced
> bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here
> would help. While we're at it, maybe we could add a bioperl.vi file to
> the distribution (if you can do such things with vi/vim).
> 
> On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> We probably need to post this somewhere on the wiki for future
>> reference; maybe in Advanced BioPerl?  I'll add this in shortly.
> 
> Another idea: Add a method naming check to the set of audits we
> perform on CVS committed code. It could check for agreement with our
> conventions and warn if nothing was found (may not be a problem
> though).
> 
> Steve
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From hlapp at gmx.net  Wed Jun 13 18:41:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 18:41:45 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
Message-ID: <FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>


On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:

> using the latest version of bioperl.lisp with xemacs 21.4.17, I  
> don't get a bioperl menu item

I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item  
it showing up just beautifully. (BTW it also have very nice icons for  
various functions - though I always feel guilty for using keystrokes  
instead.)

Is GNU Emacs finally winning this? ;)

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jason at bioperl.org  Wed Jun 13 18:58:51 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 13 Jun 2007 15:58:51 -0700
Subject: [Bioperl-l] method naming
In-Reply-To: <FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
Message-ID: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>

Post your dualing screenshots to the wiki!

I had started a couple of IDE pages on the wiki a while ago:
  http://bioperl.org/wiki/Emacs
  http://bioperl.org/wiki/Emacs_template
  http://bioperl.org/wiki/Vi

If anyone is feeling excited enough to write a few more IDE pages and  
link them into a common article that would be great.

-jason
On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:

>
> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>
>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>> don't get a bioperl menu item
>
> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item
> it showing up just beautifully. (BTW it also have very nice icons for
> various functions - though I always feel guilty for using keystrokes
> instead.)
>
> Is GNU Emacs finally winning this? ;)
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From cjfields at uiuc.edu  Wed Jun 13 19:08:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 18:08:17 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
Message-ID: <E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>

Would probably be worth writing one up for Komodo since Mauricio,  
Sendu, and I use it.

I updated the Advanced BioPerl page with Hilmar's methods suggestions/ 
rules (as well as a few I found dating back a number of years on the  
mail list).  It might be worth a glance in case there are any changes  
needed:

http://www.bioperl.org/wiki/Advanced_BioPerl

chris

On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote:

> Post your dualing screenshots to the wiki!
>
> I had started a couple of IDE pages on the wiki a while ago:
>  http://bioperl.org/wiki/Emacs
>  http://bioperl.org/wiki/Emacs_template
>  http://bioperl.org/wiki/Vi
>
> If anyone is feeling excited enough to write a few more IDE pages  
> and link them into a common article that would be great.
>
> -jason
> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:
>
>>
>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>>
>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>>> don't get a bioperl menu item
>>
>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item
>> it showing up just beautifully. (BTW it also have very nice icons for
>> various functions - though I always feel guilty for using keystrokes
>> instead.)
>>
>> Is GNU Emacs finally winning this? ;)
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Wed Jun 13 19:28:17 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 19:28:17 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
	<E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
Message-ID: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>

Thanks Chris for doing this - looks great. The only comment that I  
have is that method names should never start with a capital letter.  
If the getter/setter is for a single object (as opposed to a list),  
the name should probably be similar (if not identical) to the class  
being expected and returned, but lower-case.

E.g., $feature->location(), $seq->species() etc

	-hilmar

On Jun 13, 2007, at 7:08 PM, Chris Fields wrote:

> Would probably be worth writing one up for Komodo since Mauricio,  
> Sendu, and I use it.
>
> I updated the Advanced BioPerl page with Hilmar's methods  
> suggestions/rules (as well as a few I found dating back a number of  
> years on the mail list).  It might be worth a glance in case there  
> are any changes needed:
>
> http://www.bioperl.org/wiki/Advanced_BioPerl
>
> chris
>
> On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote:
>
>> Post your dualing screenshots to the wiki!
>>
>> I had started a couple of IDE pages on the wiki a while ago:
>>  http://bioperl.org/wiki/Emacs
>>  http://bioperl.org/wiki/Emacs_template
>>  http://bioperl.org/wiki/Vi
>>
>> If anyone is feeling excited enough to write a few more IDE pages  
>> and link them into a common article that would be great.
>>
>> -jason
>> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:
>>
>>>
>>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>>>
>>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>>>> don't get a bioperl menu item
>>>
>>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu  
>>> item
>>> it showing up just beautifully. (BTW it also have very nice icons  
>>> for
>>> various functions - though I always feel guilty for using keystrokes
>>> instead.)
>>>
>>> Is GNU Emacs finally winning this? ;)
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 13 19:44:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 18:44:08 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
	<E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
	<06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>
Message-ID: <91AF2018-EC27-49FD-A4D1-C31C0E73DEFB@uiuc.edu>

Agreed.  We can definitely add that in.

As we edge towards another release we try another round of cleaning  
up.  I wouldn't mind pushing out another 1.5 point release before  
summer's up if possible; most of the tough work was done for v.1.5.2  
by Sendu.

chris

On Jun 13, 2007, at 6:28 PM, Hilmar Lapp wrote:

> Thanks Chris for doing this - looks great. The only comment that I
> have is that method names should never start with a capital letter.
> If the getter/setter is for a single object (as opposed to a list),
> the name should probably be similar (if not identical) to the class
> being expected and returned, but lower-case.
>
> E.g., $feature->location(), $seq->species() etc
>
> 	-hilmar
>
> On Jun 13, 2007, at 7:08 PM, Chris Fields wrote:
>
>> Would probably be worth writing one up for Komodo since Mauricio,
>> Sendu, and I use it.
>>
>> I updated the Advanced BioPerl page with Hilmar's methods
>> suggestions/rules (as well as a few I found dating back a number of
>> years on the mail list).  It might be worth a glance in case there
>> are any changes needed:
>>
>> http://www.bioperl.org/wiki/Advanced_BioPerl
>>
>> chris
...


From johncumbers at gmail.com  Wed Jun 13 20:20:42 2007
From: johncumbers at gmail.com (John Cumbers)
Date: Wed, 13 Jun 2007 20:20:42 -0400
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
Message-ID: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>

Hello,

I have a simple problem, I'm trying to search a genome sequence for a motif,
I then want to output a BED file to display all the locations of this motif
on the UCSC Genome Browser.  I could not find a script to do this, so I
started to write my own.   I'm new to perl and my code below was my attempt
to read the sequence string and output the index bp of the start of each
motif.  With this I could build the BED file myself, which requires start
and finish base pairs.

For the first motif I can output the start index, but when I try and read
the next one off the sequence it does not work.  Instead I just get an
output of a list of 1's.  I realise that this is more a request for some
simple perl help, but any help much appreciated.

Best wishes,
John


$seq_object = read_sequence("Drosophila.Chr3.test.AE014296.fasta");  #turn
my FASTA file into a seq object.
$sequence_as_a_string = $seq_object->seq();  #turn it into a string
# search $sequence_as_a_string  string for motif AAA as example
# if found, return the index that it is found at

while ($sequence_as_a_string =~ m/AAA/g) {
  print "Found '$&'.  Next attempt at character " .
pos($sequence_as_a_string)+1 . "\n";
}


-- 
John Cumbers,  Graduate Student
Biology and Medicine
Brown University, Box G-W
Providence, Rhode Island, 02912, USA
Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
UK to USA: 0207 617 7824


From cjfields at uiuc.edu  Wed Jun 13 21:58:37 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 20:58:37 -0500
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
In-Reply-To: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
References: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
Message-ID: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>

This is answered in the FAQ (sorry if the URL wraps, but we don't  
like tinyurls):

http://www.bioperl.org/wiki/ 
FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. 
22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F

chris

On Jun 13, 2007, at 7:20 PM, John Cumbers wrote:

> Hello,
>
> I have a simple problem, I'm trying to search a genome sequence for  
> a motif,
> I then want to output a BED file to display all the locations of  
> this motif
> on the UCSC Genome Browser.  I could not find a script to do this,  
> so I
> started to write my own.   I'm new to perl and my code below was my  
> attempt
> to read the sequence string and output the index bp of the start of  
> each
> motif.  With this I could build the BED file myself, which requires  
> start
> and finish base pairs.
>
> For the first motif I can output the start index, but when I try  
> and read
> the next one off the sequence it does not work.  Instead I just get an
> output of a list of 1's.  I realise that this is more a request for  
> some
> simple perl help, but any help much appreciated.
>
> Best wishes,
> John
>
>
> $seq_object = read_sequence 
> ("Drosophila.Chr3.test.AE014296.fasta");  #turn
> my FASTA file into a seq object.
> $sequence_as_a_string = $seq_object->seq();  #turn it into a string
> # search $sequence_as_a_string  string for motif AAA as example
> # if found, return the index that it is found at
>
> while ($sequence_as_a_string =~ m/AAA/g) {
>   print "Found '$&'.  Next attempt at character " .
> pos($sequence_as_a_string)+1 . "\n";
> }
>
>
>
> -- 
> John Cumbers,  Graduate Student
> Biology and Medicine
> Brown University, Box G-W
> Providence, Rhode Island, 02912, USA
> Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
> UK to USA: 0207 617 7824
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Thu Jun 14 00:08:04 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 13 Jun 2007 21:08:04 -0700
Subject: [Bioperl-l] wiki bulk update
Message-ID: <992B2C7A-E944-4C69-BDE0-B0B0F6D1274D@bioperl.org>

I did a some bulk update of Module pages for new modules that had  
been created since we last setup these pages:
I outlined a little bit of what it requires behind the scenes.

http://bioperl.org/wiki/BioPerl:Module_pages

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From bix at sendu.me.uk  Thu Jun 14 05:35:00 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 10:35:00 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
Message-ID: <46710BC4.3060302@sendu.me.uk>

It is preferable to have ->new syntax over new Object syntax, as 
outlined here: 
http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules

I propose making this syntax change in all Bioperl POD documentation, so 
that the bad syntax is no longer suggested/encouraged. Any objections? 
If not, I'll go ahead and commit the changes.

(affects 907 modules in live)


Cheers,
Sendu.


From bix at sendu.me.uk  Thu Jun 14 06:01:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 11:01:02 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46710BC4.3060302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
Message-ID: <467111DE.6060800@sendu.me.uk>

Sendu Bala wrote:
> It is preferable to have ->new syntax over new Object syntax, as 
> outlined here: 
> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules 
> 
> 
> I propose making this syntax change in all Bioperl POD documentation,

Actually, I propose making the change to code as well.


From hlapp at gmx.net  Thu Jun 14 08:47:47 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 14 Jun 2007 08:47:47 -0400
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <467111DE.6060800@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk>
Message-ID: <0D7CD74F-DCB3-44F8-9AC7-144B1BD58946@gmx.net>

Sounds fine to me. People do go by working examples, and I've seen  
inconsistent examples leading to confusion on the end of newbies.

	-hilmar

On Jun 14, 2007, at 6:01 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>>
>> I propose making this syntax change in all Bioperl POD documentation,
>
> Actually, I propose making the change to code as well.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Jun 14 08:55:18 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 07:55:18 -0500
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <467111DE.6060800@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk>
Message-ID: <EC0DB8AB-F7C8-423B-9566-34B3FD24B3EC@uiuc.edu>

Sounds fine by me.  I may actually start tackling some of the feature/ 
annotation overloading stuff myself to see what happens (I'll drop a  
notice when that occurs).

chris

On Jun 14, 2007, at 5:01 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>>
>> I propose making this syntax change in all Bioperl POD documentation,
>
> Actually, I propose making the change to code as well.


From tanzeem.mb at gmail.com  Thu Jun 14 02:27:19 2007
From: tanzeem.mb at gmail.com (tanzeem)
Date: Wed, 13 Jun 2007 23:27:19 -0700 (PDT)
Subject: [Bioperl-l] Problem working with remoteblast submit method in
	webbrowser.
Message-ID: <11114623.post@talk.nabble.com>


 I have a program which uses the Bio perl remoteblast module which compares a
aminoacid  fasta file with swissprot database. The submit_blast() method 
works successfully when   run  from commandline.But when the program is run
from web browser it returns -1. I was trying to adapt the code from
Remoteblast synopsis for my need.
-- 
View this message in context: http://www.nabble.com/Problem-working-with-remoteblast-submit-method-in-webbrowser.-tf3919886.html#a11114623
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From bix at sendu.me.uk  Thu Jun 14 11:34:27 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 16:34:27 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46710BC4.3060302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
Message-ID: <46716003.2030302@sendu.me.uk>

Sendu Bala wrote:
> It is preferable to have ->new syntax over new Object syntax, as 
> outlined here: 
> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules
> 
> I propose making this syntax change in all Bioperl POD documentation, so 
> that the bad syntax is no longer suggested/encouraged. Any objections? 
> If not, I'll go ahead and commit the changes.
> 
> (affects 907 modules in live)

It was actually 515 modules & test scripts from live, 48 from run, 21
from db and 2 from network.

Now committed. Before and after my changes these were failing:


Failed Test     Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
t/BioGraphics.t    3   768    38    3  3-5
t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
                                        1932 2106
t/Sopma.t          2   512    16    2  8 15
t/genbank.t        2   512   247    2  122-123


BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
(unintentional?).

Sopma may not be a bug: results from server might have changed.

genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163
-> 1.164 not doing what the new tests expect.

PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
you working on that, or can I fix those errors?

Anyone care to look into those things?

Cheers,
Sendu.


From cjfields at uiuc.edu  Thu Jun 14 12:35:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 11:35:21 -0500
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <AAFC1021-9E3A-4C31-A9B8-4B0046F907A1@uiuc.edu>

The genbank commit was mine so I'll look into it; may be that I  
hadn't finished up the bug work.  If if have time I'll look into  
Sopma as well (unless you get to it first).

chris

On Jun 14, 2007, at 10:34 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>> I propose making this syntax change in all Bioperl POD  
>> documentation, so
>> that the bad syntax is no longer suggested/encouraged. Any  
>> objections?
>> If not, I'll go ahead and commit the changes.
>>
>> (affects 907 modules in live)
>
> It was actually 515 modules & test scripts from live, 48 from run, 21
> from db and 2 from network.
>
> Now committed. Before and after my changes these were failing:
>
>
> Failed Test     Stat Wstat Total Fail  List of Failed
> ---------------------------------------------------------------------- 
> ---------
> t/BioGraphics.t    3   768    38    3  3-5
> t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
>                                         1932 2106
> t/Sopma.t          2   512    16    2  8 15
> t/genbank.t        2   512   247    2  122-123
>
>
> BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
> (unintentional?).
>
> Sopma may not be a bug: results from server might have changed.
>
> genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm  
> 1.163
> -> 1.164 not doing what the new tests expect.
>
> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan,  
> are
> you working on that, or can I fix those errors?
>
> Anyone care to look into those things?
>
> Cheers,
> Sendu.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Thu Jun 14 12:43:43 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 17:43:43 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <4671703F.4010109@sheffield.ac.uk>

I'm just wondering if anyone passes their modules through perltidy in
order for them to have the same look/feel? If so, do you have a
.perltidyrc file? Also, is it worth running the Bioperl modules through it?

Nath


From n.haigh at sheffield.ac.uk  Thu Jun 14 12:36:37 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 17:36:37 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <46716E95.3090604@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as 
>> outlined here: 
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules
>>
>> I propose making this syntax change in all Bioperl POD documentation, so 
>> that the bad syntax is no longer suggested/encouraged. Any objections? 
>> If not, I'll go ahead and commit the changes.
>>
>> (affects 907 modules in live)
> 
> It was actually 515 modules & test scripts from live, 48 from run, 21
> from db and 2 from network.
> 
> Now committed. Before and after my changes these were failing:
> 
> 
> Failed Test     Stat Wstat Total Fail  List of Failed
> -------------------------------------------------------------------------------
> t/BioGraphics.t    3   768    38    3  3-5
> t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
>                                         1932 2106
> t/Sopma.t          2   512    16    2  8 15
> t/genbank.t        2   512   247    2  122-123
> 
> 
> BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
> (unintentional?).
> 
> Sopma may not be a bug: results from server might have changed.
> 
> genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163
> -> 1.164 not doing what the new tests expect.
> 
> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
> you working on that, or can I fix those errors?
> 

I can fix these - although I'm still trying to get my new Debian 4.0
system up-to-speed so it might take me a little while! RE the
PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't
installed. However, would it be better to have Test::Pod in t/lib so
that it runs on the user's system during installation or leave it as is?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGcW6VczuW2jkwy2gRAv3dAKCURgd4F881MhbessKxNh/cPrJu2wCeLwnS
7olroF2e6+4I0biz6fWRmu4=
=s3hK
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Thu Jun 14 13:15:24 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 18:15:24 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <4671703F.4010109@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk>
Message-ID: <467177AC.8060104@sendu.me.uk>

Nathan S. Haigh wrote:
> I'm just wondering if anyone passes their modules through perltidy in
> order for them to have the same look/feel? If so, do you have a
> .perltidyrc file? Also, is it worth running the Bioperl modules through it?

I don't use it, but I was contemplating the same thing. Chris uses it 
from time to time and I think we have a similar taste in style.

But we'd have to hammer something out that was agreeable to everyone.


From mmokrejs at ribosome.natur.cuni.cz  Thu Jun 14 13:19:42 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 14 Jun 2007 19:19:42 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
Message-ID: <467178AE.5040905@ribosome.natur.cuni.cz>


David Messina wrote:
> Hi Martin,
> 
> You're in luck -- the BioPerl core distribution includes two scripts  
> for doing just that:
> 
> 	genbank2gff

Somehow these scripts were not installed for me on Gentoo, but I have then in the
cvs copy. ;-) Anyway, the one above is not for me, I do not need the GFF database,
or better to say I have no intent to install that unknown thing, seems like an overkill
for my case. I just want to render a plasmid map.

> 	genbank2gff3

This one seems more promising but still with current cvs checkout I get...

$ perl /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl --in stdin --out stdout < ~/99.gb 
# Input: stdin
Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, <FH> line 7.
Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, <FH> line 7.
Can't call method "binomial" on an undefined value at /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl line 675, <FH> line 125.
$
$ bp_seqconvert.pl --from genbank --to embl < ~/IRESite/gb/99.gb 
Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, <STDIN> line 7.
Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, <STDIN> line 7.
ID   unknown; SV 1; circular; unassigned DNA; STD; UNC; 5391 BP.
XX
AC   unknown;
XX
XX
XX
CC   ApEinfo:methylated:0
...

Oh dear, I have just manually edited the files and still they are wrong? Oh no. :(

> 
> Look in the scripts directory of the distro.
> 
> Also, there is a *huge* amount of documentation and examples on the  
> BioPerl website.
> 
> 	http://www.bioperl.org/wiki/HOWTOs

You mean http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File ? ;-)

> 
> Reading those, reading the FAQ, and searching the mailing list  
> archives are where I look first when I don't know how to do something  
> in BioPerl.
> 
> 
> Dave
> 
> --
> Dave Messina
> Senior Analyst, Assembly Group
> Genome Sequencing Center
> Washington University
> St. Louis, MO
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 99.gb
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070614/fc6e601a/attachment-0002.pl>

From mmokrejs at ribosome.natur.cuni.cz  Thu Jun 14 13:23:28 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 14 Jun 2007 19:23:28 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <467178AE.5040905@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
Message-ID: <46717990.6040509@ribosome.natur.cuni.cz>

Martin MOKREJ? wrote:

>> Also, there is a *huge* amount of documentation and examples on the  
>> BioPerl website.
>>
>>     http://www.bioperl.org/wiki/HOWTOs
> 
> You mean 
> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File 
> ? ;-)

$ perl embl2picture.pl ~/99.gb | display -
Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature Bio::Location::Simple=HASH(0x893ebac): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature Bio::Location::Simple=HASH(0x893e720): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.
$

The plasmid is a circular DNA, why is the diagram in linear? ;-)

Martin


From bix at sendu.me.uk  Thu Jun 14 13:03:34 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 18:03:34 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716E95.3090604@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<46716E95.3090604@sheffield.ac.uk>
Message-ID: <467174E6.1090001@sendu.me.uk>

Nathan S. Haigh wrote:
>> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
>> you working on that, or can I fix those errors?
> 
> I can fix these - although I'm still trying to get my new Debian 4.0
> system up-to-speed so it might take me a little while! RE the
> PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't
> installed. However, would it be better to have Test::Pod in t/lib so
> that it runs on the user's system during installation or leave it as is?

Leave it as is. Every-day users don't need to check the syntax of the 
pod. In fact, it really only needs to be done once, prior to packaging 
up a new release.


From n.haigh at sheffield.ac.uk  Thu Jun 14 13:32:37 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 18:32:37 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <467177AC.8060104@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
Message-ID: <46717BB5.8000706@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> I'm just wondering if anyone passes their modules through perltidy in
>> order for them to have the same look/feel? If so, do you have a
>> .perltidyrc file? Also, is it worth running the Bioperl modules
>> through it?
> 
> I don't use it, but I was contemplating the same thing. Chris uses it
> from time to time and I think we have a similar taste in style.
> 
> But we'd have to hammer something out that was agreeable to everyone.

A starting place maybe Perl Best Practices by Damian Conway:
http://www.oreilly.com/catalog/perlbp/


The perltidyrc file can e found here:
http://www.perlmonks.org/?node_id=485885

I also found this nice thread with some ideas, inc some code that causes
emacs to auto-perltidy everything you use cperl-mode with. I don't use
emacs myself, ut here's the link if anyone is interested:
http://www.perlmonks.org/?node_id=516501

Nath


From johnsonm at gmail.com  Thu Jun 14 13:38:31 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Thu, 14 Jun 2007 12:38:31 -0500
Subject: [Bioperl-l] Perltidy
In-Reply-To: <467177AC.8060104@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
Message-ID: <ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>

    The nice thing about Perl Tidy is that everybody can have their
own config file.  There could be a bioperl default config that gets
applied at checkin time.  Anybody that didn't like it could script
checkouts to get run through their own config.  Diffs might get a
little hairy, but as long as you tidy before diffing, it shouldn't be
too bad.  Speaking of which....coding style is controversial enough,
but since that's already been opened, what about CVS vs Subversion? 8)
 Some of the scripting for this sort of thing might be easer in
Subversion.  Though maybe something like Git would fit the developer
model better (more support for distributed development).

On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
> Nathan S. Haigh wrote:
> > I'm just wondering if anyone passes their modules through perltidy in
> > order for them to have the same look/feel? If so, do you have a
> > .perltidyrc file? Also, is it worth running the Bioperl modules through it?
>
> I don't use it, but I was contemplating the same thing. Chris uses it
> from time to time and I think we have a similar taste in style.
>
> But we'd have to hammer something out that was agreeable to everyone.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From n.haigh at sheffield.ac.uk  Thu Jun 14 13:39:39 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 18:39:39 +0100
Subject: [Bioperl-l] cvs changes in working copy
Message-ID: <46717D5B.5040108@sheffield.ac.uk>

Not sure if I'm being dense or if it's because I've been working with
svn recently, but - how do I get a list of files that are different in
my working copy compared to the repository?

Cheers
Nath


From cjfields at uiuc.edu  Thu Jun 14 13:46:38 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 12:46:38 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
Message-ID: <CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>

Is 99.gb supposed to be a GenBank file?  And you're loading it into  
embl2picture (which I assume takes EMBL format files)?  Without  
example code we can easily make the wrong assumptions (i.e. that this  
is user error and not a BioPerl problem).

Also, I don't believe the feature plotting scripts plot circular  
chromosomes/plasmids.  If you want this functionality you'll have to  
code it for yourself.

chris

On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote:

> Martin MOKREJ? wrote:
>
>>> Also, there is a *huge* amount of documentation and examples on the
>>> BioPerl website.
>>>
>>>     http://www.bioperl.org/wiki/HOWTOs
>>
>> You mean
>> http://www.bioperl.org/wiki/ 
>> HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>> ? ;-)
>
> $ perl embl2picture.pl ~/99.gb | display -
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature  
> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature  
> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature  
> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature  
> Bio::Location::Simple=HASH(0x893e720): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature  
> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
> $
>
> The plasmid is a circular DNA, why is the diagram in linear? ;-)
>
> Martin
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Thu Jun 14 13:57:35 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 14 Jun 2007 12:57:35 -0500
Subject: [Bioperl-l] Perltidy
In-Reply-To: <46717BB5.8000706@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk> <46717BB5.8000706@sheffield.ac.uk>
Message-ID: <4671818F.5040902@campus.iztacala.unam.mx>

I think a consensus .perltidyrc could be placed in the source distribution.

Mauricio.

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> I'm just wondering if anyone passes their modules through perltidy in
>>> order for them to have the same look/feel? If so, do you have a
>>> .perltidyrc file? Also, is it worth running the Bioperl modules
>>> through it?
>> I don't use it, but I was contemplating the same thing. Chris uses it
>> from time to time and I think we have a similar taste in style.
>>
>> But we'd have to hammer something out that was agreeable to everyone.
> 
> A starting place maybe Perl Best Practices by Damian Conway:
> http://www.oreilly.com/catalog/perlbp/
> 
> 
> The perltidyrc file can e found here:
> http://www.perlmonks.org/?node_id=485885
> 
> I also found this nice thread with some ideas, inc some code that causes
> emacs to auto-perltidy everything you use cperl-mode with. I don't use
> emacs myself, ut here's the link if anyone is interested:
> http://www.perlmonks.org/?node_id=516501
> 
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Thu Jun 14 14:32:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 13:32:41 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
Message-ID: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>

To chip in on this, I only use perltidy when I need to clean bioperl  
code up for debugging (particularly if blocks are hard to see) and  
just use the defaults.  I agree it would be nice to have everything  
tidied up but it'll definitely need to be a consensus config file.

About svn, I like the idea of eventually migrating to using it over  
CVS (I think BioPython and BioJava have plans to but I'm not sure)  
but I don't really know enough to say how feasible/difficult the  
migration path would be.  Anyone know?

chris

On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote:

>     The nice thing about Perl Tidy is that everybody can have their
> own config file.  There could be a bioperl default config that gets
> applied at checkin time.  Anybody that didn't like it could script
> checkouts to get run through their own config.  Diffs might get a
> little hairy, but as long as you tidy before diffing, it shouldn't be
> too bad.  Speaking of which....coding style is controversial enough,
> but since that's already been opened, what about CVS vs Subversion? 8)
>  Some of the scripting for this sort of thing might be easer in
> Subversion.  Though maybe something like Git would fit the developer
> model better (more support for distributed development).
>
> On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
>> Nathan S. Haigh wrote:
>>> I'm just wondering if anyone passes their modules through  
>>> perltidy in
>>> order for them to have the same look/feel? If so, do you have a
>>> .perltidyrc file? Also, is it worth running the Bioperl modules  
>>> through it?
>>
>> I don't use it, but I was contemplating the same thing. Chris uses it
>> from time to time and I think we have a similar taste in style.
>>
>> But we'd have to hammer something out that was agreeable to everyone.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonm at gmail.com  Thu Jun 14 14:46:24 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Thu, 14 Jun 2007 13:46:24 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
Message-ID: <ebf5eb170706141146r6e07efffhbb98a6d101c45ccd@mail.gmail.com>

    If there was a default/standard/consensus bioperl perltidy config
file, I would probably use it prior to checkin, on my own, so I could
code in my schizophrenic style without worrying about starting any
format wars.  When I'm fixing or enhancing somebody else's code, I
always try and adapt to whatever style they used, even if it grates on
my nerves.  I'd love to not have to worry about that with Bioperl.  Of
course, nobody will every agree on a standard, so it's probably a moot
point.  8)

On 6/14/07, Chris Fields <cjfields at uiuc.edu> wrote:
> To chip in on this, I only use perltidy when I need to clean bioperl
> code up for debugging (particularly if blocks are hard to see) and
> just use the defaults.  I agree it would be nice to have everything
> tidied up but it'll definitely need to be a consensus config file.
>
> About svn, I like the idea of eventually migrating to using it over
> CVS (I think BioPython and BioJava have plans to but I'm not sure)
> but I don't really know enough to say how feasible/difficult the
> migration path would be.  Anyone know?
>
> chris


From jason at bioperl.org  Thu Jun 14 15:00:09 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 14 Jun 2007 12:00:09 -0700
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
Message-ID: <CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>


On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:

> To chip in on this, I only use perltidy when I need to clean bioperl
> code up for debugging (particularly if blocks are hard to see) and
> just use the defaults.  I agree it would be nice to have everything
> tidied up but it'll definitely need to be a consensus config file.
>

Can we do any sort of massive conversion at some logical timepoint.   
Probably after a branch release or something?  Because it basically  
means we're going to have differences on nearly every line which is  
going to make diff-ing difficult when debugging old/new versions.   
Maybe it is not a problem because we aren't introducing and new bugs!

> About svn, I like the idea of eventually migrating to using it over
> CVS (I think BioPython and BioJava have plans to but I'm not sure)
> but I don't really know enough to say how feasible/difficult the
> migration path would be.  Anyone know?
>

It's doable but non-trivial.  cvs2svn (python gah!) script exists to  
help in this.  There are pros and cons to converting.   There is a  
fair amount of documentation and other pointers out there that point  
to the CVS server for getting latest code so we'd need to think about  
whether we'd support some sort of backwards compatible SVN -> CVS for  
read-only or what.

Mostly it will need someone to lead the charge - I made a go at doing  
it in the winter, but I really don't have the SVN-foo to make this  
work.  We'd need someone with SVN experience to step up and help.   
You can always try and we can play with the converted repository for  
a while without making it the new code base.

-j

> chris
>
> On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote:
>
>>     The nice thing about Perl Tidy is that everybody can have their
>> own config file.  There could be a bioperl default config that gets
>> applied at checkin time.  Anybody that didn't like it could script
>> checkouts to get run through their own config.  Diffs might get a
>> little hairy, but as long as you tidy before diffing, it shouldn't be
>> too bad.  Speaking of which....coding style is controversial enough,
>> but since that's already been opened, what about CVS vs  
>> Subversion? 8)
>>  Some of the scripting for this sort of thing might be easer in
>> Subversion.  Though maybe something like Git would fit the developer
>> model better (more support for distributed development).
>>
>> On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
>>> Nathan S. Haigh wrote:
>>>> I'm just wondering if anyone passes their modules through
>>>> perltidy in
>>>> order for them to have the same look/feel? If so, do you have a
>>>> .perltidyrc file? Also, is it worth running the Bioperl modules
>>>> through it?
>>>
>>> I don't use it, but I was contemplating the same thing. Chris  
>>> uses it
>>> from time to time and I think we have a similar taste in style.
>>>
>>> But we'd have to hammer something out that was agreeable to  
>>> everyone.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Thu Jun 14 15:01:27 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 14 Jun 2007 12:01:27 -0700
Subject: [Bioperl-l] cvs changes in working copy
In-Reply-To: <46717D5B.5040108@sheffield.ac.uk>
References: <46717D5B.5040108@sheffield.ac.uk>
Message-ID: <EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>

cvs update | grep '^M'

On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote:

> Not sure if I'm being dense or if it's because I've been working with
> svn recently, but - how do I get a list of files that are different in
> my working copy compared to the repository?
>
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From cjfields at uiuc.edu  Thu Jun 14 15:20:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 14:20:46 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
Message-ID: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>


On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:

>
> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>
>> To chip in on this, I only use perltidy when I need to clean bioperl
>> code up for debugging (particularly if blocks are hard to see) and
>> just use the defaults.  I agree it would be nice to have everything
>> tidied up but it'll definitely need to be a consensus config file.
>>
>
> Can we do any sort of massive conversion at some logical timepoint.
> Probably after a branch release or something?  Because it basically
> means we're going to have differences on nearly every line which is
> going to make diff-ing difficult when debugging old/new versions.
> Maybe it is not a problem because we aren't introducing and new bugs!

I agree; if we intend on doing this it should be all at once, maybe  
on a branch dedicated to ensure that code changes don't tank tests  
(they shouldn't but one never knows).  We would then need a script up- 
and-running that tidies everything up prior to commits (though what  
happens if perltidy tanks?...).

Sendu, up for it?

>> About svn, I like the idea of eventually migrating to using it over
>> CVS (I think BioPython and BioJava have plans to but I'm not sure)
>> but I don't really know enough to say how feasible/difficult the
>> migration path would be.  Anyone know?
>>
>
> It's doable but non-trivial.  cvs2svn (python gah!) script exists to
> help in this.  There are pros and cons to converting.   There is a
> fair amount of documentation and other pointers out there that point
> to the CVS server for getting latest code so we'd need to think about
> whether we'd support some sort of backwards compatible SVN -> CVS for
> read-only or what.
>
> Mostly it will need someone to lead the charge - I made a go at doing
> it in the winter, but I really don't have the SVN-foo to make this
> work.  We'd need someone with SVN experience to step up and help.
> You can always try and we can play with the converted repository for
> a while without making it the new code base.
>
> -j

Stepped into that one, didn't I!  I'll look into how much effort is  
involved and try getting something going in the next month or two,  
maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
might be worth looking into.

chris


From arareko at campus.iztacala.unam.mx  Thu Jun 14 15:50:39 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 14 Jun 2007 14:50:39 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
Message-ID: <46719C0F.5010706@campus.iztacala.unam.mx>

Chris Fields wrote:
> On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:
> 
>> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>>
>>> About svn, I like the idea of eventually migrating to using it over
>>> CVS (I think BioPython and BioJava have plans to but I'm not sure)
>>> but I don't really know enough to say how feasible/difficult the
>>> migration path would be.  Anyone know?
>>>
>> It's doable but non-trivial.  cvs2svn (python gah!) script exists to
>> help in this.  There are pros and cons to converting.   There is a
>> fair amount of documentation and other pointers out there that point
>> to the CVS server for getting latest code so we'd need to think about
>> whether we'd support some sort of backwards compatible SVN -> CVS for
>> read-only or what.
>>
>> Mostly it will need someone to lead the charge - I made a go at doing
>> it in the winter, but I really don't have the SVN-foo to make this
>> work.  We'd need someone with SVN experience to step up and help.
>> You can always try and we can play with the converted repository for
>> a while without making it the new code base.
>>
>> -j
> 
> Stepped into that one, didn't I!  I'll look into how much effort is  
> involved and try getting something going in the next month or two,  
> maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
> might be worth looking into.
> 
> chris
> 

Chris D has worked with CVS-SVN transitioning for other projects, maybe 
he can shed some light on this.

Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From sac at bioperl.org  Thu Jun 14 17:33:39 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Thu, 14 Jun 2007 14:33:39 -0700
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
In-Reply-To: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>
References: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
	<5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>
Message-ID: <8f200b4c0706141433i37267774u1dc2193d8508c47b@mail.gmail.com>

This issue was discussed recently here. Check out this thread:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15046/focus=15048

Some of the tools mentioned in the FAQ item Chris mentioned do not
report where the match occurred, only that a match occurred
(String::Approx, agrep), though some do report do report match
locations (fuzznuc, fuzzprot; not sure about TFBS).

My Bio::Tools::SeqPattern module does not even perform any matches, it
just encapsulates a regular expression for a nuc or protein motif and
knows how to handle ambiguity code expansion and reverse
complementing. The idea is that you can use this to convert a
biological sequence motif into a string suitable for use in a perl
regex. Adding a match() method to this module would be handy.

There an example script for it in examples/tools of the distro (which,
btw references an obsolete module, so it won't run as is -- I'll fix).

Steve

On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
> This is answered in the FAQ (sorry if the URL wraps, but we don't
> like tinyurls):
>
> http://www.bioperl.org/wiki/
> FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_.
> 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F
>
> chris
>
> On Jun 13, 2007, at 7:20 PM, John Cumbers wrote:
>
> > Hello,
> >
> > I have a simple problem, I'm trying to search a genome sequence for
> > a motif,
> > I then want to output a BED file to display all the locations of
> > this motif
> > on the UCSC Genome Browser.  I could not find a script to do this,
> > so I
> > started to write my own.   I'm new to perl and my code below was my
> > attempt
> > to read the sequence string and output the index bp of the start of
> > each
> > motif.  With this I could build the BED file myself, which requires
> > start
> > and finish base pairs.
> >
> > For the first motif I can output the start index, but when I try
> > and read
> > the next one off the sequence it does not work.  Instead I just get an
> > output of a list of 1's.  I realise that this is more a request for
> > some
> > simple perl help, but any help much appreciated.
> >
> > Best wishes,
> > John
> >
> >
> > $seq_object = read_sequence
> > ("Drosophila.Chr3.test.AE014296.fasta");  #turn
> > my FASTA file into a seq object.
> > $sequence_as_a_string = $seq_object->seq();  #turn it into a string
> > # search $sequence_as_a_string  string for motif AAA as example
> > # if found, return the index that it is found at
> >
> > while ($sequence_as_a_string =~ m/AAA/g) {
> >   print "Found '$&'.  Next attempt at character " .
> > pos($sequence_as_a_string)+1 . "\n";
> > }
> >
> >
> >
> > --
> > John Cumbers,  Graduate Student
> > Biology and Medicine
> > Brown University, Box G-W
> > Providence, Rhode Island, 02912, USA
> > Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
> > UK to USA: 0207 617 7824
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From hlapp at gmx.net  Thu Jun 14 19:04:11 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 14 Jun 2007 19:04:11 -0400
Subject: [Bioperl-l] cvs changes in working copy
In-Reply-To: <EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>
References: <46717D5B.5040108@sheffield.ac.uk>
	<EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>
Message-ID: <3B262E6A-2C90-49FA-BCA1-BF1900C5AC3A@gmx.net>

Actually, that will update your repository. If you just wanted to  
take a peek you would use cvs status:

$ cvs status | grep 'Locally Modified'

	-hilmar

On Jun 14, 2007, at 3:01 PM, Jason Stajich wrote:

> cvs update | grep '^M'
>
> On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote:
>
>> Not sure if I'm being dense or if it's because I've been working with
>> svn recently, but - how do I get a list of files that are  
>> different in
>> my working copy compared to the repository?
>>
>> Cheers
>> Nath
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From mmokrejs at ribosome.natur.cuni.cz  Fri Jun 15 03:28:17 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Fri, 15 Jun 2007 09:28:17 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
Message-ID: <46723F91.60501@ribosome.natur.cuni.cz>

Chris Fields wrote:
> Is 99.gb supposed to be a GenBank file?  And you're loading it into 

Yes, it was attached to the email. ;)

> embl2picture (which I assume takes EMBL format files)?  Without example 
> code we can easily make the wrong assumptions (i.e. that this is user 
> error and not a BioPerl problem).

use constant USAGE =><<END;
Usage: $0 <file>
   Render a GenBank/EMBL entry into drawable form.
   Return as a GIF or PNG image on standard output.
 
   File must be in embl, genbank, or another SeqIO-
   recognized format.  Only the first entry will be
   rendered.
 
Example to try:
   embl2picture.pl factor7.embl | display -
 
END

> 
> Also, I don't believe the feature plotting scripts plot circular 
> chromosomes/plasmids.  If you want this functionality you'll have to 
> code it for yourself.

That's a pitty it does not, but at least if someone could improve the docs. ;)
Unfortunately I don't have the time to rewrite the code myself now,
I need a working, standalone, already available tool. :(
M.

> 
> chris
> 
> On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote:
> 
>> Martin MOKREJ? wrote:
>>
>>>> Also, there is a *huge* amount of documentation and examples on the
>>>> BioPerl website.
>>>>
>>>>     http://www.bioperl.org/wiki/HOWTOs
>>>
>>> You mean
>>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File 
>>>
>>> ? ;-)
>>
>> $ perl embl2picture.pl ~/99.gb | display -
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature 
>> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature 
>> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature 
>> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature 
>> Bio::Location::Simple=HASH(0x893e720): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature 
>> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>> $
>>
>> The plasmid is a circular DNA, why is the diagram in linear? ;-)
>>
>> Martin
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs


From dhoworth at mrc-lmb.cam.ac.uk  Fri Jun 15 04:59:09 2007
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Fri, 15 Jun 2007 09:59:09 +0100
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
Message-ID: <467254DD.3010505@mrc-lmb.cam.ac.uk>

Martin MOKREJ? wrote:
>>> Also, there is a *huge* amount of documentation and examples on
>>> the BioPerl website.
>>> 
>>> http://www.bioperl.org/wiki/HOWTOs
>> You mean 
>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>>  ? ;-)
> 
> $ perl embl2picture.pl ~/99.gb | display - Error returned while
> evaluating value of 'description' option for glyph
> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature
> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl
> line 141, <GEN0> line 125.

Hmm an error at line 141 of a 69 line script? Methinks you're not
actually running the script that's presented on the wiki page you
quoted. I cut-and-pasted the script and your file and it worked for me
(at least, it produced an image, along with a bunch of OOPS lines)

HTH, Dave


From n.haigh at sheffield.ac.uk  Fri Jun 15 06:21:38 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 11:21:38 +0100
Subject: [Bioperl-l] Installation using --install_base
Message-ID: <46726832.7080601@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm setting up a new installation of Debian 4.0 at home and though I'd
try to install BioPerl as a normal user rather than root. So in CPAN
options I set the --install_base to /home/username/perl and set PERL5LIB
to point to the same place.

Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
user and ask to install all optional modules, it tries to install them
through CPAN - however it seems to fail because some dependencies don't
seem to want to install in a user directory.

Has anyone else found this or might I be doing something wrong?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGcmgyczuW2jkwy2gRAtgqAKDIv717ciVHr5V+Z1kqPV2a++E8dgCfYr2a
VPt4tEPLW2J+BiKnN3B8aV8=
=c+9z
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Fri Jun 15 06:07:04 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 15 Jun 2007 11:07:04 +0100
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
Message-ID: <467264C8.4020202@sendu.me.uk>

Chris Fields wrote:
> On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:
> 
>> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>>
>>> To chip in on this, I only use perltidy when I need to clean bioperl
>>> code up for debugging (particularly if blocks are hard to see) and
>>> just use the defaults.  I agree it would be nice to have everything
>>> tidied up but it'll definitely need to be a consensus config file.
>>>
>> Can we do any sort of massive conversion at some logical timepoint.
>> Probably after a branch release or something?  Because it basically
>> means we're going to have differences on nearly every line which is
>> going to make diff-ing difficult when debugging old/new versions.
>> Maybe it is not a problem because we aren't introducing and new bugs!

Sorry, can you clarify the problem you envisage? And why would making a 
branch release help?


> I agree; if we intend on doing this it should be all at once, maybe  
> on a branch dedicated to ensure that code changes don't tank tests  
> (they shouldn't but one never knows).  We would then need a script up- 
> and-running that tidies everything up prior to commits (though what  
> happens if perltidy tanks?...).
> 
> Sendu, up for it?

If its going to be difficult and a hassle, for such an unnecessary thing 
I'm not sure its worth it. There are more pressing things to be done for 
Bioperl.

If I can just run perltidy on the entire package and commit, I'd do it. 
If that's not appropriate, I won't.


>>> About svn
[snip]
> Stepped into that one, didn't I!  I'll look into how much effort is  
> involved and try getting something going in the next month or two,  
> maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
> might be worth looking into.

I'd put this in the unnecessary-but-nice category as well. If it will be 
as easy as my ->new change, go ahead. If not, there are more pressing 
matters (POD fixing, test script updating and finishing...).


From n.haigh at sheffield.ac.uk  Fri Jun 15 06:35:40 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 11:35:40 +0100
Subject: [Bioperl-l] Installation using --install_base
Message-ID: <46726B7C.7070902@sheffield.ac.uk>

I'm setting up a new installation of Debian 4.0 at home and though I'd
try to install BioPerl as a normal user rather than root. So in CPAN
options I set the --install_base to /home/username/perl and set PERL5LIB
to point to the same place.

Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
user and ask to install all optional modules, it tries to install them
through CPAN - however it seems to fail because some dependencies don't
seem to want to install in a user directory.

Has anyone else found this or might I be doing something wrong?

Nath


From bix at sendu.me.uk  Fri Jun 15 06:45:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 15 Jun 2007 11:45:48 +0100
Subject: [Bioperl-l] Installation using --install_base
In-Reply-To: <46726832.7080601@sheffield.ac.uk>
References: <46726832.7080601@sheffield.ac.uk>
Message-ID: <46726DDC.8090202@sendu.me.uk>

Nathan S. Haigh wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> I'm setting up a new installation of Debian 4.0 at home and though I'd
> try to install BioPerl as a normal user rather than root. So in CPAN
> options I set the --install_base to /home/username/perl and set PERL5LIB
> to point to the same place.
> 
> Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
> user and ask to install all optional modules, it tries to install them
> through CPAN - however it seems to fail because some dependencies don't
> seem to want to install in a user directory.
> 
> Has anyone else found this or might I be doing something wrong?

You'll need to configure CPAN to install into your user directory. 
Upgrade to the latest version, then go read the docs on the various 
configurable options. I thought I at least mentioned this in the Bioperl 
INSTALL doc. If not, can someone come up with a concise clarification?


From sdavis2 at mail.nih.gov  Fri Jun 15 06:56:08 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 15 Jun 2007 06:56:08 -0400
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467264C8.4020202@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
Message-ID: <46727048.3080904@mail.nih.gov>

Sendu Bala wrote:
> If its going to be difficult and a hassle, for such an unnecessary thing 
> I'm not sure its worth it. There are more pressing things to be done for 
> Bioperl.
> 
> If I can just run perltidy on the entire package and commit, I'd do it. 
> If that's not appropriate, I won't.

I agree with the sentiment noted above.  I'm a bit of an outsider here,
but bioperl is a collaborative project.  Not everyone has the same
sentiments about what "correct" style means.  As a programmer, I really
wouldn't want significant changes on the style of my code.  And perl
happily puts up with many styles.  I would say leave things as they
are--let the individual programmers choose.  It reduces the amount of
work of questionable importance and allows the coding style freedom that
perl supports.

Just my $.02.

Sean


From cjfields at uiuc.edu  Fri Jun 15 10:05:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:05:07 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <46723F91.60501@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
	<46723F91.60501@ribosome.natur.cuni.cz>
Message-ID: <A2212781-75F3-4BB7-967F-1668B682E84E@uiuc.edu>


On Jun 15, 2007, at 2:28 AM, Martin MOKREJ? wrote:

> Chris Fields wrote:
>> Is 99.gb supposed to be a GenBank file?  And you're loading it into
>
> Yes, it was attached to the email. ;)

<bring foot to mouth and insert>

Sorry about that.  I notice that '.' was added, but the spacing  
seemed off.  I think bioperl catches that fine but it's something  
Wayne should consider.

>> embl2picture (which I assume takes EMBL format files)?  Without  
>> example
>> code we can easily make the wrong assumptions (i.e. that this is user
>> error and not a BioPerl problem).
>
> use constant USAGE =><<END;
> Usage: $0 <file>
>    Render a GenBank/EMBL entry into drawable form.
>    Return as a GIF or PNG image on standard output.
>
>    File must be in embl, genbank, or another SeqIO-
>    recognized format.  Only the first entry will be
>    rendered.
>
> Example to try:
>    embl2picture.pl factor7.embl | display -
>
> END

Horribly named script (should be seq2picture, since it converts both  
gb/embl).  The use of 'all_tags' makes me think the script version  
you are using is old, as those methods have long since been renamed.   
Dave has it working though, so maybe your version has been updated?   
The 'use of initialized data in' errors are probably from inclusion  
of mandatory fields with no data or '.'.

>> Also, I don't believe the feature plotting scripts plot circular
>> chromosomes/plasmids.  If you want this functionality you'll have to
>> code it for yourself.
>
> That's a pitty it does not, but at least if someone could improve  
> the docs. ;)
> Unfortunately I don't have the time to rewrite the code myself now,
> I need a working, standalone, already available tool. :(
> M.

As I said, unless someone shows interest and codes it just won't get  
done.  We have had very little interest in this, either b/c there are  
tools already out there to do this very thing (multitudes of plasmid  
drawing programs, some free like ApE) or that nobody's bothered to  
write it up.

chris


From cjfields at uiuc.edu  Fri Jun 15 10:22:23 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:22:23 -0500
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <46727048.3080904@mail.nih.gov>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov>
Message-ID: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>


On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:

> Sendu Bala wrote:
>> If its going to be difficult and a hassle, for such an unnecessary  
>> thing
>> I'm not sure its worth it. There are more pressing things to be  
>> done for
>> Bioperl.
>>
>> If I can just run perltidy on the entire package and commit, I'd  
>> do it.
>> If that's not appropriate, I won't.
>
> I agree with the sentiment noted above.  I'm a bit of an outsider  
> here,
> but bioperl is a collaborative project.  Not everyone has the same
> sentiments about what "correct" style means.  As a programmer, I  
> really
> wouldn't want significant changes on the style of my code.  And perl
> happily puts up with many styles.  I would say leave things as they
> are--let the individual programmers choose.  It reduces the amount of
> work of questionable importance and allows the coding style freedom  
> that
> perl supports.
>
> Just my $.02.
>
> Sean

I tend to run it on modules that need some reformatting  
(SearchIO::blast comes to mind).  I believe you're correct when this  
comes down to programming style, but I think this echoes a sentiment  
(frustration, perhaps) that some of us have with long-term  
maintenance of said code.

Maybe a compromise:  include a copy of .perltidyrc with the  
distribution that goes by what a consensus wants or by the general  
rules laid out in Perl Best Practices (spaced settings, use of spaces  
over tabs, etc).  Conversion would be encouraged but voluntary, with  
the caveat that if someone needs to clean up code down the road (bug  
fixes, enhancements, etc) and if the original author isn't able to  
add it in themselves, it could be perltidy'd in order to help the  
developer (locate and fix the issue)|(add relevant enhancement where  
needed).

chris


From cjfields at uiuc.edu  Fri Jun 15 10:56:23 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:56:23 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467264C8.4020202@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
Message-ID: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>


On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:

>>>> ...
>>> Can we do any sort of massive conversion at some logical timepoint.
>>> Probably after a branch release or something?  Because it basically
>>> means we're going to have differences on nearly every line which is
>>> going to make diff-ing difficult when debugging old/new versions.
>>> Maybe it is not a problem because we aren't introducing and new  
>>> bugs!
>
> Sorry, can you clarify the problem you envisage? And why would  
> making a branch release help?

Maybe the worry is that mass conversion in such a large codebase  
could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o  
trying?

>> I agree; if we intend on doing this it should be all at once,  
>> maybe  on a branch dedicated to ensure that code changes don't  
>> tank tests  (they shouldn't but one never knows).  We would then  
>> need a script up- and-running that tidies everything up prior to  
>> commits (though what  happens if perltidy tanks?...).
>> Sendu, up for it?
>
> If its going to be difficult and a hassle, for such an unnecessary  
> thing I'm not sure its worth it. There are more pressing things to  
> be done for Bioperl.
>
> If I can just run perltidy on the entire package and commit, I'd do  
> it. If that's not appropriate, I won't.

The choices aren't necessarily all or nothing.  What about voluntary,  
recommended use of a perltidy config file included with the  
distribution, with additional 'caveats'?  See my response to Sean.

>>>> About svn
> [snip]
>> Stepped into that one, didn't I!  I'll look into how much effort  
>> is  involved and try getting something going in the next month or  
>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as  
>> well but it  might be worth looking into.
>
> I'd put this in the unnecessary-but-nice category as well. If it  
> will be as easy as my ->new change, go ahead. If not, there are  
> more pressing matters (POD fixing, test script updating and  
> finishing...).

A few other open-bio projects have actively discussed a CVS->SVN  
migration (BioRuby and I think BioPython, though the latter could be  
wrong).  As I said, "it might be worth looking into" to weigh the  
pros/cons, get others opinions from others who have made the  
transition, etc.  We could, as Jason suggested, even set up a tester  
SVN w/o making it the default codebase (lock it off to a few testers,  
have CVS commits automatically/manually carry over to SVN, etc).

I agree with you that it's not feasible to switch over prior to a  
release and that there are more pressing issues, but it doesn't hurt  
having an open discussion about it.

chris


From sdavis2 at mail.nih.gov  Fri Jun 15 11:15:57 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 15 Jun 2007 11:15:57 -0400
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov>
	<78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
Message-ID: <4672AD2D.2090001@mail.nih.gov>

Chris Fields wrote:
> 
> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:
> 
>> Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary thing
>>> I'm not sure its worth it. There are more pressing things to be done for
>>> Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd do it.
>>> If that's not appropriate, I won't.
>>
>> I agree with the sentiment noted above.  I'm a bit of an outsider here,
>> but bioperl is a collaborative project.  Not everyone has the same
>> sentiments about what "correct" style means.  As a programmer, I really
>> wouldn't want significant changes on the style of my code.  And perl
>> happily puts up with many styles.  I would say leave things as they
>> are--let the individual programmers choose.  It reduces the amount of
>> work of questionable importance and allows the coding style freedom that
>> perl supports.
>>
>> Just my $.02.
>>
>> Sean
> 
> I tend to run it on modules that need some reformatting (SearchIO::blast
> comes to mind).  I believe you're correct when this comes down to
> programming style, but I think this echoes a sentiment (frustration,
> perhaps) that some of us have with long-term maintenance of said code.
> 
> Maybe a compromise:  include a copy of .perltidyrc with the distribution
> that goes by what a consensus wants or by the general rules laid out in
> Perl Best Practices (spaced settings, use of spaces over tabs, etc). 
> Conversion would be encouraged but voluntary, with the caveat that if
> someone needs to clean up code down the road (bug fixes, enhancements,
> etc) and if the original author isn't able to add it in themselves, it
> could be perltidy'd in order to help the developer (locate and fix the
> issue)|(add relevant enhancement where needed).

Don't get me wrong--I think whatever makes bioperl a better, more
maintainable beast should be what is done.  The bioperl gurus should
absolutely do what is best for them for code maintainability.

Sean


From n.haigh at sheffield.ac.uk  Fri Jun 15 11:17:15 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 16:17:15 +0100
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>	<467264C8.4020202@sendu.me.uk>
	<46727048.3080904@mail.nih.gov>
	<78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
Message-ID: <4672AD7B.4050109@sheffield.ac.uk>

Chris Fields wrote:
> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:
> 
>> Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary  
>>> thing
>>> I'm not sure its worth it. There are more pressing things to be  
>>> done for
>>> Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd  
>>> do it.
>>> If that's not appropriate, I won't.
>> I agree with the sentiment noted above.  I'm a bit of an outsider  
>> here,
>> but bioperl is a collaborative project.  Not everyone has the same
>> sentiments about what "correct" style means.  As a programmer, I  
>> really
>> wouldn't want significant changes on the style of my code.  And perl
>> happily puts up with many styles.  I would say leave things as they
>> are--let the individual programmers choose.  It reduces the amount of
>> work of questionable importance and allows the coding style freedom  
>> that
>> perl supports.
>>
>> Just my $.02.
>>
>> Sean
> 
> I tend to run it on modules that need some reformatting  
> (SearchIO::blast comes to mind).  I believe you're correct when this  
> comes down to programming style, but I think this echoes a sentiment  
> (frustration, perhaps) that some of us have with long-term  
> maintenance of said code.
> 
> Maybe a compromise:  include a copy of .perltidyrc with the  
> distribution that goes by what a consensus wants or by the general  
> rules laid out in Perl Best Practices (spaced settings, use of spaces  
> over tabs, etc).  

RE spaces, tabs etc - how well is the different coding styles handled
for displaying in html and via the online browsable cvs?

Conversion would be encouraged but voluntary, with
> the caveat that if someone needs to clean up code down the road (bug  
> fixes, enhancements, etc) and if the original author isn't able to  
> add it in themselves, it could be perltidy'd in order to help the  
> developer (locate and fix the issue)|(add relevant enhancement where  
> needed).
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From johnsonm at gmail.com  Fri Jun 15 15:37:26 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Fri, 15 Jun 2007 14:37:26 -0500
Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap
	start and stop coordinates??
In-Reply-To: <E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
References: <CED81D34E37D5043A1211565277A51E507E23161@exchkc02.stowers-institute.org>
	<79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu>
	<ebf5eb170705161211m6fb570b5r86ee055299993172@mail.gmail.com>
	<B012903E-7C0F-4E34-9BFE-E551855B6C62@uiuc.edu>
	<ebf5eb170705211348w57c37f18oeb128656c446cff@mail.gmail.com>
	<62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu>
	<ebf5eb170705211421w244933fcu4db8ba748653c090@mail.gmail.com>
	<9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu>
	<a79f6a4b0705211729j3ff17d60v610fab7f5e135303@mail.gmail.com>
	<E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
Message-ID: <ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>

Patches waiting in Bugzilla (Bug #2299).  Changes:

-Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for
prokaryotic reports (Glimmer2/Glimmer3)
-Bio::Tools::Glimmer now produces features with Fuzzy or Split
locations as appropriate (partial or circular/wraparound predictions)
-Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out
sequence lengths
-Bio::Tools::Run::Glimmer passes along the sequence length to
Bio::Tools::Glimmer for Glimmer2

I should probably modify Bio::Tools::Genemark to use
Bio::SeqFeature::Generic features for prokaryotic reports, to be
consistent, but this is more likely to surprise people.  If nobody
screams about the change to Bio::Tools::Glimmer, I'll do it at some
point.

On 5/21/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote:
>
> >> glimmer2/3 both assume the genome is circular by default (I'm
> >> assuming since Glimmer2/3 are used for bacterial genomes).  Acc. to
> >> the Glimmer3 release notes the detail file has the information in the
> >> header; from the Glimmer3 data used for tests:
> >
> > You beat me to the reply Chris - yes, Glimmer2/3 assume circular
> > chromosome by default. I had forgotten about this in earlier
> > discussions of the new Glimmer parsers as I normally run it in
> > --linear / -L mode (even if I know it is circular) because it is
> > easier to handle, and our sequencer/assembler team usually gets the
> > origin of replication right.
> >
> >> Command:  /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../BCTDNA
> >> Glimmer3.icm Glimmer3
> >
> > I did a double-take here - that's the path to my Glimmer3
> > installation! It took me a couple of minutes to realise that you got
> > it from the bioperl test data I created. D'oh! :-)
>
> Yep, I forgot about that!
>
> >> There are options available for glimmer3 (-L, -X) that specify a
> >> linear sequence or allow ORFs to extend past the end of the sequence
> >> analyzed (the latter assumes a linear sequence).
> >
> > If the -L mode should produce Bio::Location::Split objects, I guess if
> > -X is used
> > it should produce Bio::Location::Fuzzy objects too...
> >
> > --Torsten
>
> True, didn't think about that one.  Def. something to consider adding
> in.
>
> chris
>
>
>


From cjfields at uiuc.edu  Fri Jun 15 16:55:06 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 15:55:06 -0500
Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap
	start and stop coordinates??
In-Reply-To: <ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>
References: <CED81D34E37D5043A1211565277A51E507E23161@exchkc02.stowers-institute.org>
	<79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu>
	<ebf5eb170705161211m6fb570b5r86ee055299993172@mail.gmail.com>
	<B012903E-7C0F-4E34-9BFE-E551855B6C62@uiuc.edu>
	<ebf5eb170705211348w57c37f18oeb128656c446cff@mail.gmail.com>
	<62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu>
	<ebf5eb170705211421w244933fcu4db8ba748653c090@mail.gmail.com>
	<9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu>
	<a79f6a4b0705211729j3ff17d60v610fab7f5e135303@mail.gmail.com>
	<E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
	<ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>
Message-ID: <D09AF2F1-1459-4B6B-A3ED-85CEDE34D7B6@uiuc.edu>

I'll try getting to that in tonight.  Been pretty tied up lately...

chris

On Jun 15, 2007, at 2:37 PM, Mark Johnson wrote:

> Patches waiting in Bugzilla (Bug #2299).  Changes:
>
> -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for
> prokaryotic reports (Glimmer2/Glimmer3)
> -Bio::Tools::Glimmer now produces features with Fuzzy or Split
> locations as appropriate (partial or circular/wraparound predictions)
> -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out
> sequence lengths
> -Bio::Tools::Run::Glimmer passes along the sequence length to
> Bio::Tools::Glimmer for Glimmer2
>
> I should probably modify Bio::Tools::Genemark to use
> Bio::SeqFeature::Generic features for prokaryotic reports, to be
> consistent, but this is more likely to surprise people.  If nobody
> screams about the change to Bio::Tools::Glimmer, I'll do it at some
> point.
>
> On 5/21/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>
>> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote:
>>
>>>> glimmer2/3 both assume the genome is circular by default (I'm
>>>> assuming since Glimmer2/3 are used for bacterial genomes).  Acc. to
>>>> the Glimmer3 release notes the detail file has the information  
>>>> in the
>>>> header; from the Glimmer3 data used for tests:
>>>
>>> You beat me to the reply Chris - yes, Glimmer2/3 assume circular
>>> chromosome by default. I had forgotten about this in earlier
>>> discussions of the new Glimmer parsers as I normally run it in
>>> --linear / -L mode (even if I know it is circular) because it is
>>> easier to handle, and our sequencer/assembler team usually gets the
>>> origin of replication right.
>>>
>>>> Command:  /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../ 
>>>> BCTDNA
>>>> Glimmer3.icm Glimmer3
>>>
>>> I did a double-take here - that's the path to my Glimmer3
>>> installation! It took me a couple of minutes to realise that you got
>>> it from the bioperl test data I created. D'oh! :-)
>>
>> Yep, I forgot about that!
>>
>>>> There are options available for glimmer3 (-L, -X) that specify a
>>>> linear sequence or allow ORFs to extend past the end of the  
>>>> sequence
>>>> analyzed (the latter assumes a linear sequence).
>>>
>>> If the -L mode should produce Bio::Location::Split objects, I  
>>> guess if
>>> -X is used
>>> it should produce Bio::Location::Fuzzy objects too...
>>>
>>> --Torsten
>>
>> True, didn't think about that one.  Def. something to consider adding
>> in.
>>
>> chris
>>
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From rvos at interchange.ubc.ca  Fri Jun 15 17:08:17 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Fri, 15 Jun 2007 14:08:17 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
Message-ID: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>

Hi,

I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS.

Rutger


-----Original Message-----

> Date: Fri Jun 15 07:56:23 PDT 2007
> From: "Chris Fields" <cjfields at uiuc.edu>
> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> To: "Sendu Bala" <bix at sendu.me.uk>
>
> 
> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> 
> >>>> ...
> >>> Can we do any sort of massive conversion at some logical timepoint.
> >>> Probably after a branch release or something?  Because it basically
> >>> means we're going to have differences on nearly every line which is
> >>> going to make diff-ing difficult when debugging old/new versions.
> >>> Maybe it is not a problem because we aren't introducing and new  
> >>> bugs!
> >
> > Sorry, can you clarify the problem you envisage? And why would  
> > making a branch release help?
> 
> Maybe the worry is that mass conversion in such a large codebase  
> could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o  
> trying?
> 
> >> I agree; if we intend on doing this it should be all at once,  
> >> maybe  on a branch dedicated to ensure that code changes don't  
> >> tank tests  (they shouldn't but one never knows).  We would then  
> >> need a script up- and-running that tidies everything up prior to  
> >> commits (though what  happens if perltidy tanks?...).
> >> Sendu, up for it?
> >
> > If its going to be difficult and a hassle, for such an unnecessary  
> > thing I'm not sure its worth it. There are more pressing things to  
> > be done for Bioperl.
> >
> > If I can just run perltidy on the entire package and commit, I'd do  
> > it. If that's not appropriate, I won't.
> 
> The choices aren't necessarily all or nothing.  What about voluntary,  
> recommended use of a perltidy config file included with the  
> distribution, with additional 'caveats'?  See my response to Sean.
> 
> >>>> About svn
> > [snip]
> >> Stepped into that one, didn't I!  I'll look into how much effort  
> >> is  involved and try getting something going in the next month or  
> >> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as  
> >> well but it  might be worth looking into.
> >
> > I'd put this in the unnecessary-but-nice category as well. If it  
> > will be as easy as my ->new change, go ahead. If not, there are  
> > more pressing matters (POD fixing, test script updating and  
> > finishing...).
> 
> A few other open-bio projects have actively discussed a CVS->SVN  
> migration (BioRuby and I think BioPython, though the latter could be  
> wrong).  As I said, "it might be worth looking into" to weigh the  
> pros/cons, get others opinions from others who have made the  
> transition, etc.  We could, as Jason suggested, even set up a tester  
> SVN w/o making it the default codebase (lock it off to a few testers,  
> have CVS commits automatically/manually carry over to SVN, etc).
> 
> I agree with you that it's not feasible to switch over prior to a  
> release and that there are more pressing issues, but it doesn't hurt  
> having an open discussion about it.
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From spiros at lokku.com  Fri Jun 15 17:40:32 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Fri, 15 Jun 2007 22:40:32 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>

On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
> Hi,
>
> I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS.
>
> Rutger
>

I second that, SVN seems like the reasonable choice. I would be more
than happy to help out as well.

Spiros

>
> -----Original Message-----
>
> > Date: Fri Jun 15 07:56:23 PDT 2007
> > From: "Chris Fields" <cjfields at uiuc.edu>
> > Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> > To: "Sendu Bala" <bix at sendu.me.uk>
> >
> >
> > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> >
> > >>>> ...
> > >>> Can we do any sort of massive conversion at some logical timepoint.
> > >>> Probably after a branch release or something?  Because it basically
> > >>> means we're going to have differences on nearly every line which is
> > >>> going to make diff-ing difficult when debugging old/new versions.
> > >>> Maybe it is not a problem because we aren't introducing and new
> > >>> bugs!
> > >
> > > Sorry, can you clarify the problem you envisage? And why would
> > > making a branch release help?
> >
> > Maybe the worry is that mass conversion in such a large codebase
> > could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o
> > trying?
> >
> > >> I agree; if we intend on doing this it should be all at once,
> > >> maybe  on a branch dedicated to ensure that code changes don't
> > >> tank tests  (they shouldn't but one never knows).  We would then
> > >> need a script up- and-running that tidies everything up prior to
> > >> commits (though what  happens if perltidy tanks?...).
> > >> Sendu, up for it?
> > >
> > > If its going to be difficult and a hassle, for such an unnecessary
> > > thing I'm not sure its worth it. There are more pressing things to
> > > be done for Bioperl.
> > >
> > > If I can just run perltidy on the entire package and commit, I'd do
> > > it. If that's not appropriate, I won't.
> >
> > The choices aren't necessarily all or nothing.  What about voluntary,
> > recommended use of a perltidy config file included with the
> > distribution, with additional 'caveats'?  See my response to Sean.
> >
> > >>>> About svn
> > > [snip]
> > >> Stepped into that one, didn't I!  I'll look into how much effort
> > >> is  involved and try getting something going in the next month or
> > >> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
> > >> well but it  might be worth looking into.
> > >
> > > I'd put this in the unnecessary-but-nice category as well. If it
> > > will be as easy as my ->new change, go ahead. If not, there are
> > > more pressing matters (POD fixing, test script updating and
> > > finishing...).
> >
> > A few other open-bio projects have actively discussed a CVS->SVN
> > migration (BioRuby and I think BioPython, though the latter could be
> > wrong).  As I said, "it might be worth looking into" to weigh the
> > pros/cons, get others opinions from others who have made the
> > transition, etc.  We could, as Jason suggested, even set up a tester
> > SVN w/o making it the default codebase (lock it off to a few testers,
> > have CVS commits automatically/manually carry over to SVN, etc).
> >
> > I agree with you that it's not feasible to switch over prior to a
> > release and that there are more pressing issues, but it doesn't hurt
> > having an open discussion about it.
> >
> > chris
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From hlapp at gmx.net  Fri Jun 15 18:10:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 18:10:25 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
Message-ID: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>

So should we set up a sandbox svn repository and those who would like  
to help out

- take shots at migrating bioperl (any current cvs snapshot will do)  
to svn

- you document what you find yourself having to do in trying to make  
it work

- you report back when you think you have a working repository

- we all get a defined amount of time to test to our hearts' content,  
say 2 weeks

- you fix issues that were encountered

- report back when done, followed by retesting for, say 1 week

- iterate previous 2 steps until no issues and no objections to  
migration

- two more weeks of warning period to all developers to commit all  
outstanding changes, or reapply them to a future svn checkout

- pull the trigger by locking down cvs, applying the migration as  
worked out before, and announcing that BioPerl is now on svn

- get free beer at next BOSC (I'll pay if no one else does)

This may not be precisely the plan that needs to be executed, but  
it's probably somewhere along those lines.

If there are volunteers who would like to spearhead this, then power  
to you - I think everyone is in favor and the advantages of svn don't  
need to be debated. The only reason it hasn't happened yet is because  
no one has stepped forward who would have the energy.

I'm sure ChrisD will gladly create the svn sandbox if we have  
volunteers lined up to get going.

	-hilmar

On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:

> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>> Hi,
>>
>> I would very much prefer it if bioperl moved to svn. I'm  
>> considering merging Bio::Phylo (to the extent that that's possible/ 
>> practical) with bioperl and move it to an OBF repository, but I'd  
>> rather not go back to CVS.
>>
>> Rutger
>>
>
> I second that, SVN seems like the reasonable choice. I would be more
> than happy to help out as well.
>
> Spiros
>
>>
>> -----Original Message-----
>>
>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>
>>>
>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>
>>>>>>> ...
>>>>>> Can we do any sort of massive conversion at some logical  
>>>>>> timepoint.
>>>>>> Probably after a branch release or something?  Because it  
>>>>>> basically
>>>>>> means we're going to have differences on nearly every line  
>>>>>> which is
>>>>>> going to make diff-ing difficult when debugging old/new versions.
>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>> bugs!
>>>>
>>>> Sorry, can you clarify the problem you envisage? And why would
>>>> making a branch release help?
>>>
>>> Maybe the worry is that mass conversion in such a large codebase
>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows  
>>> w/o
>>> trying?
>>>
>>>>> I agree; if we intend on doing this it should be all at once,
>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>> need a script up- and-running that tidies everything up prior to
>>>>> commits (though what  happens if perltidy tanks?...).
>>>>> Sendu, up for it?
>>>>
>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>> thing I'm not sure its worth it. There are more pressing things to
>>>> be done for Bioperl.
>>>>
>>>> If I can just run perltidy on the entire package and commit, I'd do
>>>> it. If that's not appropriate, I won't.
>>>
>>> The choices aren't necessarily all or nothing.  What about  
>>> voluntary,
>>> recommended use of a perltidy config file included with the
>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>
>>>>>>> About svn
>>>> [snip]
>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>> is  involved and try getting something going in the next month or
>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>> well but it  might be worth looking into.
>>>>
>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>> more pressing matters (POD fixing, test script updating and
>>>> finishing...).
>>>
>>> A few other open-bio projects have actively discussed a CVS->SVN
>>> migration (BioRuby and I think BioPython, though the latter could be
>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>> pros/cons, get others opinions from others who have made the
>>> transition, etc.  We could, as Jason suggested, even set up a tester
>>> SVN w/o making it the default codebase (lock it off to a few  
>>> testers,
>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>
>>> I agree with you that it's not feasible to switch over prior to a
>>> release and that there are more pressing issues, but it doesn't hurt
>>> having an open discussion about it.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jason at bioperl.org  Fri Jun 15 18:23:15 2007
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 15 Jun 2007 15:23:15 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>

Sounds like a plan, I'll be curious to see if we can still get keep  
anonymous CVS working as I'd like to not have to pull the plug on  
that.  There are some threads out on the web about how to do this  
with a commit rule on SVN.

Also, can someone who is close enough to all the SVN benefits please  
elaborate how it is going to help _this_ project?
Perhaps you would be willing to put a few words up -- like on (a to  
be created):
http://bioperl.org/wiki/BioPerl:Version_control_changeover

This way if anonymous CVS is broken and/or developers who haven't  
been paying attention come back to commit code ask why things changed  
we don't have to compose long emails... =)

-jason
On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote:

> So should we set up a sandbox svn repository and those who would like
> to help out
>
> - take shots at migrating bioperl (any current cvs snapshot will do)
> to svn
>
> - you document what you find yourself having to do in trying to make
> it work
>
> - you report back when you think you have a working repository
>
> - we all get a defined amount of time to test to our hearts' content,
> say 2 weeks
>
> - you fix issues that were encountered
>
> - report back when done, followed by retesting for, say 1 week
>
> - iterate previous 2 steps until no issues and no objections to
> migration
>
> - two more weeks of warning period to all developers to commit all
> outstanding changes, or reapply them to a future svn checkout
>
> - pull the trigger by locking down cvs, applying the migration as
> worked out before, and announcing that BioPerl is now on svn
>
> - get free beer at next BOSC (I'll pay if no one else does)
>
> This may not be precisely the plan that needs to be executed, but
> it's probably somewhere along those lines.
>
> If there are volunteers who would like to spearhead this, then power
> to you - I think everyone is in favor and the advantages of svn don't
> need to be debated. The only reason it hasn't happened yet is because
> no one has stepped forward who would have the energy.

>
> I'm sure ChrisD will gladly create the svn sandbox if we have
> volunteers lined up to get going.
>
> 	-hilmar
>
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>
>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>> Hi,
>>>
>>> I would very much prefer it if bioperl moved to svn. I'm
>>> considering merging Bio::Phylo (to the extent that that's possible/
>>> practical) with bioperl and move it to an OBF repository, but I'd
>>> rather not go back to CVS.
>>>
>>> Rutger
>>>
>>
>> I second that, SVN seems like the reasonable choice. I would be more
>> than happy to help out as well.
>>
>> Spiros
>>
>>>
>>> -----Original Message-----
>>>
>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>
>>>>
>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>
>>>>>>>> ...
>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>> timepoint.
>>>>>>> Probably after a branch release or something?  Because it
>>>>>>> basically
>>>>>>> means we're going to have differences on nearly every line
>>>>>>> which is
>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>> versions.
>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>> bugs!
>>>>>
>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>> making a branch release help?
>>>>
>>>> Maybe the worry is that mass conversion in such a large codebase
>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>> w/o
>>>> trying?
>>>>
>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>> Sendu, up for it?
>>>>>
>>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>>> thing I'm not sure its worth it. There are more pressing things to
>>>>> be done for Bioperl.
>>>>>
>>>>> If I can just run perltidy on the entire package and commit,  
>>>>> I'd do
>>>>> it. If that's not appropriate, I won't.
>>>>
>>>> The choices aren't necessarily all or nothing.  What about
>>>> voluntary,
>>>> recommended use of a perltidy config file included with the
>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>
>>>>>>>> About svn
>>>>> [snip]
>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>> is  involved and try getting something going in the next month or
>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>> well but it  might be worth looking into.
>>>>>
>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>> more pressing matters (POD fixing, test script updating and
>>>>> finishing...).
>>>>
>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>> migration (BioRuby and I think BioPython, though the latter  
>>>> could be
>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>> pros/cons, get others opinions from others who have made the
>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>> tester
>>>> SVN w/o making it the default codebase (lock it off to a few
>>>> testers,
>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>
>>>> I agree with you that it's not feasible to switch over prior to a
>>>> release and that there are more pressing issues, but it doesn't  
>>>> hurt
>>>> having an open discussion about it.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From sheris at eps.berkeley.edu  Fri Jun 15 18:58:12 2007
From: sheris at eps.berkeley.edu (Sheri Simmons)
Date: Fri, 15 Jun 2007 15:58:12 -0700
Subject: [Bioperl-l] seq doesn't validate error
Message-ID: <200706151558.12911.sheris@eps.berkeley.edu>

Hi,
I'm getting an error as follows when I try to reverse complement a sequence 
string stored in a hash of arrays. The storage code is: 

		$nstarthash{$key} = [$sortchecks[0], join("", @nseq), 		
join("",@{$seqhash{$key}})];

the sequence of interest is the element at index 1. 

Later, I try to retrieve this string for a subset of keys so I can reverse 
complement it based on input from another hash (%complement):

			my %revcomphash = map { my $read = $_;
			grep $complement{$read} eq 'C', %complement;
			{$_, (Bio::Seq->new(-seq =>$nstarthash{$_}[1]))->revcom->seq()};}
			 keys(%nstarthash); 


I get the following warning (long sequence edited for clarity):

-- -------------------- WARNING ---------------------
MSG: seq doesn't validate, mismatch is 1
---------------------------------------------------

------------- EXCEPTION  -------------
MSG: Attempting to set the sequence to [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] 
which does not look healthy
STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498
STACK toplevel ../quality_wrapper.pl:103

I cannot find any non-allowed characters in the sequence, and the 
de-referencing appears to work correctly. Can anyone help me?
I'm using the latest Bioperl installation (1.5.2) with ActivePerl5.8 on a 
Mepis 6.5 system. 

Thanks
Sheri

---------------------------------------------------------------------
Sheri Simmons
Department of Earth and Planetary Sciences
University of California, Berkeley
Berkeley, CA 94720-4767


From Kevin.M.Brown at asu.edu  Fri Jun 15 19:11:34 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Fri, 15 Jun 2007 16:11:34 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <200706151558.12911.sheris@eps.berkeley.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
Message-ID: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>

> I'm getting an error as follows when I try to reverse 
> complement a sequence string stored in a hash of arrays. The 
> storage code is: 
> 
> 		$nstarthash{$key} = [$sortchecks[0], join("", 
> @nseq), 		
> join("",@{$seqhash{$key}})];
> 
> the sequence of interest is the element at index 1. 
> 
> Later, I try to retrieve this string for a subset of keys so 
> I can reverse complement it based on input from another hash 
> (%complement):
> 
> 			my %revcomphash = map { my $read = $_;
> 			grep $complement{$read} eq 'C', %complement;
> 			{$_, (Bio::Seq->new(-seq 
> =>$nstarthash{$_}[1]))->revcom->seq()};}
> 			 keys(%nstarthash); 
> 
> 
> I get the following warning (long sequence edited for clarity):
> 
> -- -------------------- WARNING ---------------------
> MSG: seq doesn't validate, mismatch is 1
> ---------------------------------------------------
> 
> ------------- EXCEPTION  -------------
> MSG: Attempting to set the sequence to 
> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
> which does not look healthy
> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK 
> toplevel ../quality_wrapper.pl:103
> 
> I cannot find any non-allowed characters in the sequence, and 
> the de-referencing appears to work correctly. Can anyone help me?
> I'm using the latest Bioperl installation (1.5.2) with 
> ActivePerl5.8 on a Mepis 6.5 system. 

Try telling the Bio::Seq object what alphabet to use when creating it.
I tend to create them like:

Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')


From sheris at eps.berkeley.edu  Fri Jun 15 19:53:04 2007
From: sheris at eps.berkeley.edu (Sheri Simmons)
Date: Fri, 15 Jun 2007 16:53:04 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
	<1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
Message-ID: <200706151653.04135.sheris@eps.berkeley.edu>

Thanks for the suggestion, but that still gives the same error as before.

On Friday 15 June 2007 4:11 pm, Kevin Brown wrote:
> > I'm getting an error as follows when I try to reverse
> > complement a sequence string stored in a hash of arrays. The
> > storage code is:
> >
> > 		$nstarthash{$key} = [$sortchecks[0], join("",
> > @nseq),
> > join("",@{$seqhash{$key}})];
> >
> > the sequence of interest is the element at index 1.
> >
> > Later, I try to retrieve this string for a subset of keys so
> > I can reverse complement it based on input from another hash
> > (%complement):
> >
> > 			my %revcomphash = map { my $read = $_;
> > 			grep $complement{$read} eq 'C', %complement;
> > 			{$_, (Bio::Seq->new(-seq
> > =>$nstarthash{$_}[1]))->revcom->seq()};}
> > 			 keys(%nstarthash);
> >
> >
> > I get the following warning (long sequence edited for clarity):
> >
> > -- -------------------- WARNING ---------------------
> > MSG: seq doesn't validate, mismatch is 1
> > ---------------------------------------------------
> >
> > ------------- EXCEPTION  -------------
> > MSG: Attempting to set the sequence to
> > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
> > which does not look healthy
> > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
> > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
> > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK
> > toplevel ../quality_wrapper.pl:103
> >
> > I cannot find any non-allowed characters in the sequence, and
> > the de-referencing appears to work correctly. Can anyone help me?
> > I'm using the latest Bioperl installation (1.5.2) with
> > ActivePerl5.8 on a Mepis 6.5 system.
>
> Try telling the Bio::Seq object what alphabet to use when creating it.
> I tend to create them like:
>
> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')

-- 
Sheri Simmons
Department of Earth and Planetary Sciences
University of California, Berkeley
Berkeley, CA 94720-4767


From hlapp at gmx.net  Fri Jun 15 21:27:42 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 21:27:42 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <EDC569BF-2E4B-4BFC-916A-665CC2FFABAF@gmx.net>

Could you post a ticket to the helpdesk: support at open-bio.org.

	-hilmar

On Jun 15, 2007, at 9:08 PM, George Hartzell wrote:

> Hilmar Lapp writes:
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>
> Free Beer, huh?  Do you deliver?
>
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
>
> thanks!
>
> g.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Fri Jun 15 21:08:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Fri, 15 Jun 2007 21:08:32 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <18035.14352.963113.473274@almost.alerce.com>

Hilmar Lapp writes:
 > So should we set up a sandbox svn repository and those who would like  
 > to help out
 > 
 > - take shots at migrating bioperl (any current cvs snapshot will do)  
 > to svn

Free Beer, huh?  Do you deliver?

Can you package up a tarball of the cvs repository (bzip or gzip would
save some time) itself?

thanks!

g.


From cjfields at uiuc.edu  Fri Jun 15 21:42:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 20:42:05 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>

The browsable CVS has a 'Download tarball' link if that helps.

http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
cvsroot=bioperl

chris

On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:

> Hilmar Lapp writes:
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>
> Free Beer, huh?  Do you deliver?
>
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
>
> thanks!
>
> g.


From cjfields at uiuc.edu  Fri Jun 15 21:50:09 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 20:50:09 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>

I'll help out to the extent I can w/o having the SVN know-how.  We  
need (as Jason points out) someone who can detail the benefits and  
maybe keep an updated journal on the wiki.

I believe at least one or two of the other Bio* contemplated moving  
over to SVN, which may be worth checking out.

chris

On Jun 15, 2007, at 5:10 PM, Hilmar Lapp wrote:

> So should we set up a sandbox svn repository and those who would like
> to help out
>
> - take shots at migrating bioperl (any current cvs snapshot will do)
> to svn
>
> - you document what you find yourself having to do in trying to make
> it work
>
> - you report back when you think you have a working repository
>
> - we all get a defined amount of time to test to our hearts' content,
> say 2 weeks
>
> - you fix issues that were encountered
>
> - report back when done, followed by retesting for, say 1 week
>
> - iterate previous 2 steps until no issues and no objections to
> migration
>
> - two more weeks of warning period to all developers to commit all
> outstanding changes, or reapply them to a future svn checkout
>
> - pull the trigger by locking down cvs, applying the migration as
> worked out before, and announcing that BioPerl is now on svn
>
> - get free beer at next BOSC (I'll pay if no one else does)
>
> This may not be precisely the plan that needs to be executed, but
> it's probably somewhere along those lines.
>
> If there are volunteers who would like to spearhead this, then power
> to you - I think everyone is in favor and the advantages of svn don't
> need to be debated. The only reason it hasn't happened yet is because
> no one has stepped forward who would have the energy.
>
> I'm sure ChrisD will gladly create the svn sandbox if we have
> volunteers lined up to get going.
>
> 	-hilmar
>
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>
>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>> Hi,
>>>
>>> I would very much prefer it if bioperl moved to svn. I'm
>>> considering merging Bio::Phylo (to the extent that that's possible/
>>> practical) with bioperl and move it to an OBF repository, but I'd
>>> rather not go back to CVS.
>>>
>>> Rutger
>>>
>>
>> I second that, SVN seems like the reasonable choice. I would be more
>> than happy to help out as well.
>>
>> Spiros
>>
>>>
>>> -----Original Message-----
>>>
>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>
>>>>
>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>
>>>>>>>> ...
>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>> timepoint.
>>>>>>> Probably after a branch release or something?  Because it
>>>>>>> basically
>>>>>>> means we're going to have differences on nearly every line
>>>>>>> which is
>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>> versions.
>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>> bugs!
>>>>>
>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>> making a branch release help?
>>>>
>>>> Maybe the worry is that mass conversion in such a large codebase
>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>> w/o
>>>> trying?
>>>>
>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>> Sendu, up for it?
>>>>>
>>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>>> thing I'm not sure its worth it. There are more pressing things to
>>>>> be done for Bioperl.
>>>>>
>>>>> If I can just run perltidy on the entire package and commit,  
>>>>> I'd do
>>>>> it. If that's not appropriate, I won't.
>>>>
>>>> The choices aren't necessarily all or nothing.  What about
>>>> voluntary,
>>>> recommended use of a perltidy config file included with the
>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>
>>>>>>>> About svn
>>>>> [snip]
>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>> is  involved and try getting something going in the next month or
>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>> well but it  might be worth looking into.
>>>>>
>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>> more pressing matters (POD fixing, test script updating and
>>>>> finishing...).
>>>>
>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>> migration (BioRuby and I think BioPython, though the latter  
>>>> could be
>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>> pros/cons, get others opinions from others who have made the
>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>> tester
>>>> SVN w/o making it the default codebase (lock it off to a few
>>>> testers,
>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>
>>>> I agree with you that it's not feasible to switch over prior to a
>>>> release and that there are more pressing issues, but it doesn't  
>>>> hurt
>>>> having an open discussion about it.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Fri Jun 15 22:12:55 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 22:12:55 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
Message-ID: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>

I think he meant the cvs repository itself, containing all the change  
data. -hilmar

On Jun 15, 2007, at 9:42 PM, Chris Fields wrote:

> The browsable CVS has a 'Download tarball' link if that helps.
>
> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
> cvsroot=bioperl
>
> chris
>
> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:
>
>> Hilmar Lapp writes:
>>> So should we set up a sandbox svn repository and those who would  
>>> like
>>> to help out
>>>
>>> - take shots at migrating bioperl (any current cvs snapshot will do)
>>> to svn
>>
>> Free Beer, huh?  Do you deliver?
>>
>> Can you package up a tarball of the cvs repository (bzip or gzip  
>> would
>> save some time) itself?
>>
>> thanks!
>>
>> g.
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Jun 15 22:37:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 21:37:55 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
Message-ID: <F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>

Ah, got it.  Sorry.

George, planning on taking this up?

chris

On Jun 15, 2007, at 9:12 PM, Hilmar Lapp wrote:

> I think he meant the cvs repository itself, containing all the  
> change data. -hilmar
>
> On Jun 15, 2007, at 9:42 PM, Chris Fields wrote:
>
>> The browsable CVS has a 'Download tarball' link if that helps.
>>
>> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
>> cvsroot=bioperl
>>
>> chris
>>
>> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:
>>
>>> Hilmar Lapp writes:
>>>> So should we set up a sandbox svn repository and those who would  
>>>> like
>>>> to help out
>>>>
>>>> - take shots at migrating bioperl (any current cvs snapshot will  
>>>> do)
>>>> to svn
>>>
>>> Free Beer, huh?  Do you deliver?
>>>
>>> Can you package up a tarball of the cvs repository (bzip or gzip  
>>> would
>>> save some time) itself?
>>>
>>> thanks!
>>>
>>> g.
>>
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sat Jun 16 04:20:57 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 16 Jun 2007 09:20:57 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <46739D69.4090204@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:
> Hilmar Lapp writes:
>  > So should we set up a sandbox svn repository and those who would like  
>  > to help out
>  > 
>  > - take shots at migrating bioperl (any current cvs snapshot will do)  
>  > to svn
> 
> Free Beer, huh?  Do you deliver?
> 
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
> 
> thanks!
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Sounds like George might know what he's doing! I have a question about
setting up svn access. I believe access can be done in several ways,
over webdav, over ssh and probably others too. Do you have any knowledge
about the benefits of one over the other? I suppose I'm thinking of what
to implement to allow anonymous read access for users and authenticated
access for developers.

Nath

p.s. if you need any monkeys to do some work I'm happy to help out as
much as possible.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGc51pczuW2jkwy2gRAmi9AJ0XojVdh4ckXoc3bwVSmeNw95cR7QCfV+G9
Lb9NUEe4dkCakQ+Gc7Py98A=
=BG9m
-----END PGP SIGNATURE-----


From rvos at interchange.ubc.ca  Sat Jun 16 06:37:11 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 03:37:11 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <15232024.1181990231860.JavaMail.myubc2@handel.my.ubc.ca>

I can volunteer some time to help out with this.

Rutger

-----Original Message-----

> Date: Fri Jun 15 15:10:25 PDT 2007
> From: "Hilmar Lapp" <hlapp at gmx.net>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: spiros at lokku.com
>
> So should we set up a sandbox svn repository and those who would like  
> to help out
> 
> - take shots at migrating bioperl (any current cvs snapshot will do)  
> to svn
> 
> - you document what you find yourself having to do in trying to make  
> it work
> 
> - you report back when you think you have a working repository
> 
> - we all get a defined amount of time to test to our hearts' content,  
> say 2 weeks
> 
> - you fix issues that were encountered
> 
> - report back when done, followed by retesting for, say 1 week
> 
> - iterate previous 2 steps until no issues and no objections to  
> migration
> 
> - two more weeks of warning period to all developers to commit all  
> outstanding changes, or reapply them to a future svn checkout
> 
> - pull the trigger by locking down cvs, applying the migration as  
> worked out before, and announcing that BioPerl is now on svn
> 
> - get free beer at next BOSC (I'll pay if no one else does)
> 
> This may not be precisely the plan that needs to be executed, but  
> it's probably somewhere along those lines.
> 
> If there are volunteers who would like to spearhead this, then power  
> to you - I think everyone is in favor and the advantages of svn don't  
> need to be debated. The only reason it hasn't happened yet is because  
> no one has stepped forward who would have the energy.
> 
> I'm sure ChrisD will gladly create the svn sandbox if we have  
> volunteers lined up to get going.
> 
> 	-hilmar
> 
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
> 
> > On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
> >> Hi,
> >>
> >> I would very much prefer it if bioperl moved to svn. I'm  
> >> considering merging Bio::Phylo (to the extent that that's possible/ 
> >> practical) with bioperl and move it to an OBF repository, but I'd  
> >> rather not go back to CVS.
> >>
> >> Rutger
> >>
> >
> > I second that, SVN seems like the reasonable choice. I would be more
> > than happy to help out as well.
> >
> > Spiros
> >
> >>
> >> -----Original Message-----
> >>
> >>> Date: Fri Jun 15 07:56:23 PDT 2007
> >>> From: "Chris Fields" <cjfields at uiuc.edu>
> >>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> >>> To: "Sendu Bala" <bix at sendu.me.uk>
> >>>
> >>>
> >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> >>>
> >>>>>>> ...
> >>>>>> Can we do any sort of massive conversion at some logical  
> >>>>>> timepoint.
> >>>>>> Probably after a branch release or something?  Because it  
> >>>>>> basically
> >>>>>> means we're going to have differences on nearly every line  
> >>>>>> which is
> >>>>>> going to make diff-ing difficult when debugging old/new versions.
> >>>>>> Maybe it is not a problem because we aren't introducing and new
> >>>>>> bugs!
> >>>>
> >>>> Sorry, can you clarify the problem you envisage? And why would
> >>>> making a branch release help?
> >>>
> >>> Maybe the worry is that mass conversion in such a large codebase
> >>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows  
> >>> w/o
> >>> trying?
> >>>
> >>>>> I agree; if we intend on doing this it should be all at once,
> >>>>> maybe  on a branch dedicated to ensure that code changes don't
> >>>>> tank tests  (they shouldn't but one never knows).  We would then
> >>>>> need a script up- and-running that tidies everything up prior to
> >>>>> commits (though what  happens if perltidy tanks?...).
> >>>>> Sendu, up for it?
> >>>>
> >>>> If its going to be difficult and a hassle, for such an unnecessary
> >>>> thing I'm not sure its worth it. There are more pressing things to
> >>>> be done for Bioperl.
> >>>>
> >>>> If I can just run perltidy on the entire package and commit, I'd do
> >>>> it. If that's not appropriate, I won't.
> >>>
> >>> The choices aren't necessarily all or nothing.  What about  
> >>> voluntary,
> >>> recommended use of a perltidy config file included with the
> >>> distribution, with additional 'caveats'?  See my response to Sean.
> >>>
> >>>>>>> About svn
> >>>> [snip]
> >>>>> Stepped into that one, didn't I!  I'll look into how much effort
> >>>>> is  involved and try getting something going in the next month or
> >>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
> >>>>> well but it  might be worth looking into.
> >>>>
> >>>> I'd put this in the unnecessary-but-nice category as well. If it
> >>>> will be as easy as my ->new change, go ahead. If not, there are
> >>>> more pressing matters (POD fixing, test script updating and
> >>>> finishing...).
> >>>
> >>> A few other open-bio projects have actively discussed a CVS->SVN
> >>> migration (BioRuby and I think BioPython, though the latter could be
> >>> wrong).  As I said, "it might be worth looking into" to weigh the
> >>> pros/cons, get others opinions from others who have made the
> >>> transition, etc.  We could, as Jason suggested, even set up a tester
> >>> SVN w/o making it the default codebase (lock it off to a few  
> >>> testers,
> >>> have CVS commits automatically/manually carry over to SVN, etc).
> >>>
> >>> I agree with you that it's not feasible to switch over prior to a
> >>> release and that there are more pressing issues, but it doesn't hurt
> >>> having an open discussion about it.
> >>>
> >>> chris
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Sat Jun 16 07:21:47 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Sat, 16 Jun 2007 07:21:47 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
Message-ID: <4673C7CB.1030709@mail.nih.gov>

Chris Fields wrote:
> I'll help out to the extent I can w/o having the SVN know-how.  We  
> need (as Jason points out) someone who can detail the benefits and  
> maybe keep an updated journal on the wiki.
>
> I believe at least one or two of the other Bio* contemplated moving  
> over to SVN, which may be worth checking out.
>   
The bioconductor project is on SVN.  The project includes over 200 
packages (the equivalent of perl modules) with something around 150-200 
ACTIVE developers.  They also have a build system for several OSes that 
operates on a cron-like system with builds of several versions 
approximately daily.  Their system is running at something like revision 
30,000, so they have significant experience.  If anyone would like 
technical support, I can certainly ask the folks maintaining their site 
if they can give some input.  Let me know if anyone would like a contact 
person.

As for access, the typical access is over http (or https).  Access 
controls can be set up on the server side while allowing anonymous 
access for checkout.  There are many excellent SVN for every OS, so that 
should not be a problem. 

Sean


From cjfields at uiuc.edu  Sat Jun 16 10:02:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 09:02:35 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4673C7CB.1030709@mail.nih.gov>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
Message-ID: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>


On Jun 16, 2007, at 6:21 AM, Sean Davis wrote:

> Chris Fields wrote:
>> I'll help out to the extent I can w/o having the SVN know-how.  We
>> need (as Jason points out) someone who can detail the benefits and
>> maybe keep an updated journal on the wiki.
>>
>> I believe at least one or two of the other Bio* contemplated moving
>> over to SVN, which may be worth checking out.
>>
> The bioconductor project is on SVN.  The project includes over 200
> packages (the equivalent of perl modules) with something around  
> 150-200
> ACTIVE developers.  They also have a build system for several OSes  
> that
> operates on a cron-like system with builds of several versions
> approximately daily.  Their system is running at something like  
> revision
> 30,000, so they have significant experience.  If anyone would like
> technical support, I can certainly ask the folks maintaining their  
> site
> if they can give some input.  Let me know if anyone would like a  
> contact
> person.
>
> As for access, the typical access is over http (or https).  Access
> controls can be set up on the server side while allowing anonymous
> access for checkout.  There are many excellent SVN for every OS, so  
> that
> should not be a problem.
>
> Sean

It looks like George Hartzell may be taking a crack at it, with  
Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
could have something testable relatively soon.  After that we'll need  
to work out a few other issues, basically what's on Hilmar's list.

chris


From hlapp at gmx.net  Sat Jun 16 10:40:08 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 16 Jun 2007 10:40:08 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>
Message-ID: <51E89347-4AF7-482E-98DB-BE1AA0138A91@gmx.net>

Just as an aside, even if we can't keep anonymous cvs working, I  
would think that using apache URL rewriting and a small CGI script  
that returns an appropriate page redirect we can without too much  
trouble keep the hyperlinks functional that people may have bookmarked

	-hilmar

On Jun 15, 2007, at 6:23 PM, Jason Stajich wrote:

> Sounds like a plan, I'll be curious to see if we can still get keep  
> anonymous CVS working as I'd like to not have to pull the plug on  
> that.  There are some threads out on the web about how to do this  
> with a commit rule on SVN.
>
> Also, can someone who is close enough to all the SVN benefits  
> please elaborate how it is going to help _this_ project?
> Perhaps you would be willing to put a few words up -- like on (a to  
> be created):
> http://bioperl.org/wiki/BioPerl:Version_control_changeover
>
> This way if anonymous CVS is broken and/or developers who haven't  
> been paying attention come back to commit code ask why things  
> changed we don't have to compose long emails... =)
>
> -jason
> On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote:
>
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>>
>> - you document what you find yourself having to do in trying to make
>> it work
>>
>> - you report back when you think you have a working repository
>>
>> - we all get a defined amount of time to test to our hearts' content,
>> say 2 weeks
>>
>> - you fix issues that were encountered
>>
>> - report back when done, followed by retesting for, say 1 week
>>
>> - iterate previous 2 steps until no issues and no objections to
>> migration
>>
>> - two more weeks of warning period to all developers to commit all
>> outstanding changes, or reapply them to a future svn checkout
>>
>> - pull the trigger by locking down cvs, applying the migration as
>> worked out before, and announcing that BioPerl is now on svn
>>
>> - get free beer at next BOSC (I'll pay if no one else does)
>>
>> This may not be precisely the plan that needs to be executed, but
>> it's probably somewhere along those lines.
>>
>> If there are volunteers who would like to spearhead this, then power
>> to you - I think everyone is in favor and the advantages of svn don't
>> need to be debated. The only reason it hasn't happened yet is because
>> no one has stepped forward who would have the energy.
>
>>
>> I'm sure ChrisD will gladly create the svn sandbox if we have
>> volunteers lined up to get going.
>>
>> 	-hilmar
>>
>> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>>
>>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>>> Hi,
>>>>
>>>> I would very much prefer it if bioperl moved to svn. I'm
>>>> considering merging Bio::Phylo (to the extent that that's possible/
>>>> practical) with bioperl and move it to an OBF repository, but I'd
>>>> rather not go back to CVS.
>>>>
>>>> Rutger
>>>>
>>>
>>> I second that, SVN seems like the reasonable choice. I would be more
>>> than happy to help out as well.
>>>
>>> Spiros
>>>
>>>>
>>>> -----Original Message-----
>>>>
>>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>>
>>>>>
>>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>>
>>>>>>>>> ...
>>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>>> timepoint.
>>>>>>>> Probably after a branch release or something?  Because it
>>>>>>>> basically
>>>>>>>> means we're going to have differences on nearly every line
>>>>>>>> which is
>>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>>> versions.
>>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>>> bugs!
>>>>>>
>>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>>> making a branch release help?
>>>>>
>>>>> Maybe the worry is that mass conversion in such a large codebase
>>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>>> w/o
>>>>> trying?
>>>>>
>>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>>> Sendu, up for it?
>>>>>>
>>>>>> If its going to be difficult and a hassle, for such an  
>>>>>> unnecessary
>>>>>> thing I'm not sure its worth it. There are more pressing  
>>>>>> things to
>>>>>> be done for Bioperl.
>>>>>>
>>>>>> If I can just run perltidy on the entire package and commit,  
>>>>>> I'd do
>>>>>> it. If that's not appropriate, I won't.
>>>>>
>>>>> The choices aren't necessarily all or nothing.  What about
>>>>> voluntary,
>>>>> recommended use of a perltidy config file included with the
>>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>>
>>>>>>>>> About svn
>>>>>> [snip]
>>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>>> is  involved and try getting something going in the next  
>>>>>>> month or
>>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>>> well but it  might be worth looking into.
>>>>>>
>>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>>> more pressing matters (POD fixing, test script updating and
>>>>>> finishing...).
>>>>>
>>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>>> migration (BioRuby and I think BioPython, though the latter  
>>>>> could be
>>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>>> pros/cons, get others opinions >from others who have made the
>>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>>> tester
>>>>> SVN w/o making it the default codebase (lock it off to a few
>>>>> testers,
>>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>>
>>>>> I agree with you that it's not feasible to switch over prior to a
>>>>> release and that there are more pressing issues, but it doesn't  
>>>>> hurt
>>>>> having an open discussion about it.
>>>>>
>>>>> chris
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Jun 16 10:55:09 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 16 Jun 2007 10:55:09 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4673C7CB.1030709@mail.nih.gov>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
Message-ID: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>


On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:

> As for access, the typical access is over http (or https).

We're using svn+ssh here (NESCent) so the password is the same as the  
one you set for your account on the server, and you can use public/ 
private key negotiation for authentication.

I think the ability to not provide a password for every single  
interaction is a requirement. If that requires using svn+ssh or can  
be made to work through https too I don't know. On sf.net I have to  
use https for svn and it doesn't ask me for the password each time.  
Not sure how this works though, maybe some local caching?

We should not be using http, or whatever other protocol that sends  
unencrypted passwords.

>   Access controls can be set up on the server side while allowing  
> anonymous access for checkout.  There are many excellent SVN for  
> every OS, so that should not be a problem.

On Mac OSX the most convenient way I have found is through fink. It  
does ask to install 30 other dependencies, which had me balk at  
first, but me doing it by hand is even worse than fink doing it, so I  
finally gave in and it's really a breeze. I've not had a single issue.

  From a sysadmin perspective, what might be worth keeping in mind is  
that svn is going to store everything in a database (BerkeleyDB I  
think). I.e., there is no such thing anymore as restoring individual  
source code files from backup if one gets accidentally corrupted on  
the server. It seems you have to restore the entire database, i.e.,  
the entire repository. I vaguely recall though that how svn manages  
the repository is actually configurable and that other storage than  
DB is possible too. Don't ask me for the pros and cons of one vs the  
other.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From rvos at interchange.ubc.ca  Sat Jun 16 13:09:18 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 10:09:18 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>

CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).

For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).

Rutger


-----Original Message-----

> Date: Sat Jun 16 07:55:09 PDT 2007
> From: "Hilmar Lapp" <hlapp at gmx.net>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: "Sean Davis" <sdavis2 at mail.nih.gov>
>
> 
> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> 
> > As for access, the typical access is over http (or https).
> 
> We're using svn+ssh here (NESCent) so the password is the same as the  
> one you set for your account on the server, and you can use public/ 
> private key negotiation for authentication.
> 
> I think the ability to not provide a password for every single  
> interaction is a requirement. If that requires using svn+ssh or can  
> be made to work through https too I don't know. On sf.net I have to  
> use https for svn and it doesn't ask me for the password each time.  
> Not sure how this works though, maybe some local caching?
> 
> We should not be using http, or whatever other protocol that sends  
> unencrypted passwords.
> 
> >   Access controls can be set up on the server side while allowing  
> > anonymous access for checkout.  There are many excellent SVN for  
> > every OS, so that should not be a problem.
> 
> On Mac OSX the most convenient way I have found is through fink. It  
> does ask to install 30 other dependencies, which had me balk at  
> first, but me doing it by hand is even worse than fink doing it, so I  
> finally gave in and it's really a breeze. I've not had a single issue.
> 
>   From a sysadmin perspective, what might be worth keeping in mind is  
> that svn is going to store everything in a database (BerkeleyDB I  
> think). I.e., there is no such thing anymore as restoring individual  
> source code files from backup if one gets accidentally corrupted on  
> the server. It seems you have to restore the entire database, i.e.,  
> the entire repository. I vaguely recall though that how svn manages  
> the repository is actually configurable and that other storage than  
> DB is possible too. Don't ask me for the pros and cons of one vs the  
> other.
> 
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From rvos at interchange.ubc.ca  Sat Jun 16 13:15:45 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 10:15:45 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>

A brief word on the topic of perltidy: no. I like what it does, and I sort of follow one of its settings (-syn -sob -b), but if you run it on a whole source tree it'll screw up the diffs, and I'm still worried about it breaking things (though really it shouldn't, it creates a *.bak if something doesn't compile anymore).

Rutger


-----Original Message-----

> Date: Sat Jun 16 10:09:18 PDT 2007
> From: "rvos" <rvos at interchange.ubc.ca>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: "Hilmar Lapp" <hlapp at gmx.net>, "Sean Davis" <sdavis2 at mail.nih.gov>
>
> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).
> 
> For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).
> 
> Rutger
> 
> 
> -----Original Message-----
> 
> > Date: Sat Jun 16 07:55:09 PDT 2007
> > From: "Hilmar Lapp" <hlapp at gmx.net>
> > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> > To: "Sean Davis" <sdavis2 at mail.nih.gov>
> >
> > 
> > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> > 
> > > As for access, the typical access is over http (or https).
> > 
> > We're using svn+ssh here (NESCent) so the password is the same as the  
> > one you set for your account on the server, and you can use public/ 
> > private key negotiation for authentication.
> > 
> > I think the ability to not provide a password for every single  
> > interaction is a requirement. If that requires using svn+ssh or can  
> > be made to work through https too I don't know. On sf.net I have to  
> > use https for svn and it doesn't ask me for the password each time.  
> > Not sure how this works though, maybe some local caching?
> > 
> > We should not be using http, or whatever other protocol that sends  
> > unencrypted passwords.
> > 
> > >   Access controls can be set up on the server side while allowing  
> > > anonymous access for checkout.  There are many excellent SVN for  
> > > every OS, so that should not be a problem.
> > 
> > On Mac OSX the most convenient way I have found is through fink. It  
> > does ask to install 30 other dependencies, which had me balk at  
> > first, but me doing it by hand is even worse than fink doing it, so I  
> > finally gave in and it's really a breeze. I've not had a single issue.
> > 
> >   From a sysadmin perspective, what might be worth keeping in mind is  
> > that svn is going to store everything in a database (BerkeleyDB I  
> > think). I.e., there is no such thing anymore as restoring individual  
> > source code files from backup if one gets accidentally corrupted on  
> > the server. It seems you have to restore the entire database, i.e.,  
> > the entire repository. I vaguely recall though that how svn manages  
> > the repository is actually configurable and that other storage than  
> > DB is possible too. Don't ask me for the pros and cons of one vs the  
> > other.
> > 
> > 	-hilmar
> > -- 
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From george.heller at yahoo.com  Sat Jun 16 13:29:26 2007
From: george.heller at yahoo.com (George Heller)
Date: Sat, 16 Jun 2007 10:29:26 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
Message-ID: <959624.48556.qm@web56502.mail.re3.yahoo.com>

Hi all,
   
  I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. 
   
  Any ideas on the way I can go about doing this?
   
  George

       
---------------------------------
Shape Yahoo! in your own image.  Join our Network Research Panel today!


From bix at sendu.me.uk  Sat Jun 16 14:21:38 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 16 Jun 2007 19:21:38 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <959624.48556.qm@web56502.mail.re3.yahoo.com>
References: <959624.48556.qm@web56502.mail.re3.yahoo.com>
Message-ID: <46742A32.90305@sendu.me.uk>

George Heller wrote:
> Hi all,
> 
> I am looking at extracting the taxonomy hierarchy for some taxon ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
> 
> Any ideas on the way I can go about doing this?

Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
some kind of looping structure. Most easily a recursing sub.

If you happen to code up something neat and efficient, why not share it 
with us and we could add it to the Taxonomy module(s).


From cjfields at uiuc.edu  Sat Jun 16 15:23:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 14:23:43 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
Message-ID: <A59B3FA2-6732-4DB2-9C9C-223DFF41D1E9@uiuc.edu>


On Jun 16, 2007, at 9:55 AM, Hilmar Lapp wrote:

>
> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>
>> As for access, the typical access is over http (or https).
>
> We're using svn+ssh here (NESCent) so the password is the same as the
> one you set for your account on the server, and you can use public/
> private key negotiation for authentication.
>
> I think the ability to not provide a password for every single
> interaction is a requirement. If that requires using svn+ssh or can
> be made to work through https too I don't know. On sf.net I have to
> use https for svn and it doesn't ask me for the password each time.
> Not sure how this works though, maybe some local caching?
>
> We should not be using http, or whatever other protocol that sends
> unencrypted passwords.

Agreed; it should be through ssh.

>>   Access controls can be set up on the server side while allowing
>> anonymous access for checkout.  There are many excellent SVN for
>> every OS, so that should not be a problem.
>
> On Mac OSX the most convenient way I have found is through fink. It
> does ask to install 30 other dependencies, which had me balk at
> first, but me doing it by hand is even worse than fink doing it, so I
> finally gave in and it's really a breeze. I've not had a single issue.
>
>   From a sysadmin perspective, what might be worth keeping in mind is
> that svn is going to store everything in a database (BerkeleyDB I
> think). I.e., there is no such thing anymore as restoring individual
> source code files from backup if one gets accidentally corrupted on
> the server. It seems you have to restore the entire database, i.e.,
> the entire repository. I vaguely recall though that how svn manages
> the repository is actually configurable and that other storage than
> DB is possible too. Don't ask me for the pros and cons of one vs the
> other.

MacPorts/DarwinPorts also has subversion, various language bindings,  
cvs2svn, and various perl modules.  There are also a few SVN GUIs  
lingering around (including live folders within Komodo).

chris


From cjfields at uiuc.edu  Sat Jun 16 15:18:06 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 14:18:06 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>
References: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <1A314D08-8F3C-4A4B-B58D-64AC7952F149@uiuc.edu>

I think it's viable as an option if the code really needs it.  After  
100+ commits some of the code has schizy coding styles, so cleaning  
it up helps.  In those cases having a perltidy config file present  
wouldn't hurt.  However I agree that it shouldn't be applied across  
every module and should be done judiciously (the commit message, for  
instance, should actually state the code was tidied).

chris

PS - Nice to see the ball is rolling on SVN!

On Jun 16, 2007, at 12:15 PM, rvos wrote:

> A brief word on the topic of perltidy: no. I like what it does, and  
> I sort of follow one of its settings (-syn -sob -b), but if you run  
> it on a whole source tree it'll screw up the diffs, and I'm still  
> worried about it breaking things (though really it shouldn't, it  
> creates a *.bak if something doesn't compile anymore).
>
> Rutger
>
>
>
> -----Original Message-----
>
>> Date: Sat Jun 16 10:09:18 PDT 2007
>> From: "rvos" <rvos at interchange.ubc.ca>
>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
>> To: "Hilmar Lapp" <hlapp at gmx.net>, "Sean Davis"  
>> <sdavis2 at mail.nih.gov>
>>
>> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales  
>> talk has been expended over it already, for my own purpose I like  
>> the integration with eclipse (through subclipse plugin) and  
>> komodo, in addition to the atomic commits (so I can ctrl+c if I  
>> goof up (again)).
>>
>> For standalone use on osx I didn't use the fink one, but I forgot  
>> where I did get it from. It was very easy to set up, though. On  
>> windows there is a really nice standalone one (tortoisesvn) that  
>> integrates with the explorer so you can see on the file icons what  
>> the state of a file is. I know that there's a cvs2svn utility that  
>> converts your revision history (seems a requirement).
>>
>> Rutger
>>
>>
>> -----Original Message-----
>>
>>> Date: Sat Jun 16 07:55:09 PDT 2007
>>> From: "Hilmar Lapp" <hlapp at gmx.net>
>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
>>> To: "Sean Davis" <sdavis2 at mail.nih.gov>
>>>
>>>
>>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>>
>>>> As for access, the typical access is over http (or https).
>>>
>>> We're using svn+ssh here (NESCent) so the password is the same as  
>>> the
>>> one you set for your account on the server, and you can use public/
>>> private key negotiation for authentication.
>>>
>>> I think the ability to not provide a password for every single
>>> interaction is a requirement. If that requires using svn+ssh or can
>>> be made to work through https too I don't know. On sf.net I have to
>>> use https for svn and it doesn't ask me for the password each time.
>>> Not sure how this works though, maybe some local caching?
>>>
>>> We should not be using http, or whatever other protocol that sends
>>> unencrypted passwords.
>>>
>>>>   Access controls can be set up on the server side while allowing
>>>> anonymous access for checkout.  There are many excellent SVN for
>>>> every OS, so that should not be a problem.
>>>
>>> On Mac OSX the most convenient way I have found is through fink. It
>>> does ask to install 30 other dependencies, which had me balk at
>>> first, but me doing it by hand is even worse than fink doing it,  
>>> so I
>>> finally gave in and it's really a breeze. I've not had a single  
>>> issue.
>>>
>>>   From a sysadmin perspective, what might be worth keeping in  
>>> mind is
>>> that svn is going to store everything in a database (BerkeleyDB I
>>> think). I.e., there is no such thing anymore as restoring individual
>>> source code files from backup if one gets accidentally corrupted on
>>> the server. It seems you have to restore the entire database, i.e.,
>>> the entire repository. I vaguely recall though that how svn manages
>>> the repository is actually configurable and that other storage than
>>> DB is possible too. Don't ask me for the pros and cons of one vs the
>>> other.
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hartzell at alerce.com  Sat Jun 16 13:47:01 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 16 Jun 2007 10:47:01 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
	<F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
Message-ID: <18036.8725.29073.619527@almost.alerce.com>

Chris Fields writes:
 > Ah, got it.  Sorry.
 > 
 > George, planning on taking this up?

I'm going to take a *peek*.  I just finished (unless someone finds
another issue) moving someone's cvs repository over to svn, so I have
some tools cobbled together and some knowledge in the cache.

I don't have too much idle time at the moment though, so if it gets
gooey I'll just summarize what I learn.  Either way it seems worth a
peek.

I will need the repository itself though.  I'll post a note to
support at open-bio.org.

g.


From jason at bioperl.org  Sat Jun 16 19:54:18 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 16 Jun 2007 16:54:18 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18036.8725.29073.619527@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
	<F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
	<18036.8725.29073.619527@almost.alerce.com>
Message-ID: <6F57475B-715F-49D1-B6D2-F3FD3ACCB728@bioperl.org>

Thanks George.
I'll respond to your support ticket as well but I put up tarballs of  
the repository as of today.

I had thought at one point ChrisD might have setup rsync-able access  
to the whole repostitory through code.open-bio.org but for now I have  
put up tarballs of most of the CVS dirs from bioperl
http://bioperl.org/uploads/

Just to say I already went through all the steps of running cvs2svn  
myself and had problems gathering back out the branches and all the  
tags when I tried it.  If you want to start with a smaller repository  
like bioperl-network or bioperl-db as the initial cvs2svn conversion  
script took quite a long time to run on bioperl-live.

Regarding ssh/https:
We have already gone through some of this for blipkit and biojava  
projects.  I think we'll still keep separate anonymous read-only  
(code.open-bio.org) and writeable repositories (dev.open-bio.org) as  
I think we are resisting any webapps on the developement server as we  
want that to as locked down as possible.  For the newly created svn  
repositories that I've been creating/using I just use svn+ssh and  
that worked okay.


-jason

On Jun 16, 2007, at 10:47 AM, George Hartzell wrote:

> Chris Fields writes:
>> Ah, got it.  Sorry.
>>
>> George, planning on taking this up?
>
> I'm going to take a *peek*.  I just finished (unless someone finds
> another issue) moving someone's cvs repository over to svn, so I have
> some tools cobbled together and some knowledge in the cache.
>
> I don't have too much idle time at the moment though, so if it gets
> gooey I'll just summarize what I learn.  Either way it seems worth a
> peek.
>
> I will need the repository itself though.  I'll post a note to
> support at open-bio.org.
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hartzell at alerce.com  Sat Jun 16 19:56:09 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 16 Jun 2007 16:56:09 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <46739D69.4090204@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<46739D69.4090204@sheffield.ac.uk>
Message-ID: <18036.30873.609341.181853@almost.alerce.com>

Nathan S. Haigh writes:
 > [...]
 > Sounds like George might know what he's doing! 

Hey, I've been looking for a Marketing Director.  Want a job?

 > I have a question about
 > setting up svn access. I believe access can be done in several ways,
 > over webdav, over ssh and probably others too. Do you have any knowledge
 > about the benefits of one over the other? I suppose I'm thinking of what
 > to implement to allow anonymous read access for users and authenticated
 > access for developers.

There are two and a half ways to talk to the repository:

  - You can put it behind a web server (e.g. apache) and get at it
    using http/https.  Authentication and authorization happen using
    the normal web server tricks, so as long as you don't do anything
    silly (e.g. don't use basic auth, stick with mod_auth_digest),
    even http connections won't send passwords in the clear.  You can
    define users in .htpassword files or use any of the fancier setup
    (e.g. sql databases, etc...).

  - You can talk to it via subversion's simple server, svnserve.
    There are two ways you usually talk to svnserve (neither of which
    send passwords in the clear):

      * directly, using a URL like
          svn:/svn.example.com/repo/proj/trunk
        when you do this the client either talks directly to a copy of
        svnserve running as a daemon, or possibly to something like
        inetd that'll start an svnserve as necessary.

        In this case, you define authen. and author. info in an
        svnserve.conf file.

      * indirectly, using a URL like
          svn+ssh://svn.example.com/repo/proj/trunk/
        in which case you make an ssh connection to the server machine
        (and authenticate via ssh mechanisms, anything other than a
        key-pair will drive you nuts with repeated password requests)
        and then an svnserve process is started up for you in "tunnel
        mode".  Access control is coarse grained an via OS level  access
        permisions. 

        Generally in this case you need to give out shell accounts to
        everyone involved, or (tsk, tsk) have them use a common
        account.  There's a cute trick in the svn book that shows how
        to use a shared ssh account but still have all of the changes
        in the repo keep track of the real user.  I've never tried
        it.... 

   - If you're on the same machine as the repo, you can do this
     simple:
        file:///path/to/repo/proj/trunk

The biggest deciding factor is how you want to manage your users and
whether you're already messing around with a web server.  I've
generally worked in small group and everyone's had ssh access, but
I've set it up the other ways too.

You can even access via multiple paths.  The only trick is that the
repository needs to be writable by whoever's committing, and if
they're running svnserve themselves (file: or svn+ssh:) and things
aren't set up right (all the dirs in the repo need to be group
writable and have the magic bit set so that any new stuff created is
also writable, users umasks and group membership need to be aligned)
then things go fubar.  Google's your friend here, and each of the
OS's/distro's has a standard hack for making this work, usually
involving a wrapper app that takes care of things.

Feel free to ask any particular questions.

Phew,

g.


From jason at bioperl.org  Sat Jun 16 20:17:58 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 16 Jun 2007 17:17:58 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <200706151653.04135.sheris@eps.berkeley.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
	<1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
	<200706151653.04135.sheris@eps.berkeley.edu>
Message-ID: <6A369DE9-943A-4DF1-9DF0-F68E361C8C20@bioperl.org>

There error is clearly saying there must be a symbol or letter in  
your sequence that violates the regexp.
I had modified the code in CVS to actually provide a more informative  
mismatch error in the error message, but this probably not in the  
release you are using.

Anyways, add this to see what is causing the problem:

print join(",",($nstarthash{$_}[1] =~ /([^ 
$Bio::PrimarySeq::MATCHPATTERN]+)/g)), "\n";

-jason
On Jun 15, 2007, at 4:53 PM, Sheri Simmons wrote:

> Thanks for the suggestion, but that still gives the same error as  
> before.
>
> On Friday 15 June 2007 4:11 pm, Kevin Brown wrote:
>>> I'm getting an error as follows when I try to reverse
>>> complement a sequence string stored in a hash of arrays. The
>>> storage code is:
>>>
>>> 		$nstarthash{$key} = [$sortchecks[0], join("",
>>> @nseq),
>>> join("",@{$seqhash{$key}})];
>>>
>>> the sequence of interest is the element at index 1.
>>>
>>> Later, I try to retrieve this string for a subset of keys so
>>> I can reverse complement it based on input from another hash
>>> (%complement):
>>>
>>> 			my %revcomphash = map { my $read = $_;
>>> 			grep $complement{$read} eq 'C', %complement;
>>> 			{$_, (Bio::Seq->new(-seq
>>> =>$nstarthash{$_}[1]))->revcom->seq()};}
>>> 			 keys(%nstarthash);
>>>
>>>
>>> I get the following warning (long sequence edited for clarity):
>>>
>>> -- -------------------- WARNING ---------------------
>>> MSG: seq doesn't validate, mismatch is 1
>>> ---------------------------------------------------
>>>
>>> ------------- EXCEPTION  -------------
>>> MSG: Attempting to set the sequence to
>>> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
>>> which does not look healthy
>>> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
>>> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
>>> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK
>>> toplevel ../quality_wrapper.pl:103
>>>
>>> I cannot find any non-allowed characters in the sequence, and
>>> the de-referencing appears to work correctly. Can anyone help me?
>>> I'm using the latest Bioperl installation (1.5.2) with
>>> ActivePerl5.8 on a Mepis 6.5 system.
>>
>> Try telling the Bio::Seq object what alphabet to use when creating  
>> it.
>> I tend to create them like:
>>
>> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')
>
> -- 
> Sheri Simmons
> Department of Earth and Planetary Sciences
> University of California, Berkeley
> Berkeley, CA 94720-4767
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From n.haigh at sheffield.ac.uk  Sun Jun 17 07:45:11 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 17 Jun 2007 12:45:11 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <46751EC7.8020609@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

rvos wrote:
> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).
> 
> For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).
> 
> Rutger
> 
> 

Just to clarify, subversion is available as command line for windows:
http://subversion.tigris.org/project_packages.html

TortoiseSVN is another svn client with a GUI that integrates into the
shell. I tried setting this up a while back to use ssh (via PUTTY), but
I wasn't successful. This may have been due to me just starting out with
svn or that it was harder to setup in an earlier version of TortoiseSVN.

Does anyone have experience of setting up svn on Windows to use ssh? If
the changeover takes place, I'm happy to write some howto's for setting
up svn clients for Windows.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGdR7HczuW2jkwy2gRAmgOAJ96wLzVYbjqEPborZTsw6gwU6UitgCfV02v
8xHJvn/Eqf9LePR3Ei0ZaIw=
=t5pN
-----END PGP SIGNATURE-----


From george.heller at yahoo.com  Sun Jun 17 14:41:55 2007
From: george.heller at yahoo.com (George Heller)
Date: Sun, 17 Jun 2007 11:41:55 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46742A32.90305@sendu.me.uk>
Message-ID: <148654.15952.qm@web56511.mail.re3.yahoo.com>

Hi all,
   
  Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. 
   
  Thanks.
  George

Sendu Bala <bix at sendu.me.uk> wrote:
  George Heller wrote:
> Hi all,
> 
> I am looking at extracting the taxonomy hierarchy for some taxon ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
> 
> Any ideas on the way I can go about doing this?

Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
some kind of looping structure. Most easily a recursing sub.

If you happen to code up something neat and efficient, why not share it 
with us and we could add it to the Taxonomy module(s).


---------------------------------
Shape Yahoo! in your own image.  Join our Network Research Panel today!


From jason at bioperl.org  Sun Jun 17 16:48:05 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sun, 17 Jun 2007 13:48:05 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <148654.15952.qm@web56511.mail.re3.yahoo.com>
References: <148654.15952.qm@web56511.mail.re3.yahoo.com>
Message-ID: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org>

I assume you already figured out how to setup a local taxonomydb?

You just want the extant species/leaves of the tree

my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;


-jason
On Jun 17, 2007, at 11:41 AM, George Heller wrote:

> Hi all,
>
>   Can anyone point me to some example that uses the  
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at  
> this, and I am not quite sure how to implement it.
>
>   Thanks.
>   George
>
> Sendu Bala <bix at sendu.me.uk> wrote:
>   George Heller wrote:
>> Hi all,
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children and so
>> on.
>>
>> Any ideas on the way I can go about doing this?
>
> Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
>
> If you happen to code up something neat and efficient, why not  
> share it
> with us and we could add it to the Taxonomy module(s).
>
>
>
> ---------------------------------
> Shape Yahoo! in your own image.  Join our Network Research Panel  
> today!
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From aaron.j.mackey at gsk.com  Sun Jun 17 22:35:42 2007
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Sun, 17 Jun 2007 22:35:42 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46742A32.90305@sendu.me.uk>
Message-ID: <OF9A874C93.CFF12016-ON852572FE.000E328D-852572FE.000E463E@gsk.com>

To do so efficiently, you might want to check out:

  http://www.oreillynet.com/pub/a/network/2002/11/27/bioconf.html

-Aaron

bioperl-l-bounces at lists.open-bio.org wrote on 06/16/2007 02:21:38 PM:

> George Heller wrote:
> > Hi all,
> > 
> > I am looking at extracting the taxonomy hierarchy for some taxon ids.
> > What I plan to do is, for a given taxon id, say 33090, I want to
> > extract all taxon ids that are children of this species. I do not
> > just want the immediate children, but the children's children and so
> > on.
> > 
> > Any ideas on the way I can go about doing this?
> 
> Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
> 
> If you happen to code up something neat and efficient, why not share it 
> with us and we could add it to the Taxonomy module(s).
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From aaron.j.mackey at gsk.com  Sun Jun 17 22:34:12 2007
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Sun, 17 Jun 2007 22:34:12 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
Message-ID: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>

> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> 
> > As for access, the typical access is over http (or https).
> 
> We're using svn+ssh here (NESCent)

Let me just note that https is preferable to ssh for those poor slobs 
stuck behind a corporate firewall (svn happily prompts me for my proxy 
server's user/pass, then my https authentication realm's user/pass - all 
then get cached in some .svn/ file that I don't have to worry about again 
until my proxy server password changes once a month ...)

-Aaron


From george.heller at yahoo.com  Mon Jun 18 00:21:45 2007
From: george.heller at yahoo.com (George Heller)
Date: Sun, 17 Jun 2007 21:21:45 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org>
Message-ID: <487845.37410.qm@web56510.mail.re3.yahoo.com>

Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. 
   
  I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. 
   
  Thanks.
  George
   
  Jason Stajich <jason at bioperl.org> wrote:
    I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;  

  
  -jason
    On Jun 17, 2007, at 11:41 AM, George Heller wrote:

    Hi all,
  

    Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. 
  

    Thanks.
    George
  

  Sendu Bala <bix at sendu.me.uk> wrote:
    George Heller wrote:
    Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not share it 
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image.  Join our Network Research Panel today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Need a vacation? Get great deals to amazing places on Yahoo! Travel. 


From bix at sendu.me.uk  Mon Jun 18 06:44:00 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 11:44:00 +0100
Subject: [Bioperl-l] Network tests overhaul
Message-ID: <467661F0.2060703@sendu.me.uk>

When the test suite runs currently, most (the intent is all) tests skip 
if the test would require network (internet) access. This is to avoid 
tests failing not due to bugs in Bioperl code, but due to temporarily 
inaccessible servers. This is also to make running the test suite faster.

To do a complete test you currently have to set BIOPERLDEBUG to true, 
which activates the network test but also increases verbosity. This 
actually causes a problem, since when running the entire test suite the 
additional debug information is more a hindrance than a help, since the 
reams of printed information can hide significant warnings that may also 
get printed. Its also ugly.

The solution is to divorce activation of network tests from the request 
for verbosity. The obvious implementation is to have another environment 
variable, perhaps BIOPERLNETWORK. However, there is an opportunity to do 
something more appropriate. The running of networking tests should be a 
choice given to every end-user installing Bioperl. Debugging 
information, on the other hand, is only of interest to the developer 
working on a specific module under test, so can be left as a 'hidden' 
env var.


I have just committed one possible implementation along these lines.

You say:
perl Build.PL
as normal, and if you seem to have internet access it asks you if you'd 
like to run network tests. The default answer is no. If you answer yes, 
network tests will be enabled.

You can alternatively say:
perl Build.PL --network
and if you seem to have internet access, network tests will be enabled.

Then you run the tests:
./Build test
Any tests written to support the new system will then skip network tests 
if they haven't been enabled.

The only test I've written to support the new system is t/RemoteBlast.t:
./Build test --test_files t/RemoteBlast.t --verbose


Adding support to test scripts consists of the following changes:

+ use Module::Build;
+ my $build = Module::Build->current(get_options => { network => {} });
+ my $do_network_tests = $build->notes('network');

! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests
---
! if (!$do_network_tests) { # skip network tests


I propose adding this support to all test scripts that carry out network 
tests. Does anyone have objections? Does anyone have alternate 
implementations that may be superior?

I specifically suggest we don't use an env var in addition to the above, 
because the multiple ways of doing things could lead to confusion. Which 
takes priority? Did a user really have the networking tests turned on 
when he reported his test results?


The one thing I need help with is identifying which tests attempt to 
access the internet. I think we caught most of them for the 1.5.2 
release, but I think there are more lurking around. Can anyone offer a 
way to systematically find at least the test scripts which access the 
internet, if not the specific tests within?

Cheers,
Sendu.


From bix at sendu.me.uk  Mon Jun 18 06:46:17 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 11:46:17 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <46766279.7050202@sendu.me.uk>

Sendu Bala wrote:
> Adding support to test scripts consists of the following changes:
> 
> + use Module::Build;
> + my $build = Module::Build->current(get_options => { network => {} });

That should read:
+ my $build = Module::Build->current();

> + my $do_network_tests = $build->notes('network');


From cjfields at uiuc.edu  Mon Jun 18 07:45:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 06:45:10 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <46766279.7050202@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk>
Message-ID: <C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>

The idea sounds good, though if we plan on doing this we need to  
update the Test HOWTO as well.

Some modules require only a few (<50% of the total) network tests; I  
think SeqFeature.t may be one, though I'm not sure.  Does this handle  
those cases?

chris

On Jun 18, 2007, at 5:46 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Adding support to test scripts consists of the following changes:
>>
>> + use Module::Build;
>> + my $build = Module::Build->current(get_options => { network =>  
>> {} });
>
> That should read:
> + my $build = Module::Build->current();
>
>> + my $do_network_tests = $build->notes('network');
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Jun 18 07:49:18 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 12:49:18 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk>
	<C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>
Message-ID: <4676713E.1000508@sendu.me.uk>

Chris Fields wrote:
> The idea sounds good, though if we plan on doing this we need to update 
> the Test HOWTO as well.
> 
> Some modules require only a few (<50% of the total) network tests; I 
> think SeqFeature.t may be one, though I'm not sure.  Does this handle 
> those cases?

Yes, the system just gives the test script a boolean describing if 
network tests should be run. The script can then do whatever it wants 
with the boolean. Skip all tests, skip no tests, skip just some tests... 
its a drop-in replacement for the current 'debug' boolean used based on 
BIOPERLDEBUG.


From hlapp at gmx.net  Mon Jun 18 08:38:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:38:25 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <487845.37410.qm@web56510.mail.re3.yahoo.com>
References: <487845.37410.qm@web56510.mail.re3.yahoo.com>
Message-ID: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net>

I'm a bit confused - it sounds like you have set up a local BioSQL  
database and loaded the NCBI taxonomy into the database. You can now  
use simple SQL to retrieve all descendants of a node in the tree  
given its NCBI taxonID such as

	SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
	WHERE
	    n.ncbi_taxon_id = :taxonID
	AND tn.left_value > n. left_value
	AND tn.right_value < n.right_value
	AND tn.taxon_id = tnm.taxon_id
	AND tn.name_class = 'scientific_name'

BioPerl doesn't have a Taxonomy::biosql module yet (though this would  
seem like a worthwhile thing to add), so you can't use the  
Bio::DB::Taxonomy interface to do this against a BioSQL instance.

However, BioPerl does have support for the flat-file download of the  
NCBI taxonomy database and indexes it, so you can simply use  
Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download  
to achieve what you wanted to do in a less than 5 lines of perl.

Although the recursive implementation of Taxonomy::get_all_Descendants 
() won't be lightning fast, it may still be perfectly fine for your  
application - are you sure it is not?

	-hilmar

On Jun 18, 2007, at 12:21 AM, George Heller wrote:

> Thanks. And how can I assign the $node here in the below code, such  
> that I can reference it to a particular taxon id record? I want to  
> retrieve all the descendents from the taxonomy hierarchy, given a  
> particular taxon id.
>
>   I have a local db setup, in which I have uploaded data using the  
> load_ncbi_taxonomy.pl script.
>
>   Thanks.
>   George
>
>   Jason Stajich <jason at bioperl.org> wrote:
>     I assume you already figured out how to setup a local taxonomydb?
>
>
>   You just want the extant species/leaves of the tree
>
>
> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>
>
>
>   -jason
>     On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>     Hi all,
>
>
>     Can anyone point me to some example that uses the  
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at  
> this, and I am not quite sure how to implement it.
>
>
>     Thanks.
>     George
>
>
>   Sendu Bala <bix at sendu.me.uk> wrote:
>     George Heller wrote:
>     Hi all,
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon  
> ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children and so
>   on.
>
>
>   Any ideas on the way I can go about doing this?
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and  
> each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>   If you happen to code up something neat and efficient, why not  
> share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image.  Join our Network Research Panel  
> today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Jun 18 08:44:22 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:44:22 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
Message-ID: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>

Just curious - how do you cvs commit then to an external repository?  
Is that open in the firewall?

It is true though that corporations typically will not permit any  
encrypted outgoing traffic through their firewall except https.  
sf.net only supports https for svn, AFAIK.

	-hilmar

On Jun 17, 2007, at 10:34 PM, aaron.j.mackey at gsk.com wrote:

>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>
>>> As for access, the typical access is over http (or https).
>>
>> We're using svn+ssh here (NESCent)
>
> Let me just note that https is preferable to ssh for those poor slobs
> stuck behind a corporate firewall (svn happily prompts me for my proxy
> server's user/pass, then my https authentication realm's user/pass  
> - all
> then get cached in some .svn/ file that I don't have to worry about  
> again
> until my proxy server password changes once a month ...)
>
> -Aaron
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Jun 18 08:47:56 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:47:56 -0400
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <B9BDBD4A-962D-4E83-8151-5D6EA8B69D3B@gmx.net>

Sounds like a great idea to me. -hilmar

On Jun 18, 2007, at 6:44 AM, Sendu Bala wrote:

> When the test suite runs currently, most (the intent is all) tests  
> skip
> if the test would require network (internet) access. This is to avoid
> tests failing not due to bugs in Bioperl code, but due to temporarily
> inaccessible servers. This is also to make running the test suite  
> faster.
>
> To do a complete test you currently have to set BIOPERLDEBUG to true,
> which activates the network test but also increases verbosity. This
> actually causes a problem, since when running the entire test suite  
> the
> additional debug information is more a hindrance than a help, since  
> the
> reams of printed information can hide significant warnings that may  
> also
> get printed. Its also ugly.
>
> The solution is to divorce activation of network tests from the  
> request
> for verbosity. The obvious implementation is to have another  
> environment
> variable, perhaps BIOPERLNETWORK. However, there is an opportunity  
> to do
> something more appropriate. The running of networking tests should  
> be a
> choice given to every end-user installing Bioperl. Debugging
> information, on the other hand, is only of interest to the developer
> working on a specific module under test, so can be left as a 'hidden'
> env var.
>
>
> I have just committed one possible implementation along these lines.
>
> You say:
> perl Build.PL
> as normal, and if you seem to have internet access it asks you if  
> you'd
> like to run network tests. The default answer is no. If you answer  
> yes,
> network tests will be enabled.
>
> You can alternatively say:
> perl Build.PL --network
> and if you seem to have internet access, network tests will be  
> enabled.
>
> Then you run the tests:
> ./Build test
> Any tests written to support the new system will then skip network  
> tests
> if they haven't been enabled.
>
> The only test I've written to support the new system is t/ 
> RemoteBlast.t:
> ./Build test --test_files t/RemoteBlast.t --verbose
>
>
> Adding support to test scripts consists of the following changes:
>
> + use Module::Build;
> + my $build = Module::Build->current(get_options => { network =>  
> {} });
> + my $do_network_tests = $build->notes('network');
>
> ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests
> ---
> ! if (!$do_network_tests) { # skip network tests
>
>
> I propose adding this support to all test scripts that carry out  
> network
> tests. Does anyone have objections? Does anyone have alternate
> implementations that may be superior?
>
> I specifically suggest we don't use an env var in addition to the  
> above,
> because the multiple ways of doing things could lead to confusion.  
> Which
> takes priority? Did a user really have the networking tests turned on
> when he reported his test results?
>
>
> The one thing I need help with is identifying which tests attempt to
> access the internet. I think we caught most of them for the 1.5.2
> release, but I think there are more lurking around. Can anyone offer a
> way to systematically find at least the test scripts which access the
> internet, if not the specific tests within?
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 08:55:53 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 07:55:53 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
Message-ID: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>

On Jun 18, 2007, at 7:44 AM, Hilmar Lapp wrote:

> Just curious - how do you cvs commit then to an external repository?
> Is that open in the firewall?
>
> It is true though that corporations typically will not permit any
> encrypted outgoing traffic through their firewall except https.
> sf.net only supports https for svn, AFAIK.
>
> 	-hilmar

If so it may be better to allow https, though I don't know how Chris  
D. and others feel about it.

Did we make a decision as to the fate of cvs if we get svn up-and- 
running?  Keep it around (assuming svn commits would be carried over  
to cvs and vice versa)?  Or see what happens over time?

chris


From sdavis2 at mail.nih.gov  Mon Jun 18 09:05:50 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 18 Jun 2007 09:05:50 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
Message-ID: <4676832E.5080704@mail.nih.gov>

aaron.j.mackey at gsk.com wrote:
>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>
>>> As for access, the typical access is over http (or https).
>> We're using svn+ssh here (NESCent)
> 
> Let me just note that https is preferable to ssh for those poor slobs 
> stuck behind a corporate firewall (svn happily prompts me for my proxy 
> server's user/pass, then my https authentication realm's user/pass - all 
> then get cached in some .svn/ file that I don't have to worry about again 
> until my proxy server password changes once a month ...)

That would be my suggestion as well (although I added it only
parenthetically).

Sean


From hlapp at gmx.net  Mon Jun 18 09:13:27 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 09:13:27 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
Message-ID: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>


On Jun 18, 2007, at 8:55 AM, Chris Fields wrote:

> Did we make a decision as to the fate of cvs if we get svn up-and- 
> running?  Keep it around (assuming svn commits would be carried  
> over to cvs and vice versa)?  Or see what happens over time?

Let's not plan for having cvs and svn writable repositories in  
parallel - that would create an administrative nightmare. Once the  
tests complete, there'll be a clean cut-over.

What Jason suggested is to try and continue a read-only (anonymous)  
cvs repository, updated from the svn repository that the developers  
use, aside from an anonymous svn repository mirroring the writable  
one. This would primarily be for maintaining working URLs for those  
folks who http-linked into the anonymous cvs repository. What I added  
earlier is that even if that fails to be feasible, you can achieve  
the goal using some small CGI script and apache redirect to map CVS- 
style links to the anonymous svn repository.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 09:31:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 08:31:35 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>
Message-ID: <0E64DBD0-BBE9-411A-A146-70236EF558BB@uiuc.edu>


On Jun 18, 2007, at 8:13 AM, Hilmar Lapp wrote:

>
> On Jun 18, 2007, at 8:55 AM, Chris Fields wrote:
>
>> Did we make a decision as to the fate of cvs if we get svn up-and- 
>> running?  Keep it around (assuming svn commits would be carried  
>> over to cvs and vice versa)?  Or see what happens over time?
>
> Let's not plan for having cvs and svn writable repositories in  
> parallel - that would create an administrative nightmare. Once the  
> tests complete, there'll be a clean cut-over.

My thoughts as well.  Much simpler.

> What Jason suggested is to try and continue a read-only (anonymous)  
> cvs repository, updated from the svn repository that the developers  
> use, aside from an anonymous svn repository mirroring the writable  
> one. This would primarily be for maintaining working URLs for those  
> folks who http-linked into the anonymous cvs repository. What I  
> added earlier is that even if that fails to be feasible, you can  
> achieve the goal using some small CGI script and apache redirect to  
> map CVS-style links to the anonymous svn repository.
>
> 	-hilmar

I like the idea of a read-only cvs or a 'faux' cvs, though the former  
would initially be easier as we already have it available.  We could  
just lock it down at some switchover point to read-only (something I  
think Jason also suggested).

chris


From bix at sendu.me.uk  Mon Jun 18 09:13:33 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 14:13:33 +0100
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
Message-ID: <467684FD.3080300@sendu.me.uk>

Chris Fields wrote:
> 
> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>> If its going to be difficult and a hassle, for such an unnecessary 
>> thing I'm not sure its worth it. There are more pressing things to be 
>> done for Bioperl.
>>
>> If I can just run perltidy on the entire package and commit, I'd do 
>> it. If that's not appropriate, I won't.
> 
> The choices aren't necessarily all or nothing.  What about voluntary, 
> recommended use of a perltidy config file included with the 
> distribution, with additional 'caveats'?

I'm happy with that idea. Why not come up with something and make it 
available for us to try out?


Cheers,
Sendu.


From bix at sendu.me.uk  Mon Jun 18 09:26:36 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 14:26:36 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
Message-ID: <4676880C.9030009@sendu.me.uk>

Chris Fields wrote:
> If so it may be better to allow https, though I don't know how Chris  
> D. and others feel about it.

If it makes no difference to me as an end-user, I won't mind. But I 
won't want to enter my password even once, at the beginning of a 
session. If that's not possible with https, then ssh should be an option 
as well.


Unrelated, but it randomly just occurred to me: what happens to all the 
id lines at the top of modules? Eg:

$Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $

That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
I wish we would, since they caused me no end of hassles during the 1.5.2 
release, doing updates across branches.)


> Did we make a decision as to the fate of cvs if we get svn up-and- 
> running?  Keep it around (assuming svn commits would be carried over  
> to cvs and vice versa)?  Or see what happens over time?

Well, I don't think hard decisions are possible until we know how its 
going to work in practice. I tried setting up my own svn repository 
once, but didn't keep it and can't remember much about it.

So, I suppose we'll play it by ear and decide things later. Is someone 
out there actively doing something leading toward a demonstration of how 
it will be?


From cjfields at uiuc.edu  Mon Jun 18 09:58:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 08:58:34 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467684FD.3080300@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
Message-ID: <DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>


On Jun 18, 2007, at 8:13 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary
>>> thing I'm not sure its worth it. There are more pressing things  
>>> to be
>>> done for Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd do
>>> it. If that's not appropriate, I won't.
>>
>> The choices aren't necessarily all or nothing.  What about voluntary,
>> recommended use of a perltidy config file included with the
>> distribution, with additional 'caveats'?
>
> I'm happy with that idea. Why not come up with something and make it
> available for us to try out?
>
>
> Cheers,
> Sendu.

Will do.  Maybe something that conforms to PBP; there's a PBP  
perltidy config on perlmonks, along with some emacs/vim related bits:

http://www.perlmonks.org/?node_id=516501

chris


From sdavis2 at mail.nih.gov  Mon Jun 18 10:03:35 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 18 Jun 2007 10:03:35 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4676880C.9030009@sendu.me.uk>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
Message-ID: <467690B7.7090105@mail.nih.gov>

Sendu Bala wrote:
> Chris Fields wrote:
>> If so it may be better to allow https, though I don't know how Chris  
>> D. and others feel about it.
> 
> If it makes no difference to me as an end-user, I won't mind. But I 
> won't want to enter my password even once, at the beginning of a 
> session. If that's not possible with https, then ssh should be an option 
> as well.
> 
> 
> Unrelated, but it randomly just occurred to me: what happens to all the 
> id lines at the top of modules? Eg:
> 
> $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $
> 
> That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
> I wish we would, since they caused me no end of hassles during the 1.5.2 
> release, doing updates across branches.)

See here:

http://svnbook.red-bean.com/en/1.0/ch07s02.html

Check out the section at the bottom having to do with svn:keywords.

Sean


From akarger at CGR.Harvard.edu  Mon Jun 18 10:10:57 2007
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 18 Jun 2007 10:10:57 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <46751EC7.8020609@sheffield.ac.uk>
References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
	<46751EC7.8020609@sheffield.ac.uk>
Message-ID: <B9182BFF5B004245BABC12956EA6322E04AFA6BC@huls5.nucleus.harvard.edu>

 
> Just to clarify, subversion is available as command line for windows:
> http://subversion.tigris.org/project_packages.html
> 
> TortoiseSVN is another svn client with a GUI that integrates into the
> shell. I tried setting this up a while back to use ssh (via 
> PUTTY), but
> I wasn't successful. This may have been due to me just 
> starting out with
> svn or that it was harder to setup in an earlier version of 
> TortoiseSVN.
> 
> Does anyone have experience of setting up svn on Windows to 
> use ssh? If
> the changeover takes place, I'm happy to write some howto's 
> for setting
> up svn clients for Windows.

Here are some notes I wrote recently. I'm using this with command-line
svn, not TortoiseSVN. I would hope that it would work with Tortoise,
too, but I can't guarantee.

1. Run PuTTYgen (installed with PuTTY, probably in Start
menu->Programs->PuTTY) and follow directions to create a private key
file like C:\someplace\private_key.ppk and a public key. At this point,
you'll pick an ssh password, which is separate from your login password.

2. Get an account with the appropriate .ssh/authorized_keys file on the
host machine. (This is not Windows-specific. By the way, if you change
the lines of the authorized_keys file to start with, e.g., 
	command="svnserve -t -r /main/repos/dir",no-pty ssh-rsa AAAAB...
comment
then (a) you're more secure because users can't open a real shell on the
computer, and (b) users don't need to type the repository directory in
their svn co commands.)

3. Set your environment variables (My Computer->Properties. Advanced
Tab, click on Environment Variables. In the top half ("User variables
for ..."), click "New" and put in the variable name and value.

3a. Set the SVN_EDITOR environment variable to your favorite editor,
such as vim or emacs, or a full path to some other editor. If it's not
set, then either VISUAL or EDITOR must be set.

3b. Set the SVN_SSH environment variable to run PuTTY's "plink" program,
which is the Windows equivalent of command-line ssh. If you installed
PuTTY in the default location, set it to "C:/Program
Files/PuTTY/plink.exe". Note 1: use FORWARD slashes. Note 2: Include the
quotation marks in the environment variable.

4. When you want to start using svn, you'll need to run Pageant (Start
menu->Programs->PuTTY), select "Add Key", browse to your private key
file, and enter the ssh password you chose in step 1 (not your login
password). Pageant will stay running until you quit it or logout, so you
can have multiple svn checkins etc., and you only need to type in your
password once.

5. Now just run command-line svn commands the same way you would on UNIX
(modulo Windows' brain-dead shell).

-Amir Karger


From cjfields at uiuc.edu  Mon Jun 18 10:24:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 09:24:00 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4676880C.9030009@sendu.me.uk>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
Message-ID: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>

On Jun 18, 2007, at 8:26 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> If so it may be better to allow https, though I don't know how  
>> Chris  D. and others feel about it.
>
> If it makes no difference to me as an end-user, I won't mind. But I  
> won't want to enter my password even once, at the beginning of a  
> session. If that's not possible with https, then ssh should be an  
> option as well.

Aaron pointed out in a related post that https access is the  
preferred option behind a corporate firewall (svn prompts for proxy  
user/pass, then caches it).  Not sure how Jason/Hilmar/Chris D. feel  
about https or supporting both https+ssh.

...

>> Did we make a decision as to the fate of cvs if we get svn up-and-  
>> running?  Keep it around (assuming svn commits would be carried  
>> over  to cvs and vice versa)?  Or see what happens over time?
>
> Well, I don't think hard decisions are possible until we know how  
> its going to work in practice. I tried setting up my own svn  
> repository once, but didn't keep it and can't remember much about it.

Agree; we'll need to work out specifics once we know how things work  
out using cvs2svn.  I think the idea is to test using a smaller  
distribution (maybe network or db) and move up from there.

> So, I suppose we'll play it by ear and decide things later. Is  
> someone out there actively doing something leading toward a  
> demonstration of how it will be?

George Hartzell is going to test it out, I believe, and will post  
something when he can.

chris


From dmessina at wustl.edu  Mon Jun 18 10:54:31 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 18 Jun 2007 09:54:31 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
	<DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
Message-ID: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>

[Chris F]
> Will do.  Maybe something that conforms to PBP; there's a PBP
> perltidy config on perlmonks, along with some emacs/vim related bits:
>
> http://www.perlmonks.org/?node_id=516501


FYI, perltidy now has a built-in -pbp flag:

[from perltidy-20070508]
> -pbp, --perl-best-practices
> -pbp is an abbreviation for the parameters in the book Perl Best  
> Practices by Damian Conway:
>
>     -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1  
> -nsfs -nolq
>     -wbb="% + - * / x != == >= <= =~ !~ < > | & =
>           **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x="
> Note that the -st and -se flags make perltidy act as a filter on  
> one file only. These can be overridden with -nst and -nse if  
> necessary.
>
[full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ 
bin/perltidy]


Dave


From dmessina at wustl.edu  Mon Jun 18 11:04:10 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 18 Jun 2007 10:04:10 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>

Awesome, Sendu! Really glad you implemented this.


> Can anyone offer a
> way to systematically find at least the test scripts which access the
> internet, if not the specific tests within?

I think tests would be accessing the net indirectly through a BioPerl  
module (which may also be using indirect access), so it'd be hard to  
come up with a universal glob for that.

However:

	% grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l
	108

	% ls -1 bioperl-live/t | wc -l
	248

Less than half of the test files use BIOPERLDEBUG, so that narrows  
down the possibilities...

Dave


From bix at sendu.me.uk  Mon Jun 18 11:09:19 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 16:09:19 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
Message-ID: <4676A01F.30205@sendu.me.uk>

David Messina wrote:
>> Can anyone offer a
>> way to systematically find at least the test scripts which access the
>> internet, if not the specific tests within?
> 
> I think tests would be accessing the net indirectly through a BioPerl 
> module (which may also be using indirect access), so it'd be hard to 
> come up with a universal glob for that.
> 
> However:
> 
>     % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l
>     108
> 
>     % ls -1 bioperl-live/t | wc -l
>     248
> 
> Less than half of the test files use BIOPERLDEBUG, so that narrows down 
> the possibilities...

Not necessarily. The problem is that there may be test scripts that have 
never even tried to skip network tests, and therefore don't use 
BIOPERLDEBUG. (Or that chose their own way to decide when to skip.)

I was thinking along the lines of, does anyone know how to monitor 
accesses to the network card (or equivalent), getting information on 
which program (test script) requested the access?


From cjfields at uiuc.edu  Mon Jun 18 11:41:28 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 10:41:28 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
	<DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
	<67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>
Message-ID: <B3EDFCDD-0F3D-47C8-B3A8-A428F24B265A@uiuc.edu>


On Jun 18, 2007, at 9:54 AM, David Messina wrote:

> [Chris F]
>> Will do.  Maybe something that conforms to PBP; there's a PBP
>> perltidy config on perlmonks, along with some emacs/vim related bits:
>>
>> http://www.perlmonks.org/?node_id=516501
>
>
> FYI, perltidy now has a built-in -pbp flag:
>
> [from perltidy-20070508]
>> -pbp, --perl-best-practices
>> -pbp is an abbreviation for the parameters in the book Perl Best
>> Practices by Damian Conway:
>>
>>     -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1
>> -nsfs -nolq
>>     -wbb="% + - * / x != == >= <= =~ !~ < > | & =
>>           **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x="
>> Note that the -st and -se flags make perltidy act as a filter on
>> one file only. These can be overridden with -nst and -nse if
>> necessary.
>>
> [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/
> bin/perltidy]
>
>
> Dave

<slaps head>  Makes sense that would eventually be incorporated.

If so there's no need to include a config (unless we want to sway  
away from PBP-style).  We can just recommend everyone use that setting.

chris


From cjfields at uiuc.edu  Mon Jun 18 12:06:26 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 11:06:26 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676A01F.30205@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
Message-ID: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>


On Jun 18, 2007, at 10:09 AM, Sendu Bala wrote:

> David Messina wrote:
>>> ...
>> Less than half of the test files use BIOPERLDEBUG, so that narrows  
>> down
>> the possibilities...
>
> Not necessarily. The problem is that there may be test scripts that  
> have
> never even tried to skip network tests, and therefore don't use
> BIOPERLDEBUG. (Or that chose their own way to decide when to skip.)
>
> I was thinking along the lines of, does anyone know how to monitor
> accesses to the network card (or equivalent), getting information on
> which program (test script) requested the access?

EUtilities.t uses network tests predominately.  I'll switch over when  
I commit everything from the overhaul.

Couldn't you enable BIOPERLDEBUG, disable network access, then  
iterate through tests checking for those which fail or skip?  I think  
Test::Harness has a way to do this, using execute_tests().

chris


From bix at sendu.me.uk  Mon Jun 18 12:34:38 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 17:34:38 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
Message-ID: <4676B41E.3050706@sendu.me.uk>

Chris Fields wrote:
> Couldn't you enable BIOPERLDEBUG, disable network access, then iterate 
> through tests checking for those which fail or skip?

Yes, good idea, though my dev machine is also my email/webserver so I'd 
rather come up with an alternate solution than one involving 'disable 
network access'.

Still, that's what I'll probably end up doing. Cheers!


Oh, Chris, Spiros, how goes the Test::More conversion? I might want to 
wait for you to finish, or join in? If you're not going to have time to 
do any more in the next few weeks, can you please update 
http://www.bioperl.org/wiki/TestMoreProgress removing your name (or in 
the opposite case, add your name in)? Its not quite clear to me which 
tests are assigned to whom. Can someone clarify what the markings mean?

Cheers,
Sendu.


From cjfields at uiuc.edu  Mon Jun 18 12:43:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 11:43:31 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676B41E.3050706@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
	<4676B41E.3050706@sendu.me.uk>
Message-ID: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>


On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> Couldn't you enable BIOPERLDEBUG, disable network access, then  
>> iterate through tests checking for those which fail or skip?
>
> Yes, good idea, though my dev machine is also my email/webserver so  
> I'd rather come up with an alternate solution than one involving  
> 'disable network access'.
>
> Still, that's what I'll probably end up doing. Cheers!
>
>
> Oh, Chris, Spiros, how goes the Test::More conversion? I might want  
> to wait for you to finish, or join in? If you're not going to have  
> time to do any more in the next few weeks, can you please update  
> http://www.bioperl.org/wiki/TestMoreProgress removing your name (or  
> in the opposite case, add your name in)? Its not quite clear to me  
> which tests are assigned to whom. Can someone clarify what the  
> markings mean?
>
> Cheers,
> Sendu.

Not sure how far along spiros is; I handed it over after I finished  
up to the 'Q' tests.  In general the ones marked out have been  
converted over, ones with names next to them have been claimed.  If  
you need help I'll prob. start back up again to finish them off; we  
just need to divy them up.

chris


From george.heller at yahoo.com  Mon Jun 18 13:07:59 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 10:07:59 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net>
Message-ID: <218165.62089.qm@web56505.mail.re3.yahoo.com>

What exactly is the "node n" in the query below. When I issue this query, it says, 
   
  relation "node" does not exist.
   
  I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line,
   
  shift->throw_not_implemented();
   
  Thanks.
  George.

Hilmar Lapp <hlapp at gmx.net> wrote:
  I'm a bit confused - it sounds like you have set up a local BioSQL 
database and loaded the NCBI taxonomy into the database. You can now 
use simple SQL to retrieve all descendants of a node in the tree 
given its NCBI taxonID such as

SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
WHERE
n.ncbi_taxon_id = :taxonID
AND tn.left_value > n. left_value
AND tn.right_value < n.right_value
AND tn.taxon_id = tnm.taxon_id
AND tn.name_class = 'scientific_name'

BioPerl doesn't have a Taxonomy::biosql module yet (though this would 
seem like a worthwhile thing to add), so you can't use the 
Bio::DB::Taxonomy interface to do this against a BioSQL instance.

However, BioPerl does have support for the flat-file download of the 
NCBI taxonomy database and indexes it, so you can simply use 
Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download 
to achieve what you wanted to do in a less than 5 lines of perl.

Although the recursive implementation of Taxonomy::get_all_Descendants 
() won't be lightning fast, it may still be perfectly fine for your 
application - are you sure it is not?

-hilmar

On Jun 18, 2007, at 12:21 AM, George Heller wrote:

> Thanks. And how can I assign the $node here in the below code, such 
> that I can reference it to a particular taxon id record? I want to 
> retrieve all the descendents from the taxonomy hierarchy, given a 
> particular taxon id.
>
> I have a local db setup, in which I have uploaded data using the 
> load_ncbi_taxonomy.pl script.
>
> Thanks.
> George
>
> Jason Stajich wrote:
> I assume you already figured out how to setup a local taxonomydb?
>
>
> You just want the extant species/leaves of the tree
>
>
> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>
>
>
> -jason
> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
> Hi all,
>
>
> Can anyone point me to some example that uses the 
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at 
> this, and I am not quite sure how to implement it.
>
>
> Thanks.
> George
>
>
> Sendu Bala wrote:
> George Heller wrote:
> Hi all,
>
>
> I am looking at extracting the taxonomy hierarchy for some taxon 
> ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
>
>
> Any ideas on the way I can go about doing this?
>
>
> Well, you'll use Bio::DB::Taxonomy presumably, and 
> each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
>
>
> If you happen to code up something neat and efficient, why not 
> share it
> with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
> ---------------------------------
> Shape Yahoo! in your own image. Join our Network Research Panel 
> today!
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================


---------------------------------
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 


From jason at bioperl.org  Mon Jun 18 13:53:28 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 10:53:28 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com>
References: <218165.62089.qm@web56505.mail.re3.yahoo.com>
Message-ID: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org>

It is implemented in the implementing class - DB::Taxonomy is just  
the base class. For example see the flatfile implementation  
Bio::DB::Taxonomy::flatfile

See the scripts/taxa/local_taxonomydb_query.PLS for example using it:
nodes and names are from NCBI taxonomy database.

Here is an un-debugged copy+paste for your question that *should* work.

use Bio::DB::Taxonomy
my $idx_dir = '/tmp';

my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                -nodesfile => $nodesfile,
                                -namesfile => $namesfile,
                                -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;


-jason

On Jun 18, 2007, at 10:07 AM, George Heller wrote:

> What exactly is the "node n" in the query below. When I issue this  
> query, it says,
>
>   relation "node" does not exist.
>
>   I tried to use the get_all_Descendents method but it looks like  
> in order to do a recursive call it calls the method  
> each_Descendent. This method is not implemented in  
> Bio::DB::Taxonomy. It just has a single line,
>
>   shift->throw_not_implemented();
>
>   Thanks.
>   George.
>
> Hilmar Lapp <hlapp at gmx.net> wrote:
>   I'm a bit confused - it sounds like you have set up a local BioSQL
> database and loaded the NCBI taxonomy into the database. You can now
> use simple SQL to retrieve all descendants of a node in the tree
> given its NCBI taxonID such as
>
> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
> WHERE
> n.ncbi_taxon_id = :taxonID
> AND tn.left_value > n. left_value
> AND tn.right_value < n.right_value
> AND tn.taxon_id = tnm.taxon_id
> AND tn.name_class = 'scientific_name'
>
> BioPerl doesn't have a Taxonomy::biosql module yet (though this would
> seem like a worthwhile thing to add), so you can't use the
> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
> However, BioPerl does have support for the flat-file download of the
> NCBI taxonomy database and indexes it, so you can simply use
> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download
> to achieve what you wanted to do in a less than 5 lines of perl.
>
> Although the recursive implementation of Taxonomy::get_all_Descendants
> () won't be lightning fast, it may still be perfectly fine for your
> application - are you sure it is not?
>
> -hilmar
>
> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>> Thanks. And how can I assign the $node here in the below code, such
>> that I can reference it to a particular taxon id record? I want to
>> retrieve all the descendents from the taxonomy hierarchy, given a
>> particular taxon id.
>>
>> I have a local db setup, in which I have uploaded data using the
>> load_ncbi_taxonomy.pl script.
>>
>> Thanks.
>> George
>>
>> Jason Stajich wrote:
>> I assume you already figured out how to setup a local taxonomydb?
>>
>>
>> You just want the extant species/leaves of the tree
>>
>>
>> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>>
>>
>>
>> -jason
>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>> Hi all,
>>
>>
>> Can anyone point me to some example that uses the
>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>> this, and I am not quite sure how to implement it.
>>
>>
>> Thanks.
>> George
>>
>>
>> Sendu Bala wrote:
>> George Heller wrote:
>> Hi all,
>>
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon
>> ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children and so
>> on.
>>
>>
>> Any ideas on the way I can go about doing this?
>>
>>
>> Well, you'll use Bio::DB::Taxonomy presumably, and
>> each_Descendent in
>> some kind of looping structure. Most easily a recursing sub.
>>
>>
>> If you happen to code up something neat and efficient, why not
>> share it
>> with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Shape Yahoo! in your own image. Join our Network Research Panel
>> today!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
>
>
>
> ---------------------------------
> Take the Internet to Go: Yahoo!Go puts the Internet in your pocket:  
> mail, news, photos & more.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hlapp at gmx.net  Mon Jun 18 18:10:00 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:10:00 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
	<278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>
Message-ID: <989DBD68-896E-4FB9-9413-4A1060E88ABD@gmx.net>

https is working fine for me for sf.net repositories, and I only have  
to enter the password upon first commit (since checkout doesn't even  
need a password).

	-hilmar

On Jun 18, 2007, at 10:24 AM, Chris Fields wrote:

> Not sure how Jason/Hilmar/Chris D. feel about https or supporting  
> both https+ssh

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From george.heller at yahoo.com  Mon Jun 18 18:18:21 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 15:18:21 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org>
Message-ID: <904670.24974.qm@web56513.mail.re3.yahoo.com>

I tried running the below mentioned script and I seem to be getting the following error:
   
  Weak references are not implemented in the version of perl at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76.
Compilation failed in require at my.pl line 7.
BEGIN failed--compilation aborted at my.pl line 7.

  My script looks something like,
   
  #!/usr/bin/perl
  use strict;
#use warnings;
use DBI;
  use Bio::Tree::Node;
use Bio::DB::Taxonomy;
use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
  
my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                               -nodesfile => $nodesfile,
                               -namesfile => $namesfile,
                               -directory => $idx_dir);
 my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  
      foreach $field (@extant_children) {
         print "$field";
         print "|";
         print "\n";
      }

  And I am running the script using the command,
   
  perl myscript.pl -v --names names.dmp --nodes nodes.dmp
   
  and I have the nodes.dmp and names.dmp files in the current directory.
   
  Thanks,
  George
  

Jason Stajich <jason at bioperl.org> wrote:
  It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile  

  See the scripts/taxa/local_taxonomydb_query.PLS for example using it:
  nodes and names are from NCBI taxonomy database.
  

  Here is an un-debugged copy+paste for your question that *should* work.
  

  use Bio::DB::Taxonomy
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
    my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                 -nodesfile => $nodesfile,
                                 -namesfile => $namesfile,
                                 -directory => $idx_dir);
     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  -jason

    On Jun 18, 2007, at 10:07 AM, George Heller wrote:

    What exactly is the "node n" in the query below. When I issue this query, it says, 
  

    relation "node" does not exist.
  

    I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line,
  

    shift->throw_not_implemented();
  

    Thanks.
    George.
  

  Hilmar Lapp <hlapp at gmx.net> wrote:
    I'm a bit confused - it sounds like you have set up a local BioSQL 
  database and loaded the NCBI taxonomy into the database. You can now 
  use simple SQL to retrieve all descendants of a node in the tree 
  given its NCBI taxonID such as
  

  SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
  WHERE
  n.ncbi_taxon_id = :taxonID
  AND tn.left_value > n. left_value
  AND tn.right_value < n.right_value
  AND tn.taxon_id = tnm.taxon_id
  AND tn.name_class = 'scientific_name'
  

  BioPerl doesn't have a Taxonomy::biosql module yet (though this would 
  seem like a worthwhile thing to add), so you can't use the 
  Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

  However, BioPerl does have support for the flat-file download of the 
  NCBI taxonomy database and indexes it, so you can simply use 
  Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download 
  to achieve what you wanted to do in a less than 5 lines of perl.
  

  Although the recursive implementation of Taxonomy::get_all_Descendants 
  () won't be lightning fast, it may still be perfectly fine for your 
  application - are you sure it is not?
  

  -hilmar
  

  On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

    Thanks. And how can I assign the $node here in the below code, such 
  that I can reference it to a particular taxon id record? I want to 
  retrieve all the descendents from the taxonomy hierarchy, given a 
  particular taxon id.
  

  I have a local db setup, in which I have uploaded data using the 
  load_ncbi_taxonomy.pl script.
  

  Thanks.
  George
  

  Jason Stajich wrote:
  I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
  

  -jason
  On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

  Hi all,
  

  Can anyone point me to some example that uses the 
  get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at 
  this, and I am not quite sure how to implement it.
  

  Thanks.
  George
  

  Sendu Bala wrote:
  George Heller wrote:
  Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon 
  ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and 
  each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not 
  share it
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image. Join our Network Research Panel 
  today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Need a vacation? Get great deals to amazing places on Yahoo! Travel.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  -- 
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Bored stiff? Loosen up...
Download and play hundreds of games for free on Yahoo! Games.


From hlapp at gmx.net  Mon Jun 18 18:27:19 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:27:19 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com>
References: <218165.62089.qm@web56505.mail.re3.yahoo.com>
Message-ID: <DEB0D23B-4FEC-418A-8AAB-FF4CBF4DAF65@gmx.net>


On Jun 18, 2007, at 1:07 PM, George Heller wrote:

> What exactly is the "node n" in the query below. When I issue this  
> query, it says,

Sorry, replace with "taxon". Jason answered the rest.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 18:33:40 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 17:33:40 -0500
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <904670.24974.qm@web56513.mail.re3.yahoo.com>
References: <904670.24974.qm@web56513.mail.re3.yahoo.com>
Message-ID: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>

As the error implies your local version of perl doesn't seem support  
weak references, which means it doesn't have Scalar::Utils (which was  
added to core after perl 5.6.1, I think).  Try installing  
Scalar::Utils to see what happens.

chris

On Jun 18, 2007, at 5:18 PM, George Heller wrote:

> I tried running the below mentioned script and I seem to be getting  
> the following error:
>
>   Weak references are not implemented in the version of perl at / 
> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ 
> Bio/Tree/Node.pm line 76.
> Compilation failed in require at my.pl line 7.
> BEGIN failed--compilation aborted at my.pl line 7.
>
>   My script looks something like,
>
>   #!/usr/bin/perl
>   use strict;
> #use warnings;
> use DBI;
>   use Bio::Tree::Node;
> use Bio::DB::Taxonomy;
> use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
>
> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                -nodesfile => $nodesfile,
>                                -namesfile => $namesfile,
>                                -directory => $idx_dir);
>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>       foreach $field (@extant_children) {
>          print "$field";
>          print "|";
>          print "\n";
>       }
>
>   And I am running the script using the command,
>
>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>   and I have the nodes.dmp and names.dmp files in the current  
> directory.
>
>   Thanks,
>   George
>
>
> Jason Stajich <jason at bioperl.org> wrote:
>   It is implemented in the implementing class - DB::Taxonomy is  
> just the base class. For example see the flatfile implementation  
> Bio::DB::Taxonomy::flatfile
>
>   See the scripts/taxa/local_taxonomydb_query.PLS for example using  
> it:
>   nodes and names are from NCBI taxonomy database.
>
>
>   Here is an un-debugged copy+paste for your question that *should*  
> work.
>
>
>   use Bio::DB::Taxonomy
>   my $idx_dir = '/tmp';
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                  -nodesfile => $nodesfile,
>                                  -namesfile => $namesfile,
>                                  -directory => $idx_dir);
>      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>
>
>
>   -jason
>
>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>     What exactly is the "node n" in the query below. When I issue  
> this query, it says,
>
>
>     relation "node" does not exist.
>
>
>     I tried to use the get_all_Descendents method but it looks like  
> in order to do a recursive call it calls the method  
> each_Descendent. This method is not implemented in  
> Bio::DB::Taxonomy. It just has a single line,
>
>
>     shift->throw_not_implemented();
>
>
>     Thanks.
>     George.
>
>
>   Hilmar Lapp <hlapp at gmx.net> wrote:
>     I'm a bit confused - it sounds like you have set up a local BioSQL
>   database and loaded the NCBI taxonomy into the database. You can now
>   use simple SQL to retrieve all descendants of a node in the tree
>   given its NCBI taxonID such as
>
>
>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>   WHERE
>   n.ncbi_taxon_id = :taxonID
>   AND tn.left_value > n. left_value
>   AND tn.right_value < n.right_value
>   AND tn.taxon_id = tnm.taxon_id
>   AND tn.name_class = 'scientific_name'
>
>
>   BioPerl doesn't have a Taxonomy::biosql module yet (though this  
> would
>   seem like a worthwhile thing to add), so you can't use the
>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>   However, BioPerl does have support for the flat-file download of the
>   NCBI taxonomy database and indexes it, so you can simply use
>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile  
> download
>   to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>   Although the recursive implementation of  
> Taxonomy::get_all_Descendants
>   () won't be lightning fast, it may still be perfectly fine for your
>   application - are you sure it is not?
>
>
>   -hilmar
>
>
>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>     Thanks. And how can I assign the $node here in the below code,  
> such
>   that I can reference it to a particular taxon id record? I want to
>   retrieve all the descendents from the taxonomy hierarchy, given a
>   particular taxon id.
>
>
>   I have a local db setup, in which I have uploaded data using the
>   load_ncbi_taxonomy.pl script.
>
>
>   Thanks.
>   George
>
>
>   Jason Stajich wrote:
>   I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>   You just want the extant species/leaves of the tree
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descedents;
>
>
>
>
>
>
>   -jason
>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>   Hi all,
>
>
>
>
>   Can anyone point me to some example that uses the
>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>   this, and I am not quite sure how to implement it.
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>   Sendu Bala wrote:
>   George Heller wrote:
>   Hi all,
>
>
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon
>   ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children and so
>   on.
>
>
>
>
>   Any ideas on the way I can go about doing this?
>
>
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and
>   each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>   If you happen to code up something neat and efficient, why not
>   share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image. Join our Network Research Panel
>   today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Need a vacation? Get great deals to amazing places on Yahoo! Travel.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Take the Internet to Go: Yahoo!Go puts the Internet in your  
> pocket: mail, news, photos & more.
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Bored stiff? Loosen up...
> Download and play hundreds of games for free on Yahoo! Games.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Mon Jun 18 18:50:38 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:50:38 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>
References: <904670.24974.qm@web56513.mail.re3.yahoo.com>
	<707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>
Message-ID: <F433CCB4-781D-480E-8EF5-CD68E70B27B8@gmx.net>

The perl version appears to be 5.8.5 though, so something strange  
appears to be going on too.

George, can you please post the output of

	$ /usr/bin/perl -V

-hilmar

On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:

> As the error implies your local version of perl doesn't seem support
> weak references, which means it doesn't have Scalar::Utils (which was
> added to core after perl 5.6.1, I think).  Try installing
> Scalar::Utils to see what happens.
>
> chris
>
> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>> I tried running the below mentioned script and I seem to be getting
>> the following error:
>>
>>   Weak references are not implemented in the version of perl at /
>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>> Bio/Tree/Node.pm line 76.
>> Compilation failed in require at my.pl line 7.
>> BEGIN failed--compilation aborted at my.pl line 7.
>>
>>   My script looks something like,
>>
>>   #!/usr/bin/perl
>>   use strict;
>> #use warnings;
>> use DBI;
>>   use Bio::Tree::Node;
>> use Bio::DB::Taxonomy;
>> use Bio::DB::Taxonomy::flatfile;
>>   my $idx_dir = '/tmp';
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>>                                -nodesfile => $nodesfile,
>>                                -namesfile => $namesfile,
>>                                -directory => $idx_dir);
>>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>  my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>       foreach $field (@extant_children) {
>>          print "$field";
>>          print "|";
>>          print "\n";
>>       }
>>
>>   And I am running the script using the command,
>>
>>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>
>>   and I have the nodes.dmp and names.dmp files in the current
>> directory.
>>
>>   Thanks,
>>   George
>>
>>
>> Jason Stajich <jason at bioperl.org> wrote:
>>   It is implemented in the implementing class - DB::Taxonomy is
>> just the base class. For example see the flatfile implementation
>> Bio::DB::Taxonomy::flatfile
>>
>>   See the scripts/taxa/local_taxonomydb_query.PLS for example using
>> it:
>>   nodes and names are from NCBI taxonomy database.
>>
>>
>>   Here is an un-debugged copy+paste for your question that *should*
>> work.
>>
>>
>>   use Bio::DB::Taxonomy
>>   my $idx_dir = '/tmp';
>>
>>
>>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>>     my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>>                                  -nodesfile => $nodesfile,
>>                                  -namesfile => $namesfile,
>>                                  -directory => $idx_dir);
>>      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>  my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>
>>
>>
>>   -jason
>>
>>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>
>>     What exactly is the "node n" in the query below. When I issue
>> this query, it says,
>>
>>
>>     relation "node" does not exist.
>>
>>
>>     I tried to use the get_all_Descendents method but it looks like
>> in order to do a recursive call it calls the method
>> each_Descendent. This method is not implemented in
>> Bio::DB::Taxonomy. It just has a single line,
>>
>>
>>     shift->throw_not_implemented();
>>
>>
>>     Thanks.
>>     George.
>>
>>
>>   Hilmar Lapp <hlapp at gmx.net> wrote:
>>     I'm a bit confused - it sounds like you have set up a local  
>> BioSQL
>>   database and loaded the NCBI taxonomy into the database. You can  
>> now
>>   use simple SQL to retrieve all descendants of a node in the tree
>>   given its NCBI taxonID such as
>>
>>
>>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>>   WHERE
>>   n.ncbi_taxon_id = :taxonID
>>   AND tn.left_value > n. left_value
>>   AND tn.right_value < n.right_value
>>   AND tn.taxon_id = tnm.taxon_id
>>   AND tn.name_class = 'scientific_name'
>>
>>
>>   BioPerl doesn't have a Taxonomy::biosql module yet (though this
>> would
>>   seem like a worthwhile thing to add), so you can't use the
>>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>
>>
>>   However, BioPerl does have support for the flat-file download of  
>> the
>>   NCBI taxonomy database and indexes it, so you can simply use
>>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>> download
>>   to achieve what you wanted to do in a less than 5 lines of perl.
>>
>>
>>   Although the recursive implementation of
>> Taxonomy::get_all_Descendants
>>   () won't be lightning fast, it may still be perfectly fine for your
>>   application - are you sure it is not?
>>
>>
>>   -hilmar
>>
>>
>>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>
>>
>>     Thanks. And how can I assign the $node here in the below code,
>> such
>>   that I can reference it to a particular taxon id record? I want to
>>   retrieve all the descendents from the taxonomy hierarchy, given a
>>   particular taxon id.
>>
>>
>>   I have a local db setup, in which I have uploaded data using the
>>   load_ncbi_taxonomy.pl script.
>>
>>
>>   Thanks.
>>   George
>>
>>
>>   Jason Stajich wrote:
>>   I assume you already figured out how to setup a local taxonomydb?
>>
>>
>>
>>
>>   You just want the extant species/leaves of the tree
>>
>>
>>
>>
>>   my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descedents;
>>
>>
>>
>>
>>
>>
>>   -jason
>>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>>
>>   Hi all,
>>
>>
>>
>>
>>   Can anyone point me to some example that uses the
>>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>>   this, and I am not quite sure how to implement it.
>>
>>
>>
>>
>>   Thanks.
>>   George
>>
>>
>>
>>
>>   Sendu Bala wrote:
>>   George Heller wrote:
>>   Hi all,
>>
>>
>>
>>
>>   I am looking at extracting the taxonomy hierarchy for some taxon
>>   ids.
>>   What I plan to do is, for a given taxon id, say 33090, I want to
>>   extract all taxon ids that are children of this species. I do not
>>   just want the immediate children, but the children's children  
>> and so
>>   on.
>>
>>
>>
>>
>>   Any ideas on the way I can go about doing this?
>>
>>
>>
>>
>>   Well, you'll use Bio::DB::Taxonomy presumably, and
>>   each_Descendent in
>>   some kind of looping structure. Most easily a recursing sub.
>>
>>
>>
>>
>>   If you happen to code up something neat and efficient, why not
>>   share it
>>   with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Shape Yahoo! in your own image. Join our Network Research Panel
>>   today!
>>   _______________________________________________
>>   Bioperl-l mailing list
>>   Bioperl-l at lists.open-bio.org
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>>
>>   --
>>   Jason Stajich
>>   jason at bioperl.org
>>   http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Need a vacation? Get great deals to amazing places on Yahoo!  
>> Travel.
>>   _______________________________________________
>>   Bioperl-l mailing list
>>   Bioperl-l at lists.open-bio.org
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>   --
>>   ===========================================================
>>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>>   ===========================================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Take the Internet to Go: Yahoo!Go puts the Internet in your
>> pocket: mail, news, photos & more.
>>
>>
>>     --
>>   Jason Stajich
>>   jason at bioperl.org
>>   http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Bored stiff? Loosen up...
>> Download and play hundreds of games for free on Yahoo! Games.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From george.heller at yahoo.com  Mon Jun 18 19:05:42 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 16:05:42 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F433CCB4-781D-480E-8EF5-CD68E70B27B8@gmx.net>
Message-ID: <706979.34648.qm@web56509.mail.re3.yahoo.com>

This is the output of /usr/bin/perl -V

Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
  Platform:
    osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
    uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.3.4'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  
Characteristics of this binary (from libperl):
  Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
  Built under linux
  Compiled at Jul 24 2006 18:28:10
  @INC:
    /usr/lib/perl5/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/5.8.5
    /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.5
    /usr/lib/perl5/site_perl/5.8.4
    /usr/lib/perl5/site_perl/5.8.3
    /usr/lib/perl5/site_perl/5.8.2
    /usr/lib/perl5/site_perl/5.8.1
    /usr/lib/perl5/site_perl/5.8.0
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.5
    /usr/lib/perl5/vendor_perl/5.8.4
    /usr/lib/perl5/vendor_perl/5.8.3
    /usr/lib/perl5/vendor_perl/5.8.2
    /usr/lib/perl5/vendor_perl/5.8.1
    /usr/lib/perl5/vendor_perl/5.8.0
    /usr/lib/perl5/vendor_perl
   
  Thanks.
  George
    .

Hilmar Lapp <hlapp at gmx.net> wrote:
  The perl version appears to be 5.8.5 though, so something strange 
appears to be going on too.

George, can you please post the output of

$ /usr/bin/perl -V

-hilmar

On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:

> As the error implies your local version of perl doesn't seem support
> weak references, which means it doesn't have Scalar::Utils (which was
> added to core after perl 5.6.1, I think). Try installing
> Scalar::Utils to see what happens.
>
> chris
>
> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>> I tried running the below mentioned script and I seem to be getting
>> the following error:
>>
>> Weak references are not implemented in the version of perl at /
>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>> Bio/Tree/Node.pm line 76.
>> Compilation failed in require at my.pl line 7.
>> BEGIN failed--compilation aborted at my.pl line 7.
>>
>> My script looks something like,
>>
>> #!/usr/bin/perl
>> use strict;
>> #use warnings;
>> use DBI;
>> use Bio::Tree::Node;
>> use Bio::DB::Taxonomy;
>> use Bio::DB::Taxonomy::flatfile;
>> my $idx_dir = '/tmp';
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>> -nodesfile => $nodesfile,
>> -namesfile => $namesfile,
>> -directory => $idx_dir);
>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>> foreach $field (@extant_children) {
>> print "$field";
>> print "|";
>> print "\n";
>> }
>>
>> And I am running the script using the command,
>>
>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>
>> and I have the nodes.dmp and names.dmp files in the current
>> directory.
>>
>> Thanks,
>> George
>>
>>
>> Jason Stajich wrote:
>> It is implemented in the implementing class - DB::Taxonomy is
>> just the base class. For example see the flatfile implementation
>> Bio::DB::Taxonomy::flatfile
>>
>> See the scripts/taxa/local_taxonomydb_query.PLS for example using
>> it:
>> nodes and names are from NCBI taxonomy database.
>>
>>
>> Here is an un-debugged copy+paste for your question that *should*
>> work.
>>
>>
>> use Bio::DB::Taxonomy
>> my $idx_dir = '/tmp';
>>
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>> -nodesfile => $nodesfile,
>> -namesfile => $namesfile,
>> -directory => $idx_dir);
>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>
>>
>>
>> -jason
>>
>> On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>
>> What exactly is the "node n" in the query below. When I issue
>> this query, it says,
>>
>>
>> relation "node" does not exist.
>>
>>
>> I tried to use the get_all_Descendents method but it looks like
>> in order to do a recursive call it calls the method
>> each_Descendent. This method is not implemented in
>> Bio::DB::Taxonomy. It just has a single line,
>>
>>
>> shift->throw_not_implemented();
>>
>>
>> Thanks.
>> George.
>>
>>
>> Hilmar Lapp wrote:
>> I'm a bit confused - it sounds like you have set up a local 
>> BioSQL
>> database and loaded the NCBI taxonomy into the database. You can 
>> now
>> use simple SQL to retrieve all descendants of a node in the tree
>> given its NCBI taxonID such as
>>
>>
>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>> WHERE
>> n.ncbi_taxon_id = :taxonID
>> AND tn.left_value > n. left_value
>> AND tn.right_value < n.right_value
>> AND tn.taxon_id = tnm.taxon_id
>> AND tn.name_class = 'scientific_name'
>>
>>
>> BioPerl doesn't have a Taxonomy::biosql module yet (though this
>> would
>> seem like a worthwhile thing to add), so you can't use the
>> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>
>>
>> However, BioPerl does have support for the flat-file download of 
>> the
>> NCBI taxonomy database and indexes it, so you can simply use
>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>> download
>> to achieve what you wanted to do in a less than 5 lines of perl.
>>
>>
>> Although the recursive implementation of
>> Taxonomy::get_all_Descendants
>> () won't be lightning fast, it may still be perfectly fine for your
>> application - are you sure it is not?
>>
>>
>> -hilmar
>>
>>
>> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>
>>
>> Thanks. And how can I assign the $node here in the below code,
>> such
>> that I can reference it to a particular taxon id record? I want to
>> retrieve all the descendents from the taxonomy hierarchy, given a
>> particular taxon id.
>>
>>
>> I have a local db setup, in which I have uploaded data using the
>> load_ncbi_taxonomy.pl script.
>>
>>
>> Thanks.
>> George
>>
>>
>> Jason Stajich wrote:
>> I assume you already figured out how to setup a local taxonomydb?
>>
>>
>>
>>
>> You just want the extant species/leaves of the tree
>>
>>
>>
>>
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descedents;
>>
>>
>>
>>
>>
>>
>> -jason
>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>>
>> Hi all,
>>
>>
>>
>>
>> Can anyone point me to some example that uses the
>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>> this, and I am not quite sure how to implement it.
>>
>>
>>
>>
>> Thanks.
>> George
>>
>>
>>
>>
>> Sendu Bala wrote:
>> George Heller wrote:
>> Hi all,
>>
>>
>>
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon
>> ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children 
>> and so
>> on.
>>
>>
>>
>>
>> Any ideas on the way I can go about doing this?
>>
>>
>>
>>
>> Well, you'll use Bio::DB::Taxonomy presumably, and
>> each_Descendent in
>> some kind of looping structure. Most easily a recursing sub.
>>
>>
>>
>>
>> If you happen to code up something neat and efficient, why not
>> share it
>> with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Shape Yahoo! in your own image. Join our Network Research Panel
>> today!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Need a vacation? Get great deals to amazing places on Yahoo! 
>> Travel.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> --
>> ===========================================================
>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Take the Internet to Go: Yahoo!Go puts the Internet in your
>> pocket: mail, news, photos & more.
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Bored stiff? Loosen up...
>> Download and play hundreds of games for free on Yahoo! Games.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================


---------------------------------
Expecting? Get great news right away with email Auto-Check.
Try the Yahoo! Mail Beta.


From jason at bioperl.org  Mon Jun 18 19:22:08 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 16:22:08 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <706979.34648.qm@web56509.mail.re3.yahoo.com>
References: <706979.34648.qm@web56509.mail.re3.yahoo.com>
Message-ID: <C93DF7A1-20AC-4474-BBC6-0C2598406EEB@bioperl.org>

Try installing the latest Scalar::Util

On Jun 18, 2007, at 4:05 PM, George Heller wrote:

> This is the output of /usr/bin/perl -V
>
> Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
>   Platform:
>     osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386- 
> linux-thread-multi
>     uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>     config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>     hint=recommended, useposix=true, d_sigaction=define
>     usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>     useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
>     use64bitint=undef use64bitall=undef uselongdouble=undef
>     usemymalloc=n, bincompat5005=undef
>   Compiler:
>     cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- 
> strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>     optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>     cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- 
> aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>     ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)',  
> gccosandvers=''
>     intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>     d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>     ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>     alignbytes=4, prototype=define
>   Linker and Libraries:
>     ld='gcc', ldflags =' -L/usr/local/lib'
>     libpth=/usr/local/lib /lib /usr/lib
>     libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>     perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>     libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>     gnulibc_version='2.3.4'
>   Dynamic Linking:
>     dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,- 
> E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>     cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
> Characteristics of this binary (from libperl):
>   Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>   Built under linux
>   Compiled at Jul 24 2006 18:28:10
>   @INC:
>     /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/5.8.5
>     /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.5
>     /usr/lib/perl5/site_perl/5.8.4
>     /usr/lib/perl5/site_perl/5.8.3
>     /usr/lib/perl5/site_perl/5.8.2
>     /usr/lib/perl5/site_perl/5.8.1
>     /usr/lib/perl5/site_perl/5.8.0
>     /usr/lib/perl5/site_perl
>     /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.5
>     /usr/lib/perl5/vendor_perl/5.8.4
>     /usr/lib/perl5/vendor_perl/5.8.3
>     /usr/lib/perl5/vendor_perl/5.8.2
>     /usr/lib/perl5/vendor_perl/5.8.1
>     /usr/lib/perl5/vendor_perl/5.8.0
>     /usr/lib/perl5/vendor_perl
>
>   Thanks.
>   George
>     .
>
> Hilmar Lapp <hlapp at gmx.net> wrote:
>   The perl version appears to be 5.8.5 though, so something strange
> appears to be going on too.
>
> George, can you please post the output of
>
> $ /usr/bin/perl -V
>
> -hilmar
>
> On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>> As the error implies your local version of perl doesn't seem support
>> weak references, which means it doesn't have Scalar::Utils (which was
>> added to core after perl 5.6.1, I think). Try installing
>> Scalar::Utils to see what happens.
>>
>> chris
>>
>> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>>
>>> I tried running the below mentioned script and I seem to be getting
>>> the following error:
>>>
>>> Weak references are not implemented in the version of perl at /
>>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>>> Bio/Tree/Node.pm line 76.
>>> Compilation failed in require at my.pl line 7.
>>> BEGIN failed--compilation aborted at my.pl line 7.
>>>
>>> My script looks something like,
>>>
>>> #!/usr/bin/perl
>>> use strict;
>>> #use warnings;
>>> use DBI;
>>> use Bio::Tree::Node;
>>> use Bio::DB::Taxonomy;
>>> use Bio::DB::Taxonomy::flatfile;
>>> my $idx_dir = '/tmp';
>>>
>>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>>> -nodesfile => $nodesfile,
>>> -namesfile => $namesfile,
>>> -directory => $idx_dir);
>>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descendents;
>>>
>>> foreach $field (@extant_children) {
>>> print "$field";
>>> print "|";
>>> print "\n";
>>> }
>>>
>>> And I am running the script using the command,
>>>
>>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>>
>>> and I have the nodes.dmp and names.dmp files in the current
>>> directory.
>>>
>>> Thanks,
>>> George
>>>
>>>
>>> Jason Stajich wrote:
>>> It is implemented in the implementing class - DB::Taxonomy is
>>> just the base class. For example see the flatfile implementation
>>> Bio::DB::Taxonomy::flatfile
>>>
>>> See the scripts/taxa/local_taxonomydb_query.PLS for example using
>>> it:
>>> nodes and names are from NCBI taxonomy database.
>>>
>>>
>>> Here is an un-debugged copy+paste for your question that *should*
>>> work.
>>>
>>>
>>> use Bio::DB::Taxonomy
>>> my $idx_dir = '/tmp';
>>>
>>>
>>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>>> -nodesfile => $nodesfile,
>>> -namesfile => $namesfile,
>>> -directory => $idx_dir);
>>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descendents;
>>>
>>>
>>>
>>>
>>> -jason
>>>
>>> On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>>
>>> What exactly is the "node n" in the query below. When I issue
>>> this query, it says,
>>>
>>>
>>> relation "node" does not exist.
>>>
>>>
>>> I tried to use the get_all_Descendents method but it looks like
>>> in order to do a recursive call it calls the method
>>> each_Descendent. This method is not implemented in
>>> Bio::DB::Taxonomy. It just has a single line,
>>>
>>>
>>> shift->throw_not_implemented();
>>>
>>>
>>> Thanks.
>>> George.
>>>
>>>
>>> Hilmar Lapp wrote:
>>> I'm a bit confused - it sounds like you have set up a local
>>> BioSQL
>>> database and loaded the NCBI taxonomy into the database. You can
>>> now
>>> use simple SQL to retrieve all descendants of a node in the tree
>>> given its NCBI taxonID such as
>>>
>>>
>>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>>> WHERE
>>> n.ncbi_taxon_id = :taxonID
>>> AND tn.left_value > n. left_value
>>> AND tn.right_value < n.right_value
>>> AND tn.taxon_id = tnm.taxon_id
>>> AND tn.name_class = 'scientific_name'
>>>
>>>
>>> BioPerl doesn't have a Taxonomy::biosql module yet (though this
>>> would
>>> seem like a worthwhile thing to add), so you can't use the
>>> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>>
>>>
>>> However, BioPerl does have support for the flat-file download of
>>> the
>>> NCBI taxonomy database and indexes it, so you can simply use
>>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>>> download
>>> to achieve what you wanted to do in a less than 5 lines of perl.
>>>
>>>
>>> Although the recursive implementation of
>>> Taxonomy::get_all_Descendants
>>> () won't be lightning fast, it may still be perfectly fine for your
>>> application - are you sure it is not?
>>>
>>>
>>> -hilmar
>>>
>>>
>>> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>>
>>>
>>> Thanks. And how can I assign the $node here in the below code,
>>> such
>>> that I can reference it to a particular taxon id record? I want to
>>> retrieve all the descendents from the taxonomy hierarchy, given a
>>> particular taxon id.
>>>
>>>
>>> I have a local db setup, in which I have uploaded data using the
>>> load_ncbi_taxonomy.pl script.
>>>
>>>
>>> Thanks.
>>> George
>>>
>>>
>>> Jason Stajich wrote:
>>> I assume you already figured out how to setup a local taxonomydb?
>>>
>>>
>>>
>>>
>>> You just want the extant species/leaves of the tree
>>>
>>>
>>>
>>>
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descedents;
>>>
>>>
>>>
>>>
>>>
>>>
>>> -jason
>>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>>
>>>
>>> Hi all,
>>>
>>>
>>>
>>>
>>> Can anyone point me to some example that uses the
>>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>>> this, and I am not quite sure how to implement it.
>>>
>>>
>>>
>>>
>>> Thanks.
>>> George
>>>
>>>
>>>
>>>
>>> Sendu Bala wrote:
>>> George Heller wrote:
>>> Hi all,
>>>
>>>
>>>
>>>
>>> I am looking at extracting the taxonomy hierarchy for some taxon
>>> ids.
>>> What I plan to do is, for a given taxon id, say 33090, I want to
>>> extract all taxon ids that are children of this species. I do not
>>> just want the immediate children, but the children's children
>>> and so
>>> on.
>>>
>>>
>>>
>>>
>>> Any ideas on the way I can go about doing this?
>>>
>>>
>>>
>>>
>>> Well, you'll use Bio::DB::Taxonomy presumably, and
>>> each_Descendent in
>>> some kind of looping structure. Most easily a recursing sub.
>>>
>>>
>>>
>>>
>>> If you happen to code up something neat and efficient, why not
>>> share it
>>> with us and we could add it to the Taxonomy module(s).
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Shape Yahoo! in your own image. Join our Network Research Panel
>>> today!
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>>
>>> --
>>> Jason Stajich
>>> jason at bioperl.org
>>> http://jason.open-bio.org/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Need a vacation? Get great deals to amazing places on Yahoo!
>>> Travel.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> --
>>> ===========================================================
>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Take the Internet to Go: Yahoo!Go puts the Internet in your
>>> pocket: mail, news, photos & more.
>>>
>>>
>>> --
>>> Jason Stajich
>>> jason at bioperl.org
>>> http://jason.open-bio.org/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Bored stiff? Loosen up...
>>> Download and play hundreds of games for free on Yahoo! Games.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
>
>
>
> ---------------------------------
> Expecting? Get great news right away with email Auto-Check.
> Try the Yahoo! Mail Beta.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From george.heller at yahoo.com  Mon Jun 18 20:04:00 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 17:04:00 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <C93DF7A1-20AC-4474-BBC6-0C2598406EEB@bioperl.org>
Message-ID: <424035.72876.qm@web56507.mail.re3.yahoo.com>

Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
   
  Sorry to be bothering, really appreaciate your patience.
   
  Thanks.
  George

Jason Stajich <jason at bioperl.org> wrote:
  Try installing the latest Scalar::Util  
    On Jun 18, 2007, at 4:05 PM, George Heller wrote:

    This is the output of /usr/bin/perl -V
  

  Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
    Platform:
      osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
      uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
      config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
      hint=recommended, useposix=true, d_sigaction=define
      usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
      useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
      use64bitint=undef use64bitall=undef uselongdouble=undef
      usemymalloc=n, bincompat5005=undef
    Compiler:
      cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
      optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
      cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
      ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
      intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
      d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
      ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
      alignbytes=4, prototype=define
    Linker and Libraries:
      ld='gcc', ldflags =' -L/usr/local/lib'
      libpth=/usr/local/lib /lib /usr/lib
      libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
      perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
      libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
      gnulibc_version='2.3.4'
    Dynamic Linking:
      dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
      cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

  Characteristics of this binary (from libperl):
    Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
    Built under linux
    Compiled at Jul 24 2006 18:28:10
    @INC:
      /usr/lib/perl5/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/5.8.5
      /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.5
      /usr/lib/perl5/site_perl/5.8.4
      /usr/lib/perl5/site_perl/5.8.3
      /usr/lib/perl5/site_perl/5.8.2
      /usr/lib/perl5/site_perl/5.8.1
      /usr/lib/perl5/site_perl/5.8.0
      /usr/lib/perl5/site_perl
      /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.5
      /usr/lib/perl5/vendor_perl/5.8.4
      /usr/lib/perl5/vendor_perl/5.8.3
      /usr/lib/perl5/vendor_perl/5.8.2
      /usr/lib/perl5/vendor_perl/5.8.1
      /usr/lib/perl5/vendor_perl/5.8.0
      /usr/lib/perl5/vendor_perl
  

    Thanks.
    George
      .
  

  Hilmar Lapp <hlapp at gmx.net> wrote:
    The perl version appears to be 5.8.5 though, so something strange 
  appears to be going on too.
  

  George, can you please post the output of
  

  $ /usr/bin/perl -V
  

  -hilmar
  

  On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

    As the error implies your local version of perl doesn't seem support
  weak references, which means it doesn't have Scalar::Utils (which was
  added to core after perl 5.6.1, I think). Try installing
  Scalar::Utils to see what happens.
  

  chris
  

  On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

    I tried running the below mentioned script and I seem to be getting
  the following error:
  

  Weak references are not implemented in the version of perl at /
  usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
  BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
  Bio/Tree/Node.pm line 76.
  Compilation failed in require at my.pl line 7.
  BEGIN failed--compilation aborted at my.pl line 7.
  

  My script looks something like,
  

  #!/usr/bin/perl
  use strict;
  #use warnings;
  use DBI;
  use Bio::Tree::Node;
  use Bio::DB::Taxonomy;
  use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
  my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
  -nodesfile => $nodesfile,
  -namesfile => $namesfile,
  -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descendents;
  

  foreach $field (@extant_children) {
  print "$field";
  print "|";
  print "\n";
  }
  

  And I am running the script using the command,
  

  perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

  and I have the nodes.dmp and names.dmp files in the current
  directory.
  

  Thanks,
  George
  

  Jason Stajich wrote:
  It is implemented in the implementing class - DB::Taxonomy is
  just the base class. For example see the flatfile implementation
  Bio::DB::Taxonomy::flatfile
  

  See the scripts/taxa/local_taxonomydb_query.PLS for example using
  it:
  nodes and names are from NCBI taxonomy database.
  

  Here is an un-debugged copy+paste for your question that *should*
  work.
  

  use Bio::DB::Taxonomy
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
  my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
  -nodesfile => $nodesfile,
  -namesfile => $namesfile,
  -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descendents;
  

  -jason
  

  On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

  What exactly is the "node n" in the query below. When I issue
  this query, it says,
  

  relation "node" does not exist.
  

  I tried to use the get_all_Descendents method but it looks like
  in order to do a recursive call it calls the method
  each_Descendent. This method is not implemented in
  Bio::DB::Taxonomy. It just has a single line,
  

  shift->throw_not_implemented();
  

  Thanks.
  George.
  

  Hilmar Lapp wrote:
  I'm a bit confused - it sounds like you have set up a local 
  BioSQL
  database and loaded the NCBI taxonomy into the database. You can 
  now
  use simple SQL to retrieve all descendants of a node in the tree
  given its NCBI taxonID such as
  

  SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
  WHERE
  n.ncbi_taxon_id = :taxonID
  AND tn.left_value > n. left_value
  AND tn.right_value < n.right_value
  AND tn.taxon_id = tnm.taxon_id
  AND tn.name_class = 'scientific_name'
  

  BioPerl doesn't have a Taxonomy::biosql module yet (though this
  would
  seem like a worthwhile thing to add), so you can't use the
  Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

  However, BioPerl does have support for the flat-file download of 
  the
  NCBI taxonomy database and indexes it, so you can simply use
  Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
  download
  to achieve what you wanted to do in a less than 5 lines of perl.
  

  Although the recursive implementation of
  Taxonomy::get_all_Descendants
  () won't be lightning fast, it may still be perfectly fine for your
  application - are you sure it is not?
  

  -hilmar
  

  On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

  Thanks. And how can I assign the $node here in the below code,
  such
  that I can reference it to a particular taxon id record? I want to
  retrieve all the descendents from the taxonomy hierarchy, given a
  particular taxon id.
  

  I have a local db setup, in which I have uploaded data using the
  load_ncbi_taxonomy.pl script.
  

  Thanks.
  George
  

  Jason Stajich wrote:
  I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descedents;
  

  -jason
  On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

  Hi all,
  

  Can anyone point me to some example that uses the
  get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
  this, and I am not quite sure how to implement it.
  

  Thanks.
  George
  

  Sendu Bala wrote:
  George Heller wrote:
  Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon
  ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children 
  and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and
  each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not
  share it
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image. Join our Network Research Panel
  today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Need a vacation? Get great deals to amazing places on Yahoo! 
  Travel.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Take the Internet to Go: Yahoo!Go puts the Internet in your
  pocket: mail, news, photos & more.
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Bored stiff? Loosen up...
  Download and play hundreds of games for free on Yahoo! Games.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  Christopher Fields
  Postdoctoral Researcher
  Lab of Dr. Robert Switzer
  Dept of Biochemistry
  University of Illinois Urbana-Champaign
  

  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  -- 
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Expecting? Get great news right away with email Auto-Check.
  Try the Yahoo! Mail Beta.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Building a website is a piece of cake. 
Yahoo! Small Business gives you all the tools to get online.


From jason at bioperl.org  Mon Jun 18 20:17:34 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 17:17:34 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <424035.72876.qm@web56507.mail.re3.yahoo.com>
References: <424035.72876.qm@web56507.mail.re3.yahoo.com>
Message-ID: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org>

All the children are in this array.

You get to decide what you want to do with them. In the following  
example I print the id, rank, and scientific name out to the screen.
Because this is a taxonomy db query you are getting back  
Bio::Taxonomy::Taxon objects so read the documentation for this  
module to see what you can do with the object.
I would also suggest spending a little time with the Getting started  
and HOWTO:Trees documentation on the website to get familiar with the  
objects and nomenclature.


my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;

for my $child ( @extant_children ) {
   print "id is ", $child->id, "\n"; # NCBI taxa id
   print "rank is ", $child->rank, "\n"; # e.g. species
   print "scientific name is ", $child->scientific_name, "\n"; #  
scientific name
}

On Jun 18, 2007, at 5:04 PM, George Heller wrote:

> Ok, I installed the latest of Scalar::Util and the script seems to  
> be working. But I am confused where exactly I need to look for the  
> descendent taxon ids once the script is run. I did look into the / 
> tmp/ directory, but I couldnt understand much.
>
>   Sorry to be bothering, really appreaciate your patience.
>
>   Thanks.
>   George
>
> Jason Stajich <jason at bioperl.org> wrote:
>   Try installing the latest Scalar::Util
>     On Jun 18, 2007, at 4:05 PM, George Heller wrote:
>
>     This is the output of /usr/bin/perl -V
>
>
>   Summary of my perl5 (revision 5 version 8 subversion 5)  
> configuration:
>     Platform:
>       osname=linux, osvers=2.6.9-22.18.bz155725.elsmp,  
> archname=i386-linux-thread-multi
>       uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>       config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>       hint=recommended, useposix=true, d_sigaction=define
>       usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>       useperlio=define d_sfio=undef uselargefiles=define  
> usesocks=undef
>       use64bitint=undef use64bitall=undef uselongdouble=undef
>       usemymalloc=n, bincompat5005=undef
>     Compiler:
>       cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - 
> fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>       optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>       cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- 
> aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>       ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)',  
> gccosandvers=''
>       intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>       d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>       ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>       alignbytes=4, prototype=define
>     Linker and Libraries:
>       ld='gcc', ldflags =' -L/usr/local/lib'
>       libpth=/usr/local/lib /lib /usr/lib
>       libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>       perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>       libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>       gnulibc_version='2.3.4'
>     Dynamic Linking:
>       dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- 
> Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>       cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
>
>   Characteristics of this binary (from libperl):
>     Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>     Built under linux
>     Compiled at Jul 24 2006 18:28:10
>     @INC:
>       /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/5.8.5
>       /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.5
>       /usr/lib/perl5/site_perl/5.8.4
>       /usr/lib/perl5/site_perl/5.8.3
>       /usr/lib/perl5/site_perl/5.8.2
>       /usr/lib/perl5/site_perl/5.8.1
>       /usr/lib/perl5/site_perl/5.8.0
>       /usr/lib/perl5/site_perl
>       /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.5
>       /usr/lib/perl5/vendor_perl/5.8.4
>       /usr/lib/perl5/vendor_perl/5.8.3
>       /usr/lib/perl5/vendor_perl/5.8.2
>       /usr/lib/perl5/vendor_perl/5.8.1
>       /usr/lib/perl5/vendor_perl/5.8.0
>       /usr/lib/perl5/vendor_perl
>
>
>     Thanks.
>     George
>       .
>
>
>   Hilmar Lapp <hlapp at gmx.net> wrote:
>     The perl version appears to be 5.8.5 though, so something strange
>   appears to be going on too.
>
>
>   George, can you please post the output of
>
>
>   $ /usr/bin/perl -V
>
>
>   -hilmar
>
>
>   On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>
>     As the error implies your local version of perl doesn't seem  
> support
>   weak references, which means it doesn't have Scalar::Utils (which  
> was
>   added to core after perl 5.6.1, I think). Try installing
>   Scalar::Utils to see what happens.
>
>
>   chris
>
>
>   On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>
>     I tried running the below mentioned script and I seem to be  
> getting
>   the following error:
>
>
>   Weak references are not implemented in the version of perl at /
>   usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>   BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>   Bio/Tree/Node.pm line 76.
>   Compilation failed in require at my.pl line 7.
>   BEGIN failed--compilation aborted at my.pl line 7.
>
>
>   My script looks something like,
>
>
>   #!/usr/bin/perl
>   use strict;
>   #use warnings;
>   use DBI;
>   use Bio::Tree::Node;
>   use Bio::DB::Taxonomy;
>   use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>   my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>   -nodesfile => $nodesfile,
>   -namesfile => $namesfile,
>   -directory => $idx_dir);
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descendents;
>
>
>   foreach $field (@extant_children) {
>   print "$field";
>   print "|";
>   print "\n";
>   }
>
>
>   And I am running the script using the command,
>
>
>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>
>   and I have the nodes.dmp and names.dmp files in the current
>   directory.
>
>
>   Thanks,
>   George
>
>
>
>
>   Jason Stajich wrote:
>   It is implemented in the implementing class - DB::Taxonomy is
>   just the base class. For example see the flatfile implementation
>   Bio::DB::Taxonomy::flatfile
>
>
>   See the scripts/taxa/local_taxonomydb_query.PLS for example using
>   it:
>   nodes and names are from NCBI taxonomy database.
>
>
>
>
>   Here is an un-debugged copy+paste for your question that *should*
>   work.
>
>
>
>
>   use Bio::DB::Taxonomy
>   my $idx_dir = '/tmp';
>
>
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>   my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>   -nodesfile => $nodesfile,
>   -namesfile => $namesfile,
>   -directory => $idx_dir);
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descendents;
>
>
>
>
>
>
>
>
>   -jason
>
>
>   On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>
>   What exactly is the "node n" in the query below. When I issue
>   this query, it says,
>
>
>
>
>   relation "node" does not exist.
>
>
>
>
>   I tried to use the get_all_Descendents method but it looks like
>   in order to do a recursive call it calls the method
>   each_Descendent. This method is not implemented in
>   Bio::DB::Taxonomy. It just has a single line,
>
>
>
>
>   shift->throw_not_implemented();
>
>
>
>
>   Thanks.
>   George.
>
>
>
>
>   Hilmar Lapp wrote:
>   I'm a bit confused - it sounds like you have set up a local
>   BioSQL
>   database and loaded the NCBI taxonomy into the database. You can
>   now
>   use simple SQL to retrieve all descendants of a node in the tree
>   given its NCBI taxonID such as
>
>
>
>
>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>   WHERE
>   n.ncbi_taxon_id = :taxonID
>   AND tn.left_value > n. left_value
>   AND tn.right_value < n.right_value
>   AND tn.taxon_id = tnm.taxon_id
>   AND tn.name_class = 'scientific_name'
>
>
>
>
>   BioPerl doesn't have a Taxonomy::biosql module yet (though this
>   would
>   seem like a worthwhile thing to add), so you can't use the
>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>
>
>   However, BioPerl does have support for the flat-file download of
>   the
>   NCBI taxonomy database and indexes it, so you can simply use
>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>   download
>   to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>
>
>   Although the recursive implementation of
>   Taxonomy::get_all_Descendants
>   () won't be lightning fast, it may still be perfectly fine for your
>   application - are you sure it is not?
>
>
>
>
>   -hilmar
>
>
>
>
>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>
>
>   Thanks. And how can I assign the $node here in the below code,
>   such
>   that I can reference it to a particular taxon id record? I want to
>   retrieve all the descendents from the taxonomy hierarchy, given a
>   particular taxon id.
>
>
>
>
>   I have a local db setup, in which I have uploaded data using the
>   load_ncbi_taxonomy.pl script.
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>   Jason Stajich wrote:
>   I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>
>
>
>
>   You just want the extant species/leaves of the tree
>
>
>
>
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descedents;
>
>
>
>
>
>
>
>
>
>
>
>
>   -jason
>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>
>
>   Hi all,
>
>
>
>
>
>
>
>
>   Can anyone point me to some example that uses the
>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>   this, and I am not quite sure how to implement it.
>
>
>
>
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>
>
>
>
>   Sendu Bala wrote:
>   George Heller wrote:
>   Hi all,
>
>
>
>
>
>
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon
>   ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children
>   and so
>   on.
>
>
>
>
>
>
>
>
>   Any ideas on the way I can go about doing this?
>
>
>
>
>
>
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and
>   each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>
>
>
>
>   If you happen to code up something neat and efficient, why not
>   share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image. Join our Network Research Panel
>   today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Need a vacation? Get great deals to amazing places on Yahoo!
>   Travel.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Take the Internet to Go: Yahoo!Go puts the Internet in your
>   pocket: mail, news, photos & more.
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Bored stiff? Loosen up...
>   Download and play hundreds of games for free on Yahoo! Games.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   Christopher Fields
>   Postdoctoral Researcher
>   Lab of Dr. Robert Switzer
>   Dept of Biochemistry
>   University of Illinois Urbana-Champaign
>
>
>
>
>
>
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Expecting? Get great news right away with email Auto-Check.
>   Try the Yahoo! Mail Beta.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Building a website is a piece of cake.
> Yahoo! Small Business gives you all the tools to get online.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From george.heller at yahoo.com  Mon Jun 18 20:29:31 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 17:29:31 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org>
Message-ID: <369098.81077.qm@web56507.mail.re3.yahoo.com>

But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like,
   
  #!/usr/bin/perl
  use strict;
#use warnings;
use DBI;
  use Bio::Tree::Node;
use Bio::DB::Taxonomy;
use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
my $nodefile;
my $namesfile;

  my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                               -nodesfile => $nodefile,
                               -namesfile => $namesfile,
                               -directory => $idx_dir);
 my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  
for my $child ( @extant_children ) {
  print "id is ", $child->id, "\n"; # NCBI taxa id
  print "rank is ", $child->rank, "\n"; # e.g. species
  print "scientific name is ", $child->scientific_name, "\n"; #
scientific name
}

Thanks.
  George
  
Jason Stajich <jason at bioperl.org> wrote:
    All the children are in this array.  
  

  You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen.  
  Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object.
    I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature.
  

  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  for my $child ( @extant_children ) {
      print "id is ", $child->id, "\n"; # NCBI taxa id
    print "rank is ", $child->rank, "\n"; # e.g. species
    print "scientific name is ", $child->scientific_name, "\n"; # scientific name
  }


    On Jun 18, 2007, at 5:04 PM, George Heller wrote:

    Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
  

    Sorry to be bothering, really appreaciate your patience.
  

    Thanks.
    George
  

  Jason Stajich <jason at bioperl.org> wrote:
    Try installing the latest Scalar::Util  
      On Jun 18, 2007, at 4:05 PM, George Heller wrote:
  

      This is the output of /usr/bin/perl -V
  

    Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
      Platform:
        osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
        uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
        config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
        hint=recommended, useposix=true, d_sigaction=define
        usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
        useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
        use64bitint=undef use64bitall=undef uselongdouble=undef
        usemymalloc=n, bincompat5005=undef
      Compiler:
        cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
        optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
        cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
        ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
        intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
        d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
        ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
        alignbytes=4, prototype=define
      Linker and Libraries:
        ld='gcc', ldflags =' -L/usr/local/lib'
        libpth=/usr/local/lib /lib /usr/lib
        libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
        perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
        libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
        gnulibc_version='2.3.4'
      Dynamic Linking:
        dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
        cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

    Characteristics of this binary (from libperl):
      Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
      Built under linux
      Compiled at Jul 24 2006 18:28:10
      @INC:
        /usr/lib/perl5/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/5.8.5
        /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.5
        /usr/lib/perl5/site_perl/5.8.4
        /usr/lib/perl5/site_perl/5.8.3
        /usr/lib/perl5/site_perl/5.8.2
        /usr/lib/perl5/site_perl/5.8.1
        /usr/lib/perl5/site_perl/5.8.0
        /usr/lib/perl5/site_perl
        /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.5
        /usr/lib/perl5/vendor_perl/5.8.4
        /usr/lib/perl5/vendor_perl/5.8.3
        /usr/lib/perl5/vendor_perl/5.8.2
        /usr/lib/perl5/vendor_perl/5.8.1
        /usr/lib/perl5/vendor_perl/5.8.0
        /usr/lib/perl5/vendor_perl
  

      Thanks.
      George
        .
  

    Hilmar Lapp <hlapp at gmx.net> wrote:
      The perl version appears to be 5.8.5 though, so something strange 
    appears to be going on too.
  

    George, can you please post the output of
  

    $ /usr/bin/perl -V
  

    -hilmar
  

    On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

      As the error implies your local version of perl doesn't seem support
    weak references, which means it doesn't have Scalar::Utils (which was
    added to core after perl 5.6.1, I think). Try installing
    Scalar::Utils to see what happens.
  

    chris
  

    On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

      I tried running the below mentioned script and I seem to be getting
    the following error:
  

    Weak references are not implemented in the version of perl at /
    usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
    BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
    Bio/Tree/Node.pm line 76.
    Compilation failed in require at my.pl line 7.
    BEGIN failed--compilation aborted at my.pl line 7.
  

    My script looks something like,
  

    #!/usr/bin/perl
    use strict;
    #use warnings;
    use DBI;
    use Bio::Tree::Node;
    use Bio::DB::Taxonomy;
    use Bio::DB::Taxonomy::flatfile;
    my $idx_dir = '/tmp';
  

    my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
    my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
    -nodesfile => $nodesfile,
    -namesfile => $namesfile,
    -directory => $idx_dir);
    my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descendents;
  

    foreach $field (@extant_children) {
    print "$field";
    print "|";
    print "\n";
    }
  

    And I am running the script using the command,
  

    perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

    and I have the nodes.dmp and names.dmp files in the current
    directory.
  

    Thanks,
    George
  

    Jason Stajich wrote:
    It is implemented in the implementing class - DB::Taxonomy is
    just the base class. For example see the flatfile implementation
    Bio::DB::Taxonomy::flatfile
  

    See the scripts/taxa/local_taxonomydb_query.PLS for example using
    it:
    nodes and names are from NCBI taxonomy database.
  

    Here is an un-debugged copy+paste for your question that *should*
    work.
  

    use Bio::DB::Taxonomy
    my $idx_dir = '/tmp';
  

    my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
    my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
    -nodesfile => $nodesfile,
    -namesfile => $namesfile,
    -directory => $idx_dir);
    my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descendents;
  

    -jason
  

    On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

    What exactly is the "node n" in the query below. When I issue
    this query, it says,
  

    relation "node" does not exist.
  

    I tried to use the get_all_Descendents method but it looks like
    in order to do a recursive call it calls the method
    each_Descendent. This method is not implemented in
    Bio::DB::Taxonomy. It just has a single line,
  

    shift->throw_not_implemented();
  

    Thanks.
    George.
  

    Hilmar Lapp wrote:
    I'm a bit confused - it sounds like you have set up a local 
    BioSQL
    database and loaded the NCBI taxonomy into the database. You can 
    now
    use simple SQL to retrieve all descendants of a node in the tree
    given its NCBI taxonID such as
  

    SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
    WHERE
    n.ncbi_taxon_id = :taxonID
    AND tn.left_value > n. left_value
    AND tn.right_value < n.right_value
    AND tn.taxon_id = tnm.taxon_id
    AND tn.name_class = 'scientific_name'
  

    BioPerl doesn't have a Taxonomy::biosql module yet (though this
    would
    seem like a worthwhile thing to add), so you can't use the
    Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

    However, BioPerl does have support for the flat-file download of 
    the
    NCBI taxonomy database and indexes it, so you can simply use
    Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
    download
    to achieve what you wanted to do in a less than 5 lines of perl.
  

    Although the recursive implementation of
    Taxonomy::get_all_Descendants
    () won't be lightning fast, it may still be perfectly fine for your
    application - are you sure it is not?
  

    -hilmar
  

    On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

    Thanks. And how can I assign the $node here in the below code,
    such
    that I can reference it to a particular taxon id record? I want to
    retrieve all the descendents from the taxonomy hierarchy, given a
    particular taxon id.
  

    I have a local db setup, in which I have uploaded data using the
    load_ncbi_taxonomy.pl script.
  

    Thanks.
    George
  

    Jason Stajich wrote:
    I assume you already figured out how to setup a local taxonomydb?
  

    You just want the extant species/leaves of the tree
  

    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descedents;
  

    -jason
    On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

    Hi all,
  

    Can anyone point me to some example that uses the
    get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
    this, and I am not quite sure how to implement it.
  

    Thanks.
    George
  

    Sendu Bala wrote:
    George Heller wrote:
    Hi all,
  

    I am looking at extracting the taxonomy hierarchy for some taxon
    ids.
    What I plan to do is, for a given taxon id, say 33090, I want to
    extract all taxon ids that are children of this species. I do not
    just want the immediate children, but the children's children 
    and so
    on.
  

    Any ideas on the way I can go about doing this?
  

    Well, you'll use Bio::DB::Taxonomy presumably, and
    each_Descendent in
    some kind of looping structure. Most easily a recursing sub.
  

    If you happen to code up something neat and efficient, why not
    share it
    with us and we could add it to the Taxonomy module(s).
  

    ---------------------------------
    Shape Yahoo! in your own image. Join our Network Research Panel
    today!
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

    ---------------------------------
    Need a vacation? Get great deals to amazing places on Yahoo! 
    Travel.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    --
    ===========================================================
    : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
    ===========================================================
  

    ---------------------------------
    Take the Internet to Go: Yahoo!Go puts the Internet in your
    pocket: mail, news, photos & more.
  

    --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

    ---------------------------------
    Bored stiff? Loosen up...
    Download and play hundreds of games for free on Yahoo! Games.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    Christopher Fields
    Postdoctoral Researcher
    Lab of Dr. Robert Switzer
    Dept of Biochemistry
    University of Illinois Urbana-Champaign
  

    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    -- 
    ===========================================================
    : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
    ===========================================================
  

    ---------------------------------
    Expecting? Get great news right away with email Auto-Check.
    Try the Yahoo! Mail Beta.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

  ---------------------------------
  Building a website is a piece of cake. 
  Yahoo! Small Business gives you all the tools to get online.


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us.


From jason at bioperl.org  Mon Jun 18 21:05:43 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 18:05:43 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <369098.81077.qm@web56507.mail.re3.yahoo.com>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
Message-ID: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>

The files are indexes because you are indexing a flatfile - this  
speeds up the lookup so the second time you run the script it doesn't  
have to index.
You don't need to look at the files, they won't make sense to a human!

The reason it isn't printing anything is someone didn't really write  
the implementation quite right. This code was overhauled by Sendu  
before the last release I guess something didn't quite get connected.

I checked in code that has the Bio::Taxon delegating now to a DB  
handle for the each_Descendent call.
You can either patch your code  or just use the code listed here:
  http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

On Jun 18, 2007, at 5:29 PM, George Heller wrote:

> But the problem is that I don't really get any output on the  
> screen. In the /tmp directory I get 4 files namely parents, nodes,  
> id2names and names2id, but I dont know what to make of them. This  
> is what my script looks like,
>
>   #!/usr/bin/perl
>   use strict;
> #use warnings;
> use DBI;
>   use Bio::Tree::Node;
> use Bio::DB::Taxonomy;
> use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
> my $nodefile;
> my $namesfile;
>
>   my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                -nodesfile => $nodefile,
>                                -namesfile => $namesfile,
>                                -directory => $idx_dir);
>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
> for my $child ( @extant_children ) {
>   print "id is ", $child->id, "\n"; # NCBI taxa id
>   print "rank is ", $child->rank, "\n"; # e.g. species
>   print "scientific name is ", $child->scientific_name, "\n"; #
> scientific name
> }
>
> Thanks.
>   George
>
> Jason Stajich <jason at bioperl.org> wrote:
>     All the children are in this array.
>
>
>   You get to decide what you want to do with them. In the following  
> example I print the id, rank, and scientific name out to the screen.
>   Because this is a taxonomy db query you are getting back  
> Bio::Taxonomy::Taxon objects so read the documentation for this  
> module to see what you can do with the object.
>     I would also suggest spending a little time with the Getting  
> started and HOWTO:Trees documentation on the website to get  
> familiar with the objects and nomenclature.
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>
>   for my $child ( @extant_children ) {
>       print "id is ", $child->id, "\n"; # NCBI taxa id
>     print "rank is ", $child->rank, "\n"; # e.g. species
>     print "scientific name is ", $child->scientific_name, "\n"; #  
> scientific name
>   }
>
>
>     On Jun 18, 2007, at 5:04 PM, George Heller wrote:
>
>     Ok, I installed the latest of Scalar::Util and the script seems  
> to be working. But I am confused where exactly I need to look for  
> the descendent taxon ids once the script is run. I did look into  
> the /tmp/ directory, but I couldnt understand much.
>
>
>     Sorry to be bothering, really appreaciate your patience.
>
>
>     Thanks.
>     George
>
>
>   Jason Stajich <jason at bioperl.org> wrote:
>     Try installing the latest Scalar::Util
>       On Jun 18, 2007, at 4:05 PM, George Heller wrote:
>
>
>       This is the output of /usr/bin/perl -V
>
>
>
>
>     Summary of my perl5 (revision 5 version 8 subversion 5)  
> configuration:
>       Platform:
>         osname=linux, osvers=2.6.9-22.18.bz155725.elsmp,  
> archname=i386-linux-thread-multi
>         uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>         config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>         hint=recommended, useposix=true, d_sigaction=define
>         usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>         useperlio=define d_sfio=undef uselargefiles=define  
> usesocks=undef
>         use64bitint=undef use64bitall=undef uselongdouble=undef
>         usemymalloc=n, bincompat5005=undef
>       Compiler:
>         cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - 
> fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>         optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>         cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- 
> strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>         ccversion='', gccversion='3.4.6 20060404 (Red Hat  
> 3.4.6-2)', gccosandvers=''
>         intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>         d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>         ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>         alignbytes=4, prototype=define
>       Linker and Libraries:
>         ld='gcc', ldflags =' -L/usr/local/lib'
>         libpth=/usr/local/lib /lib /usr/lib
>         libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>         perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>         libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>         gnulibc_version='2.3.4'
>       Dynamic Linking:
>         dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- 
> Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>         cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
>
>
>
>     Characteristics of this binary (from libperl):
>       Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>       Built under linux
>       Compiled at Jul 24 2006 18:28:10
>       @INC:
>         /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/5.8.5
>         /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.5
>         /usr/lib/perl5/site_perl/5.8.4
>         /usr/lib/perl5/site_perl/5.8.3
>         /usr/lib/perl5/site_perl/5.8.2
>         /usr/lib/perl5/site_perl/5.8.1
>         /usr/lib/perl5/site_perl/5.8.0
>         /usr/lib/perl5/site_perl
>         /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.5
>         /usr/lib/perl5/vendor_perl/5.8.4
>         /usr/lib/perl5/vendor_perl/5.8.3
>         /usr/lib/perl5/vendor_perl/5.8.2
>         /usr/lib/perl5/vendor_perl/5.8.1
>         /usr/lib/perl5/vendor_perl/5.8.0
>         /usr/lib/perl5/vendor_perl
>
>
>
>
>       Thanks.
>       George
>         .
>
>
>
>
>     Hilmar Lapp <hlapp at gmx.net> wrote:
>       The perl version appears to be 5.8.5 though, so something  
> strange
>     appears to be going on too.
>
>
>
>
>     George, can you please post the output of
>
>
>
>
>     $ /usr/bin/perl -V
>
>
>
>
>     -hilmar
>
>
>
>
>     On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>
>
>
>       As the error implies your local version of perl doesn't seem  
> support
>     weak references, which means it doesn't have Scalar::Utils  
> (which was
>     added to core after perl 5.6.1, I think). Try installing
>     Scalar::Utils to see what happens.
>
>
>
>
>     chris
>
>
>
>
>     On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>
>
>
>       I tried running the below mentioned script and I seem to be  
> getting
>     the following error:
>
>
>
>
>     Weak references are not implemented in the version of perl at /
>     usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>     BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/ 
> 5.8.5/
>     Bio/Tree/Node.pm line 76.
>     Compilation failed in require at my.pl line 7.
>     BEGIN failed--compilation aborted at my.pl line 7.
>
>
>
>
>     My script looks something like,
>
>
>
>
>     #!/usr/bin/perl
>     use strict;
>     #use warnings;
>     use DBI;
>     use Bio::Tree::Node;
>     use Bio::DB::Taxonomy;
>     use Bio::DB::Taxonomy::flatfile;
>     my $idx_dir = '/tmp';
>
>
>
>
>     my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>     -nodesfile => $nodesfile,
>     -namesfile => $namesfile,
>     -directory => $idx_dir);
>     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descendents;
>
>
>
>
>     foreach $field (@extant_children) {
>     print "$field";
>     print "|";
>     print "\n";
>     }
>
>
>
>
>     And I am running the script using the command,
>
>
>
>
>     perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>
>
>
>     and I have the nodes.dmp and names.dmp files in the current
>     directory.
>
>
>
>
>     Thanks,
>     George
>
>
>
>
>
>
>
>
>     Jason Stajich wrote:
>     It is implemented in the implementing class - DB::Taxonomy is
>     just the base class. For example see the flatfile implementation
>     Bio::DB::Taxonomy::flatfile
>
>
>
>
>     See the scripts/taxa/local_taxonomydb_query.PLS for example using
>     it:
>     nodes and names are from NCBI taxonomy database.
>
>
>
>
>
>
>
>
>     Here is an un-debugged copy+paste for your question that *should*
>     work.
>
>
>
>
>
>
>
>
>     use Bio::DB::Taxonomy
>     my $idx_dir = '/tmp';
>
>
>
>
>
>
>
>
>     my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>     -nodesfile => $nodesfile,
>     -namesfile => $namesfile,
>     -directory => $idx_dir);
>     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descendents;
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     -jason
>
>
>
>
>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>
>
>
>     What exactly is the "node n" in the query below. When I issue
>     this query, it says,
>
>
>
>
>
>
>
>
>     relation "node" does not exist.
>
>
>
>
>
>
>
>
>     I tried to use the get_all_Descendents method but it looks like
>     in order to do a recursive call it calls the method
>     each_Descendent. This method is not implemented in
>     Bio::DB::Taxonomy. It just has a single line,
>
>
>
>
>
>
>
>
>     shift->throw_not_implemented();
>
>
>
>
>
>
>
>
>     Thanks.
>     George.
>
>
>
>
>
>
>
>
>     Hilmar Lapp wrote:
>     I'm a bit confused - it sounds like you have set up a local
>     BioSQL
>     database and loaded the NCBI taxonomy into the database. You can
>     now
>     use simple SQL to retrieve all descendants of a node in the tree
>     given its NCBI taxonID such as
>
>
>
>
>
>
>
>
>     SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>     WHERE
>     n.ncbi_taxon_id = :taxonID
>     AND tn.left_value > n. left_value
>     AND tn.right_value < n.right_value
>     AND tn.taxon_id = tnm.taxon_id
>     AND tn.name_class = 'scientific_name'
>
>
>
>
>
>
>
>
>     BioPerl doesn't have a Taxonomy::biosql module yet (though this
>     would
>     seem like a worthwhile thing to add), so you can't use the
>     Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>
>
>
>
>
>
>     However, BioPerl does have support for the flat-file download of
>     the
>     NCBI taxonomy database and indexes it, so you can simply use
>     Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>     download
>     to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>
>
>
>
>
>
>     Although the recursive implementation of
>     Taxonomy::get_all_Descendants
>     () won't be lightning fast, it may still be perfectly fine for  
> your
>     application - are you sure it is not?
>
>
>
>
>
>
>
>
>     -hilmar
>
>
>
>
>
>
>
>
>     On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>
>
>
>
>
>
>     Thanks. And how can I assign the $node here in the below code,
>     such
>     that I can reference it to a particular taxon id record? I want to
>     retrieve all the descendents from the taxonomy hierarchy, given a
>     particular taxon id.
>
>
>
>
>
>
>
>
>     I have a local db setup, in which I have uploaded data using the
>     load_ncbi_taxonomy.pl script.
>
>
>
>
>
>
>
>
>     Thanks.
>     George
>
>
>
>
>
>
>
>
>     Jason Stajich wrote:
>     I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     You just want the extant species/leaves of the tree
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descedents;
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     -jason
>     On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>
>
>
>
>
>
>     Hi all,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Can anyone point me to some example that uses the
>     get_all_Descendents method from Bio::DB::Taxonomy? I am a  
> newbie at
>     this, and I am not quite sure how to implement it.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Thanks.
>     George
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Sendu Bala wrote:
>     George Heller wrote:
>     Hi all,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     I am looking at extracting the taxonomy hierarchy for some taxon
>     ids.
>     What I plan to do is, for a given taxon id, say 33090, I want to
>     extract all taxon ids that are children of this species. I do not
>     just want the immediate children, but the children's children
>     and so
>     on.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Any ideas on the way I can go about doing this?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Well, you'll use Bio::DB::Taxonomy presumably, and
>     each_Descendent in
>     some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     If you happen to code up something neat and efficient, why not
>     share it
>     with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Shape Yahoo! in your own image. Join our Network Research Panel
>     today!
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Need a vacation? Get great deals to amazing places on Yahoo!
>     Travel.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>     --
>     ===========================================================
>     : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>     ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Take the Internet to Go: Yahoo!Go puts the Internet in your
>     pocket: mail, news, photos & more.
>
>
>
>
>
>
>
>
>     --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Bored stiff? Loosen up...
>     Download and play hundreds of games for free on Yahoo! Games.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>     Christopher Fields
>     Postdoctoral Researcher
>     Lab of Dr. Robert Switzer
>     Dept of Biochemistry
>     University of Illinois Urbana-Champaign
>
>
>
>
>
>
>
>
>
>
>
>
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>     --
>     ===========================================================
>     : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>     ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Expecting? Get great news right away with email Auto-Check.
>     Try the Yahoo! Mail Beta.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>       --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Building a website is a piece of cake.
>   Yahoo! Small Business gives you all the tools to get online.
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s  
> user panel and lay it on us.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From torsten.seemann at infotech.monash.edu.au  Mon Jun 18 21:21:04 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 19 Jun 2007 11:21:04 +1000
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676A01F.30205@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
Message-ID: <a79f6a4b0706181821p12a2e138xade9c30895e45068@mail.gmail.com>

Sendu,

> >> Can anyone offer a
> >> way to systematically find at least the test scripts which access the
> >> internet, if not the specific tests within?

Perhaps you could use 'strace' to list network system calls for each
test script, and grep out AF_INET connections?

% strace -e trace=network command_to_test 2>&1 | grep AF_INET

I'm not an strace expert but it might do what you need.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From george.heller at yahoo.com  Mon Jun 18 21:16:10 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 18:16:10 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
Message-ID: <815364.33231.qm@web56512.mail.re3.yahoo.com>

Works perfectly. Thanks so much Jason, Hilmar, Chris. You've been a great help!
   
  Thanks.
  George

Jason Stajich <jason at bioperl.org> wrote:
  The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index.  You don't need to look at the files, they won't make sense to a human!
  

  The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. 
  

  I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call.
  You can either patch your code  or just use the code listed here:
     http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

  
    On Jun 18, 2007, at 5:29 PM, George Heller wrote:

    But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like,
  

    #!/usr/bin/perl
    use strict;
  #use warnings;
  use DBI;
    use Bio::Tree::Node;
  use Bio::DB::Taxonomy;
  use Bio::DB::Taxonomy::flatfile;
    my $idx_dir = '/tmp';
  my $nodefile;
  my $namesfile;
  

    my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
  my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                 -nodesfile => $nodefile,
                                 -namesfile => $namesfile,
                                 -directory => $idx_dir);
   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
   my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  for my $child ( @extant_children ) {
    print "id is ", $child->id, "\n"; # NCBI taxa id
    print "rank is ", $child->rank, "\n"; # e.g. species
    print "scientific name is ", $child->scientific_name, "\n"; #
  scientific name
  }
  

  Thanks.
    George
  

  Jason Stajich <jason at bioperl.org> wrote:
      All the children are in this array.  
  

    You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen.  
    Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object.
      I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature.
  

    my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

    for my $child ( @extant_children ) {
        print "id is ", $child->id, "\n"; # NCBI taxa id
      print "rank is ", $child->rank, "\n"; # e.g. species
      print "scientific name is ", $child->scientific_name, "\n"; # scientific name
    }
  

      On Jun 18, 2007, at 5:04 PM, George Heller wrote:
  

      Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
  

      Sorry to be bothering, really appreaciate your patience.
  

      Thanks.
      George
  

    Jason Stajich <jason at bioperl.org> wrote:
      Try installing the latest Scalar::Util  
        On Jun 18, 2007, at 4:05 PM, George Heller wrote:
  

        This is the output of /usr/bin/perl -V
  

      Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
        Platform:
          osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
          uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
          config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
          hint=recommended, useposix=true, d_sigaction=define
          usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
          useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
          use64bitint=undef use64bitall=undef uselongdouble=undef
          usemymalloc=n, bincompat5005=undef
        Compiler:
          cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
          optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
          cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
          ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
          intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
          d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
          ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
          alignbytes=4, prototype=define
        Linker and Libraries:
          ld='gcc', ldflags =' -L/usr/local/lib'
          libpth=/usr/local/lib /lib /usr/lib
          libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
          perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
          libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
          gnulibc_version='2.3.4'
        Dynamic Linking:
          dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
          cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

      Characteristics of this binary (from libperl):
        Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
        Built under linux
        Compiled at Jul 24 2006 18:28:10
        @INC:
          /usr/lib/perl5/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/5.8.5
          /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.5
          /usr/lib/perl5/site_perl/5.8.4
          /usr/lib/perl5/site_perl/5.8.3
          /usr/lib/perl5/site_perl/5.8.2
          /usr/lib/perl5/site_perl/5.8.1
          /usr/lib/perl5/site_perl/5.8.0
          /usr/lib/perl5/site_perl
          /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.5
          /usr/lib/perl5/vendor_perl/5.8.4
          /usr/lib/perl5/vendor_perl/5.8.3
          /usr/lib/perl5/vendor_perl/5.8.2
          /usr/lib/perl5/vendor_perl/5.8.1
          /usr/lib/perl5/vendor_perl/5.8.0
          /usr/lib/perl5/vendor_perl
  

        Thanks.
        George
          .
  

      Hilmar Lapp <hlapp at gmx.net> wrote:
        The perl version appears to be 5.8.5 though, so something strange 
      appears to be going on too.
  

      George, can you please post the output of
  

      $ /usr/bin/perl -V
  

      -hilmar
  

      On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

        As the error implies your local version of perl doesn't seem support
      weak references, which means it doesn't have Scalar::Utils (which was
      added to core after perl 5.6.1, I think). Try installing
      Scalar::Utils to see what happens.
  

      chris
  

      On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

        I tried running the below mentioned script and I seem to be getting
      the following error:
  

      Weak references are not implemented in the version of perl at /
      usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
      BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
      Bio/Tree/Node.pm line 76.
      Compilation failed in require at my.pl line 7.
      BEGIN failed--compilation aborted at my.pl line 7.
  

      My script looks something like,
  

      #!/usr/bin/perl
      use strict;
      #use warnings;
      use DBI;
      use Bio::Tree::Node;
      use Bio::DB::Taxonomy;
      use Bio::DB::Taxonomy::flatfile;
      my $idx_dir = '/tmp';
  

      my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
      my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
      -nodesfile => $nodesfile,
      -namesfile => $namesfile,
      -directory => $idx_dir);
      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descendents;
  

      foreach $field (@extant_children) {
      print "$field";
      print "|";
      print "\n";
      }
  

      And I am running the script using the command,
  

      perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

      and I have the nodes.dmp and names.dmp files in the current
      directory.
  

      Thanks,
      George
  

      Jason Stajich wrote:
      It is implemented in the implementing class - DB::Taxonomy is
      just the base class. For example see the flatfile implementation
      Bio::DB::Taxonomy::flatfile
  

      See the scripts/taxa/local_taxonomydb_query.PLS for example using
      it:
      nodes and names are from NCBI taxonomy database.
  

      Here is an un-debugged copy+paste for your question that *should*
      work.
  

      use Bio::DB::Taxonomy
      my $idx_dir = '/tmp';
  

      my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
      my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
      -nodesfile => $nodesfile,
      -namesfile => $namesfile,
      -directory => $idx_dir);
      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descendents;
  

      -jason
  

      On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

      What exactly is the "node n" in the query below. When I issue
      this query, it says,
  

      relation "node" does not exist.
  

      I tried to use the get_all_Descendents method but it looks like
      in order to do a recursive call it calls the method
      each_Descendent. This method is not implemented in
      Bio::DB::Taxonomy. It just has a single line,
  

      shift->throw_not_implemented();
  

      Thanks.
      George.
  

      Hilmar Lapp wrote:
      I'm a bit confused - it sounds like you have set up a local 
      BioSQL
      database and loaded the NCBI taxonomy into the database. You can 
      now
      use simple SQL to retrieve all descendants of a node in the tree
      given its NCBI taxonID such as
  

      SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
      WHERE
      n.ncbi_taxon_id = :taxonID
      AND tn.left_value > n. left_value
      AND tn.right_value < n.right_value
      AND tn.taxon_id = tnm.taxon_id
      AND tn.name_class = 'scientific_name'
  

      BioPerl doesn't have a Taxonomy::biosql module yet (though this
      would
      seem like a worthwhile thing to add), so you can't use the
      Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

      However, BioPerl does have support for the flat-file download of 
      the
      NCBI taxonomy database and indexes it, so you can simply use
      Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
      download
      to achieve what you wanted to do in a less than 5 lines of perl.
  

      Although the recursive implementation of
      Taxonomy::get_all_Descendants
      () won't be lightning fast, it may still be perfectly fine for your
      application - are you sure it is not?
  

      -hilmar
  

      On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

      Thanks. And how can I assign the $node here in the below code,
      such
      that I can reference it to a particular taxon id record? I want to
      retrieve all the descendents from the taxonomy hierarchy, given a
      particular taxon id.
  

      I have a local db setup, in which I have uploaded data using the
      load_ncbi_taxonomy.pl script.
  

      Thanks.
      George
  

      Jason Stajich wrote:
      I assume you already figured out how to setup a local taxonomydb?
  

      You just want the extant species/leaves of the tree
  

      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descedents;
  

      -jason
      On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

      Hi all,
  

      Can anyone point me to some example that uses the
      get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
      this, and I am not quite sure how to implement it.
  

      Thanks.
      George
  

      Sendu Bala wrote:
      George Heller wrote:
      Hi all,
  

      I am looking at extracting the taxonomy hierarchy for some taxon
      ids.
      What I plan to do is, for a given taxon id, say 33090, I want to
      extract all taxon ids that are children of this species. I do not
      just want the immediate children, but the children's children 
      and so
      on.
  

      Any ideas on the way I can go about doing this?
  

      Well, you'll use Bio::DB::Taxonomy presumably, and
      each_Descendent in
      some kind of looping structure. Most easily a recursing sub.
  

      If you happen to code up something neat and efficient, why not
      share it
      with us and we could add it to the Taxonomy module(s).
  

      ---------------------------------
      Shape Yahoo! in your own image. Join our Network Research Panel
      today!
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

      ---------------------------------
      Need a vacation? Get great deals to amazing places on Yahoo! 
      Travel.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
      ===========================================================
      : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
      ===========================================================
  

      ---------------------------------
      Take the Internet to Go: Yahoo!Go puts the Internet in your
      pocket: mail, news, photos & more.
  

      --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

      ---------------------------------
      Bored stiff? Loosen up...
      Download and play hundreds of games for free on Yahoo! Games.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      Christopher Fields
      Postdoctoral Researcher
      Lab of Dr. Robert Switzer
      Dept of Biochemistry
      University of Illinois Urbana-Champaign
  

      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      -- 
      ===========================================================
      : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
      ===========================================================
  

      ---------------------------------
      Expecting? Get great news right away with email Auto-Check.
      Try the Yahoo! Mail Beta.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

        --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

    ---------------------------------
    Building a website is a piece of cake. 
    Yahoo! Small Business gives you all the tools to get online.
  

      --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

  ---------------------------------
  Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us.


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Now that's room service! Choose from over 150,000 hotels 
in 45,000 destinations on Yahoo! Travel to find your fit.


From torsten.seemann at infotech.monash.edu.au  Mon Jun 18 21:26:41 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 19 Jun 2007 11:26:41 +1000
Subject: [Bioperl-l] gff2xml
In-Reply-To: <a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>
References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
	<a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>
Message-ID: <a79f6a4b0706181826x4ccc4ee5n8ddafa703ad162a3@mail.gmail.com>

(Sean, please reply to the bioperl-l list rather than to me personally
so everyone can read it. i'm reposting it here)

> > I posted this on the gbrowse list earlier. I'm looking to convert gff
> > data files into xml. Does anyone know of a module written to do this
> > already?
>
> What DTD do you want the XML to conform to?
> eg. ChadoXML, TinySeq XML, TIGR XML ... ?

Hi Torsten,
I'm collaborating with other groups and want web-service compatible
functionality for various tools. Normally the analysis tools I'm using
generate gff output. I'm going to have to wrap this output in XML with
XSL stylesheet for end-users to view. Haven't done it before and don't
know what DTD to use. The bp_seqconvert.pl doesn't accept gff format.
I would imagine the DTD would be quite short as the gff files are very
standard, I just don't have any experience with these DTD
requirements.
--Sean O'Keeffe <limericksean at gmail.com>


From sac at bioperl.org  Tue Jun 19 02:42:27 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Mon, 18 Jun 2007 23:42:27 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy)
Message-ID: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>

On 6/16/07, Jason Stajich <jason at bioperl.org> wrote:
> [...]
> Just to say I already went through all the steps of running cvs2svn
> myself and had problems gathering back out the branches and all the
> tags when I tried it.  If you want to start with a smaller repository
> like bioperl-network or bioperl-db as the initial cvs2svn conversion
> script took quite a long time to run on bioperl-live.

Might this been a good opportunity to investigate partitioning
bioperl-live into sub-repositories? There has been talk in the past of
defining a set of "core" modules separate from other functionally
related groups of modules that would be viewed as optional extensions.
The goal being to help manage growth and simplify releases. There are
currently 892 modules under Bio/.

In addition to simplifying the migration to SVN, it would also have
other benefits. Say some new functionality or a slew of fixes were
added to Bio::Graphics. We could turn around a new Bio::Graphics
release quickly without having to work on getting various other parts
up to snuff that aren't related to graphics (Biblio, DB, PopGen,
Search etc.). Maintenance and releases of the various extensions would
be more parallelizable, orchestrated by separate ring leaders.

Over time, as a set of functionality matures, it would see fewer
updates and there would be less of a need for users to
download/install/test it. This could make bioperl easier to customize,
extend, and grok in general.

Long term, it should ease development and release cycles, but it will
involve a bit of near term bullet-biting. We'd need to get clear on
how to partition things, including modules, tests, docs, installation
logic, etc. and we'd probably need new integration tests to verify
that the subsets continue working together.

What do folks think? Would this SVN-based, re-partitioned bioperl-live
constitute a 2.0 release? Any volunteers to help assemble a roadmap
and milestones? Should I go on dreaming?

Cheers,
Steve


From bix at sendu.me.uk  Tue Jun 19 03:01:05 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 08:01:05 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
Message-ID: <46777F31.7030402@sendu.me.uk>

Jason Stajich wrote:
> The reason it isn't printing anything is someone didn't really write  
> the implementation quite right. This code was overhauled by Sendu  
> before the last release I guess something didn't quite get connected.
> 
> I checked in code that has the Bio::Taxon delegating now to a DB  
> handle for the each_Descendent call.
> You can either patch your code  or just use the code listed here:
>   http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

I've reverted that change.

For some reason the docs for Bio::Taxon::each_Descendent aren't showing 
up on the website, but they state:

---
Note that this method never asks the database for the descendents; it 
will only return objects you have manually set with add_Descendent(), or 
where this was done for you by making a Bio::Tree::Tree with this object 
as an argument to new().

To get the database descendents use 
$taxon->db_handle->each_Descendent($taxon).
---


I also have a note in the Synopsis for the module:

---
# Though be careful with each_Descendent - unless you add_Descendent()
# yourself, you won't get an answer because unlike for ancestor(),
# Bio::Taxon does not ask the database for the answer. You can ask the
# database yourself using the same method:
($human) = $homo->db_handle->each_Descendent($homo);
---


This is quite deliberate and is to prevent Bad Things from happening. 
(Can't exactly remember the reasoning now, but I know it was good.)


From bix at sendu.me.uk  Tue Jun 19 03:41:57 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 08:41:57 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
Message-ID: <467788C5.6070406@sendu.me.uk>

Steve Chervitz wrote:
> Might this been a good opportunity to investigate partitioning
> bioperl-live into sub-repositories? There has been talk in the past of
> defining a set of "core" modules separate from other functionally
> related groups of modules that would be viewed as optional extensions.
> The goal being to help manage growth and simplify releases. There are
> currently 892 modules under Bio/.
> 
> In addition to simplifying the migration to SVN, it would also have
> other benefits. Say some new functionality or a slew of fixes were
> added to Bio::Graphics. We could turn around a new Bio::Graphics
> release quickly without having to work on getting various other parts
> up to snuff that aren't related to graphics (Biblio, DB, PopGen,
> Search etc.). Maintenance and releases of the various extensions would
> be more parallelizable, orchestrated by separate ring leaders.
> 
> Over time, as a set of functionality matures, it would see fewer
> updates and there would be less of a need for users to
> download/install/test it. This could make bioperl easier to customize,
> extend, and grok in general.
> 
> Long term, it should ease development and release cycles

I actually take the opposite view. Breaking things up makes testing and 
releases more difficult.

If one person acts as pumpkin for all the sub-parts, his work-load 
increases almost linearly with the number of sub-parts. If each sub-part 
gets its own pumpkin, where do all these pumpkins come from? It seems to 
me that frequently authors will write modules but inevitably their 
circumstance changes and they can no longer devote the time to look 
after them. Having a single pumpkin and 'forcing' him to make sure 
everything works (regardless of his personal interest in the module) 
seems more reliable than hoping there will be a person interested enough 
in each sub-part to handle its release.

Since all sub-parts will at the least interact with the 'true' core set 
of Bioperl modules, they need to be tested and potentially re-released 
every time the true core is updated. And since some sub-parts will 
interact with other sub-parts, there will need to be coordinated 
joint-testing and release of multiple sub-parts.

What happens when users report problems? We ask them what version 
they're running. Right now '1.5.2' means a specific thing, and its 
trivial for someone to confirm the same problem by installing 1.5.2. 
What happens when users have to list out all the versions of all the 
sub-parts they have? Who is going to consistently recreate a users 
hodge-podge of versions in order to confirm a bug? Won't the advice 
instead be: "update all versions to the latest and get back to us"?

So, as I see it, all sub-parts would best be tested and released with a 
single new version number every time one sub-part is updated 
(significantly). In which case, why have sub-parts at all? Keeping 
things the way they are now means ease of release for the pumpkin and 
ease of installation for end-users (only one install command to issue to 
CPAN). Having 'true' sub-parts (each with its own pumpkin), in my 
fatalistic view, is just going to lead to some useful sub-parts being 
abandoned and never updated, even where updates may be desirable.

Each and every Bio:: module could have been released separately by its 
respective author. As I see it, one of the main values of 'Bioperl' is 
that its one (reasonably) consistent collection of modules that lowers 
the barrier of entry for new Bioinformaticians, giving them extremely 
easy access to a whole host of functionality with a single install.


From hlapp at gmx.net  Tue Jun 19 08:47:02 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 08:47:02 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46777F31.7030402@sendu.me.uk>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
Message-ID: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>

So the real mistake was to write

  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;

instead of

  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents 
($node);

I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the  
database?

If this is correct, can we highlight this in the documentation? It's  
a small difference that everyone failed to spot.

If it is not correct, then maybe we need to revisit the rationale for  
why a Bio::DB::Taxonomy::get_all_Descendents may not query the  
underlying database.

Also, in my reading of Bio::Taxonomy::Taxon it won't use the database  
either for ancestor(). Which would be consistent with its other methods.

I.e., the bottom line is don't use Node or Taxon objects for  
hierarchy queries that you expect to use an underlying database, use  
the Bio::DB::Taxonomy object instead. It makes sense, but is it true?

	-hilmar

On Jun 19, 2007, at 3:01 AM, Sendu Bala wrote:

> Jason Stajich wrote:
>> The reason it isn't printing anything is someone didn't really write
>> the implementation quite right. This code was overhauled by Sendu
>> before the last release I guess something didn't quite get connected.
>>
>> I checked in code that has the Bio::Taxon delegating now to a DB
>> handle for the each_Descendent call.
>> You can either patch your code  or just use the code listed here:
>>   http://bioperl.org/wiki/Module:Bio::DB::Taxonomy
>
> I've reverted that change.
>
> For some reason the docs for Bio::Taxon::each_Descendent aren't  
> showing
> up on the website, but they state:
>
> ---
> Note that this method never asks the database for the descendents; it
> will only return objects you have manually set with add_Descendent 
> (), or
> where this was done for you by making a Bio::Tree::Tree with this  
> object
> as an argument to new().
>
> To get the database descendents use
> $taxon->db_handle->each_Descendent($taxon).
> ---
>
>
> I also have a note in the Synopsis for the module:
>
> ---
> # Though be careful with each_Descendent - unless you add_Descendent()
> # yourself, you won't get an answer because unlike for ancestor(),
> # Bio::Taxon does not ask the database for the answer. You can ask the
> # database yourself using the same method:
> ($human) = $homo->db_handle->each_Descendent($homo);
> ---
>
>
> This is quite deliberate and is to prevent Bad Things from happening.
> (Can't exactly remember the reasoning now, but I know it was good.)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From rvos at interchange.ubc.ca  Tue Jun 19 09:05:25 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Tue, 19 Jun 2007 06:05:25 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <15433211.1182258325544.JavaMail.myubc2@brahms.my.ubc.ca>


> Unrelated, but it randomly just occurred to me: what happens to all the 
> id lines at the top of modules? Eg:
> 
> $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $
> 
> That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
> I wish we would, since they caused me no end of hassles during the 1.5.2 
> release, doing updates across branches.)

If you run something like 'svn propset svn:keywords Id' on the file/folder/recursively, svn picks up on the $Id tag. The structure of the resulting string would be a little different, because svn revision numbers are simply auto-increasing integers (afaik) - so any regular expressions that cleverly want to include the revision number in $VERSION would need to be updated.


From bix at sendu.me.uk  Tue Jun 19 10:25:26 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 15:25:26 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
Message-ID: <4677E756.6050200@sendu.me.uk>

Hilmar Lapp wrote:
> So the real mistake was to write
> 
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
> 
> instead of
> 
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents 
> ($node);
> 
> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the  
> database?

Yes, the database object methods use the database. I don't even think it 
makes sense to question that. What else would it do?


> If this is correct, can we highlight this in the documentation? It's  
> a small difference that everyone failed to spot.

The documentation for what? I've already clearly pointed out the gotcha 
in Bio::Taxon.


> Also, in my reading of Bio::Taxonomy::Taxon it won't use the database  
> either for ancestor(). Which would be consistent with its other methods.

Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're dealing 
with, and it /does/ use the db to get the ancestor, unless the ancestor 
is manually set (see below for explanation).


> I.e., the bottom line is don't use Node or Taxon objects for  
> hierarchy queries that you expect to use an underlying database, use  
> the Bio::DB::Taxonomy object instead. It makes sense, but is it true?

Almost. It happens to be true but ideally wouldn't be the case. The 
confusion and problems arise, I guess, because we have two ways to 
access/create hierarchies and both of them are built from the same 
building block (Bio::Taxon objects).

On the one hand we have Bio::DB::Taxonomy and the other we have 
Bio::Tree::Tree.

Tree objects are easy: you have a Taxon object created in memory for 
each and every node in the tree. Each Taxon knows its ancestor and 
descendants by storing references to the relevant Taxon objects in the 
tree. You 'navigate' through the tree by grabbing a Taxon inside it and 
asking the Taxon itself for its ancestor or descendant.

This leaves us with the Taxon object having the methods ancestor() and 
each_Descendent(), which we'll expect to work in other circumstances.

Bio::DB::Taxonomy returns single Taxon objects from the database on 
request. Now we still expect our ancestor() and each_Descendent() 
methods to work, but if things were set up like Bio::Tree::Tree we'd end 
up pulling the entire database into memory because we'd have to create 
all the Taxon objects that are ancestors and descendants, recursively, 
every time we request a single Taxon (which is wasteful in the case of 
Bio::DB::Taxonomy::flatfile and slow/not allowed in the case of 
Bio::DB::Taxonomy::entrez).

The solution? We simply don't create the immediate ancestor or 
descendant Taxon objects of the requested Taxon, and instead implement 
the Taxon methods to ask the database to create them on demand, if they 
don't already exist. Well, that idea is fine (and necessary) for the 
ancestor method, but we run into problems with each_Descendent().

The problem arises when we create Bio::Tree::Tree objects from a Taxon 
we got from the database. Being able to do that is why Bio::Taxon is 
shared between them, as it is a very desirable thing to do: you can 
instantly create a lineage tree for a Taxon of interest and then use all 
the Bio::Tree::Tree methods on it. Unfortunately one of those methods is 
get_nodes() which is implemented using each_Descendent() and 
get_all_Descendents(). If each_Descendent() asked the database for the 
real answer, we'd end up pulling the entire database into the tree.

So my implementation was to not ask the database and just warn people in 
the docs. Ideally it /would/ use the database, because that's what a 
user would expect. Can anyone see an alternate way around the problem?


From hlapp at gmx.net  Tue Jun 19 12:14:38 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 12:14:38 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <4677E756.6050200@sendu.me.uk>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
	<4677E756.6050200@sendu.me.uk>
Message-ID: <C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>

Sorry I was accidentally looking at an older branch.

Reading through the Taxon module I get more confused though than  
would leave me at ease.

Here's what I understand of your description of the problem:

- We would like nodes returned from Bio::DB::Taxonomy to use the  
database for all hierarchical queries.

- We would like nodes used in a Bio::Tree::Tree not to use the  
database for any hierarchical query.

What I understand that we have is

- Taxon node objects that have a db_handle set will use the database  
for ancestor(), unless it has been set manually (?), but not for  
each_Descendent().

- Taxon node objects that don't have a db_handle set won't use a  
database but will function normally otherwise.

- This is needed to prevent Bio::Tree::Tree methods from pulling the  
entire tree into memory.

If this is correct (I'm not sure it is), it sounds like we want to  
temporarily divorce taxonomy nodes from their database capabilities  
while they are being queried in a tree context?

I'm still trying to understand - if I create a Bio::Tree::Tree from a  
single node, will the tree automatically contain all nodes along the  
lineage of ancestors up to the root? So, even if extracting this  
lineage involved querying a database it would be acceptable, but not  
for querying descendents?

It sounds to me like what is needed is that nodes that get added to a  
tree need to be stripped of their database capabilities. This could  
be achieved by creating a wrapper class that delegates all non- 
hierarchical methods to the wrapped Taxon object, and overriding all  
hierarchical queries to not use a database. I'm not sure I fully  
understand yet though, but the inconsistent behavior will be sure to  
throw people off track.

	-hilmar

On Jun 19, 2007, at 10:25 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> So the real mistake was to write
>>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>   my @extant_children = grep { $_->is_Leaf } $node- 
>> >get_all_Descendents;
>> instead of
>>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>   my @extant_children = grep { $_->is_Leaf } $db- 
>> >get_all_Descendents ($node);
>> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask  
>> the  database?
>
> Yes, the database object methods use the database. I don't even  
> think it makes sense to question that. What else would it do?
>
>
>> If this is correct, can we highlight this in the documentation?  
>> It's  a small difference that everyone failed to spot.
>
> The documentation for what? I've already clearly pointed out the  
> gotcha in Bio::Taxon.
>
>
>> Also, in my reading of Bio::Taxonomy::Taxon it won't use the  
>> database  either for ancestor(). Which would be consistent with  
>> its other methods.
>
> Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're  
> dealing with, and it /does/ use the db to get the ancestor, unless  
> the ancestor is manually set (see below for explanation).
>
>
>> I.e., the bottom line is don't use Node or Taxon objects for   
>> hierarchy queries that you expect to use an underlying database,  
>> use  the Bio::DB::Taxonomy object instead. It makes sense, but is  
>> it true?
>
> Almost. It happens to be true but ideally wouldn't be the case. The  
> confusion and problems arise, I guess, because we have two ways to  
> access/create hierarchies and both of them are built from the same  
> building block (Bio::Taxon objects).
>
> On the one hand we have Bio::DB::Taxonomy and the other we have  
> Bio::Tree::Tree.
>
> Tree objects are easy: you have a Taxon object created in memory  
> for each and every node in the tree. Each Taxon knows its ancestor  
> and descendants by storing references to the relevant Taxon objects  
> in the tree. You 'navigate' through the tree by grabbing a Taxon  
> inside it and asking the Taxon itself for its ancestor or descendant.
>
> This leaves us with the Taxon object having the methods ancestor()  
> and each_Descendent(), which we'll expect to work in other  
> circumstances.
>
> Bio::DB::Taxonomy returns single Taxon objects from the database on  
> request. Now we still expect our ancestor() and each_Descendent()  
> methods to work, but if things were set up like Bio::Tree::Tree  
> we'd end up pulling the entire database into memory because we'd  
> have to create all the Taxon objects that are ancestors and  
> descendants, recursively, every time we request a single Taxon  
> (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and  
> slow/not allowed in the case of Bio::DB::Taxonomy::entrez).
>
> The solution? We simply don't create the immediate ancestor or  
> descendant Taxon objects of the requested Taxon, and instead  
> implement the Taxon methods to ask the database to create them on  
> demand, if they don't already exist. Well, that idea is fine (and  
> necessary) for the ancestor method, but we run into problems with  
> each_Descendent().
>
> The problem arises when we create Bio::Tree::Tree objects from a  
> Taxon we got from the database. Being able to do that is why  
> Bio::Taxon is shared between them, as it is a very desirable thing  
> to do: you can instantly create a lineage tree for a Taxon of  
> interest and then use all the Bio::Tree::Tree methods on it.  
> Unfortunately one of those methods is get_nodes() which is  
> implemented using each_Descendent() and get_all_Descendents(). If  
> each_Descendent() asked the database for the real answer, we'd end  
> up pulling the entire database into the tree.
>
> So my implementation was to not ask the database and just warn  
> people in the docs. Ideally it /would/ use the database, because  
> that's what a user would expect. Can anyone see an alternate way  
> around the problem?

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Tue Jun 19 14:41:52 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 19 Jun 2007 14:41:52 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] is this a bp_genbank2gff3.pl bug?
In-Reply-To: <18039.61086.829726.809888@gargle.gargle.HOWL>
References: <18039.61086.829726.809888@gargle.gargle.HOWL>
Message-ID: <1182278512.2592.42.camel@localhost.localdomain>

Hi Alessandra,

I cc'ed your message to the bioperl and sequence ontology mailing lists,
since your question is relevant to both.

Converting genbank files to GFF3 is excruciatingly difficult; I
generally find that I can use the genbank2gff3 script to get me most of
the way there, but then I need to do some manual fixing to make it
'right'.

I am using bioperl-live, since there have been several fixes to the
script since bioperl 1.5.2 was released, including the most recent fixes
from me today (when I started working on this); I would suggest you use
bioperl-live as well.  I ran the script on chrY.

Most (perhaps all) of the errors fit into a few categories:

  - CDS doesn't have a phase, where the GFF3 spec requires CDSes to have
a phase.  Since it can be a little bit of a hassle to calculate, I
understand why it was left out, but I'll submit a bug report to have
those calculated.  If you are planning on loading the GFF file into
Chado, you can use the --noCDS option to get exons instead of CDSes,
which makes the problem go away (the validator has a bug here though--it
reports the polypeptide derives_from mRNA as invalid, but it is correct;
I'm reporting that directly to the author).  Here's the bioperl bug
report:

  http://bugzilla.open-bio.org/show_bug.cgi?id=2322

  - "invalid type pair" is caused by the genbank file using feature
types in a way that conflicts with the Sequence Ontology.  For example,
it has STS features that are part_of a gene, pseudogenic_region as
part_of pseudogene.  I don't know if there would be an easy way to catch
this in the conversion script.  You may need to fix these by hand.  If
the problems occur for features that you don't care about, you can use
the --filter option to leave them out of the resulting GFF file (for
example, adding '--filter STS' would leave all STS features out of the
file).  Also, if you don't plan on loading these into Chado (which does
require SO-compliance) but instead plan on using a Bio::DB::SeqFeature
database, these errors may not be a problem.

  - "invalid type" is caused by feature types that are not in SOFA
(Sequence Ontology for Feature Annotation), though the terms probably
are in SO.  I thought at one point we discussed allowing any SO type to
appear in the GFF3 type column, but that is not what the spec says now.
I don't see this type of error as causing a problem for either
Bio::DB::SeqFeature or Chado.  Chado allows features to be typed with
anything that is in SO and does not restrict to SOFA.

Scott


On Tue, 2007-06-19 at 16:56 +0200, Alessandra Bilardi wrote:
> Hi all,
> 
> I used bp_genbank2gff3.pl with CVS bioperl and it created gff3 about
> human genbank file. I used validate_gff3 on line with human.gff and 
> it has id non-unique so the database gbrowse inserting has errors.
> 
> I attach the error file about hs_ref_chrY.gbk and hs_ref_chr1.gbk that 
> I download at at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens
> Elements having id non-unique are:
> - CDS or pseudo*exon without mRNA and parent 
> - STS with egual start and end
> - tRNA with egual name
> 
> If this is a bp_genbank2gff3.pl bug, can you rectify bp_genbank2gff3.pl?
> If I'm mistaken, can you help me?
> 
> Thanks very much for the help in advance,
> 
> Alessandra.
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070619/3d818b27/attachment-0002.bin>

From sac at bioperl.org  Tue Jun 19 14:54:39 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Tue, 19 Jun 2007 11:54:39 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <467788C5.6070406@sendu.me.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
Message-ID: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>

Valid points, Sendu. I wonder if there might be a best-of-both-worlds
approach here. I would not be advocating for a major slice and dice,
but just identifying a few large, reasonably well established and
encapsulated blocks of functionality that could be managed more
independently and segregating them away from the rest. For example:
DB, Graphics, Search+SearchIO, Tools.

Once per year, we could have a "whole caboodle" release where the core
and all sub parts are tested and released as a group, as we currently
do. Then, updates to the sub parts can occur as-needed but without
necessarily involving updates to other sub parts or the core.

The onus would be on the pumpkin for the sub part release to make sure
it continues to work with the last whole caboodle release. This would
minimize the number of release clashes, since sub part updates would
only be sanctioned relative to the last caboodle release, and it would
ensure that the whole set continues to interoperate.

Perhaps it would be worth experimenting with such an approach so we
can judge it based on actual experience. We could identify one
functional sub part and segregate it out, do a release cycle or two,
along with a sub part release, and decide if this makes things easier
or harder, for devs as well as users. We could always bring it back
into the fold if it doesn't work out.

My fear is that as bioperl continues to grow, the monolithic approach
will become increasingly onerous for a single release pumpkin to
manage, and harder to find someone who feels up to the task. It could
also discourage new developers from diving into the codebase if it
looks too deep. And they are our lifeblood.

A more functionally segregated bioperl codebase could lower the
activation energy needed to recruit release pumpkins and new devs,
leading to more release iterations, fewer bugs, more features, and
more sustainable growth.

When I first discovered Bioperl in 1996, it had three modules. At
~900, I  probably wouldn't have joined ranks as a developer (well, I
probably would, but it would have taken a while to digest it and
become a contributor).

Steve

On 6/19/07, Sendu Bala <bix at sendu.me.uk> wrote:
> Steve Chervitz wrote:
> > Might this been a good opportunity to investigate partitioning
> > bioperl-live into sub-repositories? There has been talk in the past of
> > defining a set of "core" modules separate from other functionally
> > related groups of modules that would be viewed as optional extensions.
> > The goal being to help manage growth and simplify releases. There are
> > currently 892 modules under Bio/.
> >
> > In addition to simplifying the migration to SVN, it would also have
> > other benefits. Say some new functionality or a slew of fixes were
> > added to Bio::Graphics. We could turn around a new Bio::Graphics
> > release quickly without having to work on getting various other parts
> > up to snuff that aren't related to graphics (Biblio, DB, PopGen,
> > Search etc.). Maintenance and releases of the various extensions would
> > be more parallelizable, orchestrated by separate ring leaders.
> >
> > Over time, as a set of functionality matures, it would see fewer
> > updates and there would be less of a need for users to
> > download/install/test it. This could make bioperl easier to customize,
> > extend, and grok in general.
> >
> > Long term, it should ease development and release cycles
>
> I actually take the opposite view. Breaking things up makes testing and
> releases more difficult.
>
> If one person acts as pumpkin for all the sub-parts, his work-load
> increases almost linearly with the number of sub-parts. If each sub-part
> gets its own pumpkin, where do all these pumpkins come from? It seems to
> me that frequently authors will write modules but inevitably their
> circumstance changes and they can no longer devote the time to look
> after them. Having a single pumpkin and 'forcing' him to make sure
> everything works (regardless of his personal interest in the module)
> seems more reliable than hoping there will be a person interested enough
> in each sub-part to handle its release.
>
> Since all sub-parts will at the least interact with the 'true' core set
> of Bioperl modules, they need to be tested and potentially re-released
> every time the true core is updated. And since some sub-parts will
> interact with other sub-parts, there will need to be coordinated
> joint-testing and release of multiple sub-parts.
>
> What happens when users report problems? We ask them what version
> they're running. Right now '1.5.2' means a specific thing, and its
> trivial for someone to confirm the same problem by installing 1.5.2.
> What happens when users have to list out all the versions of all the
> sub-parts they have? Who is going to consistently recreate a users
> hodge-podge of versions in order to confirm a bug? Won't the advice
> instead be: "update all versions to the latest and get back to us"?
>
> So, as I see it, all sub-parts would best be tested and released with a
> single new version number every time one sub-part is updated
> (significantly). In which case, why have sub-parts at all? Keeping
> things the way they are now means ease of release for the pumpkin and
> ease of installation for end-users (only one install command to issue to
> CPAN). Having 'true' sub-parts (each with its own pumpkin), in my
> fatalistic view, is just going to lead to some useful sub-parts being
> abandoned and never updated, even where updates may be desirable.
>
> Each and every Bio:: module could have been released separately by its
> respective author. As I see it, one of the main values of 'Bioperl' is
> that its one (reasonably) consistent collection of modules that lowers
> the barrier of entry for new Bioinformaticians, giving them extremely
> easy access to a whole host of functionality with a single install.
>


From bix at sendu.me.uk  Tue Jun 19 15:13:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 20:13:39 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
Message-ID: <46782AE3.2090703@sendu.me.uk>

Steve Chervitz wrote:
> Valid points, Sendu. I wonder if there might be a best-of-both-worlds
> approach here.
[snip]

You haven't convinced me, but I'd go along with the majority decision if 
best-of-both-worlds was picked.


> DB, Graphics, Search+SearchIO, Tools.

I will, however, say that DB interleaves into too many core modules. It 
should stay in core. Tools? Its hardly touched anyway, so I don't see 
the value of taking it out, what with Bio::Tools::Run already being its 
own package. Most Bioperl users probably get Bioperl just to do 
something Blast related, so all Blast stuff really ought to stay in core.

Graphics is an obvious choice and I agree. Updated frequently, and has 
its own release needs. It also has some of the trickier dependencies, so 
would make installing core simpler.

I can imagine plucking Search+SearchIO out, and its something that needs 
regular updating. Another good candidate.


> Perhaps it would be worth experimenting with such an approach so we
> can judge it based on actual experience. We could identify one
> functional sub part and segregate it out, do a release cycle or two,
> along with a sub part release, and decide if this makes things easier
> or harder, for devs as well as users.

Well, we already have the run package. Its a split-off subpart that gets 
updated. The only 'experiment' left to do is finding it its own pumpkin.


From bix at sendu.me.uk  Tue Jun 19 15:48:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 20:48:50 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
	<4677E756.6050200@sendu.me.uk>
	<C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>
Message-ID: <46783322.30309@sendu.me.uk>

Hilmar Lapp wrote:
> Here's what I understand of your description of the problem:
> 
> - We would like nodes returned from Bio::DB::Taxonomy to use the  
> database for all hierarchical queries.
> 
> - We would like nodes used in a Bio::Tree::Tree not to use the  
> database for any hierarchical query.

Correct.


> What I understand that we have is
> 
> - Taxon node objects that have a db_handle set will use the database  
> for ancestor(), unless it has been set manually (?), but not for  
> each_Descendent().
> 
> - Taxon node objects that don't have a db_handle set won't use a  
> database but will function normally otherwise.
> 
> - This is needed to prevent Bio::Tree::Tree methods from pulling the  
> entire tree into memory.

Correct.


> If this is correct (I'm not sure it is), it sounds like we want to  
> temporarily divorce taxonomy nodes from their database capabilities  
> while they are being queried in a tree context?

Yes.


> I'm still trying to understand - if I create a Bio::Tree::Tree from a  
> single node, will the tree automatically contain all nodes along the  
> lineage of ancestors up to the root? So, even if extracting this  
> lineage involved querying a database it would be acceptable, but not  
> for querying descendents?

Yes. Asking the database for all the ancestors up to root only pulls a 
couple of nodes into the tree and is exactly what the user would want to 
happen. But if nodes are allowed to get their descendants from the 
database, when we get the root node from the database, we'd get all the 
root's descendants, and then for each of those we'd get all /their/ 
descendants... that's when the whole db gets sucked in.


> It sounds to me like what is needed is that nodes that get added to a  
> tree need to be stripped of their database capabilities. This could  
> be achieved by creating a wrapper class that delegates all non- 
> hierarchical methods to the wrapped Taxon object, and overriding all  
> hierarchical queries to not use a database. I'm not sure I fully  
> understand yet though, but the inconsistent behavior will be sure to  
> throw people off track.

When we're making a tree from a db Taxon we need db access to find all 
the ancestors; we just don't want to get any descendants outside our 
initiating Taxon's direct lineage.


my @names = ('Eukaryota', 'Mammalia', 'Primates', 'Homo', 'Homo sapiens');
my @ranks = qw(superkingdom class order genus species);
my $db = Bio::DB::Taxonomy->new(-source => 'list', -names => \@names,
                                                    -ranks => \@ranks);

@names = ('Eukaryota', 'Mammalia', 'Rodentia', 'Mus', 'Mus musculus');
$db->add_lineage(-names => \@names, -ranks => \@ranks);


my $homo = $db->get_taxon(-name => 'Homo');
isa_ok($homo, 'Bio::Taxon'); # PASS

is $homo->ancestor->scientific_name, 'Primates' # PASS
my @descs = $homo->each_Descendent;
is @descs, 1 # FAIL, we wanted it to contain the 'Homo sapiens' node


my $lineage = Bio::Tree::Tree->new(-node => $homo);
is $lineage->get_root_node->scientific_name, 'Eukaryota'; # PASS
my @nodes = $lineage->get_nodes;
ok @nodes, 4; # PASS: we didn't pull in Rodentia which would be 8

(on that last test I can't remember if the answer might actually be 5 
because our lineage does contain 'Homo sapiens')


If anyone can figure out how to get all those to pass, please let me know.


From cjfields at uiuc.edu  Tue Jun 19 17:15:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 19 Jun 2007 16:15:00 -0500
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
Message-ID: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>


On Jun 19, 2007, at 1:54 PM, Steve Chervitz wrote:

> Valid points, Sendu. I wonder if there might be a best-of-both-worlds
> approach here. I would not be advocating for a major slice and dice,
> but just identifying a few large, reasonably well established and
> encapsulated blocks of functionality that could be managed more
> independently and segregating them away from the rest. For example:
> DB, Graphics, Search+SearchIO, Tools.

There should also be a consensus between the core devs on this; I  
don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing  
their opinions as it will directly impact projects which rely on core  
functionality (GBrowse/GMOD, bioperl-db, etc).  I also agree with  
George that this should be postponed until after svn issues are taken  
care of.

Stating that, I think this is a good idea in general, though we'll  
need to be careful which ones we segregate out as non-core.  I agree  
with your choices; I would add in Bio::Restriction, Bio::Assembly,  
Bio::Structure, and a few more.  As long as the distribution required  
installation of 'core' prior to test runs it shouldn't be too much of  
a problem.

In order for this to work we would need to delineate what defines  
'core' (how broad the definition should be), then identify those  
modules that don't fit and decide what to do with them.  Would we  
want to split the others into separate packages or lump together as a  
bioperl-auxiliary (horrid name, but you get my point)?  Too many  
could be a logistical nightmare, as Sendu has pointed out.

> Once per year, we could have a "whole caboodle" release where the core
> and all sub parts are tested and released as a group, as we currently
> do. Then, updates to the sub parts can occur as-needed but without
> necessarily involving updates to other sub parts or the core.

Sounds fine by me.  Actually, my thought was we could reimplement  
Bundle::BioPerl on CPAN (which Module::Build effectively obsoleted)  
to install all the necessary subpackages in order to emulate an old- 
style 'core' installation, or act as an 'install everything BioPerl- 
related' Bundle.  Regular updates of the subpackages to CPAN should  
just require updating the Bundle (which would update only the  
relevant parts, at least I believe it would).

> The onus would be on the pumpkin for the sub part release to make sure
> it continues to work with the last whole caboodle release. This would
> minimize the number of release clashes, since sub part updates would
> only be sanctioned relative to the last caboodle release, and it would
> ensure that the whole set continues to interoperate.
>
> Perhaps it would be worth experimenting with such an approach so we
> can judge it based on actual experience. We could identify one
> functional sub part and segregate it out, do a release cycle or two,
> along with a sub part release, and decide if this makes things easier
> or harder, for devs as well as users. We could always bring it back
> into the fold if it doesn't work out.
>
> My fear is that as bioperl continues to grow, the monolithic approach
> will become increasingly onerous for a single release pumpkin to
> manage, and harder to find someone who feels up to the task. It could
> also discourage new developers from diving into the codebase if it
> looks too deep. And they are our lifeblood.

Agreed!

> A more functionally segregated bioperl codebase could lower the
> activation energy needed to recruit release pumpkins and new devs,
> leading to more release iterations, fewer bugs, more features, and
> more sustainable growth.

'Activation energy.'  Hmm.  Spoken like a true biologist.

> When I first discovered Bioperl in 1996, it had three modules. At
> ~900, I  probably wouldn't have joined ranks as a developer (well, I
> probably would, but it would have taken a while to digest it and
> become a contributor).
>
> Steve

I pretty much agree, though this will require quite a bit more  
discussion.

chris


From hlapp at gmx.net  Tue Jun 19 17:57:54 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 17:57:54 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
Message-ID: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>


On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:

> There should also be a consensus between the core devs on this; I
> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
> their opinions

The problem I have increasingly had with BioPerl (aside from the fact  
that it's written in Perl ;) is the plethora of dependencies I need  
to install, not the number of modules.

But every time I've been told that that's what Perl is all about, and  
I should shut up and install the bundle. Idiosyncratically I don't  
like bundles that clutter up my hard disk with stuff I'll never use,  
and in this sense if BioPerl is divided into 10 packages I will have  
to think about each one whether I need it, and do a separate CVS  
checkout - and regular update - of each one (though granted, I  
believe there are ways the multiple checkout and update thing can be  
taken care of).

In reality, this may be a rapidly disappearing trait though of those  
who have grown up in a time when they proudly spent all their savings  
to buy that new computer because it had a 20MB hard disk, compared to  
the two 360k floppy drives the previous one had.

So don't ask me, just don't make it too hard for the dinosaurs.

> as it will directly impact projects which rely on core
> functionality (GBrowse/GMOD, bioperl-db, etc).

Well, I hope there are ways to limit that?

> I also agree with George that this should be postponed until after  
> svn issues are taken care of.

I agree entirely. Please don't throw this in the same bin or tie one  
to the other. The migration is neither easier nor faster nor better  
testable with a partitioned BioPerl.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Jun 19 21:48:20 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 19 Jun 2007 20:48:20 -0500
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
Message-ID: <D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>


On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote:

> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:
>
>> There should also be a consensus between the core devs on this; I
>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
>> their opinions
>
> The problem I have increasingly had with BioPerl (aside from the fact
> that it's written in Perl ;) is the plethora of dependencies I need
> to install, not the number of modules.
>
> But every time I've been told that that's what Perl is all about, and
> I should shut up and install the bundle. Idiosyncratically I don't
> like bundles that clutter up my hard disk with stuff I'll never use,
> and in this sense if BioPerl is divided into 10 packages I will have
> to think about each one whether I need it, and do a separate CVS
> checkout - and regular update - of each one (though granted, I
> believe there are ways the multiple checkout and update thing can be
> taken care of).

I agree; the fewer dependencies the better.  We could divide it up  
into a small, focused core package with only a few dependencies, and  
1-3 more containing the focused bits which require the most  
maintenance (Graphics, SearchIO/Tools, etc).  I worry about having  
too many more.

> In reality, this may be a rapidly disappearing trait though of those
> who have grown up in a time when they proudly spent all their savings
> to buy that new computer because it had a 20MB hard disk, compared to
> the two 360k floppy drives the previous one had.
>
> So don't ask me, just don't make it too hard for the dinosaurs.

There would need to be some way of getting an old-style full-blown  
core installation regardless of how many subdistros we would divy  
core up into.  My thought for CPAN was having Bundle::BioPerl take  
over this but I'm not sure if it's still being used.  Maybe there are  
other ways for svn/cvs.

>> as it will directly impact projects which rely on core
>> functionality (GBrowse/GMOD, bioperl-db, etc).
>
> Well, I hope there are ways to limit that?

I believe so, yes, particularly for bioperl-db.  I would think  
splitting off Bio::Graphics or Bio::DB* will have some effect on  
GBrowse/GFF.

>> I also agree with George that this should be postponed until after
>> svn issues are taken care of.
>
> I agree entirely. Please don't throw this in the same bin or tie one
> to the other. The migration is neither easier nor faster nor better
> testable with a partitioned BioPerl.
>
> 	-hilmar

We def. have to complete transition to subversion first, then think  
about this some more.

chris


From n.haigh at sheffield.ac.uk  Wed Jun 20 02:31:24 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 07:31:24 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
Message-ID: <4678C9BC.10206@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote:
> 
>> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:
>>
>>> There should also be a consensus between the core devs on this; I
>>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
>>> their opinions
>> The problem I have increasingly had with BioPerl (aside from the fact
>> that it's written in Perl ;) is the plethora of dependencies I need
>> to install, not the number of modules.
>>
>> But every time I've been told that that's what Perl is all about, and
>> I should shut up and install the bundle. Idiosyncratically I don't
>> like bundles that clutter up my hard disk with stuff I'll never use,
>> and in this sense if BioPerl is divided into 10 packages I will have
>> to think about each one whether I need it, and do a separate CVS
>> checkout - and regular update - of each one (though granted, I
>> believe there are ways the multiple checkout and update thing can be
>> taken care of).
> 
> I agree; the fewer dependencies the better.  We could divide it up  
> into a small, focused core package with only a few dependencies, and  
> 1-3 more containing the focused bits which require the most  
> maintenance (Graphics, SearchIO/Tools, etc).  I worry about having  
> too many more.
> 
>> In reality, this may be a rapidly disappearing trait though of those
>> who have grown up in a time when they proudly spent all their savings
>> to buy that new computer because it had a 20MB hard disk, compared to
>> the two 360k floppy drives the previous one had.
>>
>> So don't ask me, just don't make it too hard for the dinosaurs.
> 
> There would need to be some way of getting an old-style full-blown  
> core installation regardless of how many subdistros we would divy  
> core up into.  My thought for CPAN was having Bundle::BioPerl take  
> over this but I'm not sure if it's still being used.  Maybe there are  
> other ways for svn/cvs.

Personally, I think this use of Bundle::Bioperl is more in line with
what CPAN Bundles were meant to do - "a bundle is a collection of
modules that comprise a cohesive unit". Under that definition you could
probably put the whole of Bioperl but I won't go there! When a package
is updated and a new release is made, this should be
installable/updatable via cpan as well as updating the bundle with the
correct version. This was you can get all of Bioperl via the bundle, or
just install the sub-packages on their own.

If the switch over to svn takes place, will all the Bioperl-* projects
move over at the same time? If so, will they go into their own svn
repository or into the same one? Since with svn you can checkout any
subtree of the repository I'm not clear on the pro's and cons of either
of these options.

Am I right in thinking that there is a way for cvs to define a "project"
such that when you checkout that "project" it actually checks out
multiple projects behind the scene? I'm sure I've seen this somewhere,
possibly when the project is dependent on some 3rd party code that is
also in cvs. If this is possible, I'm sure it will also be possible with
svn. This could then allow something like the following to happen after
the split up of Bioperl. The following projects could be defined:
bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
called "bioperl" would actually checkout the real projects call
bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
that this ought to be possible, doesn't it?


> 
>>> as it will directly impact projects which rely on core
>>> functionality (GBrowse/GMOD, bioperl-db, etc).
>> Well, I hope there are ways to limit that?
> 
> I believe so, yes, particularly for bioperl-db.  I would think  
> splitting off Bio::Graphics or Bio::DB* will have some effect on  
> GBrowse/GFF.
> 
>>> I also agree with George that this should be postponed until after
>>> svn issues are taken care of.
>> I agree entirely. Please don't throw this in the sam. e bin or tie one
>> to the other. The migration is neither easier nor faster nor better
>> testable with a partitioned BioPerl.
>>
>> 	-hilmar
> 
> We def. have to complete transition to subversion first, then think  
> about this some more.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeMm7czuW2jkwy2gRAi+CAJ9cNZ70GojV7eviRjdWTFLk/MKYoACg2Ls4
op9sQTZyeK6G6taFhTAPMYc=
=7NRw
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 20 07:46:16 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 07:46:16 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <4678C9BC.10206@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
Message-ID: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:

> If the switch over to svn takes place, will all the Bioperl-* projects
> move over at the same time?

They are under the same CVSROOT right now. Locking down some sub- 
repositories but not others may be odd or impossible.

> If so, will they go into their own svn repository or into the same  
> one?

Good question, I'm not sure about the pros and cons one way or the  
other either. The fewer repositories the less sysadmin work in fine- 
graining permissions.

	-hilmar

- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGeRONuV6N2JxL7qsRAoYTAJ9GVuC0j4szCcWTg7yWGoxN3YFucQCgogJ8
Ims4d150lsX0vXtDwGI1lKg=
=K4++
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Wed Jun 20 07:57:22 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 12:57:22 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
Message-ID: <46791622.6080409@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hilmar Lapp wrote:
> 
> On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:
> 
>> If the switch over to svn takes place, will all the Bioperl-* projects
>> move over at the same time?
> 
> They are under the same CVSROOT right now. Locking down some
> sub-repositories but not others may be odd or impossible.
> 
>> If so, will they go into their own svn repository or into the same one?
> 
> Good question, I'm not sure about the pros and cons one way or the other
> either. The fewer repositories the less sysadmin work in fine-graining
> permissions.
> 
>     -hilmar
> 


I don't think there is any major reason why the following single repos
wouldn't do the trick:

/--
  |-bioperl-live
  |     |--- trunk
  |     |--- branches
  |     |--- tags
  |
  |-bioperl-run
        |--- trunk
        |--- branches
        |--- tags

Any reason why this couldn't be used?

I know some people don't like the idea of the revision number
incrementing for the whole repository if it contains several "projects".
However, revision numbers are really only a way for svn to keep track of
things and a very large revision number shouldn't really "upset" anyone.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeRYiczuW2jkwy2gRApS5AJsHl73MWZP8aMfOqlLgTYuzpMWmQgCg3VqA
1Vj8BSUnanpdjYYLE6eGanU=
=bOqK
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 20 08:08:33 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 08:08:33 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <46791622.6080409@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
	<46791622.6080409@sheffield.ac.uk>
Message-ID: <DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote:

> I don't think there is any major reason why the following single repos
> wouldn't do the trick:
>
> /--
>   |-bioperl-live
>   |     |--- trunk
>   |     |--- branches
>   |     |--- tags
>   |
>   |-bioperl-run
>         |--- trunk
>         |--- branches
>         |--- tags
>
> Any reason why this couldn't be used?

That would work fine except that there are several more sub-projects  
(bioperl-db, bioperl-graphics, bioperl-microarray, and a few more).

That should still be fine. I think what needs to be recognized is the  
limitations it puts on permission granularity. If it's all the same  
repository (as is now) then having commit rights to one (subproject)  
will mean commit rights to all. From my perspective that's fine, it  
has worked great so far.

	-hilmar

- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGeRjFuV6N2JxL7qsRAj3dAJ42r1C8By29DNTUP9Ts0Lf5dOcS9QCgjSE1
hckjT7LBtHcmwGI8B+BKQIM=
=gYfA
-----END PGP SIGNATURE-----


From hartzell at alerce.com  Tue Jun 19 15:53:39 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 19 Jun 2007 12:53:39 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
Message-ID: <18040.13379.217277.992742@almost.alerce.com>

Steve Chervitz writes:
 > On 6/16/07, Jason Stajich <jason at bioperl.org> wrote:
 > > [...]
 > > Just to say I already went through all the steps of running cvs2svn
 > > myself and had problems gathering back out the branches and all the
 > > tags when I tried it.  If you want to start with a smaller repository
 > > like bioperl-network or bioperl-db as the initial cvs2svn conversion
 > > script took quite a long time to run on bioperl-live.
 > 
 > Might this been a good opportunity to investigate partitioning
 > bioperl-live into sub-repositories? [...]

I'd say that the time to do this kind of rearrangement would be
*after* the svn repo's set up.  That way you'll be able to track stuff
back through to the beginning of time.

g.


From sdavis2 at mail.nih.gov  Wed Jun 20 08:44:08 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 20 Jun 2007 08:44:08 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN
	and	...Re:	Perltidy)
In-Reply-To: <DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>	<4678C9BC.10206@sheffield.ac.uk>	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>	<46791622.6080409@sheffield.ac.uk>
	<DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>
Message-ID: <46792118.4030205@mail.nih.gov>

Hilmar Lapp wrote:
> 
> On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote:
> 
>> I don't think there is any major reason why the following single repos
>> wouldn't do the trick:
> 
>> /--
>>   |-bioperl-live
>>   |     |--- trunk
>>   |     |--- branches
>>   |     |--- tags
>>   |
>>   |-bioperl-run
>>         |--- trunk
>>         |--- branches
>>         |--- tags
> 
>> Any reason why this couldn't be used?
> 
> That would work fine except that there are several more sub-projects  
> (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more).
> 
> That should still be fine. I think what needs to be recognized is the  
> limitations it puts on permission granularity. If it's all the same  
> repository (as is now) then having commit rights to one (subproject)  
> will mean commit rights to all. From my perspective that's fine, it  
> has worked great so far.

Actually, I think there are ways of creating per-directory access
control.  See here:

http://svnbook.red-bean.com/en/1.2/svn-book.html#svn.serverconfig.svnserve.auth.general

With Apache-based https access, such access control is relatively
straightforward, it appears.  With the standalone svn server over ssh,
one needs to use "commit hook scripts" to limit access.  But I think it
is possible (admitting that I have not tried to do this...).

Sean


From hartzell at alerce.com  Wed Jun 20 09:23:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 20 Jun 2007 06:23:32 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <4678C9BC.10206@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
Message-ID: <18041.10836.728079.835572@almost.alerce.com>

Nathan S. Haigh writes:
 > [...]
 > If the switch over to svn takes place, will all the Bioperl-* projects
 > move over at the same time? If so, will they go into their own svn
 > repository or into the same one? Since with svn you can checkout any
 > subtree of the repository I'm not clear on the pro's and cons of either
 > of these options.

I'm planning to drop the projects from the top of the CVSROOT into a
single svn repository:

    bioperl-ext bioperl-pipeline biodata bioperl-gui
    bioperl-run bioperl-cookbook bioperl-live biosql-schema
    bioperl-corba-client bioperl-microarray html bioperl-corba-server
    bioperl-network task-manager bioperl-das-client bioperl-papers
    xml-html bioperl-db bioperl-pedigree

although that's open to feedback from the core members.

As a progress report, I've built a demo repos with -run, -ext, and
-live in it and asked a couple of folks to to take a peek at it.  When
I get a bit further along I'll figure out how to get something for the
public to test.

 > Am I right in thinking that there is a way for cvs to define a "project"
 > such that when you checkout that "project" it actually checks out
 > multiple projects behind the scene? I'm sure I've seen this somewhere,
 > possibly when the project is dependent on some 3rd party code that is
 > also in cvs. If this is possible, I'm sure it will also be possible with
 > svn. This could then allow something like the following to happen after
 > the split up of Bioperl. The following projects could be defined:
 > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
 > called "bioperl" would actually checkout the real projects call
 > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
 > that this ought to be possible, doesn't it?
 > [...]

I don't think that there's any functionality like that in svn.

g.


From hartzell at alerce.com  Wed Jun 20 09:26:04 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 20 Jun 2007 06:26:04 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <46791622.6080409@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
	<46791622.6080409@sheffield.ac.uk>
Message-ID: <18041.10988.375946.833182@almost.alerce.com>

Nathan S. Haigh writes:
 > -----BEGIN PGP SIGNED MESSAGE-----
 > Hash: SHA1
 > 
 > Hilmar Lapp wrote:
 > > 
 > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:
 > > 
 > >> If the switch over to svn takes place, will all the Bioperl-* projects
 > >> move over at the same time?
 > > 
 > > They are under the same CVSROOT right now. Locking down some
 > > sub-repositories but not others may be odd or impossible.
 > > 
 > >> If so, will they go into their own svn repository or into the same one?
 > > 
 > > Good question, I'm not sure about the pros and cons one way or the other
 > > either. The fewer repositories the less sysadmin work in fine-graining
 > > permissions.
 > > 
 > >     -hilmar
 > > 
 > 
 > 
 > I don't think there is any major reason why the following single repos
 > wouldn't do the trick:
 > 
 > /--
 >   |-bioperl-live
 >   |     |--- trunk
 >   |     |--- branches
 >   |     |--- tags
 >   |
 >   |-bioperl-run
 >         |--- trunk
 >         |--- branches
 >         |--- tags
 > 
 > Any reason why this couldn't be used?
 > [...]

That's exactly the way that I'm setting it up.

g.


From n.haigh at sheffield.ac.uk  Wed Jun 20 09:33:33 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 14:33:33 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <18041.10836.728079.835572@almost.alerce.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>	<4678C9BC.10206@sheffield.ac.uk>
	<18041.10836.728079.835572@almost.alerce.com>
Message-ID: <46792CAD.5060700@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:
> Nathan S. Haigh writes:
>  > [...]
>  > If the switch over to svn takes place, will all the Bioperl-* projects
>  > move over at the same time? If so, will they go into their own svn
>  > repository or into the same one? Since with svn you can checkout any
>  > subtree of the repository I'm not clear on the pro's and cons of either
>  > of these options.
> 
> I'm planning to drop the projects from the top of the CVSROOT into a
> single svn repository:
> 
>     bioperl-ext bioperl-pipeline biodata bioperl-gui
>     bioperl-run bioperl-cookbook bioperl-live biosql-schema
>     bioperl-corba-client bioperl-microarray html bioperl-corba-server
>     bioperl-network task-manager bioperl-das-client bioperl-papers
>     xml-html bioperl-db bioperl-pedigree
> 
> although that's open to feedback from the core members.
> 
> As a progress report, I've built a demo repos with -run, -ext, and
> -live in it and asked a couple of folks to to take a peek at it.  When
> I get a bit further along I'll figure out how to get something for the
> public to test.

Could I take a peek??

> 
>  > Am I right in thinking that there is a way for cvs to define a "project"
>  > such that when you checkout that "project" it actually checks out
>  > multiple projects behind the scene? I'm sure I've seen this somewhere,
>  > possibly when the project is dependent on some 3rd party code that is
>  > also in cvs. If this is possible, I'm sure it will also be possible with
>  > svn. This could then allow something like the following to happen after
>  > the split up of Bioperl. The following projects could be defined:
>  > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
>  > called "bioperl" would actually checkout the real projects call
>  > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
>  > that this ought to be possible, doesn't it?
>  > [...]
> 
> I don't think that there's any functionality like that in svn.


I did come across this which might help:
http://subversion.tigris.org/servlets/ReadMsg?listName=users&msgNo=43561

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeSytczuW2jkwy2gRAnlUAJ4pjhPlYlqOm+M882Ni116MJVzPCwCbB3Su
sWDAmqFhGgtlyeawaIGSV14=
=zeAY
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Wed Jun 20 11:38:20 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 20 Jun 2007 16:38:20 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
Message-ID: <467949EC.9040100@sendu.me.uk>

In considering updating all the test scripts to take advantage of the 
new network option, and/or reimplementing them in Test::More, I thought 
now would be a good time to standardize all the test scripts and reduce 
the possibility of having to alter them all in the future if something 
changes.

For example we could decide on an alternate way of choosing to run 
network tests, or a new way of deciding to output debug information. 
There are also some inconsistencies in the messages produced by tests 
skipping all, and even an unfortunate mistake that has been copy/pasted 
through a lot of test scripts.

My solution is t/lib/BioperlTest.pm (documented with perldoc)

We go from this:

----
use strict;
our $DEBUG;

BEGIN {
   $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
	
   eval { require Test::More; };
   if( $@ ) {
     use lib 't/lib';
   }
   use Test::More; # the mistake!
	
   use Module::Build;
   my $build = Module::Build->current();
   my $do_network_tests = $build->notes('network');

   eval {
     require IO::String;
     require LWP;
     require LWP::UserAgent;
   };
   if ($@) {
     plan skip_all => 'IO::String or LWP or LWP::UserAgentnot installed.
This means Bio::Tools::Run::RemoteBlast is not usable. Skipping tests';
   }
   elsif (!$do_network_tests) {
     plan skip_all => 'Network tests have not been requested, skipping
all';
   }
   else {
     plan tests => 21;
   }

   #...
}

my $obj = Bio::Object->new(-verbose => $DEBUG);
#...
----

To this:

----
use strict;

BEGIN {
   use lib 't/lib';
   use BioperlTest;

   test_begin(-requires_modules => [qw(IO::String LWP LWP::UserAgent)],
              -requires_networking => 1,
              -tests => 21);

   #...
}

my $obj = Bio::Object->new(-verbose => test_debug());
#...
----


Can anyone identify problems with this approach? Is the interface 
presented by BioperlTest flexible enough that any changes would only be 
additions for new functionality (and therefore all test scripts wouldn't 
need to be altered)? Is BioperlTest missing anything you'd like?

Are there any objections to me updating all tests in this manner? For an 
example, see t/RemoteBlast.t


Cheers,
Sendu.


From spiros at lokku.com  Wed Jun 20 11:49:48 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Wed, 20 Jun 2007 16:49:48 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
	<4676B41E.3050706@sendu.me.uk>
	<4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>
Message-ID: <bba689ec0706200849p3d32ffb8wee14bbeb2027e905@mail.gmail.com>

Yep, they are not all done. Some still need to be ported over, doing
some here and there at home. However, the recent email Sendu sent, the
one about abstracting the setup of testing is actually something i was
thinking myself so it might be a better way to tackle the problem. For
once it would save us from duplicating the same 30 lines of code
across all tests.

As far as network tests are involved, ive always been an avid hater of
them. I believe they only bring more troubles than what they
contribute due to the diversity of setups people have. My way of
tackling them was always to group all the tests that required live
access into one file and then forcibly just run that - iff needed and
not by default. Like i said, thats just my opinion, ive been bitten by
them one time too many.

Spiros

On 6/18/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote:
>
> > Chris Fields wrote:
> >> Couldn't you enable BIOPERLDEBUG, disable network access, then
> >> iterate through tests checking for those which fail or skip?
> >
> > Yes, good idea, though my dev machine is also my email/webserver so
> > I'd rather come up with an alternate solution than one involving
> > 'disable network access'.
> >
> > Still, that's what I'll probably end up doing. Cheers!
> >
> >
> > Oh, Chris, Spiros, how goes the Test::More conversion? I might want
> > to wait for you to finish, or join in? If you're not going to have
> > time to do any more in the next few weeks, can you please update
> > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or
> > in the opposite case, add your name in)? Its not quite clear to me
> > which tests are assigned to whom. Can someone clarify what the
> > markings mean?
> >
> > Cheers,
> > Sendu.
>
> Not sure how far along spiros is; I handed it over after I finished
> up to the 'Q' tests.  In general the ones marked out have been
> converted over, ones with names next to them have been claimed.  If
> you need help I'll prob. start back up again to finish them off; we
> just need to divy them up.
>
> chris
>


From hlapp at gmx.net  Wed Jun 20 12:27:47 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 12:27:47 -0400
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467949EC.9040100@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
Message-ID: <A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>

Very cool! Sounds like a no-brainer to me to adopt this in all the  
tests. -hilmar

On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:

> In considering updating all the test scripts to take advantage of the
> new network option, and/or reimplementing them in Test::More, I  
> thought
> now would be a good time to standardize all the test scripts and  
> reduce
> the possibility of having to alter them all in the future if something
> changes.
>
> For example we could decide on an alternate way of choosing to run
> network tests, or a new way of deciding to output debug information.
> There are also some inconsistencies in the messages produced by tests
> skipping all, and even an unfortunate mistake that has been copy/ 
> pasted
> through a lot of test scripts.
>
> My solution is t/lib/BioperlTest.pm (documented with perldoc)
>
> We go from this:
>
> ----
> use strict;
> our $DEBUG;
>
> BEGIN {
>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
> 	
>    eval { require Test::More; };
>    if( $@ ) {
>      use lib 't/lib';
>    }
>    use Test::More; # the mistake!
> 	
>    use Module::Build;
>    my $build = Module::Build->current();
>    my $do_network_tests = $build->notes('network');
>
>    eval {
>      require IO::String;
>      require LWP;
>      require LWP::UserAgent;
>    };
>    if ($@) {
>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot  
> installed.
> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping  
> tests';
>    }
>    elsif (!$do_network_tests) {
>      plan skip_all => 'Network tests have not been requested, skipping
> all';
>    }
>    else {
>      plan tests => 21;
>    }
>
>    #...
> }
>
> my $obj = Bio::Object->new(-verbose => $DEBUG);
> #...
> ----
>
> To this:
>
> ----
> use strict;
>
> BEGIN {
>    use lib 't/lib';
>    use BioperlTest;
>
>    test_begin(-requires_modules => [qw(IO::String LWP  
> LWP::UserAgent)],
>               -requires_networking => 1,
>               -tests => 21);
>
>    #...
> }
>
> my $obj = Bio::Object->new(-verbose => test_debug());
> #...
> ----
>
>
> Can anyone identify problems with this approach? Is the interface
> presented by BioperlTest flexible enough that any changes would  
> only be
> additions for new functionality (and therefore all test scripts  
> wouldn't
> need to be altered)? Is BioperlTest missing anything you'd like?
>
> Are there any objections to me updating all tests in this manner?  
> For an
> example, see t/RemoteBlast.t
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 20 12:44:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 20 Jun 2007 11:44:01 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
Message-ID: <BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>

Agreed!  You've already created an example case so there's something  
to go off of.

I plan on changing some EUtilities tests soon so I'll try  
implementing this, basing off your RemoteBlast.t implementation.   
Seems clear enough on the surface; if I run into problems I'll post.

chris

On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote:

> Very cool! Sounds like a no-brainer to me to adopt this in all the
> tests. -hilmar
>
> On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:
>
>> In considering updating all the test scripts to take advantage of the
>> new network option, and/or reimplementing them in Test::More, I
>> thought
>> now would be a good time to standardize all the test scripts and
>> reduce
>> the possibility of having to alter them all in the future if  
>> something
>> changes.
>>
>> For example we could decide on an alternate way of choosing to run
>> network tests, or a new way of deciding to output debug information.
>> There are also some inconsistencies in the messages produced by tests
>> skipping all, and even an unfortunate mistake that has been copy/
>> pasted
>> through a lot of test scripts.
>>
>> My solution is t/lib/BioperlTest.pm (documented with perldoc)
>>
>> We go from this:
>>
>> ----
>> use strict;
>> our $DEBUG;
>>
>> BEGIN {
>>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
>> 	
>>    eval { require Test::More; };
>>    if( $@ ) {
>>      use lib 't/lib';
>>    }
>>    use Test::More; # the mistake!
>> 	
>>    use Module::Build;
>>    my $build = Module::Build->current();
>>    my $do_network_tests = $build->notes('network');
>>
>>    eval {
>>      require IO::String;
>>      require LWP;
>>      require LWP::UserAgent;
>>    };
>>    if ($@) {
>>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot
>> installed.
>> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping
>> tests';
>>    }
>>    elsif (!$do_network_tests) {
>>      plan skip_all => 'Network tests have not been requested,  
>> skipping
>> all';
>>    }
>>    else {
>>      plan tests => 21;
>>    }
>>
>>    #...
>> }
>>
>> my $obj = Bio::Object->new(-verbose => $DEBUG);
>> #...
>> ----
>>
>> To this:
>>
>> ----
>> use strict;
>>
>> BEGIN {
>>    use lib 't/lib';
>>    use BioperlTest;
>>
>>    test_begin(-requires_modules => [qw(IO::String LWP
>> LWP::UserAgent)],
>>               -requires_networking => 1,
>>               -tests => 21);
>>
>>    #...
>> }
>>
>> my $obj = Bio::Object->new(-verbose => test_debug());
>> #...
>> ----
>>
>>
>> Can anyone identify problems with this approach? Is the interface
>> presented by BioperlTest flexible enough that any changes would
>> only be
>> additions for new functionality (and therefore all test scripts
>> wouldn't
>> need to be altered)? Is BioperlTest missing anything you'd like?
>>
>> Are there any objections to me updating all tests in this manner?
>> For an
>> example, see t/RemoteBlast.t
>>
>>
>> Cheers,
>> Sendu.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From wollenbergk at mail.nih.gov  Wed Jun 20 14:11:04 2007
From: wollenbergk at mail.nih.gov (Wollenberg, Kurt (NIH/NIAID))
Date: Wed, 20 Jun 2007 14:11:04 -0400
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
Message-ID: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>

Greetings:

I am working on a script to take a list of sequence IDs, extract the
sequences from GenPept, and then run a BLAST search for each of the
retrieved sequences. I am having a problem with the sequence retrieval,
where some sequences are found and others are not and it's not obvious to me
why this is. 

For example, using a text file containing the two following IDs as input:
SKG3_YEAST
NEM1_YEAST

My script 

while( <IN> ) {
  chomp;
  my $seqid = $_;
  my $seq_obj = get_sequence( 'genpept', $seqid );
}

will create a sequence object for the first ID, (print "Accession of
",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession
number) but for the second I am told

-------------------- WARNING ---------------------
MSG: id (NEM1_YEAST) does not exist
---------------------------------------------------

When I pull up these records using the Entrez cross-databse search in my web
browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using
these search terms). In both records these IDs reside in the same field
("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence finds one
but not the other. Any advice would be greatly appreciated.

Cheers,
Kurt Wollenberg, Ph.D.
Phylogenetics and Sequence Analysis Consultant
Biocomputing Research Consulting Section
Bioinformatics and Scientific IT Program (BSIP)
NIH/NIAID/OTIS
Contractor, Lockheed Martin
http://bioinformatics.niaid.nih.gov

Disclaimer:
The information in this e-mail and any of its attachments is confidential
and may contain sensitive information. It should not be used by anyone who
is not the original intended recipient. If you have received this e-mail in
error please inform the sender and delete it from your mailbox or any other
storage devices. National Institute of Allergy and Infectious Diseases shall
not accept liability for any statements made that are sender's own and not
expressly made on behalf of the NIAID by one of its representatives.


From bosborne11 at verizon.net  Wed Jun 20 14:59:39 2007
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 20 Jun 2007 14:59:39 -0400
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
In-Reply-To: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
Message-ID: <C29EF15B.EAF7%bosborne11@verizon.net>

Kurt,

I can't answer your question but I wouldn't use Bio::Perl myself, I'd use
Bio::DB::GenPept:

501 ~>perl -e 'use Bio::DB::GenPept; $db = Bio::DB::GenPept->new; $seq =
$db->get_Seq_by_acc('NEM1_YEAST'); print $seq->seq;'
MNALKYFSNHLITTKKQKKINVEVTKNQDLLGPSKEVSNKYTSHSENDCVSEVDQQYDHSSSHLKESDQNQERKNS
VPKKPKALRSILIEKIASILWALLLFLPYYLIIKPLMSLWFVFTFPLSVIERRVKHTDKRNRGSNASENELPVSSS
NINDSSEKTNPKNCNLNTIPEAVEDDLNASDEIILQRDNVKGSLLRAQSVKSRPRSYSKSELSLSNHSSSNTVFGT
KRMGRFLFPKKLIPKSVLNTQKKKKLVIDLDETLIHSASRSTTHSNSSQGHLVEVKFGLSGIRTLYFIHKRPYCDL
FLTKVSKWYDLIIFTASMKEYADPVIDWLESSFPSSFSKRYYRSDCVLRDGVGYIKDLSIVKDSEENGKGSSSSLD
DVIIIDNSPVSYAMNVDNAIQVEGWISDPTDTDLLNLLPFLEAMRYSTDVRNILALKHGEKAFNIN502 ~>

It's true that Bio::Perl is easy-to-use but it's also _very_ limited.

Brian O.


On 6/20/07 2:11 PM, "Wollenberg, Kurt (NIH/NIAID)"
<wollenbergk at mail.nih.gov> wrote:

> Greetings:
> 
> I am working on a script to take a list of sequence IDs, extract the
> sequences from GenPept, and then run a BLAST search for each of the
> retrieved sequences. I am having a problem with the sequence retrieval,
> where some sequences are found and others are not and it's not obvious to me
> why this is. 
> 
> For example, using a text file containing the two following IDs as input:
> SKG3_YEAST
> NEM1_YEAST
> 
> My script 
> 
> while( <IN> ) {
>   chomp;
>   my $seqid = $_;
>   my $seq_obj = get_sequence( 'genpept', $seqid );
> }
> 
> will create a sequence object for the first ID, (print "Accession of
> ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession
> number) but for the second I am told
> 
> -------------------- WARNING ---------------------
> MSG: id (NEM1_YEAST) does not exist
> ---------------------------------------------------
> 
> When I pull up these records using the Entrez cross-databse search in my web
> browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using
> these search terms). In both records these IDs reside in the same field
> ("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence finds one
> but not the other. Any advice would be greatly appreciated.
> 
> Cheers,
> Kurt Wollenberg, Ph.D.
> Phylogenetics and Sequence Analysis Consultant
> Biocomputing Research Consulting Section
> Bioinformatics and Scientific IT Program (BSIP)
> NIH/NIAID/OTIS
> Contractor, Lockheed Martin
> http://bioinformatics.niaid.nih.gov
> 
> Disclaimer:
> The information in this e-mail and any of its attachments is confidential
> and may contain sensitive information. It should not be used by anyone who
> is not the original intended recipient. If you have received this e-mail in
> error please inform the sender and delete it from your mailbox or any other
> storage devices. National Institute of Allergy and Infectious Diseases shall
> not accept liability for any statements made that are sender's own and not
> expressly made on behalf of the NIAID by one of its representatives.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Wed Jun 20 16:11:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 20 Jun 2007 15:11:34 -0500
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
In-Reply-To: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
References: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
Message-ID: <F9F5A58E-4767-49C4-80F2-DEE3CA474C01@uiuc.edu>

I'm assuming you are using the Bio::Perl exported sub get_sequence 
().  I am able to reproduce the issue using bioperl-live; it's an odd  
issue as direct use of Bio::DB::GenPept works fine:

use Bio::DB::GenPept;

my $factory = Bio::DB::GenPept->new();

my @accs = qw(SKG3_YEAST NEM1_YEAST);

my $io = $factory->get_Stream_by_acc(\@accs);

while (my $seq = $io->next_seq) {
     print "Accession:",$seq->accession,"\n";
}

chris


On Jun 20, 2007, at 1:11 PM, Wollenberg, Kurt (NIH/NIAID) wrote:

> Greetings:
>
> I am working on a script to take a list of sequence IDs, extract the
> sequences from GenPept, and then run a BLAST search for each of the
> retrieved sequences. I am having a problem with the sequence  
> retrieval,
> where some sequences are found and others are not and it's not  
> obvious to me
> why this is.
>
> For example, using a text file containing the two following IDs as  
> input:
> SKG3_YEAST
> NEM1_YEAST
>
> My script
>
> while( <IN> ) {
>   chomp;
>   my $seqid = $_;
>   my $seq_obj = get_sequence( 'genpept', $seqid );
> }
>
> will create a sequence object for the first ID, (print "Accession of
> ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct  
> accession
> number) but for the second I am told
>
> -------------------- WARNING ---------------------
> MSG: id (NEM1_YEAST) does not exist
> ---------------------------------------------------
>
> When I pull up these records using the Entrez cross-databse search  
> in my web
> browser I find genpept records for both SKG3_YEAST and NEM1_YEAST  
> (using
> these search terms). In both records these IDs reside in the same  
> field
> ("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence  
> finds one
> but not the other. Any advice would be greatly appreciated.
>
> Cheers,
> Kurt Wollenberg, Ph.D.
> Phylogenetics and Sequence Analysis Consultant
> Biocomputing Research Consulting Section
> Bioinformatics and Scientific IT Program (BSIP)
> NIH/NIAID/OTIS
> Contractor, Lockheed Martin
> http://bioinformatics.niaid.nih.gov
>
> Disclaimer:
> The information in this e-mail and any of its attachments is  
> confidential
> and may contain sensitive information. It should not be used by  
> anyone who
> is not the original intended recipient. If you have received this e- 
> mail in
> error please inform the sender and delete it from your mailbox or  
> any other
> storage devices. National Institute of Allergy and Infectious  
> Diseases shall
> not accept liability for any statements made that are sender's own  
> and not
> expressly made on behalf of the NIAID by one of its representatives.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From sac at bioperl.org  Thu Jun 21 02:32:47 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Wed, 20 Jun 2007 23:32:47 -0700
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
	<BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>
Message-ID: <8f200b4c0706202332w25a09547k1de20f24466877d9@mail.gmail.com>

Looks like a nice refactor. After it's in place, don't forget to
update the wiki:
http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests

Steve

On 6/20/07, Chris Fields <cjfields at uiuc.edu> wrote:
> Agreed!  You've already created an example case so there's something
> to go off of.
>
> I plan on changing some EUtilities tests soon so I'll try
> implementing this, basing off your RemoteBlast.t implementation.
> Seems clear enough on the surface; if I run into problems I'll post.
>
> chris
>
> On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote:
>
> > Very cool! Sounds like a no-brainer to me to adopt this in all the
> > tests. -hilmar
> >
> > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:
> >
> >> In considering updating all the test scripts to take advantage of the
> >> new network option, and/or reimplementing them in Test::More, I
> >> thought
> >> now would be a good time to standardize all the test scripts and
> >> reduce
> >> the possibility of having to alter them all in the future if
> >> something
> >> changes.
> >>
> >> For example we could decide on an alternate way of choosing to run
> >> network tests, or a new way of deciding to output debug information.
> >> There are also some inconsistencies in the messages produced by tests
> >> skipping all, and even an unfortunate mistake that has been copy/
> >> pasted
> >> through a lot of test scripts.
> >>
> >> My solution is t/lib/BioperlTest.pm (documented with perldoc)
> >>
> >> We go from this:
> >>
> >> ----
> >> use strict;
> >> our $DEBUG;
> >>
> >> BEGIN {
> >>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
> >>
> >>    eval { require Test::More; };
> >>    if( $@ ) {
> >>      use lib 't/lib';
> >>    }
> >>    use Test::More; # the mistake!
> >>
> >>    use Module::Build;
> >>    my $build = Module::Build->current();
> >>    my $do_network_tests = $build->notes('network');
> >>
> >>    eval {
> >>      require IO::String;
> >>      require LWP;
> >>      require LWP::UserAgent;
> >>    };
> >>    if ($@) {
> >>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot
> >> installed.
> >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping
> >> tests';
> >>    }
> >>    elsif (!$do_network_tests) {
> >>      plan skip_all => 'Network tests have not been requested,
> >> skipping
> >> all';
> >>    }
> >>    else {
> >>      plan tests => 21;
> >>    }
> >>
> >>    #...
> >> }
> >>
> >> my $obj = Bio::Object->new(-verbose => $DEBUG);
> >> #...
> >> ----
> >>
> >> To this:
> >>
> >> ----
> >> use strict;
> >>
> >> BEGIN {
> >>    use lib 't/lib';
> >>    use BioperlTest;
> >>
> >>    test_begin(-requires_modules => [qw(IO::String LWP
> >> LWP::UserAgent)],
> >>               -requires_networking => 1,
> >>               -tests => 21);
> >>
> >>    #...
> >> }
> >>
> >> my $obj = Bio::Object->new(-verbose => test_debug());
> >> #...
> >> ----
> >>
> >>
> >> Can anyone identify problems with this approach? Is the interface
> >> presented by BioperlTest flexible enough that any changes would
> >> only be
> >> additions for new functionality (and therefore all test scripts
> >> wouldn't
> >> need to be altered)? Is BioperlTest missing anything you'd like?
> >>
> >> Are there any objections to me updating all tests in this manner?
> >> For an
> >> example, see t/RemoteBlast.t
> >>
> >>
> >> Cheers,
> >> Sendu.
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From staffa at niehs.nih.gov  Thu Jun 21 14:36:12 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Thu, 21 Jun 2007 14:36:12 -0400
Subject: [Bioperl-l] BIO::DB::FASTA  ID
Message-ID: <C2A03D5E.4DE9%staffa@niehs.nih.gov>

This program below returns only  1527 IDs from a fasta file that I have
constructed, which has
mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa
1820
.
It actually does not return the first 3 ids,
nor the 5th, nor 7..36, 38,39,41..44......
The header lines are of variable length and the sequence lines are 80
characters except at the ends when they might be shorter.
Is there some caveat that I am ignoring in my format that breaks
bio::db::fasta?


#!/usr/bin/perl
#
#
#
use strict;
use Bio::DB::Fasta;
use Bio::Tools::SeqWords;
use Bio::Seq;
use Bio::SeqIO;
$|=1;
#
#
my $Dpse_UTR_file_for_T_orthologs =
"/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa";
my $db = Bio::DB::Fasta->new
('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa',
  -reindex,  -makeid => \&make_my_id);
my @ids = $db->ids;
my $number_in = @ids;
print "number of Dpse IDs = $number_in\n";
foreach my $id (@ids){
print "$id\n";
}
sub make_my_id {
#       parse header line:
#       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT
    my $line = shift;
#    print "line = $line\n";
    $line =~ />(\w+) /;
    my $ID = $1;
#    print "ID = $ID\n";
    return $ID;
      }

-------------- next part --------------
A non-text attachment was scrubbed...
Name: T_orthologs_Dpse_genes.fa
Type: application/octet-stream
Size: 5033676 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070621/07c354d0/attachment-0002.obj>

From jason at bioperl.org  Thu Jun 21 17:19:14 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 21 Jun 2007 14:19:14 -0700
Subject: [Bioperl-l] BIO::DB::FASTA  ID
In-Reply-To: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
References: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
Message-ID: <F3A92546-08EE-4AD5-BFCE-BF006D153AD7@bioperl.org>

Hey Nick -
I think
a) your IDs are not unique
b) you need to declare the function make_my_id BEFORE your call  
Bio::DB::Fasta->new if you want your function to be used.

$ grep "^>" T_orthologs_Dpse_genes.fa | awk '{print $1}' | sort |  
uniq | wc -l
1527


-jason
On Jun 21, 2007, at 11:36 AM, Staffa, Nick (NIH/NIEHS) wrote:

> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> $|=1;
> #
> #
> my $Dpse_UTR_file_for_T_orthologs =
> "/home/staffa/clients/Kari/D_pse_genome/testit/ 
> T_orthologs_Dpse_genes.fa";
> my $db = Bio::DB::Fasta->new
> ('/home/staffa/clients/Kari/D_pse_genome/testit/ 
> T_orthologs_Dpse_genes.fa',
>   -reindex,  -makeid => \&make_my_id);
> my @ids = $db->ids;
> my $number_in = @ids;
> print "number of Dpse IDs = $number_in\n";
> foreach my $id (@ids){
> print "$id\n";
> }
> sub make_my_id {
> #       parse header line:
> #       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0  
> TTATTTATT
>     my $line = shift;
> #    print "line = $line\n";
>     $line =~ />(\w+) /;
>     my $ID = $1;
> #    print "ID = $ID\n";
>     return $ID;
>       }

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From mkiwala at watson.wustl.edu  Thu Jun 21 17:23:46 2007
From: mkiwala at watson.wustl.edu (Michael Kiwala)
Date: Thu, 21 Jun 2007 16:23:46 -0500
Subject: [Bioperl-l] BIO::DB::FASTA  ID
In-Reply-To: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
References: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
Message-ID: <467AEC62.2040508@watson.wustl.edu>

You only have 1527 unique id's in the file.

~$ grep '^>' Desktop/T_orthologs_Dpse_genes.fa|cut -d\  -f1|sort -u|wc -l
1527


Change your make_id function to make sure the id's are unique.


Staffa, Nick (NIH/NIEHS) wrote:
> This program below returns only  1527 IDs from a fasta file that I have
> constructed, which has
> mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa
> 1820
> .
> It actually does not return the first 3 ids,
> nor the 5th, nor 7..36, 38,39,41..44......
> The header lines are of variable length and the sequence lines are 80
> characters except at the ends when they might be shorter.
> Is there some caveat that I am ignoring in my format that breaks
> bio::db::fasta?
>
>
> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> $|=1;
> #
> #
> my $Dpse_UTR_file_for_T_orthologs =
> "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa";
> my $db = Bio::DB::Fasta->new
> ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa',
>   -reindex,  -makeid => \&make_my_id);
> my @ids = $db->ids;
> my $number_in = @ids;
> print "number of Dpse IDs = $number_in\n";
> foreach my $id (@ids){
> print "$id\n";
> }
> sub make_my_id {
> #       parse header line:
> #       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT
>     my $line = shift;
> #    print "line = $line\n";
>     $line =~ />(\w+) /;
>     my $ID = $1;
> #    print "ID = $ID\n";
>     return $ID;
>       }
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Mon Jun 25 09:06:27 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 25 Jun 2007 14:06:27 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467949EC.9040100@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
Message-ID: <467FBDD3.8050009@sendu.me.uk>

Sendu Bala wrote:
> In considering updating all the test scripts to [... use] t/lib/BioperlTest.pm

I'm now in the process of converting all test scripts. In addition to 
those things mentioned previously, BioperlTest now also provides the 
methods test_input_file() and test_output_file().


This:
----
use Bio::Root::IO;
my $output_file = Bio::Root::IO->catfile(qw(t data temp.file));
$obj->new(-file => ">$output_file");

END {
   unlink($output_file);
}

...

$obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file)));
----


Becomes this:
----
my $output_file = test_output_file();
$obj->new(-file => ">$output_file");

...

$obj->new(-file => test_input_file('input.file'));
----


I should think the benefits are obvious, especially for the output 
files, which thanks to inconsistency of using END blocks correctly or at 
all, leaves some output data behind on occasion.

test_input_file() is helpful for the shorthand, but also gets rid of 
many tests' usage of Bio::Root::IO (relying on something you're 
installing and testing in another test script to work in the current 
test script, without testing it in your own test script seems like a 
no-no to me).


From cjfields at uiuc.edu  Mon Jun 25 09:39:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 08:39:21 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467FBDD3.8050009@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
Message-ID: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>

On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> In considering updating all the test scripts to [... use] t/lib/ 
>> BioperlTest.pm
>
> I'm now in the process of converting all test scripts. In addition to
> those things mentioned previously, BioperlTest now also provides the
> methods test_input_file() and test_output_file().
>
>
> This:
> ----
> use Bio::Root::IO;
> my $output_file = Bio::Root::IO->catfile(qw(t data temp.file));
> $obj->new(-file => ">$output_file");
>
> END {
>    unlink($output_file);
> }
>
> ...
>
> $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file)));
> ----
>
>
> Becomes this:
> ----
> my $output_file = test_output_file();
> $obj->new(-file => ">$output_file");
>
> ...
>
> $obj->new(-file => test_input_file('input.file'));
> ----
>
>
> I should think the benefits are obvious, especially for the output
> files, which thanks to inconsistency of using END blocks correctly  
> or at
> all, leaves some output data behind on occasion.

Sounds fine by me, though it's a lot of work.  BTW, did we ever  
decide whether to finish up with Test::More conversion?  I haven't  
heard back yet; let me know what you want to do.

> test_input_file() is helpful for the shorthand, but also gets rid of
> many tests' usage of Bio::Root::IO (relying on something you're
> installing and testing in another test script to work in the current
> test script, without testing it in your own test script seems like a
> no-no to me).

Well, in a way isn't that itself a test of the class (whether it  
breaks or not)?  ; >

Do test_input_file() and test_input_file() handle directory  
structures in an OS-safe way like catfile()?  For instance, I plan on  
adding test data to a new directory similar to Bio::Graphics (t/data/ 
eutil) to prevent cluttering of the t/data directory.  I could use  
'$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base  
directory is 't/data' but that may not be cross-platform compatible  
with win32 file systems, which may still expect something like 't\data 
\eutil\input.xml'.

chris


From bix at sendu.me.uk  Mon Jun 25 09:45:23 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 25 Jun 2007 14:45:23 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
Message-ID: <467FC6F3.6080705@sendu.me.uk>

Chris Fields wrote:
> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:
>> I should think the benefits are obvious, especially for the output
>> files, which thanks to inconsistency of using END blocks correctly or at
>> all, leaves some output data behind on occasion.
> 
> Sounds fine by me, though it's a lot of work.  BTW, did we ever decide 
> whether to finish up with Test::More conversion?  I haven't heard back 
> yet; let me know what you want to do.

I'm doing the remaining Test::More conversions at the same time.


> Do test_input_file() and test_input_file() handle directory structures 
> in an OS-safe way like catfile()?  For instance, I plan on adding test 
> data to a new directory similar to Bio::Graphics (t/data/eutil) to 
> prevent cluttering of the t/data directory.  I could use 
> '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base 
> directory is 't/data' but that may not be cross-platform compatible with 
> win32 file systems, which may still expect something like 
> 't\data\eutil\input.xml'.

Its platform-independent, currently implemented using File::Spec. So 
you'll say:

$obj->new(-file => test_input_file('eutil', 'input.xml'));

Its all documented in the POD of BioperlTest.


From cjfields at uiuc.edu  Mon Jun 25 09:49:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 08:49:51 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467FC6F3.6080705@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
	<467FC6F3.6080705@sendu.me.uk>
Message-ID: <679B8E76-C090-4A29-B843-99B5853FE2FB@uiuc.edu>


On Jun 25, 2007, at 8:45 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:
>>> I should think the benefits are obvious, especially for the output
>>> files, which thanks to inconsistency of using END blocks  
>>> correctly or at
>>> all, leaves some output data behind on occasion.
>> Sounds fine by me, though it's a lot of work.  BTW, did we ever  
>> decide whether to finish up with Test::More conversion?  I haven't  
>> heard back yet; let me know what you want to do.
>
> I'm doing the remaining Test::More conversions at the same time.

Okay.  Just didn't want to do any redundant work if it's already  
being/been done.

>> Do test_input_file() and test_input_file() handle directory  
>> structures in an OS-safe way like catfile()?  For instance, I plan  
>> on adding test data to a new directory similar to Bio::Graphics (t/ 
>> data/eutil) to prevent cluttering of the t/data directory.  I  
>> could use '$obj->new(-file => test_input_file('/eutil/ 
>> input.xml'))' if the base directory is 't/data' but that may not  
>> be cross-platform compatible with win32 file systems, which may  
>> still expect something like 't\data\eutil\input.xml'.
>
> Its platform-independent, currently implemented using File::Spec.  
> So you'll say:
>
> $obj->new(-file => test_input_file('eutil', 'input.xml'));
>
> Its all documented in the POD of BioperlTest.

yay!

chris


From mmokrejs at ribosome.natur.cuni.cz  Mon Jun 25 12:06:24 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Mon, 25 Jun 2007 18:06:24 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <467254DD.3010505@mrc-lmb.cam.ac.uk>
References: <466938F6.7050903@ribosome.natur.cuni.cz>	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>	<467178AE.5040905@ribosome.natur.cuni.cz>	<46717990.6040509@ribosome.natur.cuni.cz>
	<467254DD.3010505@mrc-lmb.cam.ac.uk>
Message-ID: <467FE800.4010300@ribosome.natur.cuni.cz>


Dave Howorth wrote:
> Martin MOKREJ? wrote:
>>>> Also, there is a *huge* amount of documentation and examples on
>>>> the BioPerl website.
>>>>
>>>> http://www.bioperl.org/wiki/HOWTOs
>>> You mean 
>>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>>>  ? ;-)
>> $ perl embl2picture.pl ~/99.gb | display - Error returned while
>> evaluating value of 'description' option for glyph
>> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature
>> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl
>> line 141, <GEN0> line 125.
> 
> Hmm an error at line 141 of a 69 line script? Methinks you're not
> actually running the script that's presented on the wiki page you
> quoted. I cut-and-pasted the script and your file and it worked for me
> (at least, it produced an image, along with a bunch of OOPS lines)

Maybe you used the first version of the script?  There are two or more
scripts, I used the very last one.

M.


From cjfields at uiuc.edu  Mon Jun 25 12:48:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 11:48:30 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <467FE7B0.3010904@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
	<46723F91.60501@ribosome.natur.cuni.cz>
	<A2212781-75F3-4BB7-967F-1668B682E84E@uiuc.edu>
	<467FE7B0.3010904@ribosome.natur.cuni.cz>
Message-ID: <B9DB370F-FB17-4DEF-9664-37489D84FC05@uiuc.edu>

Martin,

Keep bioperl-related discussion on the bioperl mail list.  The large  
majority of this isn't biopython-related, but maybe some devs there  
can add to this?

On Jun 25, 2007, at 11:05 AM, Martin MOKREJ? wrote:

...

> Would you please tell me exactly what is wrong with the spacing?

Here's a section of the seq record attached to your previous email:

DEFINITION .
ACCESSION .
VERSION .
SOURCE .
   ORGANISM .

Normally there is a fixed column width for any data present in a  
field, so it would look more like this:

DEFINITION  PYR4 (DIHYDROOROTASE, PYRIMIDIN 4, dihydroorotase);  
dihydroorotase
             [Arabidopsis thaliana].
ACCESSION   NP_194024
VERSION     NP_194024.1  GI:15235865
DBSOURCE    REFSEQ: accession NM_118422.3
KEYWORDS    .
SOURCE      Arabidopsis thaliana (thale cress)
   ORGANISM  Arabidopsis thaliana
             Eukaryota; Viridiplantae; Streptophyta; Embryophyta;  
Tracheophyta;
             Spermatophyta; Magnoliophyta; eudicotyledons; core  
eudicotyledons;
             rosids; eurosids II; Brassicales; Brassicaceae;  
Arabidopsis.

Here's the relevant bit in the latest release notes:

"The second part of each sequence entry record contains the information
appropriate to its keyword, in positions 13 to 80 for keywords and
positions 11 to 80 for the sequence."

The bioperl devs try to make our parsers as flexible as possible but  
others may not, so it's something in ApE that should probably be  
fixed.  And as mentioned to you several times in the past on the mail  
list and on bugzilla, don't expect sequence records which sway from  
the standard (in this case, the release notes) to parse correctly in  
all cases.  We can try supporting some that sway from that standard  
but only up to a point.  If it causes additional bugs, headaches, or  
degrades performance it won't be supported.

> ...
> Well, I just copy&pasted the script from the bioperl webpages, I think
> from a tutorial or FAQ, don't remember anymore.

Well, can't help you if you can't point out where the code originated  
from.  We would like to know so it can be corrected.

> ...
> Well, my search for such tools available on Unix to be used in a  
> script,
> non-interactively, completely failed. My last hope except getting  
> improved
> ApE is to use the GenomeDiagram under biopython, but so far my .gb  
> files
> cannot be parsed yet. :(
> Martin

As mentioned previously you will likely have to code for it yourself  
(perl or python) or help debug the relevant biopython code to get it  
working.  We can't/won't do this for you unless/until it's something  
we feel warrants implementation.  Judging by the bug list, we also  
haven't the time nor inclination to code for it.  Sorry but we have  
other priorities besides doing your work for you.

chris


From jesper at krogh.cc  Tue Jun 26 03:05:32 2007
From: jesper at krogh.cc (Jesper Krogh)
Date: Tue, 26 Jun 2007 09:05:32 +0200 (CEST)
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
Message-ID: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>

Hi List.

Trying to parse the embl database, the embl-parser fails on: AB019196
http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196


------------- EXCEPTION: Bio::Root::Exception -------------
MSG: AB019196 seems to have an invalid species classification.
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
STACK: Bio::SeqIO::embl::_read_EMBL_Species
/usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
STACK: Bio::SeqIO::embl::next_seq
/usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
STACK: -e:1
-----------------------------------------------------------


It seems to be dissatisfied with this:
OS   Acetobacter aceti
OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.

Thanks.
-- 
Jesper Krogh


From cjfields at uiuc.edu  Tue Jun 26 09:13:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 08:13:50 -0500
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
Message-ID: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>

I can verify this using bioperl-live.  Can you file this as a bug?

http://bugzilla.open-bio.org/

chris

On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:

> Hi List.
>
> Trying to parse the embl database, the embl-parser fails on: AB019196
> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: AB019196 seems to have an invalid species classification.
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
> STACK: Bio::SeqIO::embl::_read_EMBL_Species
> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
> STACK: Bio::SeqIO::embl::next_seq
> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
> STACK: -e:1
> -----------------------------------------------------------
>
>
> It seems to be dissatisfied with this:
> OS   Acetobacter aceti
> OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>
> Thanks.
> -- 
> Jesper Krogh
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From suji_ramin at yahoo.com  Tue Jun 26 00:58:36 2007
From: suji_ramin at yahoo.com (SujiBala)
Date: Mon, 25 Jun 2007 21:58:36 -0700 (PDT)
Subject: [Bioperl-l] Error in constructing Phylogenetic tree using
	BioPerl
Message-ID: <571051.26423.qm@web51107.mail.re2.yahoo.com>

Hi Hello
  This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. 
   
  Error messasge
    Must supply  a valid Bio::Align::AlignI for the _align parameter  in the distance 
  My program
  use Bio::AlignIO;
use Bio::Align::DNAStatistics;
use Bio::Tree::DistanceFactory;
# for a dna alignment  can also use ProteinStatistics
@aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
$stats = Bio::Align::DNAStatistics->new;
$mat = $stats->distance( -align  => @aln,-method => 'Kimura');
$dfactory = Bio::Tree::DistanceFactory->new(-method => 'NJ');
$tree = $dfactory->make_tree($mat);
   
  I am using clustalw formatted fasta file with more than one sequence 
   

SujiBala


---------------------------------
Luggage? GPS? Comic books? 
Check out fitting  gifts for grads at Yahoo! Search.


From bartels.stefan at mh-hannover.de  Tue Jun 26 05:26:03 2007
From: bartels.stefan at mh-hannover.de (don esteban)
Date: Tue, 26 Jun 2007 02:26:03 -0700 (PDT)
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
	<BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
Message-ID: <11302459.post@talk.nabble.com>


Try using the Proxyconfiguration in your script:

$ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080";


L Xu wrote:
> 
> I do have the internet connection bu not use the proxy server.
> I tested the network connection with ping command (below). The ncbi
> website 
> does not response. Is there any special network setting needed for 
> connecting the ncbi website?
> Thank you so much.
> 
> C:\>ping www.yahoo.com
> 
> Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:
> 
> Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=360ms TTL=45
> 
> Ping statistics for 69.147.114.210:
>     Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
> Approximate round trip times in milli-seconds:
>     Minimum = 312ms, Maximum = 363ms, Average = 338ms
> 
> C:\>ping www.ncbi.nlm.nih.gov
> 
> Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:
> 
> Request timed out.
> Request timed out.
> Request timed out.
> Request timed out.
> 
> Ping statistics for 130.14.29.110:
>     Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
> 
> 
> 
> = = = Original message = = =
> 
> Judging by the output it looks like you have no network access or? can't 
> connect to the server (what remoteblast needs).? Make sure you? don't need 
> proxy settings.
> 
> To preempt the next question, no, I'm not going to explain what a? proxy 
> is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
> tool...
> 
> chris
> 
> On Jun 13, 2007, at 7:16 AM, L Xu wrote:
> 
> 
>    ...
> -------------------- WARNING ---------------------
> MSG: <HTML>
> <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> <BODY>
> <H1>An Error Occurred</H1>
> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> </BODY>
> </HTML>
> 
> ---------------------------------------------------
> ...
> 
> ___________________________________________________________
> Sent by ePrompter, the premier email notification software.
> Free download at http://www.ePrompter.com.
> 
> _________________________________________________________________
> Get a preview of Live Earth, the hottest event this summer - only on MSN 
> http://liveearth.msn.com?source=msntaglineliveearthhm
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From rahall2 at ualr.edu  Tue Jun 26 09:51:08 2007
From: rahall2 at ualr.edu (Roger Hall)
Date: Tue, 26 Jun 2007 08:51:08 -0500
Subject: [Bioperl-l] Tuesday: ill
Message-ID: <000001c7b7f9$0d029040$4601a8c0@LIBERAL2>

Well I guess I won't be in today after all.
 
Michael, Stephen, and Ames: please call me from the grad office at 10 on
my cell phone (744-8514). 
 
Phil: please go ahead and meet with Tim, and let me know what questions
remain afterwards.
 
Thanks!
 
Roger Hall
Technical Director
MidSouth Bioinformatics Center
University of Arkansas at Little Rock
(501) 569-8074
 

From cjfields at uiuc.edu  Tue Jun 26 10:02:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 09:02:29 -0500
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <4681185D.5030402@cam.ac.uk>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
	<246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
	<4681185D.5030402@cam.ac.uk>
Message-ID: <EC86EE5C-02DF-4E4F-AF25-6E53925CBC1F@uiuc.edu>

Ill try getting to that ASAP (as well as a few bugs).  The problem is  
we have to patch this in 2-3 places (SeqIO::swiss, SeqIO::embl) due  
to repeated code issues, something I'm trying to rectify with a new  
set of parsers.  Just haven't had the time to work on them lately  
unfortunately.

chris

On Jun 26, 2007, at 8:45 AM, Roy Chaudhuri wrote:

> Sorry, replied to this but forgot to cc the list.
>
> It looks like a related problem to bug 2288 that I filed about  
> Bio::SeqIO::swiss - the period after subgen. is what causes the  
> problems since it is interpreted as a seperator between nodes. I  
> put a patch in for Bio::SeqIO::swiss that works for me, but I guess  
> it might have side effects.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
>
> Chris Fields wrote:
>> I can verify this using bioperl-live.  Can you file this as a bug?
>> http://bugzilla.open-bio.org/
>> chris
>> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:
>>> Hi List.
>>>
>>> Trying to parse the embl database, the embl-parser fails on:  
>>> AB019196
>>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>>>
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: AB019196 seems to have an invalid species classification.
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/ 
>>> Root.pm:359
>>> STACK: Bio::SeqIO::embl::_read_EMBL_Species
>>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
>>> STACK: Bio::SeqIO::embl::next_seq
>>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
>>> STACK: -e:1
>>> -----------------------------------------------------------
>>>
>>>
>>> It seems to be dissatisfied with this:
>>> OS   Acetobacter aceti
>>> OC   Bacteria; Proteobacteria; Alphaproteobacteria;  
>>> Rhodospirillales;
>>> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>>>
>>> Thanks.
>>> -- 
>>> Jesper Krogh
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From rrc22 at cam.ac.uk  Tue Jun 26 09:45:01 2007
From: rrc22 at cam.ac.uk (Roy Chaudhuri)
Date: Tue, 26 Jun 2007 14:45:01 +0100
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
	<246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
Message-ID: <4681185D.5030402@cam.ac.uk>

Sorry, replied to this but forgot to cc the list.

It looks like a related problem to bug 2288 that I filed about 
Bio::SeqIO::swiss - the period after subgen. is what causes the problems 
since it is interpreted as a seperator between nodes. I put a patch in 
for Bio::SeqIO::swiss that works for me, but I guess it might have side 
effects.

Roy.
--
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.

Chris Fields wrote:
> I can verify this using bioperl-live.  Can you file this as a bug?
> 
> http://bugzilla.open-bio.org/
> 
> chris
> 
> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:
> 
>> Hi List.
>>
>> Trying to parse the embl database, the embl-parser fails on: AB019196
>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>>
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: AB019196 seems to have an invalid species classification.
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
>> STACK: Bio::SeqIO::embl::_read_EMBL_Species
>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
>> STACK: Bio::SeqIO::embl::next_seq
>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
>> STACK: -e:1
>> -----------------------------------------------------------
>>
>>
>> It seems to be dissatisfied with this:
>> OS   Acetobacter aceti
>> OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
>> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>>
>> Thanks.
>> -- 
>> Jesper Krogh
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Tue Jun 26 10:13:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 26 Jun 2007 15:13:48 +0100
Subject: [Bioperl-l] Error in constructing Phylogenetic tree
	using	BioPerl
In-Reply-To: <571051.26423.qm@web51107.mail.re2.yahoo.com>
References: <571051.26423.qm@web51107.mail.re2.yahoo.com>
Message-ID: <46811F1C.3020307@sendu.me.uk>

SujiBala wrote:
> Hi Hello
>   This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. 
>    
>   Error messasge
>     Must supply  a valid Bio::Align::AlignI for the _align parameter  in the distance 
>   My program
>   use Bio::AlignIO;
> use Bio::Align::DNAStatistics;
> use Bio::Tree::DistanceFactory;
> # for a dna alignment  can also use ProteinStatistics
> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
> $stats = Bio::Align::DNAStatistics->new;
> $mat = $stats->distance( -align  => @aln,-method => 'Kimura');

Without looking at the docs for these modules, it is immediately obvious 
that Bio::AlignIO->new() is going to return an instance of Bio::AlignIO 
and not an array of alignments. It is also obvious that the -align => 
parameter for the distance() method can't take an array of anything (but 
probably an array ref?).

Check the documentation and make sure you know what objects you're 
generating and passing around.


From schlesi at ebi.ac.uk  Tue Jun 26 10:59:13 2007
From: schlesi at ebi.ac.uk (Felix Schlesinger)
Date: Tue, 26 Jun 2007 15:59:13 +0100
Subject: [Bioperl-l] PAML parser
Message-ID: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>

Hello,

I am trying to use the PAML result parser (BioPerl
Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15.
However on all outputs I have tested no result object is returned
(next_result is undef). This includes the HIV and Lysin datasets
included with PAML.
My code is:

my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir =>
"/.");
my $result = $codemlp->next_result;
foreach my $model ( $result->get_NSSite_results ) {
...

and the error is: Can't call method "get_NSSite_results" on an
undefined value ...

I can include the mlc file is needed. Is this supposed to work? Or do
I have to run paml from bioperl to parse the results?

Thanks
  Felix


From Xianjun.Dong at bccs.uib.no  Tue Jun 26 10:35:17 2007
From: Xianjun.Dong at bccs.uib.no (Xianjun Dong)
Date: Tue, 26 Jun 2007 16:35:17 +0200
Subject: [Bioperl-l] bug for PAML::Baseml
Message-ID: <46812425.8000509@ii.uib.no>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070626/cb3d8193/attachment-0002.html>

From Xianjun.Dong at bccs.uib.no  Tue Jun 26 11:40:47 2007
From: Xianjun.Dong at bccs.uib.no (Xianjun Dong)
Date: Tue, 26 Jun 2007 17:40:47 +0200
Subject: [Bioperl-l] bug for PAML::Baseml
In-Reply-To: <46812425.8000509@ii.uib.no>
References: <46812425.8000509@ii.uib.no>
Message-ID: <4681337F.1000902@ii.uib.no>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070626/604ce866/attachment-0002.html>

From hartzell at alerce.com  Tue Jun 26 14:12:04 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 26 Jun 2007 14:12:04 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
Message-ID: <18049.22260.967524.353173@almost.alerce.com>


There don't seem to be any .cvsignore files in the repository, or in
CVSROOT/cvsignore.

Am I missing something, or don't we use them?

g.


From cjfields at uiuc.edu  Tue Jun 26 15:54:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 14:54:25 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <74515C87-5553-4AF0-9B83-26F3E71E15C8@uiuc.edu>

Not sure.  You may want to email support at open-bio.org; my guess is  
Chris D or Jason would have an answer.

chris

On Jun 26, 2007, at 1:12 PM, George Hartzell wrote:

>
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
>
> Am I missing something, or don't we use them?
>
> g.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Tue Jun 26 15:55:21 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 26 Jun 2007 16:55:21 -0300
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>

Maybe we've been using the default?

On Jun 26, 2007, at 3:12 PM, George Hartzell wrote:

>
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
>
> Am I missing something, or don't we use them?
>
> g.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Tue Jun 26 16:21:30 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 26 Jun 2007 16:21:30 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
Message-ID: <18049.30026.61328.134490@almost.alerce.com>

Chris Fields writes:
 > [...]
 > It looks like George Hartzell may be taking a crack at it, with  
 > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
 > could have something testable relatively soon.  After that we'll need  
 > to work out a few other issues, basically what's on Hilmar's list.

There's a repository on file:///home/hartzell/bioperl with all of the
components projects in place.

If you have a dev.open-bio.org account and you're in the bioperl
group, you're good to get at it via:

  file:///home/hartzell/bioperl

or 

  svn+ssh://dev.open-bio.org/home/hartzell/bioperl

There are a couple of things to think about:

  - how are we going to provide access.  I *think* that I heard a
    decision to use http:// and https://.  Who gets to set that up?

  - what do we want to do about keywords.  The cvs2svn tool guesses
    and automatically sets the svn:keywords property to Author Date
    Revision and Id on many of the files in the tree.  If it looks
    like it got it right, we can stick with it.  Or, we can disable
    that conversion and I've cribbed a little script that'll grep out
    files using Id and set the svn:keywords property accordingly.

  - what do we want to do about svn:ignore?  I haven't seen any
    .cvsignore files.

Beyond that, how does the repo look?

How are we going to cut over?

Are we going to try to push svn commits to the read-mostly CVS repo,
or just keep it around for history's sake (I lean towards the latter).

g.


From jason at bioperl.org  Tue Jun 26 19:22:20 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:22:20 -0300
Subject: [Bioperl-l] PAML parser
In-Reply-To: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>
References: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>
Message-ID: <D536496C-D716-42DF-B614-DD43C1B13A67@bioperl.org>

Can you make sure you have the latest and greatest version of these  
modules from the CVS repository?  We had to fix things to parse 3.15  
-- I can't tell if this is the problem or something else.
You can also add -verbose => 1when you initialize the object and it  
may spit out more warnings about whether it is having problems.


-jason

On Jun 26, 2007, at 11:59 AM, Felix Schlesinger wrote:

> Hello,
>
> I am trying to use the PAML result parser (BioPerl
> Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15.
> However on all outputs I have tested no result object is returned
> (next_result is undef). This includes the HIV and Lysin datasets
> included with PAML.
> My code is:
>
> my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir =>
> "/.");
> my $result = $codemlp->next_result;
> foreach my $model ( $result->get_NSSite_results ) {
> ...
>
> and the error is: Can't call method "get_NSSite_results" on an
> undefined value ...
>
> I can include the mlc file is needed. Is this supposed to work? Or do
> I have to run paml from bioperl to parse the results?
>
> Thanks
>   Felix
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Tue Jun 26 19:27:05 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:27:05 -0300
Subject: [Bioperl-l] Error in constructing Phylogenetic tree
	using	BioPerl
In-Reply-To: <46811F1C.3020307@sendu.me.uk>
References: <571051.26423.qm@web51107.mail.re2.yahoo.com>
	<46811F1C.3020307@sendu.me.uk>
Message-ID: <A99815DC-0FC2-4019-B0C4-CA8EA713FEB0@bioperl.org>


On Jun 26, 2007, at 11:13 AM, Sendu Bala wrote:

> SujiBala wrote:
>> Hi Hello
>>   This is sujatha from singapore. I am trying to construct phylo  
>> tree using DNAStatistics and Kirma method. But I am getting the  
>> following error message. It would be nice if you could help me  
>> resolve this problem asap.
>>
>>   Error messasge
>>     Must supply  a valid Bio::Align::AlignI for the _align  
>> parameter  in the distance
>>   My program
>>   use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>> use Bio::Tree::DistanceFactory;
>> # for a dna alignment  can also use ProteinStatistics
>> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
>> $stats = Bio::Align::DNAStatistics->new;
>> $mat = $stats->distance( -align  => @aln,-method => 'Kimura');
>

yep you want to call next_aln on the Bio::AlignIO object.
I fixed the example code in the HOWTO so it should work properly now;
http://bioperl.org/wiki/HOWTO:Trees#Constructing_Trees

> Without looking at the docs for these modules, it is immediately  
> obvious
> that Bio::AlignIO->new() is going to return an instance of  
> Bio::AlignIO
> and not an array of alignments. It is also obvious that the -align =>
> parameter for the distance() method can't take an array of anything  
> (but
> probably an array ref?).
>
> Check the documentation and make sure you know what objects you're
> generating and passing around.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Tue Jun 26 19:29:11 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:29:11 -0300
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
	<E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>
Message-ID: <5A8FD8A3-9593-4925-AA74-D4B03CDC1C34@bioperl.org>

We don't have one. I have one on my local machine that defined  
basically *~ and .#* so I never had a problem.

Feel free to propose one if you think it is important, I never really  
though it was important.

On Jun 26, 2007, at 4:55 PM, Hilmar Lapp wrote:

> Maybe we've been using the default?
>
> On Jun 26, 2007, at 3:12 PM, George Hartzell wrote:
>
>>
>> There don't seem to be any .cvsignore files in the repository, or in
>> CVSROOT/cvsignore.
>>
>> Am I missing something, or don't we use them?
>>
>> g.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From j_martin at lbl.gov  Tue Jun 26 21:01:29 2007
From: j_martin at lbl.gov (Joel Martin)
Date: Tue, 26 Jun 2007 18:01:29 -0700
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <11302459.post@talk.nabble.com>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
	<BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
	<11302459.post@talk.nabble.com>
Message-ID: <20070627010129.GA8628@eniac.jgi-psf.org>

Hello, 
  The tutorial code snippet is an endless loop, I think it's supposed
to remove the rid.  As the only print statement you added is after the
endless loop, you aren't seeing anything happen.   

Use the code from this instead,

perldoc Bio::Tools::Run::RemoteBlast

  The bptutorial.pl does have a note that it's not useful and to read the pod
for Bio::Tools::Run::RemoteBlast, it's in the next sentences after the code
snippet you used.  

  Though, as it's a tutorial example it might be nice to remove the while
loop .. or at least add the sleep(5) part.
http://www.bioperl.org/wiki/Bptutorial.pl#Running_BLAST_.28using_RemoteBlast.pm.29

  Aside from that, you may have network issues but www.ncbi.nlm.nih.gov
doesn't respond to ping as far as I can tell. 

Joel


On Tue, Jun 26, 2007 at 02:26:03AM -0700, don esteban wrote:
> 
> Try using the Proxyconfiguration in your script:
> 
> $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080";
> 
> 
> 
> 
> L Xu wrote:
> > 
> > I do have the internet connection bu not use the proxy server.
> > I tested the network connection with ping command (below). The ncbi
> > website 
> > does not response. Is there any special network setting needed for 
> > connecting the ncbi website?
> > Thank you so much.
> > 
> > C:\>ping www.yahoo.com
> > 
> > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:
> > 
> > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45
> > 
> > Ping statistics for 69.147.114.210:
> >     Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
> > Approximate round trip times in milli-seconds:
> >     Minimum = 312ms, Maximum = 363ms, Average = 338ms
> > 
> > C:\>ping www.ncbi.nlm.nih.gov
> > 
> > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:
> > 
> > Request timed out.
> > Request timed out.
> > Request timed out.
> > Request timed out.
> > 
> > Ping statistics for 130.14.29.110:
> >     Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
> > 
> > 
> > 
> > = = = Original message = = =
> > 
> > Judging by the output it looks like you have no network access or? can't 
> > connect to the server (what remoteblast needs).? Make sure you? don't need 
> > proxy settings.
> > 
> > To preempt the next question, no, I'm not going to explain what a? proxy 
> > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
> > tool...
> > 
> > chris
> > 
> > On Jun 13, 2007, at 7:16 AM, L Xu wrote:
> > 
> > 
> >    ...
> > -------------------- WARNING ---------------------
> > MSG: <HTML>
> > <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> > <BODY>
> > <H1>An Error Occurred</H1>
> > 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> > </BODY>
> > </HTML>
> > 
> > ---------------------------------------------------
> > ...
> > 
> > ___________________________________________________________
> > Sent by ePrompter, the premier email notification software.
> > Free download at http://www.ePrompter.com.
> > 
> > _________________________________________________________________
> > Get a preview of Live Earth, the hottest event this summer - only on MSN 
> > http://liveearth.msn.com?source=msntaglineliveearthhm
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > 
> > 
> 
> -- 
> View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From melvinp at pacific.net.sg  Wed Jun 27 01:25:08 2007
From: melvinp at pacific.net.sg (Melvin P)
Date: Wed, 27 Jun 2007 13:25:08 +0800
Subject: [Bioperl-l] finding statistics on AA
Message-ID: <4681F4B4.8010609@pacific.net.sg>

Hi, I am new to BioPerl. I am trying to find out if there is any class 
that I can use for occupancy number/occurrence counts, psuedo count, 
observed frequency etc given a few sequences of amino acid. For example, 
what is the observed frequency of residue i at position p. My objective 
is to analyze the information content. Thanks.


From bix at sendu.me.uk  Wed Jun 27 06:23:58 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 11:23:58 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <467FBDD3.8050009@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
Message-ID: <46823ABE.2080300@sendu.me.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> In considering updating all the test scripts to [... use] 
>> t/lib/BioperlTest.pm
> 
> I'm now in the process of converting all test scripts.

And I've now completed that job (for bioperl-live at least), except for 
t/EUtilities.t since I know Chris is working on it.


In addition to converting to Test::More where necessary, I've also made 
all psuedo-TODO blocks real ones. Previously I had advised to use SKIP 
blocks instead since TODO blocks need a Test::Harness upgrade. However I 
think in the next release we ought to make such upgrading compulsory 
(which should be automatic when combined with compulsory usage of 
Module::Build and Test::More in turn: users simply have to update CPAN).


The conversion to BioperlTest directly led to the discovery and fixing 
of 6 minor bugs, so was certainly not without merit.


No user or developer needs to have BIOPERLDEBUG permanently set to true 
anymore. To run all tests you just have to answer yes to the BioDBGFF 
and networking questions of 'perl Build.PL'. With './Build test' you 
then get clean, easy-to-read output where it is obvious to see that we 
currently have these issues:

t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in 
another thread.

t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, 
t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and 
t/Annotation.t all have TODO tests. If you know about those modules, now 
would be a great time to implement those TODOs!

Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are 
deprecated' warnings.


To debug a particular test you could say:
BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t


I've updated the HOWTO for writing test scripts:
http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests


From cjfields at uiuc.edu  Wed Jun 27 07:55:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 06:55:47 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <46823ABE.2080300@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk>
Message-ID: <DC0F57B9-D733-4C89-9B7A-65E1ADFCFDD2@uiuc.edu>


On Jun 27, 2007, at 5:23 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Sendu Bala wrote:
>>> In considering updating all the test scripts to [... use]
>>> t/lib/BioperlTest.pm
>>
>> I'm now in the process of converting all test scripts.
>
> And I've now completed that job (for bioperl-live at least), except  
> for
> t/EUtilities.t since I know Chris is working on it.

The network tests will be much shorter; the bulk will be transferred  
to a new suite for the backend Bio::Tools:EUtilities parser (which  
will test static files in t/data/eutils, so no dynamic changes).

> In addition to converting to Test::More where necessary, I've also  
> made
> all psuedo-TODO blocks real ones. Previously I had advised to use SKIP
> blocks instead since TODO blocks need a Test::Harness upgrade.  
> However I
> think in the next release we ought to make such upgrading compulsory
> (which should be automatic when combined with compulsory usage of
> Module::Build and Test::More in turn: users simply have to update  
> CPAN).

Sounds good to me, but there may be some grumblings out there.

Having specific TODOs are nice b/c we can test them w/o fails.  Handy.

> The conversion to BioperlTest directly led to the discovery and fixing
> of 6 minor bugs, so was certainly not without merit.
>
>
> No user or developer needs to have BIOPERLDEBUG permanently set to  
> true
> anymore. To run all tests you just have to answer yes to the BioDBGFF
> and networking questions of 'perl Build.PL'. With './Build test' you
> then get clean, easy-to-read output where it is obvious to see that we
> currently have these issues:
>
> t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in
> another thread.
>
> t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t,
> t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and
> t/Annotation.t all have TODO tests. If you know about those  
> modules, now
> would be a great time to implement those TODOs!

The RNA_SearchIO.t is from ERPIN output; there's no easy way to  
generate it beyond having the user supply the info (or having the  
program author change the output).

Will have to look at the others to see what's involved; maybe  
something for the priority list?

> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
> deprecated' warnings.

I ran into this with XML::Simple data structures recently; there was  
an easy way around it via XML::Simple using forcearray().  It has to  
do with attempting to assign data to/from a hash in a specific way  
involving array references (though I can't remember exactly how; I  
slept since then).

> To debug a particular test you could say:
> BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t
>
>
> I've updated the HOWTO for writing test scripts:
> http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests

Good work!

chris


From schlesi at ebi.ac.uk  Wed Jun 27 07:57:27 2007
From: schlesi at ebi.ac.uk (Felix Schlesinger)
Date: Wed, 27 Jun 2007 12:57:27 +0100
Subject: [Bioperl-l] Selecting columns from alignment
Message-ID: <7317d50c0706270457i1c3d92a8hb124fa663f51b837@mail.gmail.com>

Hi,

is there an elegant way to select columns from an alignment object
fulfilling a certain property (for example less than x gaps)?
Everything I can see from Align::AlignI seems to involve looking at
the individual sequences, creating lots of slices and appending them.
If there a better way in bioperl or failing that, does anyone know a
software package with similar functionality (t-coffee has lots of
filters for alignments, but nothing to select columns besides by
position it seems). Ideally this would also return a mapping from old
to new positions in one of the sequences of course.

Thanks
  Felix


From cjfields at uiuc.edu  Wed Jun 27 10:36:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 09:36:41 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>


On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:

> ...
> If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
>
>   file:///home/hartzell/bioperl
>
> or
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

I managed to get it working using file://.  Haven't tried svn+ssh yet  
but I've had persistent problems getting ssh to work properly on my  
macbook; not sure why yet but I haven't had time to play around with it.

> There are a couple of things to think about:
>
>   - how are we going to provide access.  I *think* that I heard a
>     decision to use http:// and https://.  Who gets to set that up?

That hasn't been decided yet and will be up to a consensus of the  
core devs, but I think the odds are in favor of allowing https:// but  
against allowing http://.

As for setup that could be anyone with admin privs, though it may be  
best left up to Chris D, Jason, or Mauricio.

>   - what do we want to do about keywords.  The cvs2svn tool guesses
>     and automatically sets the svn:keywords property to Author Date
>     Revision and Id on many of the files in the tree.  If it looks
>     like it got it right, we can stick with it.  Or, we can disable
>     that conversion and I've cribbed a little script that'll grep out
>     files using Id and set the svn:keywords property accordingly.

Probably again a consensus issue, but you can choose one route.  My  
inclination is the former if it's easier.

>   - what do we want to do about svn:ignore?  I haven't seen any
>     .cvsignore files.

Not sure.  I've never used one personally, but (as Jason suggests) if  
you have ideas for one you can propose them, or we can suggest devs  
set up svn::ignore locally.

> Beyond that, how does the repo look?

Seems fine, though a simple 'svn file:///home/hartzell/bioperl'  
checkout gets everything (all distros, branches, etc).  We need to  
make sure everyone uses 'svn co file:///home/hartzell/bioperl/bioperl- 
live/trunk /live' or similar if they just want the latest core/db/etc.

We'll also need to start a svn wiki page to show how to get relevant  
distros (similar in style probably to the cvs page, with dev  
information, how to set up ssh keys, https stuff, etc).

> How are we going to cut over?
>
> Are we going to try to push svn commits to the read-mostly CVS repo,
> or just keep it around for history's sake (I lean towards the latter).

I think a clean cut-over.  Everyone would be warned to hold commits  
for a day (lest they be lost), then probably do something in this order:

- switch cvs to read-only except for svn commits
- run a clean cvs2svn
- set up svn as read/write
- set up test commits to cvs via svn
- disable cvs commit messages to bioperl-guts, enable svn commit  
messages in it's place.
- push svn commits over to read-only cvs

cvs >>must<< be read-only after that point (no cvs->svn commits),  
with write access only available through svn.  If at some future  
point there is no reason to keep it around or that it is more trouble  
than it's worth, we can make a decision then on cvs's fate.

> g.

chris


From rvos at interchange.ubc.ca  Wed Jun 27 10:23:25 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Wed, 27 Jun 2007 07:23:25 -0700 (PDT)
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
Message-ID: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>

 
> Are we going to try to push svn commits to the read-mostly CVS repo,
> or just keep it around for history's sake (I lean towards the latter).

I'm a little confused - surely once the svn is up and running we'll want *no more* cvs commits? Parallel repositories that each accumulate stuff will be a nightmare. I'm probably just not getting your point.

Rutger


From cjfields at uiuc.edu  Wed Jun 27 11:18:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 10:18:03 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>


On Jun 27, 2007, at 9:23 AM, rvos wrote:

>
>> Are we going to try to push svn commits to the read-mostly CVS repo,
>> or just keep it around for history's sake (I lean towards the  
>> latter).
>
> I'm a little confused - surely once the svn is up and running we'll  
> want *no more* cvs commits? Parallel repositories that each  
> accumulate stuff will be a nightmare. I'm probably just not getting  
> your point.
>
> Rutger

Most projects make a clean break with cvs (no more commits) for the  
reasons you point out.  Not sure how the other core devs feel about  
that but I could go for that; it would def. prevent headaches.  We  
could keep cvs for the time being as read-only, with no svn->cvs  
syncing.

There are few projects which have (as a phase-out plan) old read-only  
cvs repositories available, with an automatic svn->cvs commit  
following every new svn commit.  Not sure how that works, esp. for  
branching/merging and so on which I could see potentially getting hairy.

chris


From cjfields at uiuc.edu  Wed Jun 27 12:05:49 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 11:05:49 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <5EA56270-3427-4995-B3C1-2789229AACF1@uiuc.edu>


On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:

> ...If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
>
>   file:///home/hartzell/bioperl
>
> or
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

Did manage to get svn+ssh working (with some password harassment);  
core tests passed enough that I think everything's okay.  If ssh keys  
are set up correctly (mine aren't) it should work fine.

chris


From dmessina at wustl.edu  Wed Jun 27 12:27:32 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 11:27:32 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>

> [Chris]
>
> I managed to get it working using file://.  Haven't tried svn+ssh yet
> but I've had persistent problems getting ssh to work properly on my
> macbook; not sure why yet but I haven't had time to play around  
> with it.

I just did a checkout and a test commit, both via svn+ssh -- works  
great for me.


>> [George]
>>
>>   - what do we want to do about keywords.  The cvs2svn tool guesses
>>     and automatically sets the svn:keywords property to Author Date
>>     Revision and Id on many of the files in the tree.  If it looks
>>     like it got it right, we can stick with it.  Or, we can disable
>>     that conversion and I've cribbed a little script that'll grep out
>>     files using Id and set the svn:keywords property accordingly.


I would think we would want "Author Date Id Rev URL" set on  
everything, no?. So either cvs2svn or your tool (whichever you think  
is better), followed by

	svn propset svn:keywords "Author Date Id Rev URL" *

from the root of a working copy would take care of all of the  
existing files in the repository, I think.

George knows more about this than I do, but I think you can set up a  
global config file with

	enable-auto-props = yes
	* = svn:keywords="Author Date Id Rev URL"

to ensure it gets set on any future additions to the repository.


>>   - what do we want to do about svn:ignore?  I haven't seen any
>>     .cvsignore files.
>
> Not sure.  I've never used one personally, but (as Jason suggests) if
> you have ideas for one you can propose them, or we can suggest devs
> set up svn::ignore locally.

I use the default global-ignores

	global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* .DS_Store

(again, in my system-wide config file), but I'm not tied to that. I  
do think we should have one, though; individuals can easily override  
any settings in the system-wide config with their own ~/.subversion/ 
config.


>> Beyond that, how does the repo look?

Looks great, George! Thanks for doing this.


Dave


From hartzell at alerce.com  Wed Jun 27 13:00:53 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 13:00:53 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <18050.38853.526224.791878@almost.alerce.com>

rvos writes:
 >  
 > > Are we going to try to push svn commits to the read-mostly CVS repo,
 > > or just keep it around for history's sake (I lean towards the latter).
 > 
 > I'm a little confused - surely once the svn is up and running we'll
 > want *no more* cvs commits? Parallel repositories that each
 > accumulate stuff will be a nightmare. I'm probably just not getting
 > your point. 

There had been some point of keeping a CVS repository around as a
read-only mirror of the svn repo, presumably for people who's habits
or setup won't let them use svn.

In theory, each commit to the svn repo can be automagically pushed
down into CVS w/out user intervention, google will tell you how but
I've never run anything that way.

g.


From dmessina at wustl.edu  Wed Jun 27 13:27:01 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 12:27:01 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <99969FC2-479E-408C-AADB-7664EBE937CF@wustl.edu>

> [Chris]
> We'll also need to start a svn wiki page to show how to get relevant
> distros (similar in style probably to the cvs page, with dev
> information, how to set up ssh keys, https stuff, etc).

I cloned the CVS page and have started adapting it for Subversion:

	http://www.bioperl.org/wiki/Using_Subversion

I'll do some more on it later today, but if anyone wants to fiddle  
with it in the interim, please do.


Dave


From n.haigh at sheffield.ac.uk  Wed Jun 27 14:44:16 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 19:44:16 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <46823ABE.2080300@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk>
Message-ID: <4682B000.2050707@sheffield.ac.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> Sendu Bala wrote:
>>> In considering updating all the test scripts to [... use] 
>>> t/lib/BioperlTest.pm
>> I'm now in the process of converting all test scripts.
> 
> And I've now completed that job (for bioperl-live at least), except for 
> t/EUtilities.t since I know Chris is working on it.
> 
> 
> In addition to converting to Test::More where necessary, I've also made 
> all psuedo-TODO blocks real ones. Previously I had advised to use SKIP 
> blocks instead since TODO blocks need a Test::Harness upgrade. However I 
> think in the next release we ought to make such upgrading compulsory 
> (which should be automatic when combined with compulsory usage of 
> Module::Build and Test::More in turn: users simply have to update CPAN).
> 
> 
> The conversion to BioperlTest directly led to the discovery and fixing 
> of 6 minor bugs, so was certainly not without merit.
> 
> 
> No user or developer needs to have BIOPERLDEBUG permanently set to true 
> anymore. To run all tests you just have to answer yes to the BioDBGFF 
> and networking questions of 'perl Build.PL'. With './Build test' you 
> then get clean, easy-to-read output where it is obvious to see that we 
> currently have these issues:
> 
> t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in 
> another thread.
> 
> t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, 
> t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and 
> t/Annotation.t all have TODO tests. If you know about those modules, now 
> would be a great time to implement those TODOs!
> 
> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are 
> deprecated' warnings.

Ah, that reminds me!

I recently tried to do an install of the cvs head (a week or two ago) on
a clean installation of Debian 4.0 (etch). During the installation, of
dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
Bioperl. I seem to remember this circular dependency cropping up before
- am I correct - and can you remind me how this was "fixed"?

Cheers
Nath


From bix at sendu.me.uk  Wed Jun 27 14:52:01 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 19:52:01 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B000.2050707@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
Message-ID: <4682B1D1.3080206@sendu.me.uk>

Nathan S. Haigh wrote:
> I recently tried to do an install of the cvs head (a week or two ago) on
> a clean installation of Debian 4.0 (etch). During the installation, of
> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
> Bioperl. I seem to remember this circular dependency cropping up before
> - am I correct - and can you remind me how this was "fixed"?

Yes, it always happens. It was 'fixed' by being completely ignored by 
me. Installation is guaranteed to fail, but if you really want it, 
trying to install again after you already have Bioperl installed will 
result in success.

Clearly something nicer could be done. Suggestions on a postcard...


From cjfields at uiuc.edu  Wed Jun 27 15:01:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 14:01:01 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B000.2050707@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
Message-ID: <A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>


On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote:

> Sendu Bala wrote:
>> ...
>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
>> deprecated' warnings.
>
> Ah, that reminds me!
>
> I recently tried to do an install of the cvs head (a week or two  
> ago) on
> a clean installation of Debian 4.0 (etch). During the installation, of
> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
> Bioperl. I seem to remember this circular dependency cropping up  
> before
> - am I correct - and can you remind me how this was "fixed"?
>
> Cheers
> Nath

Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part  
of Bioperl (and he could be come a dev).  That would solve it.

chris


From n.haigh at sheffield.ac.uk  Wed Jun 27 15:16:40 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 20:16:40 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
Message-ID: <4682B798.1010409@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> 
> On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote:
> 
>> Sendu Bala wrote:
>>> ...
>>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
>>> deprecated' warnings.
>>
>> Ah, that reminds me!
>>
>> I recently tried to do an install of the cvs head (a week or two ago) on
>> a clean installation of Debian 4.0 (etch). During the installation, of
>> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
>> Bioperl. I seem to remember this circular dependency cropping up before
>> - am I correct - and can you remind me how this was "fixed"?
>>
>> Cheers
>> Nath
> 
> Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of
> Bioperl (and he could be come a dev).  That would solve it.
> 
> chris

Just to put the feelers out to see what people think.

It seems (to me at least) that Bioperl modules could/should? be released
as individual modules and that "bioperl" would really constitute a
"bundle" of all these modules - in terms of CPAN anyway. Am I correct in
this thinking? The Bio::ASN1::EntrezGene could simply require a
particular module rather than the whole of bioperl - might get out of
the circular dependency theoretically!?

I'm not suggesting moving in this direction, but just wondered what
others thought about this concept?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgreYczuW2jkwy2gRAi5IAJ9/Alq1fktEmAF16DlKcBVcy7d+jQCeIj+X
tOFQUQ7cGJLUITEDw1+QLxc=
=Yc+g
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Wed Jun 27 15:31:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 14:31:44 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B798.1010409@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
	<4682B798.1010409@sheffield.ac.uk>
Message-ID: <33C76559-4771-4FDC-9EEA-1645BC3C576C@uiuc.edu>


On Jun 27, 2007, at 2:16 PM, Nathan S. Haigh wrote:

> ...
>
> Just to put the feelers out to see what people think.
>
> It seems (to me at least) that Bioperl modules could/should? be  
> released
> as individual modules and that "bioperl" would really constitute a
> "bundle" of all these modules - in terms of CPAN anyway. Am I  
> correct in
> this thinking? The Bio::ASN1::EntrezGene could simply require a
> particular module rather than the whole of bioperl - might get out of
> the circular dependency theoretically!?
>
> I'm not suggesting moving in this direction, but just wondered what
> others thought about this concept?
>
> Nath

Well, Steve suggested splitting some of core into distinct groups,  
which I tend to agree with in some respects (speed up releases for  
those modules, such as SearchIO, DB, Graphics).  The problem we have  
yet to solve is what we consider 'core'.  Is it Bio::Seq and  
related?  Should it include Bio::DB*?  Should it just be Bio::*  
modules with no or very few external dependencies?  And so on...,   
probably not a decision we want to make immediately (until after svn  
migration, tests finished, maybe a release or two, a beer)...

The Bioperl module dependency that Bio::ASN1::EntrezGene has is  
Bio::Index::AbstractSeq.  You could try a test build of  
Bio::ASN1::EntrezGene to see what happens.

chris


From hlapp at gmx.net  Wed Jun 27 15:49:15 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 16:49:15 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
Message-ID: <E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>


On Jun 27, 2007, at 1:27 PM, David Messina wrote:

> I would think we would want "Author Date Id Rev URL" set on
> everything, no?. So either cvs2svn or your tool (whichever you think
> is better), followed by
>
> 	svn propset svn:keywords "Author Date Id Rev URL" *

Shouldn't this be done recursively?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Jun 27 15:50:27 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 16:50:27 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
Message-ID: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>


On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:

> Most projects make a clean break with cvs (no more commits) for the
> reasons you point out.  Not sure how the other core devs feel about
> that but I could go for that; it would def. prevent headaches.

There shouldn't be any cvs write support after the cut-over I think.  
I don't see the benefit that would justify the huge headache potential.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 27 16:01:40 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:01:40 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
Message-ID: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>


On Jun 27, 2007, at 2:50 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:
>
>> Most projects make a clean break with cvs (no more commits) for the
>> reasons you point out.  Not sure how the other core devs feel about
>> that but I could go for that; it would def. prevent headaches.
>
> There shouldn't be any cvs write support after the cut-over I  
> think. I don't see the benefit that would justify the huge headache  
> potential.
>
> 	-hilmar

Agreed, so maybe we should set that in stone.  That means no svn->cvs  
syncing post-migration as well, I assume.

Now how about a quick straw poll, what kind of access?  svn+ssh is  
already available, but some (Aaron among them) have indicated they  
would like https as well (not sure how involved it would be to set up).

chris


From hlapp at gmx.net  Wed Jun 27 16:08:40 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:08:40 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
Message-ID: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>


On Jun 27, 2007, at 5:01 PM, Chris Fields wrote:

> That means no svn->cvs syncing post-migration as well, I assume.

That's a bit of a different story. People out there have URL links  
into our anonymous CVS repository. If it's not too troublesome (and  
tend to I think it's not) I'd like to maintain those in working  
order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi  
script that maps between the URL flavors (i.e., that maps a CVS-style  
URL to the equivalent SVN link).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Wed Jun 27 16:15:10 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 16:15:10 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
Message-ID: <18050.50510.84363.355034@almost.alerce.com>

David Messina writes:
 > > [Chris]
 > >
 > > I managed to get it working using file://.  Haven't tried svn+ssh yet
 > > but I've had persistent problems getting ssh to work properly on my
 > > macbook; not sure why yet but I haven't had time to play around  
 > > with it.
 > 
 > I just did a checkout and a test commit, both via svn+ssh -- works  
 > great for me.

Is there anyone working outside of bioperl-{run,live,ext}?

g.


From bix at sendu.me.uk  Wed Jun 27 16:22:13 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 21:22:13 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B798.1010409@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
	<4682B798.1010409@sheffield.ac.uk>
Message-ID: <4682C6F5.4020406@sendu.me.uk>

Nathan S. Haigh wrote:
> It seems (to me at least) that Bioperl modules could/should? be released
> as individual modules and that "bioperl" would really constitute a
> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
> this thinking? The Bio::ASN1::EntrezGene could simply require a
> particular module rather than the whole of bioperl - might get out of
> the circular dependency theoretically!?

No, it wouldn't. The 'problem' only arises because the user is 
/choosing/ to install both Bioperl and Bio::ASN1::EntrezGene at the same 
time. So even if Bioperl was released as separate modules there would 
still be that 'bundle' and users would still choose to do the same 
thing: install all the Bioperl modules as well as all its /optional/ 
recommended modules. And there lies the problem: Bio::ASN1::EntrezGene 
requires  Bioperl modules, and one Bioperl module requires 
Bio::ASN1::EntrezGene, so the circularity isn't solved.


(FYI:
Bio::ASN1::EntrezGene requires Bio::Index::AbstractSeq
Bio::Index::AbstractSeq requires a couple of Bioperl modules, including 
Bio::Root::Root

Bio::SeqIO::entrezgene requires Bio::ASN1::EntrezGene and a bunch of 
Bioperl modules, including Bio::Root::Root.
)


You only avoid circularity by choosing not to install everything in one 
go. Which is something you can do right now with no problems.


From n.haigh at sheffield.ac.uk  Wed Jun 27 16:24:18 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 21:24:18 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
Message-ID: <4682C772.5070502@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hilmar Lapp wrote:
> On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:
> 
>> Most projects make a clean break with cvs (no more commits) for the
>> reasons you point out.  Not sure how the other core devs feel about
>> that but I could go for that; it would def. prevent headaches.
> 
> There shouldn't be any cvs write support after the cut-over I think.  
> I don't see the benefit that would justify the huge headache potential.
> 
> 	-hilmar

I agree. A clean switch from cvs read/write to svn read/write plus cvs
read only sounds the least problematic!

However, how will links to cvs be dealt with? Links on Bioperl could be
switched over to point to svn, but what about possible links from
external sources? Maybe a more generic approach of redirection could
work? Or a simple warning page stating the fact that we have moved from
cvs to svn and provide a common link to follow?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgsdyczuW2jkwy2gRAtuyAKDIpN0TNX0U7sTuE3i+fj6WFZ1K0QCfcX7Y
81KurFwJlRtYFxSmLZP56Sk=
=pp7b
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 27 16:30:19 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:30:19 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>


On Jun 26, 2007, at 5:21 PM, George Hartzell wrote:

>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>

Cool - this works for me.

One thing I notice is that in cvs log you see which version is in  
which branch which is useful to answer user queries that might be a  
version problem. svn log doesn't seem to want to show that. Does  
anyone have ideas for how to do this in svn?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Jun 27 16:32:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:32:18 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4682C772.5070502@sheffield.ac.uk>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<4682C772.5070502@sheffield.ac.uk>
Message-ID: <D080DC49-A2A4-44E4-9027-A63C1772CD85@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 27, 2007, at 5:24 PM, Nathan S. Haigh wrote:

> However, how will links to cvs be dealt with?

Well I said before that probably one can write a couple of lines of  
Perl to write a cgi script that returns the appropriate redirect URL  
with a redirect status code.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGgslWuV6N2JxL7qsRAvsTAKDjR18NzWzlj74mCF+diNpe2dLV2ACgn/4Y
f6sJ/ngeKEGpKHgyAHM1DAA=
=8n0E
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Wed Jun 27 16:50:11 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:50:11 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
Message-ID: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>


On Jun 27, 2007, at 3:30 PM, Hilmar Lapp wrote:

>
> On Jun 26, 2007, at 5:21 PM, George Hartzell wrote:
>
>>
>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>>
>
> Cool - this works for me.
>
> One thing I notice is that in cvs log you see which version is in  
> which branch which is useful to answer user queries that might be a  
> version problem. svn log doesn't seem to want to show that. Does  
> anyone have ideas for how to do this in svn?
>
> 	-hilmar

We prob. should move it to a new directory ASAP which george can  
write to when he needs to update.  cvs is in /home/repository/ 
bioperl, so maybe something similar, like /home/svn/repository/bioperl?

chris


From cjfields at uiuc.edu  Wed Jun 27 16:51:37 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:51:37 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>
Message-ID: <4D8CAAD9-4774-47FB-84E0-7FBA50EC377B@uiuc.edu>


On Jun 27, 2007, at 3:08 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 5:01 PM, Chris Fields wrote:
>
>> That means no svn->cvs syncing post-migration as well, I assume.
>
> That's a bit of a different story. People out there have URL links  
> into our anonymous CVS repository. If it's not too troublesome (and  
> tend to I think it's not) I'd like to maintain those in working  
> order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi  
> script that maps between the URL flavors (i.e., that maps a CVS- 
> style URL to the equivalent SVN link).
>
> 	-hilmar

I'll try getting a wiki page up as a checklist for this, including  
what direction we're heading in, ideas (your list and CGI redirect  
ideas, svn::ignore issues, etc).  Dave has already started on the  
'getting bioperl using svn' wiki page.

If we intend to sync cvs with svn we need to find the right tools or  
at least check for other projects which have done something similar.   
I haven't googled on that yet but I'll attempt to tonight.

chris


From cjfields at uiuc.edu  Wed Jun 27 16:53:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:53:08 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <C2A83EA3.EC27%bosborne11@verizon.net>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
Message-ID: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>

bioperl-run also.  I think the run CVS repo has some binary files, so  
if there are any problems with cvs2svn it'll be there.

chris

On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote:

> George,
>
> bioperl-db and bioperl-network should be included, I think.
>
> Brian O
>
>
> On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:
>
>> David Messina writes:
>>>> [Chris]
>>>>
>>>> I managed to get it working using file://.  Haven't tried svn 
>>>> +ssh yet
>>>> but I've had persistent problems getting ssh to work properly on my
>>>> macbook; not sure why yet but I haven't had time to play around
>>>> with it.
>>>
>>> I just did a checkout and a test commit, both via svn+ssh -- works
>>> great for me.
>>
>> Is there anyone working outside of bioperl-{run,live,ext}?
>>
>> g.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Wed Jun 27 17:05:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 22:05:50 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682C6F5.4020406@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk>
Message-ID: <4682D12E.3000803@sendu.me.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> It seems (to me at least) that Bioperl modules could/should? be released
>> as individual modules and that "bioperl" would really constitute a
>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>> particular module rather than the whole of bioperl - might get out of
>> the circular dependency theoretically!?
> 
> No, it wouldn't.
[snip]
> You only avoid circularity by choosing not to install everything in one 
> go.

Errr... I take that back. Since CPAN bundles install things in a certain 
order, you just have to make sure that everything Bio::ASN1::EntrezGene 
needs is installed first, then Bio::ASN1::EntrezGene, then 
Bio::SeqIO::entrezgene.

But the main problem with this approach is that maintenance, 
global-style code improvements and releases become a nightmare. I could, 
perhaps, imagine a scenario where the repository stayed as-is (one 
monolithic collection), but the dist action of Build.PL could be altered 
to generate a release package per module instead of one big release 
package of all modules, as is currently the case.

Is there much value in doing that? Does anyone want me to look into the 
feasibility of such a thing?


From bosborne11 at verizon.net  Wed Jun 27 16:19:47 2007
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 27 Jun 2007 16:19:47 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <18050.50510.84363.355034@almost.alerce.com>
Message-ID: <C2A83EA3.EC27%bosborne11@verizon.net>

George,

bioperl-db and bioperl-network should be included, I think.

Brian O


On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:

> David Messina writes:
>>> [Chris]
>>> 
>>> I managed to get it working using file://.  Haven't tried svn+ssh yet
>>> but I've had persistent problems getting ssh to work properly on my
>>> macbook; not sure why yet but I haven't had time to play around
>>> with it.
>> 
>> I just did a checkout and a test commit, both via svn+ssh -- works
>> great for me.
> 
> Is there anyone working outside of bioperl-{run,live,ext}?
> 
> g.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Wed Jun 27 17:25:53 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 22:25:53 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682D12E.3000803@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
Message-ID: <4682D5E1.2030507@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> It seems (to me at least) that Bioperl modules could/should? be released
>>> as individual modules and that "bioperl" would really constitute a
>>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
>>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>>> particular module rather than the whole of bioperl - might get out of
>>> the circular dependency theoretically!?
>>
>> No, it wouldn't.
> [snip]
>> You only avoid circularity by choosing not to install everything in
>> one go.
> 
> Errr... I take that back. Since CPAN bundles install things in a certain
> order, you just have to make sure that everything Bio::ASN1::EntrezGene
> needs is installed first, then Bio::ASN1::EntrezGene, then
> Bio::SeqIO::entrezgene.
> 
> But the main problem with this approach is that maintenance,
> global-style code improvements and releases become a nightmare. I could,
> perhaps, imagine a scenario where the repository stayed as-is (one
> monolithic collection), but the dist action of Build.PL could be altered
> to generate a release package per module instead of one big release
> package of all modules, as is currently the case.
> 
> Is there much value in doing that? Does anyone want me to look into the
> feasibility of such a thing?


I think the value would be in other external modules being able to use
bioperl modules with more ease (not sure how many modules have, or
currently depend on bioperl) as they would depend on a single module,
rather than the whole package. However, how would the dependencies of
each module be handled? I'm clearly thinking aloud, but....Maybe this
would tease apart "cliques" of modules that are interdependent? and
could in themselves be shipped as bundles e.g. Bio::Graphics and have a
"master" bioperl bundle that installa all the bioperl modules.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgtXhczuW2jkwy2gRAiftAKDZQGDpaq5saEyE3ZfPyFqli4j+8QCfXbIB
2EZjccEFEzfFlx4H47gzwLk=
=nobl
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 27 17:35:28 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 18:35:28 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
Message-ID: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>

Is there a reason not to port every subproject over?

	-hilmar

On Jun 27, 2007, at 5:53 PM, Chris Fields wrote:

> bioperl-run also.  I think the run CVS repo has some binary files, so
> if there are any problems with cvs2svn it'll be there.
>
> chris
>
> On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote:
>
>> George,
>>
>> bioperl-db and bioperl-network should be included, I think.
>>
>> Brian O
>>
>>
>> On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:
>>
>>> David Messina writes:
>>>>> [Chris]
>>>>>
>>>>> I managed to get it working using file://.  Haven't tried svn
>>>>> +ssh yet
>>>>> but I've had persistent problems getting ssh to work properly  
>>>>> on my
>>>>> macbook; not sure why yet but I haven't had time to play around
>>>>> with it.
>>>>
>>>> I just did a checkout and a test commit, both via svn+ssh -- works
>>>> great for me.
>>>
>>> Is there anyone working outside of bioperl-{run,live,ext}?
>>>
>>> g.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 27 17:36:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:36:29 -0500
Subject: [Bioperl-l] Splits again, formerly  Test overhaul complete
In-Reply-To: <4682D12E.3000803@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
Message-ID: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>


On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> It seems (to me at least) that Bioperl modules could/should? be  
>>> released
>>> as individual modules and that "bioperl" would really constitute a
>>> "bundle" of all these modules - in terms of CPAN anyway. Am I  
>>> correct in
>>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>>> particular module rather than the whole of bioperl - might get  
>>> out of
>>> the circular dependency theoretically!?
>> No, it wouldn't.
> [snip]
>> You only avoid circularity by choosing not to install everything  
>> in one go.
>
> Errr... I take that back. Since CPAN bundles install things in a  
> certain order, you just have to make sure that everything  
> Bio::ASN1::EntrezGene needs is installed first, then  
> Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene.
>
> But the main problem with this approach is that maintenance, global- 
> style code improvements and releases become a nightmare. I could,  
> perhaps, imagine a scenario where the repository stayed as-is (one  
> monolithic collection), but the dist action of Build.PL could be  
> altered to generate a release package per module instead of one big  
> release package of all modules, as is currently the case.
>
> Is there much value in doing that? Does anyone want me to look into  
> the feasibility of such a thing?

Not for the time being, at least in my opinion.  Too much on our  
plate at this point with svn migration, test conversion, bugzilla  
running over (next point of attack!), etc.  Maybe something to think  
about after, though I like the idea of a few splits to core as Steve  
suggested (SearchIO, Graphics, some LWP-related DB modules).

My (albeit extreme) thought is to have a lean-and-mean set of 'core'  
modules with as few external dependencies as possible, which could  
work around the circular dependency issue in this case:

                dep.on                  dep.on
Bio::Auxiliary -----> ASN1::EntrezGene -----> core
(with EntrezGene)                            (basic SeqIO, Index, DB,  
etc)
       \---->------>--- dep.on ->----->----->----/

Bioperl auxiliary modules would list core as a required dependency  
along with anything else needed for that particular aux. section  
(i.e. XML parsers, LWP, GD, etc.).  The whole mess, if needed, would  
be installed using Bundle::BioPerl or similar, with no part released  
w/o testing on the whole 'base' to ensure proper interaction.

If a fix needed to be made in one set, make the fix, test against  
bioperl 'base' as a whole, and release when possible.  No need to  
wait for a full-fledged 1.5.3 release.

Maybe wishful thinking...

chris


From cjfields at uiuc.edu  Wed Jun 27 17:44:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:44:47 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
	<9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
Message-ID: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>

We should port them all, yes.

chris

On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote:

> Is there a reason not to port every subproject over?
>
> 	-hilmar


From cjfields at uiuc.edu  Wed Jun 27 17:53:02 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:53:02 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682D5E1.2030507@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<4682D5E1.2030507@sheffield.ac.uk>
Message-ID: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>


On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote:

>> ...
>> Is there much value in doing that? Does anyone want me to look  
>> into the
>> feasibility of such a thing?
>
>
> I think the value would be in other external modules being able to use
> bioperl modules with more ease (not sure how many modules have, or
> currently depend on bioperl) as they would depend on a single module,
> rather than the whole package. However, how would the dependencies of
> each module be handled? I'm clearly thinking aloud, but....Maybe this
> would tease apart "cliques" of modules that are interdependent? and
> could in themselves be shipped as bundles e.g. Bio::Graphics and  
> have a
> "master" bioperl bundle that installa all the bioperl modules.

See my response to Sendu, and Steve Chervitz's original post and  
related thread:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ 
focus=15315

which pretty much covers the same ground.  I think at most 4-5 split  
'cliques', including core, with the fewest possible dependencies in  
core.  If we do any of this, it prob. should wait until after an svn  
migration and bugzilla bug stomping unless there is a (well-argued)  
advantage to doing it now.

chris


From n.haigh at sheffield.ac.uk  Wed Jun 27 18:07:31 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 23:07:31 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<4682D5E1.2030507@sheffield.ac.uk>
	<1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>
Message-ID: <4682DFA3.9090100@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> 
> On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote:
> 
>>> ...
>>> Is there much value in doing that? Does anyone want me to look into the
>>> feasibility of such a thing?
>>
>>
>> I think the value would be in other external modules being able to use
>> bioperl modules with more ease (not sure how many modules have, or
>> currently depend on bioperl) as they would depend on a single module,
>> rather than the whole package. However, how would the dependencies of
>> each module be handled? I'm clearly thinking aloud, but....Maybe this
>> would tease apart "cliques" of modules that are interdependent? and
>> could in themselves be shipped as bundles e.g. Bio::Graphics and have a
>> "master" bioperl bundle that installa all the bioperl modules.
> 
> See my response to Sendu, and Steve Chervitz's original post and related
> thread:
> 
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/focus=15315
> 
> which pretty much covers the same ground.  I think at most 4-5 split
> 'cliques', including core, with the fewest possible dependencies in
> core.  If we do any of this, it prob. should wait until after an svn
> migration and bugzilla bug stomping unless there is a (well-argued)
> advantage to doing it now.
> 
> chris


That's fine by me - or should I say, the best way forward - I was really
just thinking aloud :)

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgt+jczuW2jkwy2gRAhPmAKDCgI1BOp/MOQVUQhQGqWaRRfPTaACfTPix
TSi/e8PtYTwpxn6x+ewrjBs=
=7Vp1
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Wed Jun 27 18:43:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 23:43:48 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
Message-ID: <4682E824.1050507@sendu.me.uk>

Chris Fields wrote:
> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:
>> But the main problem with this approach is that maintenance, global- 
>> style code improvements and releases become a nightmare. I could,  
>> perhaps, imagine a scenario where the repository stayed as-is (one  
>> monolithic collection), but the dist action of Build.PL could be  
>> altered to generate a release package per module instead of one big  
>> release package of all modules, as is currently the case.
>>
>> Is there much value in doing that? Does anyone want me to look into  
>> the feasibility of such a thing?
> 
> Not for the time being, at least in my opinion.  Too much on our  
> plate at this point with svn migration, test conversion, bugzilla  
> running over (next point of attack!), etc.  Maybe something to think  
> about after, though I like the idea of a few splits to core as Steve  
> suggested (SearchIO, Graphics, some LWP-related DB modules).
[snip]
> If a fix needed to be made in one set, make the fix, test against  
> bioperl 'base' as a whole, and release when possible.  No need to  
> wait for a full-fledged 1.5.3 release.

What advantage is there of these defined splits instead of individual 
modules? As I see it you lose some of the potential benefits of breaking 
Bioperl up completely, whilst also suffering the maintenance problems I 
outlined in my objection to Steve's post.

Being able to work on all Bioperl from a single cvs (ne svn) check out/ 
archive, whilst distributing it as individual modules on CPAN seems like 
the best of both worlds to me. What am I missing?


From hartzell at alerce.com  Wed Jun 27 20:41:01 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:41:01 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
	<9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>
Message-ID: <18051.925.23313.932916@almost.alerce.com>

Chris Fields writes:
 > [...]
 > We prob. should move it to a new directory ASAP which george can  
 > write to when he needs to update.  cvs is in /home/repository/ 
 > bioperl, so maybe something similar, like /home/svn/repository/bioperl?

I'd be parsimonious (lazy...) and go for /home/svn/bioperl.

g.


From hartzell at alerce.com  Wed Jun 27 20:46:29 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:46:29 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
Message-ID: <18051.1253.87485.235496@almost.alerce.com>

Chris Fields writes:
 > [...]
 > Now how about a quick straw poll, what kind of access?  svn+ssh is  
 > already available, but some (Aaron among them) have indicated they  
 > would like https as well (not sure how involved it would be to set up).

What we do here, in large part, depends on what our host machine makes
available to us.

Is there an apache instance that we can use?  Maybe a separate one?

May someone among us configure it, or do we need to ask for help?  (in
other words, does anyone have sudo?)

Is there some reason to not include http: (using Digest authentication
so that passwords aren't passed in the clear?)?  Maybe even go so far
as to ask why bother with https:, it's not like we need to transfer
any data encrypted....

g.


From dmessina at wustl.edu  Wed Jun 27 23:02:25 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 22:02:25 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
Message-ID: <D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>


On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 1:27 PM, David Messina wrote:
>
>> I would think we would want "Author Date Id Rev URL" set on
>> everything, no?. So either cvs2svn or your tool (whichever you think
>> is better), followed by
>>
>> 	svn propset svn:keywords "Author Date Id Rev URL" *
>
> Shouldn't this be done recursively?


Yep, good catch! Thanks, Hilmar.

Should be:

	svn propset --recursive svn:keywords "Author Date Id Rev URL" *


From jason at bioperl.org  Wed Jun 27 23:29:09 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 28 Jun 2007 00:29:09 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <18051.1253.87485.235496@almost.alerce.com>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
Message-ID: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>

I think Chris D and I will need to confer a bit on https+svn.  I  
don't know when we'll have a good chance to discuss everything.  At  
some point this discussion is may need to be taken off bioperl and  
just the interested parties as we're delving into hardware geek land.

The repository machine (dev) is a locked down machine meaning it only  
really runs ssh and not many servers include httpd.  We have  
anonymous CVS (client and through httpd browsing) running on a  
separate machine (code) that has the info rsynced over every 10 or 15  
minutes. The foundation websites and mailing lists run on a third  
machine (portal).


If we decide to support https we'll need to spend a little time  
deciding how well we can keep it locked down - it will only be https  
not http for example and we may want to see about limiting ssh access  
to everyone if we migrate all OBF projects over to SVN and only  
support https.

Again to re-iterate what I think we would do:
  - SVN read/write will live on 'dev', _WHEN_ we switch over no  
writes to the CVS repository. It will be available by ssh+svn and  
potentially by https+svn
  - SVN read-only will live on 'code', it will be accessible by http+svn
  - CVS read-only will live on 'code', this will only be a sync from  
the SVN to the CVS.  See http://svn2cvs.tigris.org/ for details


As I tried to ask for in the past, would someone also illustrate the  
importance of why _WE_ need to switch to SVN on a wiki page on  
Bioperl so that when someone complains/asks about this in the future  
the arguments are already laid out.  I am basically fine with it, but  
I don't honestly see a compelling reason beyond what has been  
mentioned wrt better integration in IDEs.
http://bioperl.org/wiki/Why_SVN

-jason
On Jun 27, 2007, at 9:46 PM, George Hartzell wrote:

> Chris Fields writes:
>> [...]
>> Now how about a quick straw poll, what kind of access?  svn+ssh is
>> already available, but some (Aaron among them) have indicated they
>> would like https as well (not sure how involved it would be to set  
>> up).
>
> What we do here, in large part, depends on what our host machine makes
> available to us.
>
> Is there an apache instance that we can use?  Maybe a separate one?
>
> May someone among us configure it, or do we need to ask for help?  (in
> other words, does anyone have sudo?)
>
> Is there some reason to not include http: (using Digest authentication
> so that passwords aren't passed in the clear?)?  Maybe even go so far
> as to ask why bother with https:, it's not like we need to transfer
> any data encrypted....
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Wed Jun 27 23:51:32 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 28 Jun 2007 00:51:32 -0300
Subject: [Bioperl-l] Splits again
In-Reply-To: <4682E824.1050507@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
Message-ID: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>

Hey guys - I'm wading in a bit late as I haven't had time to keep up  
with whole discussion.

So you are suggesting 800+ individual CPAN modules?  I don't think  
that is a good idea.  Why would you split up Bio::Seq::RichSeq and  
Bio::Seq into two separate packages for example? I think if you  
really want to move away from the monolithic install it has to be  
more logical by function - but I am not that optimistic that this is  
going to actually be easier for people.  Maybe I'm misunderstanding.

What are the arguments for separating things -- to make it so people  
aren't scared by the number of modules so they'll code?  It seems  
like some people just want it to be installed and run scripts - does  
having them install dozens of modules work.  Do we need to consider  
people how much this would suck if someone can't use CPAN or  
Module::Builder to automate dependancy tracking installation?  How  
does it work when modules are deprecated?

I'm not sure I have made up my mind on what I'd like to see, but at  
some point I think we need to get a clearer idea of what audience we  
are trying to serve best.  If want it to be easy to install maybe we  
should invest time into making OSX double-click installers, RPMs, and  
the Windows stuff easily installable.  If we want to serve the  
developers who aren't using SVN so we want to push out releases of  
modules ASAP?  I just am not clear on the motivation for some of the  
proposed changes.

Also - the main point I wanted to make - Can I suggest we spend a  
little time discussing what it will take to get a stable release for  
the current code as it stands (bioperl-live and bioperl-run)?  It  
seems like we really need to do this first so that we have a stable  
release that can be followed by CVS -> SVN migration, then consider  
major changes to the repository structure and release packaging, and  
potential deprecation and incorporation of other modules.


I assume there is no chance that we'd have a 1.6 candidate by BOSC  
next month?

Will it be productive to schedule a fair amount of time at BOSC  
discussing how to partition out the packages into separate sub- 
packages after we've done a successful release rather than trying to  
change things right now? I realize not everyone will be there but  
maybe it will be easier to interact on this then.

I think it will also be time to talk with Lincoln/Scott about how  
Gbrowse is structured and if that is working for them.  There is too  
much code in different places that I think we need to figure out how  
to structure it properly so those packages can be released.  It would  
probably mean moving Bio::Graphics, Bio::DB::GFF and  
Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages  
so they could be released more regularly on par with Gbrowse  
schedules.   Also I think someone needs to figure out Bio::Tools::GFF  
vs Bio::FeatureIO -- what do we want to do?  I don't think we really  
fully support GFF3 that well -- the X2GFF scripts probably need some  
more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL,  
etc... ) and or migration to the proper GFF writing.


-jason
On Jun 27, 2007, at 7:43 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:
>>> But the main problem with this approach is that maintenance, global-
>>> style code improvements and releases become a nightmare. I could,
>>> perhaps, imagine a scenario where the repository stayed as-is (one
>>> monolithic collection), but the dist action of Build.PL could be
>>> altered to generate a release package per module instead of one big
>>> release package of all modules, as is currently the case.
>>>
>>> Is there much value in doing that? Does anyone want me to look into
>>> the feasibility of such a thing?
>>
>> Not for the time being, at least in my opinion.  Too much on our
>> plate at this point with svn migration, test conversion, bugzilla
>> running over (next point of attack!), etc.  Maybe something to think
>> about after, though I like the idea of a few splits to core as Steve
>> suggested (SearchIO, Graphics, some LWP-related DB modules).
> [snip]
>> If a fix needed to be made in one set, make the fix, test against
>> bioperl 'base' as a whole, and release when possible.  No need to
>> wait for a full-fledged 1.5.3 release.
>
> What advantage is there of these defined splits instead of individual
> modules? As I see it you lose some of the potential benefits of  
> breaking
> Bioperl up completely, whilst also suffering the maintenance  
> problems I
> outlined in my objection to Steve's post.
>
> Being able to work on all Bioperl from a single cvs (ne svn) check  
> out/
> archive, whilst distributing it as individual modules on CPAN seems  
> like
> the best of both worlds to me. What am I missing?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From chris at bioteam.net  Thu Jun 28 00:08:25 2007
From: chris at bioteam.net (Chris Dagdigian)
Date: Thu, 28 Jun 2007 00:08:25 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <97A3257B-8E00-48D7-8B7D-51AD728CB8F7@bioteam.net>


My understanding of "https+svn" is that it is actually WebDAV-over- 
HTTP which means that not only would we need to light up a HTTPD  
server on the developer box we'd also have to get a stable mod_dav  
module installed (sometimes not trivial) and then we would have to  
figure out how to handle the authentication bits. Right now with SSH  
we use Unix group permissions to figure out who can write to what  
repository -- WebDAV makes this a lot more complicated.

Forcing encryption over https will prevent someone from sniffing a  
developer password which removes the main security issue. The next  
problem is going to be integrating the DAV module with Linux PAM so  
that existing usernames and passwords can be used, -OR- we have to  
set up and maintain an entirely separate set of username and password  
maps for each developer and each SVN project.

I'm not super concerned about this -- BioTeam runs svn internally and  
we expose our SVN for employees both via WebDAV and SVN+SSH - it's  
not that hard to set up.

My biggest concern really has to do with how much extra work this  
will mean for the OBF sysadmin team. If there is an easy way to get a  
stable Apache/DAV/SVN integration going with authentication coming  
from Linux PAM then this is no big deal. If we have to manually  
maintain separate authentication lists then it will be kind of a hassle.

Like Jason mentioned, the OBF currently segregates "stuff" onto three  
different servers with three levels of security:

- dev.open-bio.org -- Developers only, SSH access only (main  
sourcecode repository for OBF)
- portal.open-bio.org -- Websites, Wikis, Blogs, Mailing list servers  
and helpdesk.open-bio.org
- code.open-bio.org -- "Disposable" anonymous access server that we  
can easily burn/wipe/reinstall if it ever gets hacked

Everything else that Jason mentioned is fine and easy to set up (if  
not already running):

  - SVN+SSH for developers
  - Anonymous SVN and Anonymous RSYNC for community access on  
code.open-bio.org
  - svn2cvs for whomever wants it on code.open-bio.org
  - web based SVN code browser installed on http://code.open-bio.org


Regards,
Chris


On Jun 27, 2007, at 11:29 PM, Jason Stajich wrote:

> I think Chris D and I will need to confer a bit on https+svn.  I  
> don't know when we'll have a good chance to discuss everything.  At  
> some point this discussion is may need to be taken off bioperl and  
> just the interested parties as we're delving into hardware geek land.
>
> The repository machine (dev) is a locked down machine meaning it  
> only really runs ssh and not many servers include httpd.  We have  
> anonymous CVS (client and through httpd browsing) running on a  
> separate machine (code) that has the info rsynced over every 10 or  
> 15 minutes. The foundation websites and mailing lists run on a  
> third machine (portal).
>
>
> If we decide to support https we'll need to spend a little time  
> deciding how well we can keep it locked down - it will only be  
> https not http for example and we may want to see about limiting  
> ssh access to everyone if we migrate all OBF projects over to SVN  
> and only support https.
>
> Again to re-iterate what I think we would do:
>  - SVN read/write will live on 'dev', _WHEN_ we switch over no  
> writes to the CVS repository. It will be available by ssh+svn and  
> potentially by https+svn
>  - SVN read-only will live on 'code', it will be accessible by http 
> +svn
>  - CVS read-only will live on 'code', this will only be a sync from  
> the SVN to the CVS.  See http://svn2cvs.tigris.org/ for details
>
>
> As I tried to ask for in the past, would someone also illustrate  
> the importance of why _WE_ need to switch to SVN on a wiki page on  
> Bioperl so that when someone complains/asks about this in the  
> future the arguments are already laid out.  I am basically fine  
> with it, but I don't honestly see a compelling reason beyond what  
> has been mentioned wrt better integration in IDEs.
> http://bioperl.org/wiki/Why_SVN
>
> -jason
> On Jun 27, 2007, at 9:46 PM, George Hartzell wrote:
>
>> Chris Fields writes:
>>> [...]
>>> Now how about a quick straw poll, what kind of access?  svn+ssh is
>>> already available, but some (Aaron among them) have indicated they
>>> would like https as well (not sure how involved it would be to  
>>> set up).
>>
>> What we do here, in large part, depends on what our host machine  
>> makes
>> available to us.
>>
>> Is there an apache instance that we can use?  Maybe a separate one?
>>
>> May someone among us configure it, or do we need to ask for help?   
>> (in
>> other words, does anyone have sudo?)
>>
>> Is there some reason to not include http: (using Digest  
>> authentication
>> so that passwords aren't passed in the clear?)?  Maybe even go so far
>> as to ask why bother with https:, it's not like we need to transfer
>> any data encrypted....
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>


From cjfields at uiuc.edu  Thu Jun 28 00:18:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 23:18:03 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4682E824.1050507@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
Message-ID: <FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>


On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:

> Chris Fields wrote:
> ...
>> If a fix needed to be made in one set, make the fix, test against   
>> bioperl 'base' as a whole, and release when possible.  No need to   
>> wait for a full-fledged 1.5.3 release.
>
> What advantage is there of these defined splits instead of  
> individual modules? As I see it you lose some of the potential  
> benefits of breaking Bioperl up completely, whilst also suffering  
> the maintenance problems I outlined in my objection to Steve's post.
>
> Being able to work on all Bioperl from a single cvs (ne svn) check  
> out/ archive, whilst distributing it as individual modules on CPAN  
> seems like the best of both worlds to me. What am I missing?

Okay, forewarned, but here's my long-winded reasoning.  The short and  
sweet version: I (very) respectfully don't agree with you, at least  
re: the idea we should commit all modules to CPAN independently.  It  
doesn't make any sense to me, but maybe you can elaborate more?   
Maybe I'm misinterpreting what you mean?

Also, I agree with Steve C. that core is anything but a  
representation of a 'core' set of modules, and some sections could  
(should?) be split off into discrete, cohesive units.  We may be  
alone in that camp, though it doesn't seem so (it's popped up more  
than a few times, in one form or another).  If you want an in-depth  
explanation for both opinions, read on (below my sig), or feel free  
to bypass it.  I'll understand.

Finally, all of this should wait until later.  Much later, like after  
a decent release, after svn, etc kind of 'later'.  I think we can  
agree on that.

.
.
.
.
.

Still here?  Okay... each issue (skip as needed):

Individual CPAN modules:

CPAN is not our personal versioning system; it may be if a  
distribution consists of only a few modules, but not when it's one of  
the largest distros present.  If someone wants to update an  
individual bioperl module for a quick bug fix they are more than  
welcome to download it via cvs, svn, or even using a web browser, and  
replace the one they have.  In most cases, it works w/o problems.   
With Module::Build you have even made it easier if a full  
installation is necessary.

I'm trying to reason how one could break up the individual SeqIO/ 
SearchIO/otherIO modules into single module distributions.  They are  
intrinsically tied together (SeqIO::genbank won't work w/o SeqIO,  
which relies on the various interfaces, RootIO, and on down).  How  
would tests be run off CPAN when the modules are distributed  
independently?  Would they also be individually distributed?  What  
would you use to tie all the individual modules together?  How would  
you explain to the CPAN maintainers that you want to split bioperl  
into 990 individual modules, all updated independently, but intend on  
bundling them afterwards anyway?

I'm failing to see the advantages to this approach, but if you can  
find an example where this was done successfully on CPAN or elsewhere  
maybe I could see what you mean.

Splitting up core:

As I see it, here are the advantages of a defined split as Steve and  
I see it (off the top of my head).  Some of this probably reiterates  
my previous points, as well as Steve's, so apologies in advance.

- A lean, mean, focused set of bioperl base modules (core) w/o or  
with very few external deps, minimal installation issues, etc.  The  
very basic stuff to get up and running.

- BioPerl bundled modules (Nathan's 'cliques') with defined, focused  
functionality, code, and tests, which add a bit more 'sugar' to the  
base functionality of the core.  If you only care about parsing BLAST  
reports, get SearchIO, which requires core and optionally other  
modules (XML::SAX).  If you want additional DB functionality apart  
from the very basic ones in core, install DB (with it's additional  
requirements, including core, DBI, and so on).  Same with Graphics,  
Tools, Tree/Phylo, etc.  We just need to define and limit the number  
of splits.

- Easier to add additional bundled modules.  For instance, I could  
focus all of my RNA work into a discrete set of modules (say, bioperl- 
rna) which I maintain, I ensure works with the latest core code, I  
ensure also plays well with the other children =) , and I distribute  
via CPAN.  Same with EUtilities, which could go into a separated DB- 
related set or stay in core.

- If we want a full-fledged 'install everything', the CPAN Bundle  
system is available.  I think it's easier to use a Bundle for 4-5,  
even 10 groups of modules as opposed to over 900.

- A Bundle or a build file where discrete distributions are listed  
(Bio::SearchIO, etc) wouldn't need to be updated every time a new  
module is added to a distribution.  I suppose this could be  
automated, but why have the additional headache?

- A chance to cut out some cruft.  We all know that particular areas  
need work or a complete overhaul (Restriction, Structure, maybe a few  
others).  Smaller, concentrated sets of modules I believe would be  
easier to maintain, and those that don't get use will eventually fall  
out of favor and may be lost or replaced from the more maintained  
group of modules.  Survival of the fittest.

- We already have had practice; bioperl-db, bioperl-run, bioperl- 
network, and others.  Those that have been routinely maintained and  
enjoy wide use (db, run, network) have survived; others not so much  
(corba-related stuff, microarray, ext, etc., though the code is still  
available if someone else wants to take it up and revive it!).

Disadvantages of a defined split:

- The initial headache of identifying which groups go where,  
coordinating with those who rely on bioperl (GMOD, etc) on how this  
will be set up, so on...

- Separate groups of modules require testing together to ensure  
functionality is consistent and maintained (something I think you  
pointed out previously).

- I think an increased possibility of branching is possible.

- Extra headaches for devs, who have to keep track of the various  
critical distributions and make sure they work well together.

- Maybe others, but it's getting late here.  Add more as needed; I'm  
sure there are a number more.


chris


From cjfields at uiuc.edu  Thu Jun 28 01:17:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 00:17:01 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <671B8432-28DA-47DA-9E0C-66AF0E3D5973@uiuc.edu>

D'oh!  Just when I wanted to go to bed.  It's not fair, you're in  
California...

On Jun 27, 2007, at 10:51 PM, Jason Stajich wrote:

> Hey guys - I'm wading in a bit late as I haven't had time to keep up
> with whole discussion.
>
> So you are suggesting 800+ individual CPAN modules?  I don't think
> that is a good idea.  Why would you split up Bio::Seq::RichSeq and
> Bio::Seq into two separate packages for example? I think if you
> really want to move away from the monolithic install it has to be
> more logical by function - but I am not that optimistic that this is
> going to actually be easier for people.  Maybe I'm misunderstanding.

Okay, so maybe it wasn't just me.

> What are the arguments for separating things -- to make it so people
> aren't scared by the number of modules so they'll code?  It seems
> like some people just want it to be installed and run scripts - does
> having them install dozens of modules work.  Do we need to consider
> people how much this would suck if someone can't use CPAN or
> Module::Builder to automate dependancy tracking installation?  How
> does it work when modules are deprecated?

What I envision for core is maybe not just one distribution, but a  
cluster of distributions:

base - Bio::Seq; Bio::SeqIO; Bio::AlignIO, some Bio::DB, associated  
modules.  Bare bones, with as few dependencies as possible.
aux - Any Bio::SeqIO, Bio::AlignIO, Bio::DB etc. that requires  
additional modules.
search - Bio::Search and SearchIO
tools - Bio::Tools, Bio::Restriction, maybe DB modules, GFF-related  
stuff?
graphics - Bio::Graphics.  Maybe GMOD-related stuff here?

The last four would list bioperl-core as a dependency themselves  
along with any other modules necessary.  We could also have the core  
Build.PL ask the user if they want to install the other non-base  
distros, and maybe include bioperl-db, bioperl-network, and bioperl- 
run in the loop if requested.

All would be installed as a bundle similar to Bundle::BioPerl, but  
have regular CPAN point releases (1.x.x) independently from one  
another i.e. for bug fixes, with a yearly/biyearly timed full release  
(1.x) of the whole shebang.  Any point release for any 'core'  
distribution would have to be tested against the others prior to  
release.

This is basically following Steve's train of thought, though more  
elaborated:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ 
focus=15315

> I'm not sure I have made up my mind on what I'd like to see, but at
> some point I think we need to get a clearer idea of what audience we
> are trying to serve best.  If want it to be easy to install maybe we
> should invest time into making OSX double-click installers, RPMs, and
> the Windows stuff easily installable.  If we want to serve the
> developers who aren't using SVN so we want to push out releases of
> modules ASAP?  I just am not clear on the motivation for some of the
> proposed changes.

I think regular CPAN releases with updated PPMs hosted via portal  
work fine for the most part, but it would be nice to host RPMs.   
Others (Allen Day, for instance) have donated time to generate RPMs  
but they seem to lag behind a bit more.

The original idea for svn arose from an unrelated thread with Mark  
Johnson discussing something (Glimmer maybe?) and took off from  
there.  I was actually pretty surprised it took on a life of it's  
own.  As for the motivation to switch, I haven't specifically used it  
myself, but the large number of responses seem to indicate others  
have and seem happy with it.  Rutger Vos had also indicated he would  
move Bio::Phylo over to the repo if we used svn.  We def. should  
address the issues you bring up (why _WE_ need svn) more succinctly  
but that shouldn't be an issue.

> Also - the main point I wanted to make - Can I suggest we spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

Agreed.  We prob. need to schedule a good couple of days (or so) to  
squash bugs.

> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

Um, not likely as nothing has been addressed Feature/Annotation-wise  
(overloads are still there, methods have not been deprecated, etc).   
There was an underlying assumption these would have an effect on GMOD- 
related stuff (I remember reading a post from Scott Cain in the mail  
archive mentioning something along these lines after the 1.5 release  
hubbub).

Maybe a quick 1.5.3 for BOSC, with a 1.6 for fall?

> Will it be productive to schedule a fair amount of time at BOSC
> discussing how to partition out the packages into separate sub-
> packages after we've done a successful release rather than trying to
> change things right now? I realize not everyone will be there but
> maybe it will be easier to interact on this then.

How many are going to be there?  I can't go this year except on my  
own dime (which I don't have many of, student loans and all, sorry),  
though I'll likely be in a new lab by spring which is likely more  
amenable to funding.  If there is a hackathon in the late fall (post- 
sept) I'll make it a point to go regardless.

> I think it will also be time to talk with Lincoln/Scott about how
> Gbrowse is structured and if that is working for them.  There is too
> much code in different places that I think we need to figure out how
> to structure it properly so those packages can be released.  It would
> probably mean moving Bio::Graphics, Bio::DB::GFF and
> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
> so they could be released more regularly on par with Gbrowse
> schedules.   Also I think someone needs to figure out Bio::Tools::GFF
> vs Bio::FeatureIO -- what do we want to do?  I don't think we really
> fully support GFF3 that well -- the X2GFF scripts probably need some
> more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL,
> etc... ) and or migration to the proper GFF writing.
>
>
> -jason

Will Lincoln or Scott be at BOSC?

chris


From dmessina at wustl.edu  Thu Jun 28 01:21:58 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 00:21:58 -0500
Subject: [Bioperl-l] finding statistics on AA
In-Reply-To: <4681F4B4.8010609@pacific.net.sg>
References: <4681F4B4.8010609@pacific.net.sg>
Message-ID: <F57E70E8-BBDA-45CF-B2C7-E05AED04F4C6@wustl.edu>

Hi Melvin,

I don't think BioPerl has any information content-related code. I'm  
not terribly familiar with it myself, but the usual recommendation is  
to look at the EMBOSS package:

	http://en.wikipedia.org/wiki/EMBOSS

Dave


From bix at sendu.me.uk  Thu Jun 28 02:38:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 07:38:48 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <46835778.5070901@sendu.me.uk>

Jason Stajich wrote:
> So you are suggesting ou are suggesting 800+ individual CPAN modules?
> I don't think that is a good idea.  Why would you split up
> Bio::Seq::RichSeq and Bio::Seq into two separate packages for
> example? I think if you really want to move away from the monolithic
> install it has to be more logical by function - but I am not that
> optimistic that this is going to actually be easier for people.
> Maybe I'm misunderstanding.
> 
> What are the arguments for separating things -- to make it so people
>  aren't scared by the number of modules so they'll code?  It seems
> like some people just want it to be installed and run scripts - does
> having them install dozens of modules work.  Do we need to consider
> people how much this would suck if someone can't use CPAN or
> Module::Builder to automate dependancy tracking installation?  How
> does it work when modules are deprecated?

See my upcoming reply to Chris. Briefly, if the only change is to the
dist action of Build.PL, we can make a single archive of all modules
available to non-CPAN users, and individual modules available to CPAN
users. No problems.


> Also - the main point I wanted to make - Can I suggest we spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

I'd recommend that a 'stable' release shouldn't happen until we resolve
all the missing tests and bugzilla bugs (because I think the opportunity
should be taken to have it stable both in terms of interface /and/
bugs). Which is a lot of work.


> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

None.


From bix at sendu.me.uk  Thu Jun 28 03:25:03 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 08:25:03 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
Message-ID: <4683624F.6020402@sendu.me.uk>

Chris Fields wrote:
> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>> What advantage is there of these defined splits instead of  
>> individual modules? As I see it you lose some of the potential  
>> benefits of breaking Bioperl up completely, whilst also suffering  
>> the maintenance problems I outlined in my objection to Steve's post.
>>
>> Being able to work on all Bioperl from a single cvs (ne svn) check  
>> out/ archive, whilst distributing it as individual modules on CPAN  
>> seems like the best of both worlds to me. What am I missing?
> 
> Okay, forewarned, but here's my long-winded reasoning.  The short and  
> sweet version: I (very) respectfully don't agree with you, at least  
> re: the idea we should commit all modules to CPAN independently. It  
> doesn't make any sense to me, but maybe you can elaborate more?   
> Maybe I'm misinterpreting what you mean?

The short and sweet version: my proposal has all the benefits of yours, 
but none of the disadvantages. What's not to like?


> Finally, all of this should wait until later.  Much later, like after  
> a decent release, after svn, etc kind of 'later'.  I think we can  
> agree on that.

Hmm, not really. If it can be implemented by a change in just Build.PL 
and ModuleBuildBioperl, its really independent of everything else. 
That's the beauty of it: the only thing that changes is how things are 
uploaded to and downloaded from CPAN. The only person that normally 
deals with that issue is the pumpkin for a release, and he only cares 
about it at release time.

In fact, if we're going to do it at all it makes sense to try it out on 
a minor release like 1.5.3. We've already got experience of doing it 
split-style from 1.5.2. (And let me tell you: splits at the code-base 
level suck.)


> Individual CPAN modules:
> 
> CPAN is not our personal versioning system; it may be if a  
> distribution consists of only a few modules, but not when it's one of  
> the largest distros present.  If someone wants to update an  
> individual bioperl module for a quick bug fix they are more than  
> welcome to download it via cvs, svn, or even using a web browser, and  
> replace the one they have.

And where is the harm in letting them do it via CPAN as well? In fact, 
there are significant benefits:


> I'm trying to reason how one could break up the individual SeqIO/ 
> SearchIO/otherIO modules into single module distributions.  They are  
> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO,  
> which relies on the various interfaces, RootIO, and on down).  How  
> would tests be run off CPAN when the modules are distributed  
> independently?

Bio::SeqIO::genbank would have a dependency on the latest version of 
Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.

So when a user wants to get the latest version of Bio::SeqIO::genbank, 
they no longer have to worry about what other modules in its dependency 
hierarchy they should also install.

Instead they just request Bio::SeqIO::genbank which itself ensures you 
have the latest version of all its dependencies before installing itself 
and running its tests.

When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank 
users should have, he could just call './Build dist Bio::SeqIO::genbank' 
which would generate a new package for Bio::SeqIO::genbank suitable for 
uploading to CPAN. No more long release cycles and having to constantly 
tell people to 'use CVS' to get working Bioperl code.


> Would they also be individually distributed?  What  
> would you use to tie all the individual modules together?  How would  
> you explain to the CPAN maintainers that you want to split bioperl  
> into 990 individual modules, all updated independently, but intend on  
> bundling them afterwards anyway?

They would be tied together by a CPAN bundle. You don't have to 
'explain' anything to the CPAN maintainers because you're not doing 
anything wrong. In fact, you're using it the way you're supposed to.


> Splitting up core:
> 
> As I see it, here are the advantages of a defined split as Steve and  
> I see it (off the top of my head).  Some of this probably reiterates  
> my previous points, as well as Steve's, so apologies in advance.

Below I answer with how it would be with my single-module approach 
compared to the defined splits.


> - A lean, mean, focused set of bioperl base modules (core) w/o or  
> with very few external deps, minimal installation issues, etc.  The  
> very basic stuff to get up and running.

Even leaner, even more focused.


> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused  
> functionality, code, and tests, which add a bit more 'sugar' to the  
> base functionality of the core.  If you only care about parsing BLAST  
> reports, get SearchIO, which requires core and optionally other  
> modules (XML::SAX).  If you want additional DB functionality apart  
> from the very basic ones in core, install DB (with it's additional  
> requirements, including core, DBI, and so on).  Same with Graphics,  
> Tools, Tree/Phylo, etc.  We just need to define and limit the number  
> of splits.

The same can be achieved with CPAN bundles for each kind of functional 
grouping you can think of. And since its just a single text file that 
defines such a grouping, its easy to change or add new ones as you feel 
like it, as opposed to the rather more permanent and substantial effort 
of creating one of your splits on the code-base level.

Also, the world doesn't have to rely on /our/ ideas of what a useful 
functional split is. If someone just wants to parse Blast results, they 
can just use CPAN to install Bio::SearchIO::blast_pull instead of having 
to install all of SearchIO.


> - Easier to add additional bundled modules.  For instance, I could  
> focus all of my RNA work into a discrete set of modules (say, bioperl- 
> rna) which I maintain, I ensure works with the latest core code, I  
> ensure also plays well with the other children =) , and I distribute  
> via CPAN.  Same with EUtilities, which could go into a separated DB- 
> related set or stay in core.

And if you lose interest in them? They eventually die because they no 
longer have someone looking after them by default (the pumpkin and other 
devs). Alternatively you could just make a CPAN bundle. One text file! 
Easy! No duplication of modules in CPAN, no new hassle for you or the 
Bioperl 'core' pumpkin to ensure that the latest version of each work 
with each other and other splits.


> - If we want a full-fledged 'install everything', the CPAN Bundle  
> system is available.  I think it's easier to use a Bundle for 4-5,  
> even 10 groups of modules as opposed to over 900.

No, it isn't any easier. Its /equally/ easy to install a bundle of 900 
packages of 900 modules as it is to install 5 packages of 900 modules.

When not installing absolutely everything, but perhaps 'most' things, 
there's the additional benefit that it would be easier to skip a 
particular Bio::module because you didn't want to install its external 
dependencies and weren't that interested in it anyway.


> - A Bundle or a build file where discrete distributions are listed  
> (Bio::SearchIO, etc) wouldn't need to be updated every time a new  
> module is added to a distribution.  I suppose this could be  
> automated, but why have the additional headache?

Yes, it would be automated, and no, it wouldn't at all be any kind of 
additional headache. I'm proposing a fully-automated system that the 
pumpkin wouldn't even have to think about it. Much /less/ of a headache 
than dealing with splits. Orders of magnitude easier to deal with.


> - A chance to cut out some cruft.  We all know that particular areas  
> need work or a complete overhaul (Restriction, Structure, maybe a few  
> others).  Smaller, concentrated sets of modules I believe would be  
> easier to maintain, and those that don't get use will eventually fall  
> out of favor and may be lost or replaced from the more maintained  
> group of modules.  Survival of the fittest.

And the smallest, most concentrated set of modules is the individual module.


> - We already have had practice; bioperl-db, bioperl-run, bioperl- 
> network, and others.  Those that have been routinely maintained and  
> enjoy wide use (db, run, network) have survived; others not so much  
> (corba-related stuff, microarray, ext, etc., though the code is still  
> available if someone else wants to take it up and revive it!).

The reason some of these existing splits (micoarray, ext) have fallen by 
the way-side? /Because/ they're splits. If they had been part of 
bioperl-live all along, they'd have been kept in a working, compatible 
state and would have been released along with everything else in 1.5.2


> Disadvantages of a defined split:
> 
> - The initial headache of identifying which groups go where,  
> coordinating with those who rely on bioperl (GMOD, etc) on how this  
> will be set up, so on...

No need to worry about this with individual modules.


> - Separate groups of modules require testing together to ensure  
> functionality is consistent and maintained (something I think you  
> pointed out previously).

No need to worry.


> - I think an increased possibility of branching is possible.
> 
> - Extra headaches for devs, who have to keep track of the various  
> critical distributions and make sure they work well together.

No headaches.


From charles-listes+bioperl at plessy.org  Thu Jun 28 03:40:04 2007
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Thu, 28 Jun 2007 16:40:04 +0900
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
Message-ID: <20070628074004.GD6338@kunpuu.plessy.org>

Dear developpers,

I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
it would make sense to call it "bioperl-live" and distribute it in
parallel with the stable 1.4.0 version, if bioperl-live means "the
current developepr version".

If I am wrong, can somebody explain me what bioperl-live exactly refers
to ?

Have a nice day,

-- 
Charles Plessy
Debian-med packaging team
Wako, Saitama, Japan


From n.haigh at sheffield.ac.uk  Thu Jun 28 04:23:10 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:23:10 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <46836FEE.5030203@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Chris Fields wrote:
>> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>>> What advantage is there of these defined splits instead of 
>>> individual modules? As I see it you lose some of the potential 
>>> benefits of breaking Bioperl up completely, whilst also suffering 
>>> the maintenance problems I outlined in my objection to Steve's post.
>>>
>>> Being able to work on all Bioperl from a single cvs (ne svn) check 
>>> out/ archive, whilst distributing it as individual modules on CPAN 
>>> seems like the best of both worlds to me. What am I missing?
>>
>> Okay, forewarned, but here's my long-winded reasoning.  The short and 
>> sweet version: I (very) respectfully don't agree with you, at least 
>> re: the idea we should commit all modules to CPAN independently. It 
>> doesn't make any sense to me, but maybe you can elaborate more?  
>> Maybe I'm misinterpreting what you mean?
> 
> The short and sweet version: my proposal has all the benefits of yours,
> but none of the disadvantages. What's not to like?
> 
> 
>> Finally, all of this should wait until later.  Much later, like after 
>> a decent release, after svn, etc kind of 'later'.  I think we can 
>> agree on that.
> 
> Hmm, not really. If it can be implemented by a change in just Build.PL
> and ModuleBuildBioperl, its really independent of everything else.
> That's the beauty of it: the only thing that changes is how things are
> uploaded to and downloaded from CPAN. The only person that normally
> deals with that issue is the pumpkin for a release, and he only cares
> about it at release time.
> 
> In fact, if we're going to do it at all it makes sense to try it out on
> a minor release like 1.5.3. We've already got experience of doing it
> split-style from 1.5.2. (And let me tell you: splits at the code-base
> level suck.)
> 
> 
>> Individual CPAN modules:
>>
>> CPAN is not our personal versioning system; it may be if a 
>> distribution consists of only a few modules, but not when it's one of 
>> the largest distros present.  If someone wants to update an 
>> individual bioperl module for a quick bug fix they are more than 
>> welcome to download it via cvs, svn, or even using a web browser, and 
>> replace the one they have.
> 
> And where is the harm in letting them do it via CPAN as well? In fact,
> there are significant benefits:
> 
> 
>> I'm trying to reason how one could break up the individual SeqIO/
>> SearchIO/otherIO modules into single module distributions.  They are 
>> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, 
>> which relies on the various interfaces, RootIO, and on down).  How 
>> would tests be run off CPAN when the modules are distributed 
>> independently?
> 
> Bio::SeqIO::genbank would have a dependency on the latest version of
> Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.
> 
> So when a user wants to get the latest version of Bio::SeqIO::genbank,
> they no longer have to worry about what other modules in its dependency
> hierarchy they should also install.
> 
> Instead they just request Bio::SeqIO::genbank which itself ensures you
> have the latest version of all its dependencies before installing itself
> and running its tests.

This was my thinking when I first brought this up at the
begining/splitting of this thread. This way of thinking of modules as
the constituent parts of a larger package should make it easier for
people to define dependencies far easier as well as users only needing
to install those parts they require. As Sendu points out, if the user
wants to convert seqs from genbank to fasta they could simply install
Bio::SeqIO::genbank and Bio::SeqIO::fasta and they would get all the
other modules that are the dependencies of Bio::SeqIO::genbank and
Bio::SeqIO::fasta.

> 
> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
> users should have, he could just call './Build dist Bio::SeqIO::genbank'
> which would generate a new package for Bio::SeqIO::genbank suitable for
> uploading to CPAN. No more long release cycles and having to constantly
> tell people to 'use CVS' to get working Bioperl code.

However, how would the test suite work out with this? e.g. when someone
installs Bio::SeqIO::genbank they want to have the tests associated with
Bio::SeqIO::genbank to be run. Would there be tests that would be run
redundantly if for example someone installed Bio::SeqIO::genbank and
Bio::SeqIO::fasta?

> 
> 
>> Would they also be individually distributed?  What  would you use to
>> tie all the individual modules together?  How would  you explain to
>> the CPAN maintainers that you want to split bioperl  into 990
>> individual modules, all updated independently, but intend on  bundling
>> them afterwards anyway?
> 
> They would be tied together by a CPAN bundle. You don't have to
> 'explain' anything to the CPAN maintainers because you're not doing
> anything wrong. In fact, you're using it the way you're supposed to.

Yep. real modules are released as modules, each with their own set of
dependencies. The use CPAN bundles the way there were supposed to be for
- - distributing a set of CPAN modules that make a coherent set of
functionality. You "could" also bundle in other authors modules e.g.
Bio::ASN1::EntrezGene?

> 
> 
>> Splitting up core:
>>
>> As I see it, here are the advantages of a defined split as Steve and 
>> I see it (off the top of my head).  Some of this probably reiterates 
>> my previous points, as well as Steve's, so apologies in advance.
> 
> Below I answer with how it would be with my single-module approach
> compared to the defined splits.
> 
> 
>> - A lean, mean, focused set of bioperl base modules (core) w/o or 
>> with very few external deps, minimal installation issues, etc.  The 
>> very basic stuff to get up and running.
> 
> Even leaner, even more focused.
> 
> 
>> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused 
>> functionality, code, and tests, which add a bit more 'sugar' to the 
>> base functionality of the core.  If you only care about parsing BLAST 
>> reports, get SearchIO, which requires core and optionally other 
>> modules (XML::SAX).  If you want additional DB functionality apart 
>> from the very basic ones in core, install DB (with it's additional 
>> requirements, including core, DBI, and so on).  Same with Graphics, 
>> Tools, Tree/Phylo, etc.  We just need to define and limit the number 
>> of splits.
> 
> The same can be achieved with CPAN bundles for each kind of functional
> grouping you can think of. And since its just a single text file that
> defines such a grouping, its easy to change or add new ones as you feel
> like it, as opposed to the rather more permanent and substantial effort
> of creating one of your splits on the code-base level.
> 
> Also, the world doesn't have to rely on /our/ ideas of what a useful
> functional split is. If someone just wants to parse Blast results, they
> can just use CPAN to install Bio::SearchIO::blast_pull instead of having
> to install all of SearchIO.
> 
> 
>> - Easier to add additional bundled modules.  For instance, I could 
>> focus all of my RNA work into a discrete set of modules (say, bioperl-
>> rna) which I maintain, I ensure works with the latest core code, I 
>> ensure also plays well with the other children =) , and I distribute 
>> via CPAN.  Same with EUtilities, which could go into a separated DB-
>> related set or stay in core.
> 
> And if you lose interest in them? They eventually die because they no
> longer have someone looking after them by default (the pumpkin and other
> devs). Alternatively you could just make a CPAN bundle. One text file!
> Easy! No duplication of modules in CPAN, no new hassle for you or the
> Bioperl 'core' pumpkin to ensure that the latest version of each work
> with each other and other splits.

Hmm, how would module versions be handled? Wouldn't this approach
require each module to have it's own independent version number, which
could then be used for building the dependencies? Each new release of
that module would only bump that module's version number.

Bundles can specify the minimum version of a module to be installed,
such that bug fixes to individual modules and be released into CPAN and
would automatically get picked up when installing bundles etc.

I'm not quite sure how the current stable/dev releases would work. I
assume bug fixes would have to be made on a branch e.g. branch 1.6 and
released to cpan from there. Then when the next stable release is made,
all module versions would be bumped and and released to CPAN. With any
modifications to the content of the bundle to be made. Is it possible to
have a stable and developer release bundles that are able to specify the
minimum stable and developer modules versions respectively?


> 
> 
>> - If we want a full-fledged 'install everything', the CPAN Bundle 
>> system is available.  I think it's easier to use a Bundle for 4-5, 
>> even 10 groups of modules as opposed to over 900.
> 
> No, it isn't any easier. Its /equally/ easy to install a bundle of 900
> packages of 900 modules as it is to install 5 packages of 900 modules.
> 
> When not installing absolutely everything, but perhaps 'most' things,
> there's the additional benefit that it would be easier to skip a
> particular Bio::module because you didn't want to install its external
> dependencies and weren't that interested in it anyway.
> 
> 
>> - A Bundle or a build file where discrete distributions are listed 
>> (Bio::SearchIO, etc) wouldn't need to be updated every time a new 
>> module is added to a distribution.  I suppose this could be 
>> automated, but why have the additional headache?
> 
> Yes, it would be automated, and no, it wouldn't at all be any kind of
> additional headache. I'm proposing a fully-automated system that the
> pumpkin wouldn't even have to think about it. Much /less/ of a headache
> than dealing with splits. Orders of magnitude easier to deal with.
> 
> 
>> - A chance to cut out some cruft.  We all know that particular areas 
>> need work or a complete overhaul (Restriction, Structure, maybe a few 
>> others).  Smaller, concentrated sets of modules I believe would be 
>> easier to maintain, and those that don't get use will eventually fall 
>> out of favor and may be lost or replaced from the more maintained 
>> group of modules.  Survival of the fittest.
> 
> And the smallest, most concentrated set of modules is the individual
> module.
> 
> 
>> - We already have had practice; bioperl-db, bioperl-run, bioperl-
>> network, and others.  Those that have been routinely maintained and 
>> enjoy wide use (db, run, network) have survived; others not so much 
>> (corba-related stuff, microarray, ext, etc., though the code is still 
>> available if someone else wants to take it up and revive it!).
> 
> The reason some of these existing splits (micoarray, ext) have fallen by
> the way-side? /Because/ they're splits. If they had been part of
> bioperl-live all along, they'd have been kept in a working, compatible
> state and would have been released along with everything else in 1.5.2
> 
> 
>> Disadvantages of a defined split:
>>
>> - The initial headache of identifying which groups go where, 
>> coordinating with those who rely on bioperl (GMOD, etc) on how this 
>> will be set up, so on...
> 
> No need to worry about this with individual modules.
> 
> 
>> - Separate groups of modules require testing together to ensure 
>> functionality is consistent and maintained (something I think you 
>> pointed out previously).
> 
> No need to worry.

Maye need to worry aout how the tests are run when installing individual
modules etc?

> 
> 
>> - I think an increased possibility of branching is possible.
>>
>> - Extra headaches for devs, who have to keep track of the various 
>> critical distributions and make sure they work well together.
> 
> No headaches.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg2/uczuW2jkwy2gRAlR4AJ44kHIXWWapNVGOIrkFBJdP9rn3vwCdErhT
VkymyXNshguE44/RilEXWDA=
=O5ex
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Thu Jun 28 04:27:54 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:27:54 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <4683710A.9010808@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Chris Fields wrote:
>> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>>> What advantage is there of these defined splits instead of 
>>> individual modules? As I see it you lose some of the potential 
>>> benefits of breaking Bioperl up completely, whilst also suffering 
>>> the maintenance problems I outlined in my objection to Steve's post.
>>>
>>> Being able to work on all Bioperl from a single cvs (ne svn) check 
>>> out/ archive, whilst distributing it as individual modules on CPAN 
>>> seems like the best of both worlds to me. What am I missing?
>>
>> Okay, forewarned, but here's my long-winded reasoning.  The short and 
>> sweet version: I (very) respectfully don't agree with you, at least 
>> re: the idea we should commit all modules to CPAN independently. It 
>> doesn't make any sense to me, but maybe you can elaborate more?  
>> Maybe I'm misinterpreting what you mean?
> 
> The short and sweet version: my proposal has all the benefits of yours,
> but none of the disadvantages. What's not to like?
> 
> 
>> Finally, all of this should wait until later.  Much later, like after 
>> a decent release, after svn, etc kind of 'later'.  I think we can 
>> agree on that.
> 
> Hmm, not really. If it can be implemented by a change in just Build.PL
> and ModuleBuildBioperl, its really independent of everything else.
> That's the beauty of it: the only thing that changes is how things are
> uploaded to and downloaded from CPAN. The only person that normally
> deals with that issue is the pumpkin for a release, and he only cares
> about it at release time.
> 
> In fact, if we're going to do it at all it makes sense to try it out on
> a minor release like 1.5.3. We've already got experience of doing it
> split-style from 1.5.2. (And let me tell you: splits at the code-base
> level suck.)
> 
> 
>> Individual CPAN modules:
>>
>> CPAN is not our personal versioning system; it may be if a 
>> distribution consists of only a few modules, but not when it's one of 
>> the largest distros present.  If someone wants to update an 
>> individual bioperl module for a quick bug fix they are more than 
>> welcome to download it via cvs, svn, or even using a web browser, and 
>> replace the one they have.
> 
> And where is the harm in letting them do it via CPAN as well? In fact,
> there are significant benefits:
> 
> 
>> I'm trying to reason how one could break up the individual SeqIO/
>> SearchIO/otherIO modules into single module distributions.  They are 
>> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, 
>> which relies on the various interfaces, RootIO, and on down).  How 
>> would tests be run off CPAN when the modules are distributed 
>> independently?
> 
> Bio::SeqIO::genbank would have a dependency on the latest version of
> Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.
> 
> So when a user wants to get the latest version of Bio::SeqIO::genbank,
> they no longer have to worry about what other modules in its dependency
> hierarchy they should also install.
> 
> Instead they just request Bio::SeqIO::genbank which itself ensures you
> have the latest version of all its dependencies before installing itself
> and running its tests.
> 
> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
> users should have, he could just call './Build dist Bio::SeqIO::genbank'
> which would generate a new package for Bio::SeqIO::genbank suitable for
> uploading to CPAN. No more long release cycles and having to constantly
> tell people to 'use CVS' to get working Bioperl code.
> 
> 
>> Would they also be individually distributed?  What  would you use to
>> tie all the individual modules together?  How would  you explain to
>> the CPAN maintainers that you want to split bioperl  into 990
>> individual modules, all updated independently, but intend on  bundling
>> them afterwards anyway?
> 
> They would be tied together by a CPAN bundle. You don't have to
> 'explain' anything to the CPAN maintainers because you're not doing
> anything wrong. In fact, you're using it the way you're supposed to.
> 


The successor to Bundles - may prove interesting:
http://search.cpan.org/~adamk/Task-1.01/lib/Task.pm


> 
>> Splitting up core:
>>
>> As I see it, here are the advantages of a defined split as Steve and 
>> I see it (off the top of my head).  Some of this probably reiterates 
>> my previous points, as well as Steve's, so apologies in advance.
> 
> Below I answer with how it would be with my single-module approach
> compared to the defined splits.
> 
> 
>> - A lean, mean, focused set of bioperl base modules (core) w/o or 
>> with very few external deps, minimal installation issues, etc.  The 
>> very basic stuff to get up and running.
> 
> Even leaner, even more focused.
> 
> 
>> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused 
>> functionality, code, and tests, which add a bit more 'sugar' to the 
>> base functionality of the core.  If you only care about parsing BLAST 
>> reports, get SearchIO, which requires core and optionally other 
>> modules (XML::SAX).  If you want additional DB functionality apart 
>> from the very basic ones in core, install DB (with it's additional 
>> requirements, including core, DBI, and so on).  Same with Graphics, 
>> Tools, Tree/Phylo, etc.  We just need to define and limit the number 
>> of splits.
> 
> The same can be achieved with CPAN bundles for each kind of functional
> grouping you can think of. And since its just a single text file that
> defines such a grouping, its easy to change or add new ones as you feel
> like it, as opposed to the rather more permanent and substantial effort
> of creating one of your splits on the code-base level.
> 
> Also, the world doesn't have to rely on /our/ ideas of what a useful
> functional split is. If someone just wants to parse Blast results, they
> can just use CPAN to install Bio::SearchIO::blast_pull instead of having
> to install all of SearchIO.
> 
> 
>> - Easier to add additional bundled modules.  For instance, I could 
>> focus all of my RNA work into a discrete set of modules (say, bioperl-
>> rna) which I maintain, I ensure works with the latest core code, I 
>> ensure also plays well with the other children =) , and I distribute 
>> via CPAN.  Same with EUtilities, which could go into a separated DB-
>> related set or stay in core.
> 
> And if you lose interest in them? They eventually die because they no
> longer have someone looking after them by default (the pumpkin and other
> devs). Alternatively you could just make a CPAN bundle. One text file!
> Easy! No duplication of modules in CPAN, no new hassle for you or the
> Bioperl 'core' pumpkin to ensure that the latest version of each work
> with each other and other splits.
> 
> 
>> - If we want a full-fledged 'install everything', the CPAN Bundle 
>> system is available.  I think it's easier to use a Bundle for 4-5, 
>> even 10 groups of modules as opposed to over 900.
> 
> No, it isn't any easier. Its /equally/ easy to install a bundle of 900
> packages of 900 modules as it is to install 5 packages of 900 modules.
> 
> When not installing absolutely everything, but perhaps 'most' things,
> there's the additional benefit that it would be easier to skip a
> particular Bio::module because you didn't want to install its external
> dependencies and weren't that interested in it anyway.
> 
> 
>> - A Bundle or a build file where discrete distributions are listed 
>> (Bio::SearchIO, etc) wouldn't need to be updated every time a new 
>> module is added to a distribution.  I suppose this could be 
>> automated, but why have the additional headache?
> 
> Yes, it would be automated, and no, it wouldn't at all be any kind of
> additional headache. I'm proposing a fully-automated system that the
> pumpkin wouldn't even have to think about it. Much /less/ of a headache
> than dealing with splits. Orders of magnitude easier to deal with.
> 
> 
>> - A chance to cut out some cruft.  We all know that particular areas 
>> need work or a complete overhaul (Restriction, Structure, maybe a few 
>> others).  Smaller, concentrated sets of modules I believe would be 
>> easier to maintain, and those that don't get use will eventually fall 
>> out of favor and may be lost or replaced from the more maintained 
>> group of modules.  Survival of the fittest.
> 
> And the smallest, most concentrated set of modules is the individual
> module.
> 
> 
>> - We already have had practice; bioperl-db, bioperl-run, bioperl-
>> network, and others.  Those that have been routinely maintained and 
>> enjoy wide use (db, run, network) have survived; others not so much 
>> (corba-related stuff, microarray, ext, etc., though the code is still 
>> available if someone else wants to take it up and revive it!).
> 
> The reason some of these existing splits (micoarray, ext) have fallen by
> the way-side? /Because/ they're splits. If they had been part of
> bioperl-live all along, they'd have been kept in a working, compatible
> state and would have been released along with everything else in 1.5.2
> 
> 
>> Disadvantages of a defined split:
>>
>> - The initial headache of identifying which groups go where, 
>> coordinating with those who rely on bioperl (GMOD, etc) on how this 
>> will be set up, so on...
> 
> No need to worry about this with individual modules.
> 
> 
>> - Separate groups of modules require testing together to ensure 
>> functionality is consistent and maintained (something I think you 
>> pointed out previously).
> 
> No need to worry.
> 
> 
>> - I think an increased possibility of branching is possible.
>>
>> - Extra headaches for devs, who have to keep track of the various 
>> critical distributions and make sure they work well together.
> 
> No headaches.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg3EKczuW2jkwy2gRAriiAJ47Qz9jTshEXuaG0XMYrUTI0hHqAwCeL45r
r/BykCKbM9lqJM0khARuEms=
=NB4B
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Thu Jun 28 04:51:19 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:51:19 +0100
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org>
References: <20070628074004.GD6338@kunpuu.plessy.org>
Message-ID: <46837687.7010101@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Charles Plessy wrote:
> Dear developpers,
> 
> I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
> it would make sense to call it "bioperl-live" and distribute it in
> parallel with the stable 1.4.0 version, if bioperl-live means "the
> current developepr version".
> 
> If I am wrong, can somebody explain me what bioperl-live exactly refers
> to ?
> 
> Have a nice day,
> 

bioperl-live really means the HEAD of the cvs repository so is the most
bleeding-edge code available.

Version 1.5.* is the developer release, while the 1.4.* is the stable
release. However, there have been few updates to the 1.4.* release which
means that it is more unstable than the 1.5.* dev release. I think the
consensus, was to have more rapid release cycles of the stable branch in
future in order to avoid this. I'm sure there are others more qualified
to expand/correct me on this if needs e.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg3aHczuW2jkwy2gRAo5pAJ95BGqrA5bLwRKNfUQi/HfBnkUJjwCg0mYB
/fHFyYkqAvcmOSxu4djPll0=
=KwVH
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Thu Jun 28 05:11:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 10:11:39 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <46836FEE.5030203@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk> <46836FEE.5030203@sheffield.ac.uk>
Message-ID: <46837B4B.7060705@sendu.me.uk>

Nathan S. Haigh wrote:
(Please try and snip more: don't quote whole posts just to reply to 
certain paragraphs)

> Sendu Bala wrote:
>> Chris Fields wrote:
>> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
>> users should have, he could just call './Build dist Bio::SeqIO::genbank'
>> which would generate a new package for Bio::SeqIO::genbank suitable for
>> uploading to CPAN. No more long release cycles and having to constantly
>> tell people to 'use CVS' to get working Bioperl code.
> 
> However, how would the test suite work out with this? e.g. when someone
> installs Bio::SeqIO::genbank they want to have the tests associated with
> Bio::SeqIO::genbank to be run. Would there be tests that would be run
> redundantly if for example someone installed Bio::SeqIO::genbank and
> Bio::SeqIO::fasta?

We would want to move to a strict test-script-per-module system. But 
that's desirable in any case, as it would greatly ease reaching our goal 
of complete test coverage, and subsequent maintenance of those tests.

The genbank test would only run tests specific to genbank parsing, and 
likewise for fasta. They would both have a dependency on Bio::SeqIO, and 
if that was also recently updated, it would get installed prior to you 
installing genbank (and therefor run its own generic SeqIO tests), but 
wouldn't get installed again (wouldn't run its tests again) when you 
install fasta afterwards.


On the subject of tests, I'm reminded of another benefit of the 
individual-module approach. Currently if a test fails during a CPAN 
install, nothing gets installed. Users do one of:

# refuse to install at all (strict sys-admins)
# cry and give up (newbies)
# cry and seek help (newbies who really really need Bioperl)
# force install, leaving them in some undefined state because they 
didn't understand the problems (most remaining users)
# force install, happy that the problems are ok (some Bioperl devs)

With a bundle of individual modules you would install virtually all 
Bioperl modules with no problems, and the problems with the remainder 
would be clear to everyone. No one would need to force install since the 
tests results would now be meaningful: the thing you're trying to 
install really isn't going to work if the tests are failing. If you 
really needed that particular Bioperl module you could then pay 
particular attention to why its failing (most likely some problem with 
an external dependency).


>>> Would they also be individually distributed?  What  would you use to
>>> tie all the individual modules together?
>>
>> They would be tied together by a CPAN bundle. You don't have to
>> 'explain' anything to the CPAN maintainers because you're not doing
>> anything wrong. In fact, you're using it the way you're supposed to.
> 
> Yep. real modules are released as modules, each with their own set of
> dependencies. The use CPAN bundles the way there were supposed to be for
> - - distributing a set of CPAN modules that make a coherent set of
> functionality. You "could" also bundle in other authors modules e.g.
> Bio::ASN1::EntrezGene?

Any bundle featuring Bio::SeqIO::entrezgene would necessarily include 
Bio::ASN1::EntrezGene in the bundle.


> Hmm, how would module versions be handled? Wouldn't this approach
> require each module to have it's own independent version number, which
> could then be used for building the dependencies? Each new release of
> that module would only bump that module's version number.

Yes, that's how it would work. No more global version number.


> Bundles can specify the minimum version of a module to be installed,
> such that bug fixes to individual modules and be released into CPAN and
> would automatically get picked up when installing bundles etc.

Yes.


> I'm not quite sure how the current stable/dev releases would work. I
> assume bug fixes would have to be made on a branch e.g. branch 1.6 and
> released to cpan from there. Then when the next stable release is made,
> all module versions would be bumped and and released to CPAN. With any
> modifications to the content of the bundle to be made. Is it possible to
> have a stable and developer release bundles that are able to specify the
> minimum stable and developer modules versions respectively?

No, the distinction becomes pretty meaningless. We could still do big 
major releases, but modules wouldn't be version-bumped. The big release 
would just be an update of the bundle that specifies the latest version 
of all Bioperl modules.

Remember that bundles only specify the minimum version, not the required 
version: in this brave new world users would end up with the same 
versions of modules if they installed a 1.8 bundle compared to 1.7 bundle.

The only way to get a true snapshot of 1.7 after it was released would 
be if we took snapshots and archived them, making them available from 
bioperl.org (or by checking out the 1.7 tag from cvs/svn).

I don't see that as a significant problem. You lose the trivial benefit 
of being able to install old snapshots from CPAN. The people who have a 
great need to install old snapshots can find their way to bioperl.org no 
problem.


From bix at sendu.me.uk  Thu Jun 28 04:50:09 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 09:50:09 +0100
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org>
References: <20070628074004.GD6338@kunpuu.plessy.org>
Message-ID: <46837641.8050106@sendu.me.uk>

Charles Plessy wrote:
> I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
> it would make sense to call it "bioperl-live" and distribute it in
> parallel with the stable 1.4.0 version, if bioperl-live means "the
> current developepr version".
> 
> If I am wrong, can somebody explain me what bioperl-live exactly refers
> to ?

bioperl-live is the name of the CVS repository containing what is 
currently considered the 'Core package' or core modules.
http://www.bioperl.org/wiki/Using_CVS

If you want to call it something to distinguish it from stable, call it 
'developer' vs 'stable' or '1.5.2' vs '1.4.0'.

To distinguish them both from the other packages, call them 'core' vs 
'run' etc.


From hlapp at gmx.net  Thu Jun 28 06:31:29 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 28 Jun 2007 07:31:29 -0300
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>


On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote:

> [...] Also - the main point I wanted to make - Can I suggest we  
> spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

I agree we need to discuss a path towards 1.6, but I think that  
should be kept separate from the cvs->svn migration. Otherwise one  
stalls the other (by stopping people who seem to have the energy and  
motivation right now to do one but not the other) for no really good  
reason.

> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

I'm not sure that's feasible to be happening but if someone steps up  
it maybe it is.

>
> Will it be productive to schedule a fair amount of time at BOSC
> discussing how to partition out the packages into separate sub-
> packages after we've done a successful release rather than trying to
> change things right now?

I agree. I also don't think that people are partitioning right now  
(other than the existing partitioning), though maybe I'm mistaken.

> [...]
> It would  probably mean moving Bio::Graphics, Bio::DB::GFF and
> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
> so they could be released more regularly on par with Gbrowse
> schedules.

Possibly. I'm not fully sure why those modules couldn't also be  
released more often out of the "main trunk" of modules. In Java/ant,  
it'd be relatively easy to write build script filters that select the  
appropriate modules and package them on the fly. I'm not sure whether  
the build tools for Perl can do that too, though.

>   Also I think someone needs to figure out Bio::Tools::GFF
> vs Bio::FeatureIO -- what do we want to do?

I believe FeatureIO has the ontology download tied into it?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Thu Jun 28 06:47:39 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 28 Jun 2007 07:47:39 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>


On Jun 28, 2007, at 12:29 AM, Jason Stajich wrote:

> As I tried to ask for in the past, would someone also illustrate the
> importance of why _WE_ need to switch to SVN on a wiki page on
> Bioperl so that when someone complains/asks about this in the future
> the arguments are already laid out.  I am basically fine with it, but
> I don't honestly see a compelling reason beyond what has been
> mentioned wrt better integration in IDEs.
> http://bioperl.org/wiki/Why_SVN

I guess at the end of the day svn is just the system of choice for  
new developers. I've had people tell me who started with svn that cvs  
seems a lot harder to use. The newer projects are all on svn and for  
example to integrate Bio::Phylo into BioPerl should become a question  
of the revision control system.

At the end of the day if being on svn makes it easier for new people  
to contribute it's enough of an argument for me, whether it's  
rational or not.

IMHO, there's two advantages that svn has over cvs. First,  
directories are versioned, have properties, and generally are the  
same class of citizens as files. They can be added, renamed, and  
removed from the repository. In cvs, we all know what a hassle it is  
to rename or even retire directories. Second, svn log gives you the  
commits, i.e., the set of changes that constituted one particular  
commit (and therefore version increase). In cvs that's hard or  
impossible to reconstruct.

Bottom line - I don't think many people if any will question why we  
moved from cvs to svn ...

My $0.02 ...

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Wed Jun 27 20:34:37 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:34:37 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
	<9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
	<1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>
Message-ID: <18051.541.684705.567954@almost.alerce.com>

Chris Fields writes:
 > We should port them all, yes.
 > 
 > chris
 > 
 > On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote:
 > 
 > > Is there a reason not to port every subproject over?
 > >
 > > 	-hilmar

They're all there.  At least everything that I found in the CVS repo.
Some of the directories were empty, some had very little content, I
was just mechanical about it.

Here's what I have:

  [hartzell at dev ~]$ svn ls file://`pwd`/bioperl
  biodata/
  bioperl-cookbook/
  bioperl-corba-client/
  bioperl-corba-server/
  bioperl-das-client/
  bioperl-db/
  bioperl-ext/
  bioperl-gui/
  bioperl-live/
  bioperl-microarray/
  bioperl-network/
  bioperl-papers/
  bioperl-pedigree/
  bioperl-pipeline/
  bioperl-run/
  biosql-schema/
  html/
  task-manager/
  xml-html/

I wasn't very clear in my original request, but I was hoping that
someone out there who's familiar with the various out-of-the-way bits
and pieces could take a look at them.  I was afraid that everyone was
just checking out bioperl-live and doing 'make test'.

Someone (chris?) made a point about binary files in bioperl-run.  It'd
be great if someone in the know could check on them.

Also, to the degree that it's possible, look around at various tags
and branches and see if they're what you'd expect.

Thanks!

g.


From bix at sendu.me.uk  Thu Jun 28 08:21:37 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 13:21:37 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <4683A7D1.8070403@sendu.me.uk>

George Hartzell wrote:
> Chris Fields writes:
>  > [...]
>  > It looks like George Hartzell may be taking a crack at it, with  
>  > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
>  > could have something testable relatively soon.  After that we'll need  
>  > to work out a few other issues, basically what's on Hilmar's list.
> 
> There's a repository on file:///home/hartzell/bioperl with all of the
> components projects in place.
> 
> If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
> 
>   file:///home/hartzell/bioperl

I'm confused. Presumably that only works whilst logged into 
dev.open-bio.org?


>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

I just tried:

svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl

on Mac OS X and things seemed to go well, except for this error message 
at the end:


svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
svn: Can't move source to dest
svn: Can't move 
'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
to 
'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
No such file or directory

I also ended up with only:
bioperl-corba-server    bioperl-db              bioperl-live 
bioperl-network         bioperl-papers          biosql-schema


Am I doing something totally wrong here?


From hartzell at alerce.com  Thu Jun 28 08:32:36 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:32:36 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN
	and	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <18051.43620.481558.447399@almost.alerce.com>

Jason Stajich writes:
 > [...]
 > The repository machine (dev) is a locked down machine meaning it only  
 > really runs ssh and not many servers include httpd.  We have  
 > anonymous CVS (client and through httpd browsing) running on a  
 > separate machine (code) that has the info rsynced over every 10 or 15  
 > minutes.

A great way to provide a read-only mirror of the repos. for anonymous
users is to have svnsync running out of cron on code.open-bio.org,
configured to pull from the dev.open-bio.org repository.  It might
actually work to have rsync mirror the fsfs-backed repository, but
that's scary-poking-into-the-internals.

g.


From hartzell at alerce.com  Thu Jun 28 08:43:37 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:43:37 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
Message-ID: <18051.44281.831316.749586@almost.alerce.com>

David Messina writes:
 > 
 > On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote:
 > 
 > >
 > > On Jun 27, 2007, at 1:27 PM, David Messina wrote:
 > >
 > >> I would think we would want "Author Date Id Rev URL" set on
 > >> everything, no?. So either cvs2svn or your tool (whichever you think
 > >> is better), followed by
 > >>
 > >> 	svn propset svn:keywords "Author Date Id Rev URL" *
 > >
 > > Shouldn't this be done recursively?
 > 
 > 
 > Yep, good catch! Thanks, Hilmar.
 > 
 > Should be:
 > 
 > 	svn propset --recursive svn:keywords "Author Date Id Rev URL" *

That's not quite what you want either.  It'll set the the keyword
property on all of the files, including things where you probably
don't want expansion to happen (e.g. images, someone said there are
binary wads in bioperl-run, etc...).

The Right Thing To Do is to grub around (grep) for '\$Id:' (and the
others) and set svn:keywords to files that are already using
keywords.  I have a bourne shell hack that'll do this, although it's
painful because it has to run in working directories....

Once we settle on a list of keywords to use, I'll take a wack at the
demo repository.

Likewise, you probably DON'T want to use this in your config file:

	  enable-auto-props = yes
	  * = svn:keywords="Author Date Id Rev URL"

since it'll do the same thing.

The Right Thing To Do is a more tedious 

	  *.pl = svn:keywords="Author Date Id Rev URL"
	  *.pm = svn:keywords="Author Date Id Rev URL"
  	  *.c = svn:keywords="Author Date Id Rev URL"

A bit of googling will give you a good starting point for the list,
and we should probably maintain a common one somewhere in the repo.

I don't think that there's a server side way of doing this, short of
running some script via a hook around commit time.

g.


From hartzell at alerce.com  Thu Jun 28 08:54:40 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:54:40 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN
	and	...Re:	Perltidy]
In-Reply-To: <F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
	<F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>
Message-ID: <18051.44944.982207.37624@almost.alerce.com>

Hilmar Lapp writes:
 > [...]
 > IMHO, there's two advantages that svn has over cvs. First,  
 > directories are versioned, have properties, and generally are the  
 > same class of citizens as files. They can be added, renamed, and  
 > removed from the repository. In cvs, we all know what a hassle it is  
 > to rename or even retire directories. Second, svn log gives you the  
 > commits, i.e., the set of changes that constituted one particular  
 > commit (and therefore version increase). In cvs that's hard or  
 > impossible to reconstruct.

Two more:

  - svn groups changes into revisions, so that they can be considered
    together, CVS versions individual files.
  - subversion tracks renames/moves correctly,
  - subversion commits are atomic, so you never have to worry about
    all of your stuff making it into the repos. at the same time [if
    you've never had to un-muck this, count yourself blessed!] ,
  - svk, which allows disconnected development while still commiting
    your work to a repo at natural points along the way (you can
    revert, branch, etc.... to your hearts content).

[yeah, that's 3, err, 4. Math is hard.]

g.


From cjfields at uiuc.edu  Thu Jun 28 09:07:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 08:07:24 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
	<23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>
Message-ID: <01812F01-9409-49FB-9061-330FA52177C1@uiuc.edu>


On Jun 28, 2007, at 5:31 AM, Hilmar Lapp wrote:

>
> On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote:
>
>> ...It
>> seems like we really need to do this first so that we have a stable
>> release that can be followed by CVS -> SVN migration, then consider
>> major changes to the repository structure and release packaging, and
>> potential deprecation and incorporation of other modules.
>
> I agree we need to discuss a path towards 1.6, but I think that
> should be kept separate from the cvs->svn migration. Otherwise one
> stalls the other (by stopping people who seem to have the energy and
> motivation right now to do one but not the other) for no really good
> reason.

It's good to discuss it as long as it doesn't take time and energy  
away from other priorities.

>> I assume there is no chance that we'd have a 1.6 candidate by BOSC
>> next month?
>
> I'm not sure that's feasible to be happening but if someone steps up
> it maybe it is.

Maybe a 1.5.3 and (if we work hard on it) a 1.6 soon after.  Then  
maybe work on partitioning if everyone's up for it and a scheme is  
worked out.

>> Will it be productive to schedule a fair amount of time at BOSC
>> discussing how to partition out the packages into separate sub-
>> packages after we've done a successful release rather than trying to
>> change things right now?
>
> I agree. I also don't think that people are partitioning right now
> (other than the existing partitioning), though maybe I'm mistaken.

The original proposal was based on Steve's idea of splitting up  
core.  I don't think a partition is feasible at this point, at least  
until we put more thought into it  (our energy should be focused  
elsewhere), but it's well worth discussing as a future path.

At this time there are two proposals:

1)  Steve's and my 'split into discrete sections' proposal, where we  
split core into self-sustaining sections with a common core listed as  
a dependency, tying installation of all together with a Bundle or  
similar.

2)  Sendu's 'break everything up' approach where all modules are  
submitted independently to CPAN, with their own tests, dependencies,  
etc.

There are advantages and disadvantages to both approaches.  Not sure  
if CPAN would go for the latter (it's pretty drastic), but I don't  
know for sure.  If you want in on that discussion (in this thread)  
feel free to join in!  The more the merrier!

>> [...]
>> It would  probably mean moving Bio::Graphics, Bio::DB::GFF and
>> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
>> so they could be released more regularly on par with Gbrowse
>> schedules.
>
> Possibly. I'm not fully sure why those modules couldn't also be
> released more often out of the "main trunk" of modules. In Java/ant,
> it'd be relatively easy to write build script filters that select the
> appropriate modules and package them on the fly. I'm not sure whether
> the build tools for Perl can do that too, though.

Both approaches above would probably use Module::Build to install  
other bioperl dependencies, each of which could have it's own  
dependency set, possibly using a Bundle to tie everything together.

>>   Also I think someone needs to figure out Bio::Tools::GFF
>> vs Bio::FeatureIO -- what do we want to do?
>
> I believe FeatureIO has the ontology download tied into it?
>
> 	-hilmar

 From recent posts here and on the gbrowse mail list by Scott and  
Lincoln, it seemed like they were moving away from using Bio::DB::GFF  
and were trying to get users to switch to Bio::DB::SeqFeature.  Maybe  
should get a more direct response?

chris


From hartzell at alerce.com  Thu Jun 28 09:16:18 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 09:16:18 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <18051.46242.942184.758493@almost.alerce.com>

Sendu Bala writes:
 > George Hartzell wrote:
 > > Chris Fields writes:
 > >  > [...]
 > >  > It looks like George Hartzell may be taking a crack at it, with  
 > >  > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
 > >  > could have something testable relatively soon.  After that we'll need  
 > >  > to work out a few other issues, basically what's on Hilmar's list.
 > > 
 > > There's a repository on file:///home/hartzell/bioperl with all of the
 > > components projects in place.
 > > 
 > > If you have a dev.open-bio.org account and you're in the bioperl
 > > group, you're good to get at it via:
 > > 
 > >   file:///home/hartzell/bioperl
 > 
 > I'm confused. Presumably that only works whilst logged into 
 > dev.open-bio.org?

Yes, that only works if you're actually on the machine.

 > >   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > I just tried:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > on Mac OS X and things seemed to go well, except for this error message 
 > at the end:
 > 
 > 
 > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
 > svn: Can't move source to dest
 > svn: Can't move 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
 > to 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
 > No such file or directory
 > 
 > I also ended up with only:
 > bioperl-corba-server    bioperl-db              bioperl-live 
 > bioperl-network         bioperl-papers          biosql-schema
 > 
 > 
 > Am I doing something totally wrong here?

It looks like you tried to check out the *entire* repository.  It
never occured to me to try that.  I'll take a look at what you
reported.

g.


From bix at sendu.me.uk  Thu Jun 28 09:20:19 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 14:20:19 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.46242.942184.758493@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.46242.942184.758493@almost.alerce.com>
Message-ID: <4683B593.3050108@sendu.me.uk>

George Hartzell wrote:
> Sendu Bala writes:
>> I just tried:
>> 
>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
[snip]
> It looks like you tried to check out the *entire* repository.

Yes. If you don't want everything, how does one 'browse' the repository
to find out the address of the thing you /do/ want?


> It never occured to me to try that.  I'll take a look at what you 
> reported.

Cheers.


From bix at sendu.me.uk  Thu Jun 28 09:27:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 14:27:29 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <4683B741.5020600@sendu.me.uk>

George Hartzell wrote:
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
> 
> Am I missing something, or don't we use them?

It would be great to have the following files svn:ignored :

In all package roots:
? Build
? MANIFEST
? MANIFEST.SKIP
? META.yml
? _build
? bioperl-*.tar.bz2
? bioperl-*.tar.gz
? bioperl-*.zip
? blib
? cover_db

In any and all directories:
? .DS_Store
? .DAV

In bioperl-live:
? t/BioDBSeqFeature.t
? t/BioDBSeqFeature_BDB.t
? t/BioDBSeqFeature_mysql.t


Can't think of anything else right now.

Thanks for your efforts,
Sendu.


From cjfields at uiuc.edu  Thu Jun 28 09:30:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 08:30:43 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <A2B0A715-BEF7-4632-91B3-1A215FBFE3D5@uiuc.edu>


On Jun 28, 2007, at 7:21 AM, Sendu Bala wrote:

>> ...
>>   file:///home/hartzell/bioperl
>
> I'm confused. Presumably that only works whilst logged into
> dev.open-bio.org?

Yes, it's just a tester.

>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>
> I just tried:
>
> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl

Try 'svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/trunk /mybiodir' to check out the main trunk for core.

chris


From hartzell at alerce.com  Thu Jun 28 09:57:00 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 09:57:00 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <18051.48684.996884.134046@almost.alerce.com>

Sendu Bala writes:
 > [...]
 > I just tried:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > on Mac OS X and things seemed to go well, except for this error message 
 > at the end:
 > 
 > 
 > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
 > svn: Can't move source to dest
 > svn: Can't move 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
 > to 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
 > No such file or directory
 > 
 > I also ended up with only:
 > bioperl-corba-server    bioperl-db              bioperl-live 
 > bioperl-network         bioperl-papers          biosql-schema
 > 
 > 
 > Am I doing something totally wrong here?

So, you probably wanted something like

  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

to pick up the head of the bioperl live tree (or
/.../bioperl-run/trunk, etc...).

I just checked out

  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/

and it ran to completion and gave me 

   (delicious)[6:50am]~/tmp>>ls bioperl | cat
   biodata
   bioperl-cookbook
   bioperl-corba-client
   bioperl-corba-server
   bioperl-das-client
   bioperl-db
   bioperl-ext
   bioperl-gui
   bioperl-live
   bioperl-microarray
   bioperl-network
   bioperl-papers
   bioperl-pedigree
   bioperl-pipeline
   bioperl-run
   biosql-schema
   html
   task-manager
   xml-html

Can another mac os x user out there give the Great Big Checkout a try
and see if it runs to completion.  Potential problems that come to
mind are:

  - the "mac's are case insensitive, sort of" problem
  - you filled up your disk
  - something else.

g.


From charles-listes+bioperl at plessy.org  Thu Jun 28 09:44:56 2007
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Thu, 28 Jun 2007 22:44:56 +0900
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <46837687.7010101@sheffield.ac.uk>
References: <20070628074004.GD6338@kunpuu.plessy.org>
	<46837687.7010101@sheffield.ac.uk>
Message-ID: <20070628134456.GB14492@kunpuu.plessy.org>

Le Thu, Jun 28, 2007 at 09:51:19AM +0100, Nathan S. Haigh a ?crit :
> 
> Version 1.5.* is the developer release, while the 1.4.* is the stable
> release. However, there have been few updates to the 1.4.* release which
> means that it is more unstable than the 1.5.* dev release. I think the
> consensus, was to have more rapid release cycles of the stable branch in
> future in order to avoid this. I'm sure there are others more qualified
> to expand/correct me on this if needs e.

Ok, thank you all for the answers. I think that I will simply upgrade
bioperl to 1.5.2 in Debian testing, and maybe rename it bioperl-core
when I will package other components.

Have a nice day,

-- 
Charles Plessy
Debian-Med packaging team
Wako, Saitama, Japan


From bix at sendu.me.uk  Thu Jun 28 10:19:49 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 15:19:49 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.48684.996884.134046@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
Message-ID: <4683C385.3050904@sendu.me.uk>

George Hartzell wrote:
> Sendu Bala writes:
>  > [...]
>  > I just tried:
>  > 
>  > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>  > 
>  > on Mac OS X and things seemed to go well, except for this error message 
>  > at the end:
>  > 
>  > 
>  > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
>  > svn: Can't move source to dest
>  > svn: Can't move 
>  > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
>  > to 
>  > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
>  > No such file or directory
>  > 
>  > I also ended up with only:
>  > bioperl-corba-server    bioperl-db              bioperl-live 
>  > bioperl-network         bioperl-papers          biosql-schema

I tried again in the same location and it told me I had to 'svn 
cleanup', which I did. But subsequently it kept complaining about files 
already being there.


> I just checked out
> 
>   svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/
> 
> and it ran to completion
[snip]
> Can another mac os x user out there give the Great Big Checkout a try
> and see if it runs to completion.  Potential problems that come to
> mind are:
> 
>   - the "mac's are case insensitive, sort of" problem
>   - you filled up your disk
>   - something else.

Well, I didn't run out of disc space. After a rm -fr * and trying again 
it failed at exactly the same point, in the same way.

svn co 
svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data

causes this repeatable problem:

[...]
A    data/phredfile.phd
svn: In directory 'data'
svn: Can't move source to dest
svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 
'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory

That is with Mac OS X svn command-line client, version 1.4.4

I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with 
a linux svn command-line client, version 1.2.3.


Cheers,
Sendu.


From dmessina at wustl.edu  Thu Jun 28 11:08:59 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 10:08:59 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18051.44281.831316.749586@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
Message-ID: <F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>

> [George]
> Likewise, you probably DON'T want to use this in your config file:
>
> 	  enable-auto-props = yes
> 	  * = svn:keywords="Author Date Id Rev URL"
>
> since it'll do the same thing.

Ah, so I've been doing it wrong all along then. :) Thanks, George!


> The Right Thing To Do is a more tedious
>
> 	  *.pl = svn:keywords="Author Date Id Rev URL"
> 	  *.pm = svn:keywords="Author Date Id Rev URL"
>   	  *.c = svn:keywords="Author Date Id Rev URL"
>
> A bit of googling will give you a good starting point for the list,
> and we should probably maintain a common one somewhere in the repo.


I've googled around and gathered the following as a possible list for  
our repo. Since I obviously don't know what I'm doing :), of course  
adjust and refine as necessary.

Dave

-------
[auto-props]
# Code formats
*.c          = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.cpp        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.h          = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.java       = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.as         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.cgi        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn-mine-type=text/plain
*.js         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/javascript
*.php        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL" Rev Date; svn:mime-type=text/x-php
*.pl         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-perl; svn:executable
*.pm         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-perl
*.py         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-python; svn:executable
*.sh         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-sh; svn:executable

# Image formats
*.bmp        = svn:mime-type=image/bmp
*.gif        = svn:mime-type=image/gif
*.ico        = svn:mime-type=image/ico
*.jpeg       = svn:mime-type=image/jpeg
*.jpg        = svn:mime-type=image/jpeg
*.png        = svn:mime-type=image/png
*.tif        = svn:mime-type=image/tiff
*.tiff       = svn:mime-type=image/tiff

# Data formats
*.pdf        = svn:mime-type=application/pdf
*.avi        = svn:mime-type=video/avi
*.doc        = svn:mime-type=application/msword
*.eps        = svn:mime-type=application/postscript
*.gz         = svn:mime-type=application/gzip
*.mov        = svn:mime-type=video/quicktime
*.mp3        = svn:mime-type=audio/mpeg
*.ppt        = svn:mime-type=application/vnd.ms-powerpoint
*.ps         = svn:mime-type=application/postscript
*.psd        = svn:mime-type=application/photoshop
*.rtf        = svn:mime-type=text/rtf
*.swf        = svn:mime-type=application/x-shockwave-flash
*.tgz        = svn:mime-type=application/gzip
*.wav        = svn:mime-type=audio/wav
*.xls        = svn:mime-type=application/vnd.ms-excel
*.zip        = svn:mime-type=application/zip

# Text formats
.htaccess    = svn:mime-type=text/plain
*.css        = svn:mime-type=text/css
*.dtd        = svn:mime-type=text/xml
*.html       = svn:mime-type=text/html
*.ini        = svn:mime-type=text/plain
*.sql        = svn:mime-type=text/x-sql
*.txt        = svn:mime-type=text/plain
*.xhtml      = svn:mime-type=text/xhtml+xml
*.xml        = svn:mime-type=text/xml
*.xsd        = svn:mime-type=text/xml
*.xsl        = svn:mime-type=text/xml
*.xslt       = svn:mime-type=text/xml
*.xul        = svn:mime-type=text/xul
*.yml        = svn:mime-type=text/plain
CHANGES      = svn:mime-type=text/plain
COPYING      = svn:mime-type=text/plain
INSTALL      = svn:mime-type=text/plain
Makefile*    = svn:mime-type=text/plain
README       = svn:mime-type=text/plain
TODO         = svn:mime-type=text/plain


From dmessina at wustl.edu  Thu Jun 28 11:11:23 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 10:11:23 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683B593.3050108@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.46242.942184.758493@almost.alerce.com>
	<4683B593.3050108@sendu.me.uk>
Message-ID: <F55A8B8A-B7B8-4354-85B7-E459B3679E41@wustl.edu>

> [Sendu]
>
> Yes. If you don't want everything, how does one 'browse' the  
> repository
> to find out the address of the thing you /do/ want?

svn ls file://dev.open-bio.org/home/hartzell/bioperl

or

svn ls svn+ssh://dev.open-bio.org/home/hartzell/bioperl


From n.haigh at sheffield.ac.uk  Thu Jun 28 11:13:58 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 16:13:58 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683B593.3050108@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>	<18051.46242.942184.758493@almost.alerce.com>
	<4683B593.3050108@sendu.me.uk>
Message-ID: <4683D036.5060109@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> George Hartzell wrote:
>> Sendu Bala writes:
>>> I just tried:
>>>
>>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
> [snip]
>> It looks like you tried to check out the *entire* repository.
> 
> Yes. If you don't want everything, how does one 'browse' the repository
> to find out the address of the thing you /do/ want?
> 

You could try:
svn ls

or

svn ls -R

to get a list of directories.

> 
>> It never occured to me to try that.  I'll take a look at what you 
>> reported.
> 
> Cheers.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg9A2czuW2jkwy2gRAgirAKCnMAg6a7W7RM22O2rOi4vD5w3HPwCePsku
akLhIszoQbRc/aVX3d/Jp7w=
=mlHY
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Thu Jun 28 11:20:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 10:20:46 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683C385.3050904@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
Message-ID: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>

I can replicate the same problem (Mac OS X) with a full checkout:

svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
svn: Can't move source to dest
svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/ 
tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/ 
tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base':  
No such file or directory

What local (mac) svn version are you using?  I'm running off macports:

svn --version
svn, version 1.4.4 (r25188)
    compiled Jun 16 2007, 23:40:53

chris

On Jun 28, 2007, at 9:19 AM, Sendu Bala wrote:
...

> I tried again in the same location and it told me I had to 'svn
> cleanup', which I did. But subsequently it kept complaining about  
> files
> already being there.
>>
> [snip]
>> Can another mac os x user out there give the Great Big Checkout a try
>> and see if it runs to completion.  Potential problems that come to
>> mind are:
>>
>>   - the "mac's are case insensitive, sort of" problem
>>   - you filled up your disk
>>   - something else.
>
> Well, I didn't run out of disc space. After a rm -fr * and trying  
> again
> it failed at exactly the same point, in the same way.
>
> svn co
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/ 
> release-0-9-2/t/data
>
> causes this repeatable problem:
>
> [...]
> A    data/phredfile.phd
> svn: In directory 'data'
> svn: Can't move source to dest
> svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to
> 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or  
> directory
>
> That is with Mac OS X svn command-line client, version 1.4.4
>
> I can get bioperl-live/tags/release-0-9-2/t/data to check out fine  
> with
> a linux svn command-line client, version 1.2.3.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Jun 28 11:37:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 10:37:27 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>

On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> ...
>
> The short and sweet version: my proposal has all the benefits of  
> yours, but none of the disadvantages. What's not to like?

The short and sweet version: I'm more convinced after you laid out  
your argument in detail, which would have saved me some typing last  
night, BTW, thanks! ; >

The other core devs need to chip in and we need to openly (candidly)  
discuss it some more (I've added Hilmar to this).  There is also a  
tenable solution that allows both aspects ('cliques' and single mode)  
which might make everybody happy.

Let's say we only want to install Bio::SeqIO::genbank.  The  
Bio::SeqIO::genbank Build.PL would only install what was needed (as  
you indicated), only Bio::SeqIO::genbank-related tests would run  
(along with dependency test, if available), and life would go on.   
However, what if we wanted to install everything in SeqIO/DB/AlignIO/ 
etc?

We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO  
modules installed or a select few (maybe a quick 'install all (y/n)?'  
followed by a list, which installs them one at a time along with  
dependencies), or have the option to specifically denote them as  
passed args to SeqIO's Build.PL, something like 'perl Build.PL - 
install-plugins genbank embl swiss', 'perl Build.PL -install-plugins  
all', etc.  If a specific module (Bio::SeqIO::genbank) is installed  
directly then maybe the installation q&a's of followed modules could  
be bypassed when installing down the dependency tree with additional  
passed args.

This would, in effect, be a bioperl-specific mini-CPAN within CPAN.   
Nice!

Now, this doesn't address several related issues, such as how we  
handle versioning of the independent modules (should be in a  
controlled manner), what we do about deprecated modules which linger  
about on CPAN, how we deal with PPMs/RPMs/packaging, and so on.  All  
have possible reasonable ways they can be addressed, I believe.   
Also, I think we should still think about doing regular full-scale  
'stable' (1.#) releases (sort of our stamp of approval for that batch  
of modules at that point in time, with a reasonable 'sell-by' date).

Again, it should be seriously discussed among the core devs and the  
bioperl community at large prior to any serious work on it, and it  
would be quite a large-scale project, but possibly worth it.  It can  
only go forward if there is enough momentum behind it.

>> Finally, all of this should wait until later.  Much later, like  
>> after  a decent release, after svn, etc kind of 'later'.  I think  
>> we can  agree on that.
>
> Hmm, not really. If it can be implemented by a change in just  
> Build.PL and ModuleBuildBioperl, its really independent of  
> everything else. That's the beauty of it: the only thing that  
> changes is how things are uploaded to and downloaded from CPAN. The  
> only person that normally deals with that issue is the pumpkin for  
> a release, and he only cares about it at release time.
>
> In fact, if we're going to do it at all it makes sense to try it  
> out on a minor release like 1.5.3. We've already got experience of  
> doing it split-style from 1.5.2. (And let me tell you: splits at  
> the code-base level suck.)

BOSC is coming up, and I would like to focus on getting svn migration  
taken care of ASAP (which is sounding more and more like we plan on  
moving all open-bio over, unless I misread Jason's post?) and  
stomping of bugs (my next priority after EUtilities).  Maybe in the  
interim we should try focusing on bug squashing, get out a quick  
standard dev release (1.5.3) before BOSC, and then a few of us could  
all communicate there via email/text/IM/phone off-list?  Maybe post  
updates via the bioperl blog and list?

> And where is the harm in letting them do it via CPAN as well? In  
> fact, there are significant benefits:
...

I'm already pretty convinced...

> The same can be achieved with CPAN bundles for each kind of  
> functional grouping you can think of. And since its just a single  
> text file that defines such a grouping, its easy to change or add  
> new ones as you feel like it, as opposed to the rather more  
> permanent and substantial effort of creating one of your splits on  
> the code-base level.

... or it could be run right in Module::Build for specific parent  
classes (as I mention above).  Bundling could be instituted for  
something like a standard GBrowse release (Bundle::BioPerl::GBrowse)  
where the functionality might be more spread out (Bio::DB*,  
Bio::Graphics, Bio::FeatureIO, etc).  For a full-scale old-style core  
install, another Bundle (Bundle::BioPerl::Standard).

...

> Yes, it would be automated, and no, it wouldn't at all be any kind  
> of additional headache. I'm proposing a fully-automated system that  
> the pumpkin wouldn't even have to think about it. Much /less/ of a  
> headache than dealing with splits. Orders of magnitude easier to  
> deal with.

The 'headache' would be the initial setup (splitting test, individual  
Build.PL, etc), but this could be done stepwise or section-wise, I  
suppose.
...

> And the smallest, most concentrated set of modules is the  
> individual module.

Well, only if it runs correctly (i.e. has the entire dep. tree  
installed).  But the 'follow' tests would handle that.

> The reason some of these existing splits (micoarray, ext) have  
> fallen by the way-side? /Because/ they're splits. If they had been  
> part of bioperl-live all along, they'd have been kept in a working,  
> compatible state and would have been released along with everything  
> else in 1.5.2

microarray fell out of favor for other reasons (much faster ways to  
do the same thing via R), though I think it still could be salvaged  
if someone wanted to take it up.

the other bioperl distros (network, db, run, etc) would also  
necessitate following the same path as core, but I guess they could  
be bundled as well.

> ...
> No headaches.

I already have one, sorry!

chris


From n.haigh at sheffield.ac.uk  Thu Jun 28 11:53:52 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 16:53:52 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <4683D990.8090909@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>> ...
>>
>> The short and sweet version: my proposal has all the benefits of
>> yours, but none of the disadvantages. What's not to like?
> 
> The short and sweet version: I'm more convinced after you laid out your
> argument in detail, which would have saved me some typing last night,
> BTW, thanks! ; >
> 
> The other core devs need to chip in and we need to openly (candidly)
> discuss it some more (I've added Hilmar to this).  There is also a
> tenable solution that allows both aspects ('cliques' and single mode)
> which might make everybody happy.

Couldn't "cliques" simply be satisfied with CPAN Bundles?

> 
> Let's say we only want to install Bio::SeqIO::genbank.  The
> Bio::SeqIO::genbank Build.PL would only install what was needed (as you
> indicated), only Bio::SeqIO::genbank-related tests would run (along with
> dependency test, if available), and life would go on.  However, what if
> we wanted to install everything in SeqIO/DB/AlignIO/etc?

I think this might be where Bundles come in for installing these
"cliques" of related modules?

- -- snip --

> 
>> Yes, it would be automated, and no, it wouldn't at all be any kind of
>> additional headache. I'm proposing a fully-automated system that the
>> pumpkin wouldn't even have to think about it. Much /less/ of a
>> headache than dealing with splits. Orders of magnitude easier to deal
>> with.
> 
> The 'headache' would be the initial setup (splitting test, individual
> Build.PL, etc), but this could be done stepwise or section-wise, I suppose.

Yes, I think this is where most of the labour will be. However, setting
the test suite up like this would be beneficial with or without
publishing modules individually.

- -- snip --
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg9mQczuW2jkwy2gRAlfBAKCFP7XUvWXsjycSv0MVGN3Ru40D/wCcDiDg
UKE/Q/wA3gu1Gb7S6rarCQw=
=WQdY
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Thu Jun 28 12:03:54 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 17:03:54 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <4683DBEA.90005@sendu.me.uk>

Chris Fields wrote:
> On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:
> Let's say we only want to install Bio::SeqIO::genbank.  The 
> Bio::SeqIO::genbank Build.PL would only install what was needed (as you 
> indicated), only Bio::SeqIO::genbank-related tests would run (along with 
> dependency test, if available), and life would go on.  However, what if 
> we wanted to install everything in SeqIO/DB/AlignIO/etc?
> 
> We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO 
> modules installed or a select few (maybe a quick 'install all (y/n)?' 
> followed by a list, which installs them one at a time along with 
> dependencies), or have the option to specifically denote them as passed 
> args to SeqIO's Build.PL, something like 'perl Build.PL -install-plugins 
> genbank embl swiss', 'perl Build.PL -install-plugins all', etc.  If a 
> specific module (Bio::SeqIO::genbank) is installed directly then maybe 
> the installation q&a's of followed modules could be bypassed when 
> installing down the dependency tree with additional passed args.

I'd probably stay away from something like this. My primary reason 
being, off-the-top-of-my-head I don't see how to get it to work. If 
you're installing Bio::SeqIO for the first time via CPAN you can't ask 
it to install Bio::SeqIO::genbank et al. at the same time because 
Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some circularity.

I also wouldn't want these things to be complicated. There should be 
little in the way of questions to ask during install. Each module's 
Build.PL should be ultra-simple with no advanced logic at all. It should 
just specify things that are absolute requirements. This simplicity 
helps avoid some of the problems we face by distributing the monolithic 
Bioperl.

No, much better for us and for users to provide a Bundle::Bio-SeqIO.


> Now, this doesn't address several related issues, such as how we handle 
> versioning of the independent modules (should be in a controlled 
> manner),

When a module is changed, it gets a version bump. Nothing complicated 
needs to be done. Transparent and obvious, behaving like all other CPAN 
modules would be my choice.


> what we do about deprecated modules which linger about on CPAN,

Delete them from CPAN seems appropriate.


> how we deal with PPMs/RPMs/packaging, and so on.  All have possible 
> reasonable ways they can be addressed, I believe.  Also, I think we 
> should still think about doing regular full-scale 'stable' (1.#) 
> releases (sort of our stamp of approval for that batch of modules at 
> that point in time, with a reasonable 'sell-by' date).

Yes, we can still choose to take a snapshot and announce it to the 
world, but at the module-level nothing special would happen. There would 
just be an updated Bundle::Bioperl-everything (or whatever).


> Again, it should be seriously discussed among the core devs and the 
> bioperl community at large prior to any serious work on it, and it would 
> be quite a large-scale project, but possibly worth it.  It can only go 
> forward if there is enough momentum behind it.

The requirement for this approach is per-module test scripts. Which as I 
identified already, is very desirable anyway so we can hit 100% test 
coverage.

So, regardless of anything else can we all agree that per-module test 
scripts are a good idea and should be worked on? If so, I'll look into 
the feasibility and figure out how much work will be involved.


From cjfields at uiuc.edu  Thu Jun 28 13:17:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 12:17:50 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683DBEA.90005@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
Message-ID: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>


On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:

> ...
> I'd probably stay away from something like this. My primary reason  
> being, off-the-top-of-my-head I don't see how to get it to work. If  
> you're installing Bio::SeqIO for the first time via CPAN you can't  
> ask it to install Bio::SeqIO::genbank et al. at the same time  
> because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some  
> circularity.

True...

> I also wouldn't want these things to be complicated. There should  
> be little in the way of questions to ask during install. Each  
> module's Build.PL should be ultra-simple with no advanced logic at  
> all. It should just specify things that are absolute requirements.  
> This simplicity helps avoid some of the problems we face by  
> distributing the monolithic Bioperl.
>
> No, much better for us and for users to provide a Bundle::Bio-SeqIO.

I just don't want too much Bundle-itis as it'll gets confusing for  
newbie (i.e. Vista-itis, or AdobeCS-itis).  It should be limited to  
functional grouping (SeqIO, AlignIO, DB, etc), 'install everything',  
or distribution-specific (GBrowse).

I also think (though Hilmar may veto this) that we should work on  
integrating bioperl-db, network, etc. into this if it goes forward.

Here's a question: how do we plan on handling uploading bioperl  
updates to CPAN via PAUSE?  Do we want to run every single module  
through one pumpkin?  Or do we want to have a core dev group PAUSE  
account?  I can see, for instance, removing everything EUtilities- 
related and submitting it independently using my own PAUSE account,  
but it would be nice to have it under an umbrella 'bioperl-devs'  
account instead.

>> Now, this doesn't address several related issues, such as how we  
>> handle versioning of the independent modules (should be in a  
>> controlled manner),
>
> When a module is changed, it gets a version bump. Nothing  
> complicated needs to be done. Transparent and obvious, behaving  
> like all other CPAN modules would be my choice.
>
>> what we do about deprecated modules which linger about on CPAN,
>
> Delete them from CPAN seems appropriate.

I know you can do that via PAUSE, but I think it lingers about on  
search.cpan.org (unless that's been fixed).  This would prob. have to  
be used sparingly.

>> how we deal with PPMs/RPMs/packaging, and so on.  All have  
>> possible reasonable ways they can be addressed, I believe.  Also,  
>> I think we should still think about doing regular full-scale  
>> 'stable' (1.#) releases (sort of our stamp of approval for that  
>> batch of modules at that point in time, with a reasonable 'sell- 
>> by' date).
>
> Yes, we can still choose to take a snapshot and announce it to the  
> world, but at the module-level nothing special would happen. There  
> would just be an updated Bundle::Bioperl-everything (or whatever).

Right, it would basically be a stamp of certification.

>> Again, it should be seriously discussed among the core devs and  
>> the bioperl community at large prior to any serious work on it,  
>> and it would be quite a large-scale project, but possibly worth  
>> it.  It can only go forward if there is enough momentum behind it.
>
> The requirement for this approach is per-module test scripts. Which  
> as I identified already, is very desirable anyway so we can hit  
> 100% test coverage.
>
> So, regardless of anything else can we all agree that per-module  
> test scripts are a good idea and should be worked on? If so, I'll  
> look into the feasibility and figure out how much work will be  
> involved.

I think so, but the feasibility issue is critical.  Do we want cvs/ 
svn to be divided up into 900 subdirectories (one for each module),  
or do we want to have a similar directory structure as we have now,  
but with each module in it's own directory?  Or leave everything as  
is and generate Build.PL on-the-fly (prob. least feasible)?

This is where it might be wise to do it piece-meal at first (maybe  
starting with something somewhat segregated like Bio::Tools), then  
progress from there.

chris


From hartzell at alerce.com  Thu Jun 28 13:38:48 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 13:38:48 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
Message-ID: <18051.61992.627473.323346@almost.alerce.com>

David Messina writes:
 > > [George]
 > > Likewise, you probably DON'T want to use this in your config file:
 > >
 > > 	  enable-auto-props = yes
 > > 	  * = svn:keywords="Author Date Id Rev URL"
 > >
 > > since it'll do the same thing.
 > 
 > Ah, so I've been doing it wrong all along then. :) Thanks, George!

It's not *wrong* if it's never done anything to you that you've
regretted.  The right answer depends on your situation....

 > [...]
 > I've googled around and gathered the following as a possible list for  
 > our repo. Since I obviously don't know what I'm doing :), of course  
 > adjust and refine as necessary.
 > 

That's a great starting point.  Do you have write access to the wiki?
Could you link it off of the instructions for using svn?

g.


From hartzell at alerce.com  Thu Jun 28 14:06:50 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 14:06:50 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683C385.3050904@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
Message-ID: <18051.63674.685297.426813@almost.alerce.com>

Sendu Bala writes:
 > [...]
 > I tried again in the same location and it told me I had to 'svn 
 > cleanup', which I did. But subsequently it kept complaining about files 
 > already being there.

You need to do the cleanup because svn exited gracelessly and you
needed to help it get back in it's feet.  The cleanup doesn't remove
the stuff that you did get checked out, so it's still there getting in
the way of your new checkout.

 > [...]
 > svn co 
 > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data
 > 
 > causes this repeatable problem:
 > 
 > [...]
 > A    data/phredfile.phd
 > svn: In directory 'data'
 > svn: Can't move source to dest
 > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 
 > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory
 > 
 > That is with Mac OS X svn command-line client, version 1.4.4
 > 
 > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with 
 > a linux svn command-line client, version 1.2.3.

I'm not 100% sure what's going on here, but I'm inclined to say "get a
real computer" (and yes, I'm typing this on a mac...).  I have a mac
pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
the tiger used to say)....

I think that we're having trouble with case sensitivity.  My only
evidence is that I can see where there have been both HUMBETGLOA.FASTA
and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
anything else that's weird about that file.  On the other hand, I
can't see how this would cause the error you're seeing though.

The experiment would be to grab a usb or firewire disk (or even a
memory stick), partition/format it as case sensitive (or even *unix*)
and try to do

 svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data

into it.  If it works, voila.  If not, I'll keep making stuff up, err,
thinking about it.

g.


From dmessina at wustl.edu  Thu Jun 28 14:15:32 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 13:15:32 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>
Message-ID: <459D9BC0-4FBA-4560-80A8-E6243DE9D9CC@wustl.edu>

Same svn error here on the full checkout.


> What local (mac) svn version are you using?  I'm running off macports:
>
> svn --version
> svn, version 1.4.4 (r25188)
>     compiled Jun 16 2007, 23:40:53

I have svn 1.4.3.

% svn --version
svn, version 1.4.3 (r23084)
    compiled Apr  1 2007, 02:47:14

Copyright (C) 2000-2006 CollabNet.
Subversion is open source software, see http://subversion.tigris.org/
This product includes software developed by CollabNet (http:// 
www.Collab.Net/).

The following repository access (RA) modules are available:

* ra_dav : Module for accessing a repository via WebDAV (DeltaV)  
protocol.
   - handles 'http' scheme
* ra_svn : Module for accessing a repository using the svn network  
protocol.
   - handles 'svn' scheme
* ra_local : Module for accessing a repository on local disk.
   - handles 'file' scheme


From cjfields at uiuc.edu  Thu Jun 28 14:54:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 13:54:15 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.63674.685297.426813@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
Message-ID: <D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>


On Jun 28, 2007, at 1:06 PM, George Hartzell wrote:

> ...
> I'm not 100% sure what's going on here, but I'm inclined to say "get a
> real computer" (and yes, I'm typing this on a mac...).  I have a mac
> pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
> the tiger used to say)....

Ouch!  Though it could be worse (**coughwindowscough**).

> I think that we're having trouble with case sensitivity.  My only
> evidence is that I can see where there have been both HUMBETGLOA.FASTA
> and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
> anything else that's weird about that file.  On the other hand, I
> can't see how this would cause the error you're seeing though.

Odd that other branches (including the main trunk) work but that one  
doesn't.

> The experiment would be to grab a usb or firewire disk (or even a
> memory stick), partition/format it as case sensitive (or even *unix*)
> and try to do
>
>  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data
>
> into it.  If it works, voila.  If not, I'll keep making stuff up, err,
> thinking about it.
>
> g.

I'll have to figure out why I can't get ssh keys to work locally to  
test it out more (I have a usb drive to test with); just don't have  
time at the moment.

chris


From dmessina at wustl.edu  Thu Jun 28 14:47:04 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 13:47:04 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18051.61992.627473.323346@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
Message-ID: <0027C4E0-26B1-41F3-8FD8-EAB5465CA80E@wustl.edu>

> That's a great starting point.  Do you have write access to the wiki?
> Could you link it off of the instructions for using svn?

Done.

http://www.bioperl.org/wiki/Svn_auto-props

linked from:
http://www.bioperl.org/wiki/Using_Subversion (bottom of page)


From bix at sendu.me.uk  Thu Jun 28 15:19:35 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 20:19:35 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
Message-ID: <468409C7.7020102@sendu.me.uk>

Chris Fields wrote:
> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:
> Here's a question: how do we plan on handling uploading bioperl  
> updates to CPAN via PAUSE?  Do we want to run every single module  
> through one pumpkin?  Or do we want to have a core dev group PAUSE  
> account?  I can see, for instance, removing everything EUtilities- 
> related and submitting it independently using my own PAUSE account,  
> but it would be nice to have it under an umbrella 'bioperl-devs'  
> account instead.

All Bioperl modules (except the Bundle!) are owned by BIOPERLML on 
PAUSE. Its a little akward since PAUSE is uploader-centric, but see my 
notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release

And certainly, everything that wants to consider itself part of Bioperl 
(and gain the benefit of lots of devs looking after it) should certainly 
  have BIOPERLML as the primary owner.


> I think so, but the feasibility issue is critical.  Do we want cvs/ 
> svn to be divided up into 900 subdirectories (one for each module),  
> or do we want to have a similar directory structure as we have now,  
> but with each module in it's own directory?  Or leave everything as  
> is and generate Build.PL on-the-fly (prob. least feasible)?

Very definitely the latter. The key benefit of my approach is that the 
organisation stays as is and that a snapshot of the repository remains a 
single directory of modules in Bio so that people don't have to 
'install' Bioperl, they can still just uncompress the archive (or check 
out the package from svn) and point their PERL5LIB to the root dir of 
the package.

For that reason I very much like the idea of folding the current 
split-out packages (run, network etc.) back into the core package so 
everything is one place. Folding them back in should obviously wait 
until everything is in place and working with core already.


My proposal obviously wasn't very clear. As far as all other devs are 
concerned, nothing changes at all (except for lots of new improved test 
scripts). The pumpkin will, however, be able to say:

./Build dist

Right now that generates the distribution archives (in different 
compression formats) - one big archive containing everything.
My proposal is simply that instead it generates lots of archives, one 
archive per module. It will also generate some Bundles and whatever else 
might be needed.

I don't envisage any major difficulties in achieving this. The 
'feasibility' issue I was going to look into was strictly regarding 
doing all the new test scripts.


From hartzell at alerce.com  Thu Jun 28 15:43:38 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 15:43:38 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
Message-ID: <18052.3946.224905.415905@almost.alerce.com>

Chris Fields writes:
 > 
 > On Jun 28, 2007, at 1:06 PM, George Hartzell wrote:
 > 
 > > ...
 > > I'm not 100% sure what's going on here, but I'm inclined to say "get a
 > > real computer" (and yes, I'm typing this on a mac...).  I have a mac
 > > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
 > > the tiger used to say)....
 > 
 > Ouch!  Though it could be worse (**coughwindowscough**).
 > 
 > > I think that we're having trouble with case sensitivity.  My only
 > > evidence is that I can see where there have been both HUMBETGLOA.FASTA
 > > and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
 > > anything else that's weird about that file.  On the other hand, I
 > > can't see how this would cause the error you're seeing though.
 > 
 > Odd that other branches (including the main trunk) work but that one  
 > doesn't.
 > 
 > > The experiment would be to grab a usb or firewire disk (or even a
 > > memory stick), partition/format it as case sensitive (or even *unix*)
 > > and try to do
 > >
 > >  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
 > > live/tags/release-0-9-2/t/data
 > >
 > > into it.  If it works, voila.  If not, I'll keep making stuff up, err,
 > > thinking about it.
 > >
 > > g.
 > 
 > I'll have to figure out why I can't get ssh keys to work locally to  
 > test it out more (I have a usb drive to test with); just don't have  
 > time at the moment.

I just did the experiment, and filename-insensitivity seems to be
breaking something.

I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.

I reformatted a memory stick to be case sensitive and co of

  bioperl/bioperl-live/tags/release-0-9-2/t 

worked, then I made a directory in my home dir (normal mac thing) and
got the same error as above.

I can get a copy of the trunk, so I'm inclined to ask someone to
mention the problem on the wiki and then just ignore it.

g.


From cjfields at uiuc.edu  Thu Jun 28 16:29:09 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 15:29:09 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <468409C7.7020102@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
Message-ID: <026156F4-4C46-4CC6-82B5-07FC5326A244@uiuc.edu>


On Jun 28, 2007, at 2:19 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:
>> Here's a question: how do we plan on handling uploading bioperl
>> updates to CPAN via PAUSE?  Do we want to run every single module
>> through one pumpkin?  Or do we want to have a core dev group PAUSE
>> account?  I can see, for instance, removing everything EUtilities-
>> related and submitting it independently using my own PAUSE account,
>> but it would be nice to have it under an umbrella 'bioperl-devs'
>> account instead.
>
> All Bioperl modules (except the Bundle!) are owned by BIOPERLML on
> PAUSE. Its a little akward since PAUSE is uploader-centric, but see my
> notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release
>
> And certainly, everything that wants to consider itself part of  
> Bioperl
> (and gain the benefit of lots of devs looking after it) should  
> certainly
>   have BIOPERLML as the primary owner.

Alrighty then.

>> I think so, but the feasibility issue is critical.  Do we want cvs/
>> svn to be divided up into 900 subdirectories (one for each module),
>> or do we want to have a similar directory structure as we have now,
>> but with each module in it's own directory?  Or leave everything as
>> is and generate Build.PL on-the-fly (prob. least feasible)?
>
> Very definitely the latter. The key benefit of my approach is that the
> organisation stays as is and that a snapshot of the repository  
> remains a
> single directory of modules in Bio so that people don't have to
> 'install' Bioperl, they can still just uncompress the archive (or  
> check
> out the package from svn) and point their PERL5LIB to the root dir of
> the package.

Okay, makes sense.

> For that reason I very much like the idea of folding the current
> split-out packages (run, network etc.) back into the core package so
> everything is one place. Folding them back in should obviously wait
> until everything is in place and working with core already.

I agree, but that's up to Brian, Hilmar, and the others who donated  
the packages (or at least a consensus of core devs).  One thing at a  
time.

> My proposal obviously wasn't very clear. As far as all other devs are
> concerned, nothing changes at all (except for lots of new improved  
> test
> scripts). The pumpkin will, however, be able to say:
>
> ./Build dist
>
> Right now that generates the distribution archives (in different
> compression formats) - one big archive containing everything.
> My proposal is simply that instead it generates lots of archives, one
> archive per module. It will also generate some Bundles and whatever  
> else
> might be needed.

We'll need to define which tests and data goes with each module and  
so on.

> I don't envisage any major difficulties in achieving this. The
> 'feasibility' issue I was going to look into was strictly regarding
> doing all the new test scripts.

Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3  
is ready to go.  We'll still need to get thoughts on this from other  
core devs out there, and it prob. should until everybody is  
comfortable with the idea.

chris


From dmessina at wustl.edu  Thu Jun 28 18:13:48 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 17:13:48 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>

Coming late to this party, I'm replying to snippets from multiple  
emails.


> [Chris]
> what we do about deprecated modules which linger
> about on CPAN

> [Sendu]
> Delete them from CPAN seems appropriate.

I coulda sworn this was frowned upon, but a recent thread suggests  
it's totally kosher.

	http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html


> [Sendu]
> So, regardless of anything else can we all agree that per-module test
> scripts are a good idea and should be worked on?

I agree.


> [Sendu]
> people don't have to
> 'install' Bioperl, they can still just uncompress the archive (or  
> check
> out the package from svn) and point their PERL5LIB to the root dir of
> the package.

Could you elaborate a bit on how this works? How is XS code that  
needs compiling handled? Or the scripts directory? I would love to be  
able to do this.


> [Sendu]
> For that reason I very much like the idea of folding the current
> split-out packages (run, network etc.) back into the core package so
> everything is one place. Folding them back in should obviously wait
> until everything is in place and working with core already.

 From an organizational standpoint, I'm concerned that with ~900  
modules in core right now, adding all of the additional stuff from  
the split-out packages would make for a daunting directory.

But as you said, this is way down the road, so this proposal doesn't  
bear on the other, closer-to-now issues on the table.


> [Chris]
> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
> is ready to go.  We'll still need to get thoughts on this from other
> core devs out there, and it prob. should until everybody is
> comfortable with the idea.

If we go forward with the CPAN split plan, I like the idea of having  
a trial. We can foresee some of the issues that such a change may  
bring, and yet still more no doubt wait for us once we do it.


Dave


From bix at sendu.me.uk  Thu Jun 28 18:59:35 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 23:59:35 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <46843D57.2080409@sendu.me.uk>

David Messina wrote:
>> people don't have to 'install' Bioperl, they can still just
>> uncompress the archive (or check out the package from svn) and
>> point their PERL5LIB to the root dir of the package.
> 
> Could you elaborate a bit on how this works? How is XS code that 
> needs compiling handled? Or the scripts directory? I would love to be
> able to do this.

I meant for the most part. Core doesn't have any XS code so that's not 
an issue. Scripts can be run manually like any other perl script. When 
you discover something isn't working because of a missing external 
dependency, you just install it. (But that happens very rarely.)

Personally I've /never/ installed Bioperl and used that installed set of 
modules. I've always just pointed my PERL5LIB at the distribution folder 
or my cvs checkout.

Which makes me a strange candidate for advocating all these 
CPAN-specific changes, but there you go ;)


From cjfields at uiuc.edu  Thu Jun 28 19:03:02 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 18:03:02 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <8B6FBB52-5CCE-4122-876C-B9827C86E46E@uiuc.edu>


On Jun 28, 2007, at 5:13 PM, David Messina wrote:

> Coming late to this party, I'm replying to snippets from multiple  
> emails.
>
>
>> [Chris]
>> what we do about deprecated modules which linger
>> about on CPAN
>
>> [Sendu]
>> Delete them from CPAN seems appropriate.
>
> I coulda sworn this was frowned upon, but a recent thread suggests  
> it's totally kosher.
>
> 	http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html

As long as it doesn't show up somewhere to confuse newbies I'm okay  
with it.

>> [Sendu]
>> people don't have to
>> 'install' Bioperl, they can still just uncompress the archive (or  
>> check
>> out the package from svn) and point their PERL5LIB to the root dir of
>> the package.
>
> Could you elaborate a bit on how this works? How is XS code that  
> needs compiling handled? Or the scripts directory? I would love to  
> be able to do this.

Maybe Sendu can add to this, but the XS code is limited to bioperl- 
ext AFAIK.  We could keep that separate until it plays well with  
bioperl itself.

Scripts and examples - maybe packaged along with a Bundle?

>> [Sendu]
>> For that reason I very much like the idea of folding the current
>> split-out packages (run, network etc.) back into the core package so
>> everything is one place. Folding them back in should obviously wait
>> until everything is in place and working with core already.
>
> From an organizational standpoint, I'm concerned that with ~900  
> modules in core right now, adding all of the additional stuff from  
> the split-out packages would make for a daunting directory.
>
> But as you said, this is way down the road, so this proposal  
> doesn't bear on the other, closer-to-now issues on the table.

Well, the code in bioperl-db and network complement code in core, so  
I agree with Sendu they belong there.  They should be under the same  
scrutiny as the rest anyway (code, tests, etc), but won't be bundled  
unles there is an 'install everything' Bundle.

>> [Chris]
>> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
>> is ready to go.  We'll still need to get thoughts on this from other
>> core devs out there, and it prob. should until everybody is
>> comfortable with the idea.
>
> If we go forward with the CPAN split plan, I like the idea of  
> having a trial. We can foresee some of the issues that such a  
> change may bring, and yet still more no doubt wait for us once we  
> do it.

That's what branches are for; testing stuff out like this.

chris


From hartzell at alerce.com  Thu Jun 28 19:05:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 19:05:32 -0400
Subject: [Bioperl-l] problem with binary files.
Message-ID: <18052.16060.932502.183552@almost.alerce.com>


Ok, after pointing out the problem with setting the svn:keywords
property on binary files, it turns out that I *did* that.  Worse yet,
I set the svn:eol-style to 'native' on everything, including binary
files, so depending on your platform they're likely to be fubar.

For example, bioperl-run/t/data/H_pylori_J99.glimmer2.icm may or may
not be what you expect it to be, depending on whether your eol-style
matches the servers and whether any conversions were done.

I'll touch up the way that the little tool I'm using calls cvs2svn and
redo the repository.

g.


From n.haigh at sheffield.ac.uk  Fri Jun 29 02:59:21 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 07:59:21 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>	<4682C6F5.4020406@sendu.me.uk>
	<4682D12E.3000803@sendu.me.uk>	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>	<4682E824.1050507@sendu.me.uk>	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>	<4683624F.6020402@sendu.me.uk>	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <4684ADC9.8040404@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- -- split --
>> [Sendu]
>> For that reason I very much like the idea of folding the current
>> split-out packages (run, network etc.) back into the core package so
>> everything is one place. Folding them back in should obviously wait
>> until everything is in place and working with core already.
> 
>  From an organizational standpoint, I'm concerned that with ~900  
> modules in core right now, adding all of the additional stuff from  
> the split-out packages would make for a daunting directory.
> 
> But as you said, this is way down the road, so this proposal doesn't  
> bear on the other, closer-to-now issues on the table.
> 

I don't think this is an issue - it would simply mean everything is
under the same version control hierarchy. And with svn it's Soooooo much
easier to fiddle around with directory structures

> 
> 
>> [Chris]
>> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
>> is ready to go.  We'll still need to get thoughts on this from other
>> core devs out there, and it prob. should until everybody is
>> comfortable with the idea.
> 
> If we go forward with the CPAN split plan, I like the idea of having  
> a trial. We can foresee some of the issues that such a change may  
> bring, and yet still more no doubt wait for us once we do it.
> 

Under svn it would be easy to make an "svn copy" of run, network etc
into a branch of live to test this out. Not that this might be a
problem, but: Since we are looking at bioperl-* packages being under the
same svn repository, then then "svn copy's" are cheap for disk space.

> 
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhK3JczuW2jkwy2gRAtI2AJ4kNrpGY8XMMh9KxOqs+l0PrEVcwgCfVFj6
BCvltmPyWF4ImueYmd7VFAc=
=ktl+
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Fri Jun 29 03:05:33 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 08:05:33 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <18051.61992.627473.323346@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
Message-ID: <4684AF3D.5090907@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:

- -- snip --

>  > [...]
>  > I've googled around and gathered the following as a possible list for  
>  > our repo. Since I obviously don't know what I'm doing :), of course  
>  > adjust and refine as necessary.
>  > 
> 
> That's a great starting point.  Do you have write access to the wiki?
> Could you link it off of the instructions for using svn?
> 
> g.

Don't .t files need adding to the auto-props?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhK89czuW2jkwy2gRAnRGAJ0VnBNVBAdQdfUnqPhmvsyQnD/bswCggSHC
/Iivb6Lc4/51bUdrTmRQYlE=
=V+t2
-----END PGP SIGNATURE-----


From sac at bioperl.org  Fri Jun 29 04:25:36 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Fri, 29 Jun 2007 01:25:36 -0700
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>

On 6/27/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:
>
> > ...
> > If you have a dev.open-bio.org account and you're in the bioperl
> > group, you're good to get at it via:
> >
> >   file:///home/hartzell/bioperl
> >
> > or
> >
> >   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>
> I managed to get it working using file://.  Haven't tried svn+ssh yet
> but I've had persistent problems getting ssh to work properly on my
> macbook; not sure why yet but I haven't had time to play around with it.

Are you using the ssh that comes installed with OSX? If so, I'd
recommend installing openssh from MacPorts. I recall having issues
with the stock version which were resolved by using the more
up-to-date version you can get via MacPorts.

BTW, I haven't been able to check out the new svn repository via
svn+ssh:// because I can't get svn to authenticate with an alternative
username. My username on dev.open-bio.org differs from what it is on
my local machine, so I issue a command such as:

steve at localhost $ svn --username sac checkout
svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

but I get challenged with:
steve at dev.open-bio.org's password:

I also tried putting the --username argument after the subcommand, but
it still wants to use my local username. I can ssh -l sac into the dev
box no problem. Any suggestions?

Steve


From bix at sendu.me.uk  Fri Jun 29 04:52:42 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 29 Jun 2007 09:52:42 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <4684C85A.5030206@sendu.me.uk>

Steve Chervitz wrote:
> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username. My username on dev.open-bio.org differs from what it is on
> my local machine, so I issue a command such as:
> 
> steve at localhost $ svn --username sac checkout
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> 
> but I get challenged with:
> steve at dev.open-bio.org's password:
> 
> I also tried putting the --username argument after the subcommand, but
> it still wants to use my local username. I can ssh -l sac into the dev
> box no problem. Any suggestions?

Set up your ssh key on the dev machine. I'm also on a machine with the 
wrong username and it works even without attempting to supply the 
correct one.

It does, however, show the 'Welcome to the new developer system' message 
2 or 3 times for every svn+ssh action, which freaks me out a little.


From N.Haigh at sheffield.ac.uk  Fri Jun 29 05:32:38 2007
From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 10:32:38 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <1183109558.4684d1b69bcec@webmail.shef.ac.uk>

Quoting Steve Chervitz <sac at bioperl.org>:

-- snip --

> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username. My username on dev.open-bio.org differs from what it is on
> my local machine, so I issue a command such as:
> 
> steve at localhost $ svn --username sac checkout
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> 
> but I get challenged with:
> steve at dev.open-bio.org's password:
> 
> I also tried putting the --username argument after the subcommand, but
> it still wants to use my local username. I can ssh -l sac into the dev
> box no problem. Any suggestions?
> 
> Steve
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


You could try:
svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

Nath


From dmessina at wustl.edu  Fri Jun 29 08:28:26 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 29 Jun 2007 07:28:26 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>

>
> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username.

I have the same issue. I set up a stanza in my ~/.ssh/config:

Host dev.open-bio.org
   User dave_messina

where dave_messina is my dev.open-bio.org username.


From cjfields at uiuc.edu  Fri Jun 29 13:00:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 29 Jun 2007 12:00:27 -0500
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
Message-ID: <F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>


On Jun 29, 2007, at 7:28 AM, David Messina wrote:

>>
>> BTW, I haven't been able to check out the new svn repository via
>> svn+ssh:// because I can't get svn to authenticate with an  
>> alternative
>> username.
>
> I have the same issue. I set up a stanza in my ~/.ssh/config:
>
> Host dev.open-bio.org
>    User dave_messina
>
> where dave_messina is my dev.open-bio.org username.

I changed to the macports ssh w/o luck.  It appears the key is  
offered up, so maybe the problem is how I have everything set up on  
dev (though I followed everything on the wiki):

....
  Contact 'support at open-bio.org' for
your new login information.
======================================
debug1: Authentications that can continue: publickey,gssapi-with- 
mic,password
debug1: Next authentication method: publickey
debug1: Offering public key: /Users/cjfields/.ssh/id_dsa
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,gssapi-with- 
mic,password
debug2: we did not send a packet, disable method
debug1: Next authentication method: password

It's odd; I can use passwordless logins for other servers (admittedly  
Mac servers) w/o problems using ssh keys, but dev.open-bio.org always  
prompts for a password regardless.

My feeling is it's something with my local ssh or sshd config; I'll  
try fiddling with it to see what happens.  Anyone have suggestions?   
I've lost enough hair as is; don't want to lose more!

chris


From sac at bioperl.org  Fri Jun 29 13:07:45 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Fri, 29 Jun 2007 10:07:45 -0700
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <1183109558.4684d1b69bcec@webmail.shef.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<1183109558.4684d1b69bcec@webmail.shef.ac.uk>
Message-ID: <8f200b4c0706291007x2b765323n75c9003a47fe7cbb@mail.gmail.com>

On 6/29/07, Nathan S. Haigh <N.Haigh at sheffield.ac.uk> wrote:
> Quoting Steve Chervitz <sac at bioperl.org>:
>
> -- snip --
>
> > BTW, I haven't been able to check out the new svn repository via
> > svn+ssh:// because I can't get svn to authenticate with an alternative
> > username. My username on dev.open-bio.org differs from what it is on
> > my local machine, so I issue a command such as:
> >
> > steve at localhost $ svn --username sac checkout
> > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> >
> > but I get challenged with:
> > steve at dev.open-bio.org's password:
> >
> > I also tried putting the --username argument after the subcommand, but
> > it still wants to use my local username. I can ssh -l sac into the dev
> > box no problem. Any suggestions?
>
> [...]
> You could try:
> svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

Bingo. Thanks for the tips, guys.

BTW, setting up ssh keys was not the issue, since my key is already
set up on the dev machine. The svn --username setting appears to not
be operative at the ssh layer. I  suspected this might be the case
given that the usage info says:

 $ svn --help co
  --username arg           : specify a username ARG
  --password arg           : specify a password ARG

which seemed insecure. I didn't want to send my password in the clear,
and didn't know if or whether svn would hand it off to ssh. It wasn't
even sending my username to ssh, so I knew something was wrong. These
args are probably only intended for accessing local svn repositories,
or non-svn+ssh-based checkouts.

BTW, the svn+ssh check out on Mac OS X works for me. I'm using svn and
openssh installed via MacPorts:

$ svn --version
svn, version 1.4.4 (r25188)
   compiled Jun 28 2007, 23:51:53

$ ssh -version
OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007

Steve


From hartzell at alerce.com  Fri Jun 29 15:19:31 2007
From: hartzell at alerce.com (George Hartzell)
Date: Fri, 29 Jun 2007 15:19:31 -0400
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
	<F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
Message-ID: <18053.23363.102371.602742@almost.alerce.com>

Chris Fields writes:
 > 
 > On Jun 29, 2007, at 7:28 AM, David Messina wrote:
 > 
 > >>
 > >> BTW, I haven't been able to check out the new svn repository via
 > >> svn+ssh:// because I can't get svn to authenticate with an  
 > >> alternative
 > >> username.
 > >
 > > I have the same issue. I set up a stanza in my ~/.ssh/config:
 > >
 > > Host dev.open-bio.org
 > >    User dave_messina
 > >
 > > where dave_messina is my dev.open-bio.org username.
 > 
 > I changed to the macports ssh w/o luck.  It appears the key is  
 > offered up, so maybe the problem is how I have everything set up on  
 > dev (though I followed everything on the wiki):

A couple of things to check.

  - make sure that you put your public key in ~/.ssh/authorized_keys2
    (not authorized_keys)

  - make sure that authorized_keys2 is chmod'ed 600 (644 might be
    enough...).

  - make sure that ~/.ssh is chmoded 700.

  - make sure that your home directory is 755.

Then see if it works.  You might be able to relax some of those
protections a bit, but ssh's uptight about letting other people mess
with that data.

g.


From dmessina at wustl.edu  Fri Jun 29 18:47:14 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 29 Jun 2007 17:47:14 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <4684AF3D.5090907@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
Message-ID: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>

> [Nathan]
> Don't .t files need adding to the auto-props?

Yes -- thanks for reminding me. Please feel free to add it to the  
wiki page. I'll be tweaking it some more later on in any case.


Dave


From n.haigh at sheffield.ac.uk  Sat Jun 30 05:55:56 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 30 Jun 2007 10:55:56 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
Message-ID: <468628AC.9060200@sheffield.ac.uk>

David Messina wrote:
>> [Nathan]
>> Don't .t files need adding to the auto-props?
> 
> Yes -- thanks for reminding me. Please feel free to add it to the wiki 
> page. I'll be tweaking it some more later on in any case.
> 
> 
> Dave

I noticed this has already been done. I have just been through the 
t/data dir and added a list of extensions I found (without props). There 
are some files without extensions, how should these be dealt with? There 
seems to be a plethora of file naming styles which means there's a 
pretty long list of non-standard extensions. So at some point someone 
will commit a new data file with a new extension (often describing what 
program created the output or the test for which it's intended) that 
won't be in the auto-props file - can you think of a way around this?

Nath


From cjfields at uiuc.edu  Sat Jun 30 08:48:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 07:48:10 -0500
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <18053.23363.102371.602742@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
	<F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
	<18053.23363.102371.602742@almost.alerce.com>
Message-ID: <3874B4EE-0119-40BC-8B92-11133A766417@uiuc.edu>


On Jun 29, 2007, at 2:19 PM, George Hartzell wrote:

> Chris Fields writes:
>>
>> On Jun 29, 2007, at 7:28 AM, David Messina wrote:
>>
>>>>
>>>> BTW, I haven't been able to check out the new svn repository via
>>>> svn+ssh:// because I can't get svn to authenticate with an
>>>> alternative
>>>> username.
>>>
>>> I have the same issue. I set up a stanza in my ~/.ssh/config:
>>>
>>> Host dev.open-bio.org
>>>    User dave_messina
>>>
>>> where dave_messina is my dev.open-bio.org username.
>>
>> I changed to the macports ssh w/o luck.  It appears the key is
>> offered up, so maybe the problem is how I have everything set up on
>> dev (though I followed everything on the wiki):
>
> A couple of things to check.
>
>   - make sure that you put your public key in ~/.ssh/authorized_keys2
>     (not authorized_keys)
>
>   - make sure that authorized_keys2 is chmod'ed 600 (644 might be
>     enough...).
>
>   - make sure that ~/.ssh is chmoded 700.
>
>   - make sure that your home directory is 755.
>
> Then see if it works.  You might be able to relax some of those
> protections a bit, but ssh's uptight about letting other people mess
> with that data.
>
> g.

Got it working; it was the permissions on my home dir (the last  
one).  Thanks George!

chris


From dmessina at wustl.edu  Sat Jun 30 11:37:44 2007
From: dmessina at wustl.edu (David Messina)
Date: Sat, 30 Jun 2007 10:37:44 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <468628AC.9060200@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
Message-ID: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>

> I have just been through the t/data dir and added a list of  
> extensions I found

Thanks! That's a big help. I'll add prop definitions to those shortly.


>  There are some files without extensions, how should these be dealt  
> with?

If you look in the text files section, there are some files there  
which don't have extensions, e.g. AUTHORS, BUGS. There's also

	Makefile.*

so we have some flexibility in how svn knows to auto-prop a file. I  
haven't read up on the details yet to find out how it handles files  
that match multiple criteria -- it may be dependent simply on the  
order they're defined.


> There seems to be a plethora of file naming styles which means  
> there's a pretty long list of non-standard extensions. So at some  
> point someone will commit a new data file with a new extension  
> (often describing what program created the output or the test for  
> which it's intended) that won't be in the auto-props file - can you  
> think of a way around this?

Ive been thinking about this a bit. How about this?

- We have just "standard" files and extensions (like *.blast,  
*.fasta) in the auto-props list.

- We manually add props for the files that have nonstandard,  
arbitrary extensions so all the files have now are prop'd.

- At some point we rename those nonstandard files to have standard  
extensions. Especially for the t/data/ files, we'll have to make sure  
to update the tests that rely on them.

- We can have the suggested list of extensions for new files that get  
added. I don't think we need to strictly enforce this just for the  
sake of svn (after all, its primary function of version control will  
work just fine without any properties set), but it would be nice if  
we could try to keep to it mostly.

Many distros come with an /etc/mime.types file which has the list of  
officially registered MIME types. I found a script that will take  
this list and convert it into auto-props format. I don't think we  
need to support *all* of the gazillion filetypes since most of the  
them our repository will never see, but we certainly could.


Dave


From dmessina at wustl.edu  Sat Jun 30 12:26:27 2007
From: dmessina at wustl.edu (David Messina)
Date: Sat, 30 Jun 2007 11:26:27 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
Message-ID: <D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>


On Jun 30, 2007, at 10:37 AM, David Messina wrote:

> - We manually add props for the files that have nonstandard,
> arbitrary extensions so all the files have now are prop'd.

Er, that should be

- We manually add props for the files that have nonstandard,  
arbitrary extensions so that all the files now in the repository are  
prop'd.


From n.haigh at sheffield.ac.uk  Sat Jun 30 13:25:58 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 30 Jun 2007 18:25:58 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
Message-ID: <46869226.70203@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- -- snip --
> 
> 
>> There seems to be a plethora of file naming styles which means there's
>> a pretty long list of non-standard extensions. So at some point
>> someone will commit a new data file with a new extension (often
>> describing what program created the output or the test for which it's
>> intended) that won't be in the auto-props file - can you think of a
>> way around this?
> 
> Ive been thinking about this a bit. How about this?
> 
> - We have just "standard" files and extensions (like *.blast, *.fasta)
> in the auto-props list.

I think the list of seq formats recognised by Bioperl in Bio::SeqIO and
Bio::AlignIO would be a good start. As these are likely to be the ones
that are sensitive to file format recognition and thus could break tests
if renamed.

I think a lot of people have used "." in file names as an alternative to
a space. I think it would be beneficial to use an underscore "_" in
these cases and leave the "." to represent the beginning of the file
extension.

> 
> - We manually add props for the files that have nonstandard, arbitrary
> extensions so all the files that we currently have now are prop'd.
> 
> - At some point we rename those nonstandard files to have standard
> extensions. Especially for the t/data/ files, we'll have to make sure to
> update the tests that rely on them.

Nice and easy with svn :)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhpHiczuW2jkwy2gRAuZ5AKCnd2MvCsvSn1NemDVMmabnieR2vACg1Qk0
pYVvXwxq0lpiGfM09RQ6A1I=
=3Lhw
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Sat Jun 30 15:11:52 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 14:11:52 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
	<D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>
Message-ID: <C274666B-9771-4296-80BB-8DFFB036F29C@uiuc.edu>


On Jun 30, 2007, at 11:26 AM, David Messina wrote:

>
> On Jun 30, 2007, at 10:37 AM, David Messina wrote:
>
>> - We manually add props for the files that have nonstandard,
>> arbitrary extensions so all the files have now are prop'd.
>
> Er, that should be
>
> - We manually add props for the files that have nonstandard,
> arbitrary extensions so that all the files now in the repository are
> prop'd.

Do we need to define every filetype extension, or can there be a  
fallback (eg if it isn't on the list or has no extension it's plain  
text)?

chris


From hlapp at gmx.net  Sat Jun 30 17:26:22 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 17:26:22 -0400
Subject: [Bioperl-l] Splits again
In-Reply-To: <468409C7.7020102@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
Message-ID: <A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>


On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:

> [...]
> Very definitely the latter. The key benefit of my approach is that  
> the organisation stays as is and that a snapshot of the repository  
> remains a single directory of modules in Bio so that people don't  
> have to 'install' Bioperl, they can still just uncompress the  
> archive (or check out the package from svn) and point their  
> PERL5LIB to the root dir of the package.

I think this is absolutely key to keep in mind. Anything without this  
feature will likely be a non-starter.

I don't really have time to follow the discussion let alone  
participate, so really all I can contribute is to offer some sanity/ 
reality checks (such as the above).

In this sense, I understand a release pumpkin will generate ~900  
packages to upload to CPAN? How much hassle is that compared to what  
uploading a bioperl release means right now?

How brittle is all the Build.PL code that will be needed to automate  
all of this, and how difficult will it be to maintain? For example,  
if someone adds in 10 new modules, what Build.PL-related work will  
need to be done?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Sat Jun 30 17:32:52 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 30 Jun 2007 22:32:52 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
	<A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
Message-ID: <4686CC04.6000403@sendu.me.uk>

Hilmar Lapp wrote:
> On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:
> 
>> [...]
>> Very definitely the latter. The key benefit of my approach is that  
>> the organisation stays as is and that a snapshot of the repository  
>> remains a single directory of modules in Bio so that people don't  
>> have to 'install' Bioperl, they can still just uncompress the  
>> archive (or check out the package from svn) and point their  
>> PERL5LIB to the root dir of the package.
[snip]
> In this sense, I understand a release pumpkin will generate ~900  
> packages to upload to CPAN? How much hassle is that compared to what  
> uploading a bioperl release means right now?

I'd have to investigate. I did my uploads using the PAUSE website, which 
for 900 packages would be unfeasible. Will have to see if the process 
can be automated.


> How brittle is all the Build.PL code that will be needed to automate  
> all of this, and how difficult will it be to maintain? For example,  
> if someone adds in 10 new modules, what Build.PL-related work will  
> need to be done?

Well, my plan will be that once the work is done, you won't need to 
touch the Build.PL code again. My intent is that the pumpkin can just 
type one command and not think about anything.

As for the reality, I won't know until I think about it properly and 
experiment.


From hlapp at gmx.net  Sat Jun 30 19:36:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 19:36:45 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18052.3946.224905.415905@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
Message-ID: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>


On Jun 28, 2007, at 3:43 PM, George Hartzell wrote:

> I just did the experiment, and filename-insensitivity seems to be
> breaking something.
>
> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.
>
> I reformatted a memory stick to be case sensitive and co of
>
>   bioperl/bioperl-live/tags/release-0-9-2/t
>
> worked, then I made a directory in my home dir (normal mac thing) and
> got the same error as above.

You picked up a rename of a file from lower case extension to upper  
case extension. Unfortunately, there are several months between  
adding the upper-case and removing the lower-case version.

We can reconstruct what happened with this using svn log on the  
directory (this does not require a checkout):

$ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ 
bioperl-live/trunk/t/data

Searching for HUMBETGLOA yields the following two commits that added  
one and removed the other:

------------------------------------------------------------------------
r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines
Changed paths:
    M /bioperl-live/trunk/t/SearchIO.t
    A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA
    A /bioperl-live/trunk/t/data/cysprot1.FASTA

added tests for FASTA

------------------------------------------------------------------------
r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines
Changed paths:
    A /bioperl-live/trunk/t/data/HUMBETGLOA.fa
    D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta

renaming file to avoid clobbering on windows

Unfortunately, both files are in the tag (again, no checkout required):

$ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta
HUMBETGLOA.FASTA
HUMBETGLOA.fasta

We can remove the offending version from the repository (again,  
without needing a checkout):

$ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta

I did this, and now the tag checks out fine on OSX. Can anyone confirm?

(BTW the ability to operate on the repository w/o needing a checkout  
is another advantage of svn)

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Jun 30 20:40:53 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 19:40:53 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
	<2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
Message-ID: <A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>

Checkout worked for me (Mac OS X) using both:

svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
tags/release-0-9-2/t/data
svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
tags/release-0-9-2/

so removing the offending file worked (good catch!).  Haven't run a  
full co but probably isn't necessary.

chris

On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote:

>
> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote:
>
>> I just did the experiment, and filename-insensitivity seems to be
>> breaking something.
>>
>> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.
>>
>> I reformatted a memory stick to be case sensitive and co of
>>
>>   bioperl/bioperl-live/tags/release-0-9-2/t
>>
>> worked, then I made a directory in my home dir (normal mac thing) and
>> got the same error as above.
>
> You picked up a rename of a file from lower case extension to upper  
> case extension. Unfortunately, there are several months between  
> adding the upper-case and removing the lower-case version.
>
> We can reconstruct what happened with this using svn log on the  
> directory (this does not require a checkout):
>
> $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ 
> bioperl/bioperl-live/trunk/t/data
>
> Searching for HUMBETGLOA yields the following two commits that  
> added one and removed the other:
>
> ---------------------------------------------------------------------- 
> --
> r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines
> Changed paths:
>    M /bioperl-live/trunk/t/SearchIO.t
>    A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA
>    A /bioperl-live/trunk/t/data/cysprot1.FASTA
>
> added tests for FASTA
>
> ---------------------------------------------------------------------- 
> --
> r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines
> Changed paths:
>    A /bioperl-live/trunk/t/data/HUMBETGLOA.fa
>    D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta
>
> renaming file to avoid clobbering on windows
>
> Unfortunately, both files are in the tag (again, no checkout  
> required):
>
> $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta
> HUMBETGLOA.FASTA
> HUMBETGLOA.fasta
>
> We can remove the offending version from the repository (again,  
> without needing a checkout):
>
> $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta
>
> I did this, and now the tag checks out fine on OSX. Can anyone  
> confirm?
>
> (BTW the ability to operate on the repository w/o needing a  
> checkout is another advantage of svn)
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hartzell at alerce.com  Sat Jun 30 20:48:06 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 30 Jun 2007 17:48:06 -0700
Subject: [Bioperl-l] Take 2 of the new subversion repository.
Message-ID: <18054.63942.316904.413911@almost.alerce.com>


There's a second cut at the subversion repository.  I've done a better
job of setting svn:keywords and svn:eol-style on various files.  The
defaults were more cautious and I used an auto-props files based on
the wiki version.

  svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2

The old repository's still around as

  svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1

I renamed it so that people would work with it by mistake.  If, for
some hard-to-imagine reason, you have a working copy that you want to
run against it, you should be able to do an svn switch --relocate on
your working copy and be back in shape.  In fact, it might be a good
time to give it a try....

g.


From hartzell at alerce.com  Sat Jun 30 21:17:18 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 30 Jun 2007 18:17:18 -0700
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
	<2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
	<A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
Message-ID: <18055.158.30409.808612@almost.alerce.com>

Chris Fields writes:
 > Checkout worked for me (Mac OS X) using both:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
 > tags/release-0-9-2/t/data
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
 > tags/release-0-9-2/
 > 
 > so removing the offending file worked (good catch!).  Haven't run a  
 > full co but probably isn't necessary.
 > [...]

I'll keep a note of that as something to do when I prepare the final
cut of the repository.

g.


From jason at bioperl.org  Sat Jun 30 21:25:30 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 30 Jun 2007 18:25:30 -0700
Subject: [Bioperl-l] Take 2 of the new subversion repository.
In-Reply-To: <18054.63942.316904.413911@almost.alerce.com>
References: <18054.63942.316904.413911@almost.alerce.com>
Message-ID: <D8C71EF7-6E2E-498E-8638-373512ADE3EE@bioperl.org>

Thanks George -
I also did
chgrp -R bioperl /home/hartzell/bioperl_take?
to make sure the group permission was set right.

We may also want to do a chmod g+s on all the dirs in there as well  
so that permissions are preserved when this gets deployed for real.

If anyone wants to make some changes to files and commit them, as  
well as make some branches/tags to play around a little bit since  
we'll likely throw this away and do it again from locked down version  
from CVS at some appointed time.

Do you know how to have svn commit messages generate summary emails  
as well?

-j
On Jun 30, 2007, at 5:48 PM, George Hartzell wrote:

>
> There's a second cut at the subversion repository.  I've done a better
> job of setting svn:keywords and svn:eol-style on various files.  The
> defaults were more cautious and I used an auto-props files based on
> the wiki version.
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2
>
> The old repository's still around as
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1
>
> I renamed it so that people would work with it by mistake.  If, for
> some hard-to-imagine reason, you have a working copy that you want to
> run against it, you should be able to do an svn switch --relocate on
> your working copy and be back in shape.  In fact, it might be a good
> time to give it a try....
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hlapp at gmx.net  Sat Jun 30 22:21:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 22:21:25 -0400
Subject: [Bioperl-l] Take 2 of the new subversion repository.
In-Reply-To: <18054.63942.316904.413911@almost.alerce.com>
References: <18054.63942.316904.413911@almost.alerce.com>
Message-ID: <5F53A433-BAA9-431D-A0C5-5955690D0B73@gmx.net>


On Jun 30, 2007, at 8:48 PM, George Hartzell wrote:

> I renamed it so that people would work with it by mistake.  If, for
> some hard-to-imagine reason, you have a working copy that you want to
> run against it,

It's not so hard to imagine - checking out the entire repository  
takes a long time.

> you should be able to do an svn switch --relocate on
> your working copy and be back in shape.  In fact, it might be a good
> time to give it a try....

It doesn't work:

svn: The repository at 'svn+ssh://dev.open-bio.org/home/hartzell/ 
bioperl_take2' has uuid '31277767-6726-dc11-ab4c-0019e3f901d6', but  
the WC has '27e854f1-f323-dc11-8c1b-0019e3f901d6'

You can't relocate to a totally new repository (relocating to  
bioperl_take1 does work though).

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Jun 30 22:39:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 21:39:27 -0500
Subject: [Bioperl-l] Take 2 of the new subversion repository.
In-Reply-To: <D8C71EF7-6E2E-498E-8638-373512ADE3EE@bioperl.org>
References: <18054.63942.316904.413911@almost.alerce.com>
	<D8C71EF7-6E2E-498E-8638-373512ADE3EE@bioperl.org>
Message-ID: <7C6FD6C9-CBED-40D3-BA90-4B34F79E6DE0@uiuc.edu>

There are a few CPAN modules available; here's one:

http://search.cpan.org/~dwheeler/SVN-Notify-2.66/lib/SVN/Notify.pm

chris

On Jun 30, 2007, at 8:25 PM, Jason Stajich wrote:

> Thanks George -
> I also did
> chgrp -R bioperl /home/hartzell/bioperl_take?
> to make sure the group permission was set right.
>
> We may also want to do a chmod g+s on all the dirs in there as well
> so that permissions are preserved when this gets deployed for real.
>
> If anyone wants to make some changes to files and commit them, as
> well as make some branches/tags to play around a little bit since
> we'll likely throw this away and do it again from locked down version
> from CVS at some appointed time.
>
> Do you know how to have svn commit messages generate summary emails
> as well?
>
> -j
> On Jun 30, 2007, at 5:48 PM, George Hartzell wrote:
>
>>
>> There's a second cut at the subversion repository.  I've done a  
>> better
>> job of setting svn:keywords and svn:eol-style on various files.  The
>> defaults were more cautious and I used an auto-props files based on
>> the wiki version.
>>
>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2
>>
>> The old repository's still around as
>>
>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1
>>
>> I renamed it so that people would work with it by mistake.  If, for
>> some hard-to-imagine reason, you have a working copy that you want to
>> run against it, you should be able to do an svn switch --relocate on
>> your working copy and be back in shape.  In fact, it might be a good
>> time to give it a try....
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sat Jun 30 22:46:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 21:46:05 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4686CC04.6000403@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
	<A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
	<4686CC04.6000403@sendu.me.uk>
Message-ID: <D10BF6DE-D8A6-448A-8850-A7B13AE54266@uiuc.edu>


On Jun 30, 2007, at 4:32 PM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:
>>> [...]
>>> Very definitely the latter. The key benefit of my approach is  
>>> that  the organisation stays as is and that a snapshot of the  
>>> repository  remains a single directory of modules in Bio so that  
>>> people don't  have to 'install' Bioperl, they can still just  
>>> uncompress the  archive (or check out the package from svn) and  
>>> point their  PERL5LIB to the root dir of the package.
> [snip]
>> In this sense, I understand a release pumpkin will generate ~900   
>> packages to upload to CPAN? How much hassle is that compared to  
>> what  uploading a bioperl release means right now?
>
> I'd have to investigate. I did my uploads using the PAUSE website,  
> which for 900 packages would be unfeasible. Will have to see if the  
> process can be automated.

Not that they would care one way or another but maybe we should  
contact the CPAN maintainers to get their thoughts.  They might have  
some ideas...

>> How brittle is all the Build.PL code that will be needed to  
>> automate  all of this, and how difficult will it be to maintain?  
>> For example,  if someone adds in 10 new modules, what Build.PL- 
>> related work will  need to be done?
>
> Well, my plan will be that once the work is done, you won't need to  
> touch the Build.PL code again. My intent is that the pumpkin can  
> just type one command and not think about anything.
>
> As for the reality, I won't know until I think about it properly  
> and experiment.

A good experiment for a branch.  I still think this could be  
accomplished step-wise; for instance run a quick test using something  
with a simple dependency tree like Bio::Root::Root (only needs  
RootI), finish up with Bio::Root*, then work down into PrimarySeq,  
Seq, etc.  Submit them to CPAN piecemeal or in batches (all  
Bio::Seq*, so on).

If the Build.PL, etc are to be generated on the fly then maybe there  
should be a simple way of registering or matching tests to modules  
(or vice versa) to ease the pain, particularly for new code.

chris


From hlapp at gmx.net  Sat Jun 30 22:56:04 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 22:56:04 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
	<2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
	<A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
Message-ID: <E250DB37-E2C1-4F71-A2FE-B64603EB69FD@gmx.net>

It turns out that both files are also present on the release-0-9-3,  
bioperl-1-0-0, bioperl-1-0-alpha, and bioperl-1-0-alpha2-rc tags, so add

$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/release-0-9-3/t/data/ 
HUMBETGLOA.fasta
$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-0/t/data/ 
HUMBETGLOA.fasta
$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha/t/data/ 
HUMBETGLOA.fasta
$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha2-rc/t/data/ 
HUMBETGLOA.fasta

to the post-processing commands.

	-hilmar

On Jun 30, 2007, at 8:40 PM, Chris Fields wrote:

> Checkout worked for me (Mac OS X) using both:
>
> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data
> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/
>
> so removing the offending file worked (good catch!).  Haven't run a  
> full co but probably isn't necessary.
>
> chris
>
> On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote:
>
>>
>> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote:
>>
>>> I just did the experiment, and filename-insensitivity seems to be
>>> breaking something.
>>>
>>> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.
>>>
>>> I reformatted a memory stick to be case sensitive and co of
>>>
>>>   bioperl/bioperl-live/tags/release-0-9-2/t
>>>
>>> worked, then I made a directory in my home dir (normal mac thing)  
>>> and
>>> got the same error as above.
>>
>> You picked up a rename of a file from lower case extension to  
>> upper case extension. Unfortunately, there are several months  
>> between adding the upper-case and removing the lower-case version.
>>
>> We can reconstruct what happened with this using svn log on the  
>> directory (this does not require a checkout):
>>
>> $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ 
>> bioperl/bioperl-live/trunk/t/data
>>
>> Searching for HUMBETGLOA yields the following two commits that  
>> added one and removed the other:
>>
>> --------------------------------------------------------------------- 
>> ---
>> r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2  
>> lines
>> Changed paths:
>>    M /bioperl-live/trunk/t/SearchIO.t
>>    A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA
>>    A /bioperl-live/trunk/t/data/cysprot1.FASTA
>>
>> added tests for FASTA
>>
>> --------------------------------------------------------------------- 
>> ---
>> r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2  
>> lines
>> Changed paths:
>>    A /bioperl-live/trunk/t/data/HUMBETGLOA.fa
>>    D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta
>>
>> renaming file to avoid clobbering on windows
>>
>> Unfortunately, both files are in the tag (again, no checkout  
>> required):
>>
>> $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ 
>> bioperl-live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i  
>> fasta
>> HUMBETGLOA.FASTA
>> HUMBETGLOA.fasta
>>
>> We can remove the offending version from the repository (again,  
>> without needing a checkout):
>>
>> $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
>> live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta
>>
>> I did this, and now the tag checks out fine on OSX. Can anyone  
>> confirm?
>>
>> (BTW the ability to operate on the repository w/o needing a  
>> checkout is another advantage of svn)
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Fri Jun  1 04:06:04 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 01 Jun 2007 09:06:04 +0100
Subject: [Bioperl-l] ClustalW Score?
In-Reply-To: <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu>
References: <00e201c7a2de$91f60f50$2d01a8c0@PICO><DFEEDFC9-68C4-4821-846F-69AC9559C70B@bioperl.org><465E9B58.1020403@sendu.me.uk>	<49B6333A-18B9-4B63-80EF-81C57A295494@bioperl.org>
	<1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu>
Message-ID: <465FD36C.5060603@sendu.me.uk>

Kevin Brown wrote:
>> you're right --- it is not really my code, I was just 
>> elaborating Kevin's example --- it would probably need to be 
>> more specific or perhaps the last Score seen is sufficient 
>> for what one is trying to capture?
> 
> I took that code from a pairwise clustal alignment script that I wrote
> to deal with aligning a bunch of short sequences against a long one to
> see where they line up at.  When all of them were fed to Clustal the
> short sequences all ended up aligned to each other and not well aligned
> to the longer sequence.  I only saw one score in the output from the
> pairwise, so that is what I used to find a reasonable value.

Ok, well I've hedged my bets and used both. Now commited to CVS.


From jy at genseq.co.uk  Fri Jun  1 22:39:48 2007
From: jy at genseq.co.uk (Jean-Yves Sireau)
Date: Sat, 2 Jun 2007 10:39:48 +0800
Subject: [Bioperl-l] Genseq
Message-ID: <20070602103948.093d713c@jys.my.regentmarkets.com>

Dear List members,

I would like to let you know of the formation of Genseq Ltd., a
bioinformatics company that will (in time!) offer genome sequencing to
high net worth individuals and bioinformatic analysis of the sequence
data to detect predisposition to illness.  The company's website is
www.genseq.co.uk

Genseq would be willing to sponsor bioperl, whether financially or by
providing resources, notably for any bioperl-related activities in the
Asia Pacific region.  Genseq's bioinformatics team will be based in
Cyberjaya (Malaysia), and we are in particular interested to promote
bioperl in Malaysia.  We are also actively recruiting at the moment
in Malaysia and India.

If there was sufficient demand, we would be willing to organise a
bioperl conference in Cyberjaya at the Cyberview Lodge
(www.cyberview-lodge.com), which would be the ideal place for such a
conference in Malaysia.

Looking forward to your comments, suggestions and proposals.

Best regards
Jean-Yves Sireau

-- 

Jean-Yves Sireau
CEO, Genseq Ltd.
www.genseq.co.uk


From cjfields at uiuc.edu  Sat Jun  2 01:16:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 00:16:05 -0500
Subject: [Bioperl-l] EUtilities overhaul started
Message-ID: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>

To anyone using Bio::DB::EUilities,

I am in the midst of a major overhaul to the various EUtilities tools  
and to Bio::DB::GenericWebDBI (the latter which I am forming into  
more or less a test bed for other database interfaces).  I'm about  
80% done at this point, and will likely start committing changes this  
coming week.

The overall interface will change (something I had warned about in  
the Bio::DB::EUtilities POD) but I am hoping it will be more  
intuitive and easier to use in the long run.  I'll describe the  
overall redesign and use in an upcoming HOWTO (as recommended by  
Brian a while back).

If anyone has any suggestions/ideas/flames, please let me know!

Cheers!

chris


From cjfields at uiuc.edu  Sat Jun  2 10:39:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 09:39:25 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
Message-ID: <AF243C87-B82E-4C33-939D-2B84B9E41537@uiuc.edu>

Yes, there are a few odd issues, though that's one I've not heard of  
yet.  You might try one of the sub-nucleotide databases (nuccore,  
nucest, nucgss).

I'll try looking into it and (if necessary) pester NCBI about it.   
I'll pass this on to the mail list to see if anyone else knows about  
the problem.

chris

On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote:

> Hi Chris,
>
> Thanks for your work on EUtilities.
> For a production task, I used EUtilitities directly (given your
> announced overhaul). I noticed a recent problem at NCBI (reported two
> weeks ago to NCBI, no reply yet). Possibly you may run into this with
> testing: if you ePOST gi ids to the EU server and then use this set in
> Esearch (using the query key) no results are returned for the
> nucleotide database.
> ESearches like "db=$db%23$QueryKey" typically fail if the $db is
> nucleotide (but work f $db='protein'). The XML output has Count 0 and
> an empty QueryTranslationSet for db=nucleotide only.
> For completeness, I attach a simple test script I used.
>
>
> Best regards,
> Bernd
>
>
> On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> To anyone using Bio::DB::EUilities,
>>
>> I am in the midst of a major overhaul to the various EUtilities tools
>> and to Bio::DB::GenericWebDBI (the latter which I am forming into
>> more or less a test bed for other database interfaces).  I'm about
>> 80% done at this point, and will likely start committing changes this
>> coming week.
>>
>> The overall interface will change (something I had warned about in
>> the Bio::DB::EUtilities POD) but I am hoping it will be more
>> intuitive and easier to use in the long run.  I'll describe the
>> overall redesign and use in an upcoming HOWTO (as recommended by
>> Brian a while back).
>>
>> If anyone has any suggestions/ideas/flames, please let me know!
>>
>> Cheers!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> <EUsearch.pl>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Jun  3 00:51:57 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 23:51:57 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <e572b3c70706020948l708f14c8q706b65c73617c86d@mail.gmail.com>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
	<AF243C87-B82E-4C33-939D-2B84B9E41537@uiuc.edu>
	<e572b3c70706020948l708f14c8q706b65c73617c86d@mail.gmail.com>
Message-ID: <1A2AF5C4-6A58-4FDD-A4CA-6ABCE30F0D1B@uiuc.edu>

I can confirm this; however it only relates to the use of history  
with esearch and nucleotide (use of the history with other eutils  
seems to work fine); retrieving sequences via efetch is not  
affected.  If I find out anything more I'll post something on the  
mail list.

chris

On Jun 2, 2007, at 11:48 AM, Bernd Brandt wrote:

> I can confirm that using the correct sub-nucleotide database works
> (nuccore in my case).
> This seems to be a quite recent change/bug at NCBI. Until recently,
> db=nucleotide worked. Moreover, EInfo still lists nucleotide as valid
> db.
> It is not optimal to have to choose the sub-database and the searches
> work via the Entrez web-interface. Note that this problem is related
> to the ESearch and db=nucleotide.
>
> bernd
>
> On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> Yes, there are a few odd issues, though that's one I've not heard of
>> yet.  You might try one of the sub-nucleotide databases (nuccore,
>> nucest, nucgss).
>>
>> I'll try looking into it and (if necessary) pester NCBI about it.
>> I'll pass this on to the mail list to see if anyone else knows about
>> the problem.
>>
>> chris
>>
>> On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote:
>>
>> > Hi Chris,
>> >
>> > Thanks for your work on EUtilities.
>> > For a production task, I used EUtilitities directly (given your
>> > announced overhaul). I noticed a recent problem at NCBI  
>> (reported two
>> > weeks ago to NCBI, no reply yet). Possibly you may run into this  
>> with
>> > testing: if you ePOST gi ids to the EU server and then use this  
>> set in
>> > Esearch (using the query key) no results are returned for the
>> > nucleotide database.
>> > ESearches like "db=$db%23$QueryKey" typically fail if the $db is
>> > nucleotide (but work f $db='protein'). The XML output has Count  
>> 0 and
>> > an empty QueryTranslationSet for db=nucleotide only.
>> > For completeness, I attach a simple test script I used.
>> >
>> >
>> > Best regards,
>> > Bernd
>> >
>> >
>> > On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> >> To anyone using Bio::DB::EUilities,
>> >>
>> >> I am in the midst of a major overhaul to the various EUtilities  
>> tools
>> >> and to Bio::DB::GenericWebDBI (the latter which I am forming into
>> >> more or less a test bed for other database interfaces).  I'm about
>> >> 80% done at this point, and will likely start committing  
>> changes this
>> >> coming week.
>> >>
>> >> The overall interface will change (something I had warned about in
>> >> the Bio::DB::EUtilities POD) but I am hoping it will be more
>> >> intuitive and easier to use in the long run.  I'll describe the
>> >> overall redesign and use in an upcoming HOWTO (as recommended by
>> >> Brian a while back).
>> >>
>> >> If anyone has any suggestions/ideas/flames, please let me know!
>> >>
>> >> Cheers!
>> >>
>> >> chris
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>
>> >> <EUsearch.pl>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From basu at pharm.stonybrook.edu  Sun Jun  3 10:44:18 2007
From: basu at pharm.stonybrook.edu (Siddhartha Basu)
Date: Sun, 03 Jun 2007 10:44:18 -0400
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
Message-ID: <web-5961520@pharm.stonybrook.edu>

On Sat, 2 Jun 2007 00:16:05 -0500
  Chris Fields <cjfields at uiuc.edu> wrote:
> To anyone using Bio::DB::EUilities,
> 
> I am in the midst of a major overhaul to the various 
>EUtilities tools  
> and to Bio::DB::GenericWebDBI (the latter which I am 
>forming into  
> more or less a test bed for other database interfaces). 
> I'm about  
> 80% done at this point, and will likely start committing 
>changes this  
> coming week.
> 
> The overall interface will change (something I had 
>warned about in  
> the Bio::DB::EUtilities POD) but I am hoping it will be 
>more  
> intuitive and easier to use in the long run.  I'll 
>describe the  
> overall redesign and use in an upcoming HOWTO (as 
>recommended by  
> Brian a while back).

Hi chris,
Being a frequent user of EUtilities, hopefully this api 
facelift and upcoming howto will definitely be more 
helpful.
Anyway, one thing i noticed that for each eutil call such 
as efetch,epost,esearch,esummary a new 
'Bio::DB::Utilities' object has to be
instantiated. And thereafter it cannot be set during 
runtime such as
$eutils->id('ids'), for example....

my $eutils = Bio::DB::Eutilities->new ( -id => $id,
                                        -eutil => 
'esummary',
                                        -db => 'protein',
                                      );
my $ct = $eutils->get_response->content();

## -- now i cannot do this...
$eutils->id($newid);
my $ct = $eutils->get_response->content();

Is the new api going to address something along this line 
or is there currently anyway to reuse
the object.
Thanks again for this nice toolkit.

-siddhartha


> 
> If anyone has any suggestions/ideas/flames, please let 
>me know!
> 
> Cheers!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Sun Jun  3 19:52:39 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 3 Jun 2007 18:52:39 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <web-5961520@pharm.stonybrook.edu>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<web-5961520@pharm.stonybrook.edu>
Message-ID: <5120BD7B-CA89-46E4-8D6B-6B24C1F93A5E@uiuc.edu>

On Jun 3, 2007, at 9:44 AM, Siddhartha Basu wrote:

> ...
> Hi chris,
> Being a frequent user of EUtilities, hopefully this api facelift  
> and upcoming howto will definitely be more helpful.
> Anyway, one thing i noticed that for each eutil call such as  
> efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has  
> to be
> instantiated. And thereafter it cannot be set during runtime such as
> $eutils->id('ids'), for example....
>
> my $eutils = Bio::DB::Eutilities->new ( -id => $id,
>                                        -eutil => 'esummary',
>                                        -db => 'protein',
>                                      );
> my $ct = $eutils->get_response->content();
>
> ## -- now i cannot do this...
> $eutils->id($newid);
> my $ct = $eutils->get_response->content();

I'll have to check up on that, though changing id() should work with  
the old API.  It won't matter with the new API (it works fine), but  
it is still troubling...

> Is the new api going to address something along this line or is  
> there currently anyway to reuse
> the object.
> Thanks again for this nice toolkit.
>
> -siddhartha

The old API was based upon the idea of creating discrete user agents  
for each eutil to retrieve data.  The problem with the old interface  
is it attempts to do too much (take care of parameters, set up  
requests, retrieve responses, parse data, etc), and many tasks  
required instantiating a new EUtilities object.  I was never really  
satisfied with it.

The new interface is a composition of three classes: the web user  
agent (LWP::UserAgent), a class encapsulating parameter handling, and  
a parser class (all which can be used independently if needed).  When  
parameters change a new request is made 'lazily' (i.e. only when  
needed).  Similarly, when data is requested after any parameter  
change a new parser instance is created and the new response is parsed.

With that in mind you can now do the following:
----------------------------------------
my @params = (-eutil => 'esearch',
               -db    => 'protein',
               -term => 'BRCA1',
               -retmax => 100);

my $eutil = Bio::DB::EUtilities->new(@params);

# no need to get response first; get_ids() calls that if needed

my @ids = $eutil->get_ids;

# below changes only those parameters, leaves all others set as before
$eutil->set_parameters(-eutil => 'efetch',
                        -id  => \@ids,
                        -retmode => 'text',
                        -rettype => 'fasta');

# sends streamed content directly to a file
$eutil->get_response(-content_file => 'seqs.fas');

# or to a LWP::UserAgent-supported request callback
$eutil->get_response(-content_cb => \&my_cb);

my @newparams = (-eutil => 'esearch',
               -db    => 'protein',
               -term => 'BRCA2',
               -retmax => 100);

# Resets eutility to passed parameters (or undef)
$eutil->reset_parameters(@newparams);

# retrieve new IDs
my @new_ids = $eutil->get_ids;
----------------------------------------

Note the same eutil object is used for all of the above, so to answer  
your last question, yes, you should be able to create data pipelines  
using the same object if necessary.

chris


From sac at bioperl.org  Mon Jun  4 13:56:57 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Mon, 4 Jun 2007 10:56:57 -0700
Subject: [Bioperl-l] question about Bio::Restriction::Analysis
In-Reply-To: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu>
References: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu>
Message-ID: <8f200b4c0706041056o4dbaadfexddf9f82fc33c6da@mail.gmail.com>

Hi Apurva,

I'm cc:ing the list to let others know you have found performance
issues with Bio::Restriction::Analysis. Ideally, we should focus on
addressing those issues rather than fixing a module that is now
deprecated.

But taking a quick look at my Bio::Tools::RestrictionEnzyme module,
I'm not sure why HpaII would give slower performance relative to other
non-ambiguous cutters. This enzyme has a 4-base recognition sequence
CCGG, and if you're feeding it a large CG-rich input sequence, that
could be a factor. To test, you might try using some other 4-base
cutters that aren't CG-rich (TaqI, TasI) or try some other input
sequences. There is no special flag to indicate that the enzyme is
non-ambiguous. The module handles that automatically.

Good luck,
Steve

On 6/4/07, Apurva Narechania <apurva at cshl.edu> wrote:
> Hi Rob and Steve,
>
> I was hoping you could answer a quick performance question regarding
> the Bio::Restriction::Analysis module. I have found that though this
> module works well, it is considerably slower than the deprecated
> Bio::Tools::RestrictionEnzyme. I see that there are two algorithms
> available to your module, and since I am using HpaII, a non-ambiguous
> enzyme, I thought I might find similar performance to the older,
> deprecated module, but I do not. Is it possible that I am not setting
> the non-ambiguous flag correctly? Does it need to be set in the first
> place?
>
> As far as Bio::Tools::RestrictionEnzyme, though it is faster, I have
> found instances where it is inaccurate, especially in calculating
> fragments of extremely small size 1-5 base pairs, so I would like to
> use your module if possible. It just seems slow to me.
>
> Can you clarify?
>
> I have copied my code below since it is a short, simple script.
>
> Thanks!
> Apurva Narechania
> Ware Lab
> Cold Spring Harbor Labs
>
> ----------
>
> #!/usr/bin/perl
>
> # This program generates a fasta of restriction frags given an
> # input fasta and a restriction cut site
>
> use Getopt::Std;
> use Bio::Seq;
> use Bio::SeqIO;
> use strict;
>
> use Bio::Tools::RestrictionEnzyme;
>
> my %opts = ();
> getopts ('f:', \%opts);
> my $fasta  = $opts{'f'};
>
> # read fasta file
> my $seqin = Bio::SeqIO -> new (-format => 'Fasta', -file => "$fasta");
>
> my $x = 0;
> while (my $sequence_obj = $seqin -> next_seq()){
>      $x++;
>      my $id = $sequence_obj->id();
>
>      print STDERR "$x Working on $id\n";
>
>      # generate the rx object
>      my $ra = new Bio::Tools::RestrictionEnzyme(-NAME=>'HpaII');
>
>      my @frags = $ra->cut_seq($sequence_obj);
>
>      my $counter = 0;
>      foreach my $frag (@frags){
>          $counter++;
>          my $length = length ($frag);
>          print ">$id.$counter length=$length\n$frag\n";
>      }
>
> }
>
>


From anhthu.tieu at gsf.de  Tue Jun  5 04:14:09 2007
From: anhthu.tieu at gsf.de (Tieu, Anh-Thu)
Date: Tue, 5 Jun 2007 10:14:09 +0200
Subject: [Bioperl-l] problems with image maps and IE 6 or higher
Message-ID: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>

Hi, 

 I have a problem using the bioperl image maps function with the IE6 or and
 higher browser. It might be a more general problem with IE6 rather than with bioperl,
 but as I used bioperl to create my image maps, I thought I could still post this problem 
 here and ask for people's opinion. I wondered if anyone else faced the same problem and if
 possible if anyone could share their experiences and their solutions. 
 
  
<div>
<p><img src="/ggtc/tmp_bilder/19727dab708e1cbf567dd48480febb96.png" usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/></p>
<map name="mapnameD064C01" id="mapnameD064C01">
<area shape="rect" coords="108,0,608,20" href="javascript:void(0)" onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale " alt="scale " target="_blank"/>
<area shape="rect" coords="234,44,244,55" href="javascript:void(0)" onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
alue: ' ));;return false;" title="alignment5 " alt="alignment5 " target="_blank"/>
<area shape="rect" coords="241,57,247,68" href="javascript:void(0)" onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
alue: ' ));;return false;" title="integration_pt " alt="integration_pt " target="_blank"/>
<area shape="rect" coords="108,70,608,81" href="javascript:void(0)" onclick="javascript:void(zmenu( 'Nphs1                                   ', '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', '
stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene " alt="gene " target="_blank"/>
<area shape="rect" coords="108,83,117,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop: 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a
lt="exon1 " target="_blank"/>
<area shape="rect" coords="117,83,119,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop: 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1
 " alt="intron1 " target="_blank"/>
<area shape="rect" coords="119,83,123,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop: 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a
lt="exon2 " target="_blank"/>
<area shape="rect" coords="123,83,124,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop: 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2
...
</div>


 This is part of the code I used in my HTML file to display the image map and it really runs beautifully
 with Mozilla 1.7 or the latest Firefox version. However, if used in IE6 the clickable pop-ups do not appear/ work.
 
 I appreciate any help and would like to thank everyone for their help. 
 
 Best regards, 
 
 
 Anh-Thu
________________________________________________________________________
GSF-Forschungszentrum

Ingolst?dter Landstr. 1

85764 M?nchen-Neuherberg, Germany

Chairman of Supervisory Board: MinDir Dr. Peter Lange

Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum

Register of Societies: Amtsgericht M?nchen HRB 6466


From lstein at cshl.edu  Tue Jun  5 09:56:57 2007
From: lstein at cshl.edu (Lincoln Stein)
Date: Tue, 5 Jun 2007 09:55:57 -0401
Subject: [Bioperl-l] problems with image maps and IE 6 or higher
In-Reply-To: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>
References: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>
Message-ID: <6dce9a0b0706050656n783d27b3u9229f948b2710d90@mail.gmail.com>

Hi Anh-Thu,

Could you send me a snippet of the code that is generating this imagemap? It
looks like you are relying on a javascript library for the zmenu() call, and
it may be that this library is in need of updating.

You might also consider replacing the library with Sheldon McKay's popup
balloon library, located at
http://www.wormbase.org/wiki/index.php/Balloon_Tooltips

Lincoln

On 6/5/07, Tieu, Anh-Thu <anhthu.tieu at gsf.de> wrote:
>
> Hi,
>
> I have a problem using the bioperl image maps function with the IE6 or and
> higher browser. It might be a more general problem with IE6 rather than
> with bioperl,
> but as I used bioperl to create my image maps, I thought I could still
> post this problem
> here and ask for people's opinion. I wondered if anyone else faced the
> same problem and if
> possible if anyone could share their experiences and their solutions.
>
>
> <div>
> <p><img src="/ggtc/tmp_bilder/19727dab708e1cbf567dd48480febb96.png"
> usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/></p>
> <map name="mapnameD064C01" id="mapnameD064C01">
> <area shape="rect" coords="108,0,608,20" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale "
> alt="scale " target="_blank"/>
> <area shape="rect" coords="234,44,244,55" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '',
> 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
> alue: ' ));;return false;" title="alignment5 " alt="alignment5 "
> target="_blank"/>
> <area shape="rect" coords="241,57,247,68" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '',
> 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
> alue: ' ));;return false;" title="integration_pt " alt="integration_pt "
> target="_blank"/>
> <area shape="rect" coords="108,70,608,81" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'Nphs1                                   ',
> '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', '
> stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene "
> alt="gene " target="_blank"/>
> <area shape="rect" coords="108,83,117,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop:
> 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a
> lt="exon1 " target="_blank"/>
> <area shape="rect" coords="117,83,119,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop:
> 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1
> " alt="intron1 " target="_blank"/>
> <area shape="rect" coords="119,83,123,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop:
> 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a
> lt="exon2 " target="_blank"/>
> <area shape="rect" coords="123,83,124,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop:
> 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2
> ..
> </div>
>
>
> This is part of the code I used in my HTML file to display the image map
> and it really runs beautifully
> with Mozilla 1.7 or the latest Firefox version. However, if used in IE6
> the clickable pop-ups do not appear/ work.
>
> I appreciate any help and would like to thank everyone for their help.
>
> Best regards,
>
>
> Anh-Thu
> ________________________________________________________________________
> GSF-Forschungszentrum
>
> Ingolst?dter Landstr. 1
>
> 85764 M?nchen-Neuherberg, Germany
>
> Chairman of Supervisory Board: MinDir Dr. Peter Lange
>
> Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum
>
> Register of Societies: Amtsgericht M?nchen HRB 6466
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From cjfields at uiuc.edu  Tue Jun  5 11:28:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 5 Jun 2007 10:28:24 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <46656D64.7010508@ribosome.natur.cuni.cz>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
Message-ID: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>

Martin,

The example file you give in the bioperl bugzilla report has several  
blank annotation lines which may lead to additional problems.  When  
the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,  
DEFINITION, etc) then it expects there will also be relevant data  
(text descriptions) accompanying it; I assume the BioPython parser  
expects likewise though I may be wrong.

AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- 
compliant.  GenBank records lacking text either have a '.' instead or  
are left out entirely:

http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html

We could add a fix but you should probably contact the ApE developers  
and request that field names w/o text be left out or have '.' added.

chris

On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:

> Ezequiel Panepucci wrote:
>>>     genbank entry = parser.parse(fhandle)
>>
>> there is a space character between "genbank" and "entry".
>> It is a syntax error.
>> I suppose you meant "genbank_entry" ?
>
> Yes, the next command was right and has shown the error. Sorry, I  
> forgot
> to delete the first attempt. ;-)
>
>>>> genbank_entry = parser.parse(fhandle)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",  
> line 187, in parse
>    self._scanner.feed(handle, self._consumer)
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",  
> line 360, in feed
>    self._feed_first_line(consumer, self.line)
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",  
> line 835, in _feed_first_line
>    assert False, \
> AssertionError: Did not recognise the LOCUS line layout:
> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>
>>>>
>
> Martin
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stewarta at nmrc.navy.mil  Tue Jun  5 11:34:14 2007
From: stewarta at nmrc.navy.mil (Andrew Stewart)
Date: Tue, 5 Jun 2007 11:34:14 -0400
Subject: [Bioperl-l] Setting attributes on a Bio::DB::GFF::Feature object
Message-ID: <95C9F539-A4C4-4B6A-8DA8-079B957BF909@nmrc.navy.mil>

I see bidirectional mutator methods for source, type, strand, etc. in  
the Bio::DB::GFF::Feature documentation but I see that ->attributes  
is only able to get and not set the feature attributes.  Is there no  
way to modify the attributes of a Bio::DB::GFF::Feature live?


--
Andrew Stewart
Research Assistant, Genomics Team
Navy Medical Research Center (NMRC)
Biological Defense Research Directorate (BDRD)
BDRD Annex
12300 Washington Avenue, 2nd Floor
Rockville, MD 20852

email: stewarta at nmrc.navy.mil
phone: 301-231-6700 Ext 270


From cjfields at uiuc.edu  Tue Jun  5 12:07:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 5 Jun 2007 11:07:41 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
Message-ID: <D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>

One thing I missed which explains the biopython error: the LOCUS line  
is missing the locus identifier (see the NCBI example record link).   
This doesn't choke the bioperl parser but it appears to stop the  
biopython parser in it's tracks (maybe a feature instead of a bug!).

You should try adding a unique identifier (maybe the name of the file  
or record) to the LOCUS line to see if it works:

LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006

The bioperl parser in CVS writes out the correct alphabet when this  
is added:

LOCUS       testfile                6499 bp    ds-DNA  linear   02- 
AUG-2006

I'll try adding a warning to the bioperl parser for this.

chris

On Jun 5, 2007, at 10:28 AM, Chris Fields wrote:

> Martin,
>
> The example file you give in the bioperl bugzilla report has several
> blank annotation lines which may lead to additional problems.  When
> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,
> DEFINITION, etc) then it expects there will also be relevant data
> (text descriptions) accompanying it; I assume the BioPython parser
> expects likewise though I may be wrong.
>
> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL-
> compliant.  GenBank records lacking text either have a '.' instead or
> are left out entirely:
>
> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html
>
> We could add a fix but you should probably contact the ApE developers
> and request that field names w/o text be left out or have '.' added.
>
> chris
>
> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:
>
>> Ezequiel Panepucci wrote:
>>>>     genbank entry = parser.parse(fhandle)
>>>
>>> there is a space character between "genbank" and "entry".
>>> It is a syntax error.
>>> I suppose you meant "genbank_entry" ?
>>
>> Yes, the next command was right and has shown the error. Sorry, I
>> forgot
>> to delete the first attempt. ;-)
>>
>>>>> genbank_entry = parser.parse(fhandle)
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in ?
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",
>> line 187, in parse
>>    self._scanner.feed(handle, self._consumer)
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>> line 360, in feed
>>    self._feed_first_line(consumer, self.line)
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>> line 835, in _feed_first_line
>>    assert False, \
>> AssertionError: Did not recognise the LOCUS line layout:
>> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>>
>>>>>
>>
>> Martin
>> _______________________________________________
>> BioPython mailing list  -  BioPython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From staffa at niehs.nih.gov  Tue Jun  5 22:00:34 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Tue, 05 Jun 2007 22:00:34 -0400
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C170E69F.246E%staffa@niehs.nih.gov>
Message-ID: <C28B8D82.51AE%staffa@niehs.nih.gov>

I am wondering if I knew what this error message exactly meant, if I could
discern my error. 
I don't see much difference in this program and programs that worked.
Can I assume that the new worked because an index file exists?
I don't know how the filehandle UTR_TT_GENES gets involved.
Maybe I should use some other module, but I really would like to have
get_Seq_by_id functionality.

The error message:
Dpse ortholog = Dpse_GA17307
fetching GA17307
Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,
<UTR_TT_GENES> line 4.

Relevant code:
#!/usr/bin/perl
#
#
#
use strict;
use Bio::DB::Fasta;
use Bio::Tools::SeqWords;
use Bio::Seq;
use Bio::SeqIO;
#
my $db = 
Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/TT_orthol
ogs_Dpse_genes.fa',
                                -makeid => \&make_my_id);
...
...
...
my $pse_obj = $db->get_Seq_by_id('GA17307');
my $pse_sequence = $pse_obj->seq;


Nick Staffa 
Telephone: 919-316-4569  (NIEHS: 6-4569)
Scientific Computing Support Group
NIEHS Information Technology Support Services Contract
(Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov)
National Institute of Environmental Health Sciences
National Institutes of Health
Research Triangle Park, North Carolina


From jason at bioperl.org  Tue Jun  5 23:12:40 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 5 Jun 2007 20:12:40 -0700
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C28B8D82.51AE%staffa@niehs.nih.gov>
References: <C28B8D82.51AE%staffa@niehs.nih.gov>
Message-ID: <EC9E4A2E-2C06-4ADE-8317-9E25DDF1C9C4@bioperl.org>

the file handle is probably not important, Perl just reports this if  
there is a filehandle open.

more importantly what is on line 84....

my guess is you are trying to get a sequence out and it doesn't exist  
- some error code around the lines getting the sequence out would be  
helpful.


On Jun 5, 2007, at 7:00 PM, Staffa, Nick (NIH/NIEHS) wrote:

> I am wondering if I knew what this error message exactly meant, if  
> I could
> discern my error.
> I don't see much difference in this program and programs that worked.
> Can I assume that the new worked because an index file exists?
> I don't know how the filehandle UTR_TT_GENES gets involved.
> Maybe I should use some other module, but I really would like to have
> get_Seq_by_id functionality.
>
> The error message:
> Dpse ortholog = Dpse_GA17307
> fetching GA17307
> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl  
> line 84,
> <UTR_TT_GENES> line 4.
>
> Relevant code:
> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> #
> my $db =
> Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/ 
> TT_orthol
> ogs_Dpse_genes.fa',
>                                 -makeid => \&make_my_id);
> ...
> ...
> ...
> my $pse_obj = $db->get_Seq_by_id('GA17307');
> my $pse_sequence = $pse_obj->seq;
>
>
>
>
> Nick Staffa
> Telephone: 919-316-4569  (NIEHS: 6-4569)
> Scientific Computing Support Group
> NIEHS Information Technology Support Services Contract
> (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov)
> National Institute of Environmental Health Sciences
> National Institutes of Health
> Research Triangle Park, North Carolina
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070605/7e056ff6/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2613 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070605/7e056ff6/attachment-0003.bin>

From torsten.seemann at infotech.monash.edu.au  Wed Jun  6 02:06:37 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 6 Jun 2007 16:06:37 +1000
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C28B8D82.51AE%staffa@niehs.nih.gov>
References: <C170E69F.246E%staffa@niehs.nih.gov>
	<C28B8D82.51AE%staffa@niehs.nih.gov>
Message-ID: <a79f6a4b0706052306r16f7ce61y28448c18349ac3f4@mail.gmail.com>

Nick,

> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,

The error makes it pretty clear. You are calling the ->seq method on
an undefined value, ie. $pse_obj.

> my $pse_obj = $db->get_Seq_by_id('GA17307');

# check we got something!
die "sequence not in database" unless $pse_obj;

> my $pse_sequence = $pse_obj->seq;


-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From shameer at ncbs.res.in  Wed Jun  6 02:27:42 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Wed, 6 Jun 2007 11:57:42 +0530 (IST)
Subject: [Bioperl-l] Validation of files using BioPerl
Message-ID: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>

Dear All,

How to validate an input file in fasta/PIR/GenPept/PDB format using
Bioperl ? (This is to avoid unnecessary files to be submitted to servers
by new users).   Any module available ?

Many thanks in advance,
-- 
Shameer Khadar


From cjfields at uiuc.edu  Wed Jun  6 08:37:28 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 6 Jun 2007 07:37:28 -0500
Subject: [Bioperl-l] Validation of files using BioPerl
In-Reply-To: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>
References: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>
Message-ID: <39F5F622-0C93-4DC5-B969-491F789FC932@uiuc.edu>

It has been discussed but never coded.  I believe if it passes  
through the Bio::SeqIO parser it's generally considered validly  
formatted (spacing, balanced quotes), though it doesn't specifically  
check FT keys and qualifiers for invalid ones, look for missing  
annotation, check taxonomy, etc.

As long as the end sequence mark (//) is present for every file, you  
cold try parsing the file into chunks (read with 'local $/ = '//';')  
and tossing the seq chunks as a filehandle (via IO::String) to a  
Bio::SeqIO object wrapped in an eval block (the parser resets $/, so  
it should work).  Follow the eval with a check of $@ for caught  
errors.  It might get tedious for big sequences...

chris

On Jun 6, 2007, at 1:27 AM, Shameer Khadar wrote:

> Dear All,
>
> How to validate an input file in fasta/PIR/GenPept/PDB format using
> Bioperl ? (This is to avoid unnecessary files to be submitted to  
> servers
> by new users).   Any module available ?
>
> Many thanks in advance,
> -- 
> Shameer Khadar
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From staffa at niehs.nih.gov  Wed Jun  6 10:40:49 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Wed, 06 Jun 2007 10:40:49 -0400
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <a79f6a4b0706052306r16f7ce61y28448c18349ac3f4@mail.gmail.com>
Message-ID: <C28C3FB1.4B73%staffa@niehs.nih.gov>

Indeed.
One must know what is actually in his header,
AND 
one must write the appropriate make_id subroutine
AND
one must specify the exact ID.
THEN things might work.
And they did!
THANK YOU


On 6/6/07 2:06 AM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

> Nick,
> 
>> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,
> 
> The error makes it pretty clear. You are calling the ->seq method on
> an undefined value, ie. $pse_obj.
> 
>> my $pse_obj = $db->get_Seq_by_id('GA17307');
> 
> # check we got something!
> die "sequence not in database" unless $pse_obj;
> 
>> my $pse_sequence = $pse_obj->seq;
> 


From jaudall at gmail.com  Wed Jun  6 17:51:33 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Wed, 6 Jun 2007 15:51:33 -0600
Subject: [Bioperl-l] blastxml interation
Message-ID: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>

I was searching in the deobfuscator under
*Bio::Search::Result::BlastResult*but there doesn't seem to be a
method to extract the iteration number from a
blastxml report.  I can see this number being possibly useful to count the
number of queries that didn't hit anything since the are no empty reports in
the blastxml output.  If I'm missing something, I would welcome an example
how to retrieve the result iteration number.  Thanks in advance for any
suggestions.

Josh


From dmessina at wustl.edu  Wed Jun  6 18:18:26 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 6 Jun 2007 17:18:26 -0500
Subject: [Bioperl-l] blastxml interation
In-Reply-To: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
Message-ID: <CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>

I think you want to look at the hits(), num_hits() and no_hits_found 
() methods. There is a private method _next_iteration_index() which  
should do what you asked for, but num_hits() looks like the better way.

By the way, hits() and num_hits() are listed on the Deobfuscator as  
having no documentation. This (as the below shows) is incorrect and  
is due to some nonstandard formatting issues which I will correct.  
_next_iteration_index() isn't listed on the Deobfuscator because it's  
a private method.


Hope this helps!
Dave


hits()

This method overrides Bio::Search::Result::GenericResult::hits to take
into account the possibility of multiple iterations, as occurs in PSI- 
BLAST reports.
If there are multiple iterations, all 'new' hits for all iterations  
are returned.
These are the hits that did not occur in a previous iteration.
See Also: Bio::Search::Result::GenericResult::hits

num_hits()

This method overrides Bio::Search::Result::GenericResult::num_hits to  
take
into account the possibility of multiple iterations, as occurs in PSI- 
BLAST reports.
If there are multiple iterations, calling num_hits() returns the  
number of
'new' hits for each iteration. These are the hits that did not occur
in a previous iteration.
See Also: Bio::Search::Result::GenericResult::num_hits

no_hits_found()

  Usage     : $nohits = $blast->no_hits_found( $iteration_number );
  Purpose   : Get boolean indicator indicating whether or not any hits
              were present in the report.
              This is NOT the same as determining the number of hits via
              the hits() method, which will return zero hits if there  
were no
              hits in the report or if all hits were filtered out  
during the parse.

              Thus, this method can be used to distinguish these  
possibilities
              for hitless reports generated when filtering.

  Returns   : Boolean
  Argument  : (optional) integer indicating the iteration number (PSI- 
BLAST)
              If iteration number is not specified and this is a PSI- 
BLAST result,
              then this method will return true only if all  
iterations had
              no hits found.


From apurva at cshl.edu  Wed Jun  6 19:51:45 2007
From: apurva at cshl.edu (Apurva Narechania)
Date: Wed, 6 Jun 2007 19:51:45 -0400
Subject: [Bioperl-l] non-palindromic issue in Bio::Restriction::Analysis
Message-ID: <3F7C7E33-416A-4141-969A-DDC4716E8A44@cshl.edu>

Hi,

I was hoping you could confirm and give me some feedback on an issue  
I think I've found with the Bio::Restriction::Analysis module. I am  
using the enzyme AciI, a non-palindromic restriction enzyme with a 5'  
C | CGC 3' recognition site. The module should search both the  
forward and the reverse complement strings in the case of a non- 
palindromic enzyme. I have found that the this works only  
intermittently. For example, the following sequence:

GAAAAAAACAAAGGAAGAAGCTAGCTAGCAGGGCACGCGGTTTGAGGATGGCTGGTGGCCGACCGCAGGGCG 
CGCGGTTG
GAGGATTGCTGGTGGCCGACCAGATGAAACTCACGCGCGGCTGGGGACAGCTGGAATATTTGGGCGGCGGCG 
GCTGGTAT
TACGGGAAAGGAGAGATAGGGTTTTGGACGGCAGCAGCTGGTATTTGGGCCACCAATTTTGCGCGCCAGTAC 
AGGACACC
GATGCCGCAAATTGCACAATGCCTTTTATGGCGACTGACAGTGCGATGCTATAGGTATGAATTGTCGACTGA 
CAAAGTGA
CACTATTCACATATAAATATAACGAATAACACTCAGTTGGAATATAGACATATGCCGACTCACCATCTGTGG 
CAATGTAT
ACCGACTAACAATTCGATGCTAATTCTCTATTTATAGCGACAGTCGTCAGACACTAATTTGGTGTTGTGGTA 
TAATGCTA
GTGCCTCACCGCTGTAGGTGTTGGTCTACTGGTGC

Should digest into 10 fragments using this enzyme, but the module  
produces only 7. Could you please confirm this behavior, and if  
observed, suggest some possible fixes? This may be a bug in the  
_non_pal_enz method, or may be me overlooking something pretty obvious.

Thanks,
Apurva Narechania.


From cjfields at uiuc.edu  Wed Jun  6 20:51:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 6 Jun 2007 19:51:00 -0500
Subject: [Bioperl-l] blastxml interation
In-Reply-To: <CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>
References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
	<CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>
Message-ID: <B494A9F2-80CE-4761-B67F-127B37358819@uiuc.edu>

Joshua,

Just to make sure there is no confusion, do you mean a  
Bio::Search::Iteration::IterationI-based object?  The iteration tags  
have multiple meanings apparently in BLAST XML output (multiple  
queries, multiple PSI-BLAST iterations).  The current  
SearchIO::blastxml parser returns multiple  
Bio::Search::Result::BlastResult objects based on the iterations, so  
PSI-BLAST output is treated as multiple BLAST reports regardless  
(i.e. no Iteration objects).  This is something I want to rectify but  
it may not be a easy fix.

chris

On Jun 6, 2007, at 5:18 PM, David Messina wrote:

> I think you want to look at the hits(), num_hits() and no_hits_found
> () methods. There is a private method _next_iteration_index() which
> should do what you asked for, but num_hits() looks like the better  
> way.
>
> By the way, hits() and num_hits() are listed on the Deobfuscator as
> having no documentation. This (as the below shows) is incorrect and
> is due to some nonstandard formatting issues which I will correct.
> _next_iteration_index() isn't listed on the Deobfuscator because it's
> a private method.
>
>
> Hope this helps!
> Dave
>
>
> hits()
>
> This method overrides Bio::Search::Result::GenericResult::hits to take
> into account the possibility of multiple iterations, as occurs in PSI-
> BLAST reports.
> If there are multiple iterations, all 'new' hits for all iterations
> are returned.
> These are the hits that did not occur in a previous iteration.
> See Also: Bio::Search::Result::GenericResult::hits
>
> num_hits()
>
> This method overrides Bio::Search::Result::GenericResult::num_hits to
> take
> into account the possibility of multiple iterations, as occurs in PSI-
> BLAST reports.
> If there are multiple iterations, calling num_hits() returns the
> number of
> 'new' hits for each iteration. These are the hits that did not occur
> in a previous iteration.
> See Also: Bio::Search::Result::GenericResult::num_hits
>
> no_hits_found()
>
>   Usage     : $nohits = $blast->no_hits_found( $iteration_number );
>   Purpose   : Get boolean indicator indicating whether or not any hits
>               were present in the report.
>               This is NOT the same as determining the number of  
> hits via
>               the hits() method, which will return zero hits if there
> were no
>               hits in the report or if all hits were filtered out
> during the parse.
>
>               Thus, this method can be used to distinguish these
> possibilities
>               for hitless reports generated when filtering.
>
>   Returns   : Boolean
>   Argument  : (optional) integer indicating the iteration number (PSI-
> BLAST)
>               If iteration number is not specified and this is a PSI-
> BLAST result,
>               then this method will return true only if all
> iterations had
>               no hits found.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Wed Jun  6 20:45:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 6 Jun 2007 20:45:14 -0400
Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db
Message-ID: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>

I have added support to BioSQL and bioperl-db for schemas in  
PostgreSQL. A schema in PostgreSQL is more or less a namespace for  
database objects (tables, indexes, views, etc) within a database.

(A database in PostgreSQL is similar to the concept of a user in  
Oracle or MySQL, and therefore for the latter two schemas are  
synonymous with a user. [Not sure I'm still up-to-date on this for  
MySQL, but at least that's what I recall.])

When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts,  
you specify the schema in which BioSQL resides using the --schema  
option.

If you are using bioperl-db as a library, the Bio::DB::BioDB->new()  
call also accepts a -schema named parameter, and Bio::DB::DBContextI  
objects have a $dbc->schema() property for getting/setting the  
schema, Bio::DB::SimpleDBContext->new() accepts a -schema parameter,  
and you may also add the property to the .bioperldb connection  
parameter file (-schema => 'yourschemahere').

Thanks for Brian Osborne for being the instigator (and tester, and  
for adding the code to load_ncbi_taxonomy.pl - I came too late).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jaudall at gmail.com  Wed Jun  6 17:41:08 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Wed, 6 Jun 2007 15:41:08 -0600
Subject: [Bioperl-l] blastxml interation number
Message-ID: <52cea20c0706061441n96ce803v9422e8d14461c2bd@mail.gmail.com>

I was searching in the deobfuscator under
*Bio::Search::Result::BlastResult*but there doesn't seem to be a
method to extract the iteration number from a
blastxml report.  I can see this number being very useful to count the
number of queries that didn't hit anything since the are no empty reports in
the blastxml output.  If I'm missing something, I would welcome an example
how to retrieve the result iteration number, otherwise I'm suggesting that
an iteration_count feature be added to the Result object.  Thanks in advance
for any suggestions.

Josh


From holland at ebi.ac.uk  Thu Jun  7 03:33:25 2007
From: holland at ebi.ac.uk (Richard Holland)
Date: Thu, 07 Jun 2007 08:33:25 +0100
Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db
In-Reply-To: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>
References: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>
Message-ID: <4667B4C5.6070107@ebi.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sounds great.

BioJava users shouldn't need to change anything to get this to work as
PostgreSQL JDBC connection objects already require you to specify a schema.

cheers,
Richard


Hilmar Lapp wrote:
> I have added support to BioSQL and bioperl-db for schemas in PostgreSQL.
> A schema in PostgreSQL is more or less a namespace for database objects
> (tables, indexes, views, etc) within a database.
> 
> (A database in PostgreSQL is similar to the concept of a user in Oracle
> or MySQL, and therefore for the latter two schemas are synonymous with a
> user. [Not sure I'm still up-to-date on this for MySQL, but at least
> that's what I recall.])
> 
> When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you
> specify the schema in which BioSQL resides using the --schema option.
> 
> If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call
> also accepts a -schema named parameter, and Bio::DB::DBContextI objects
> have a $dbc->schema() property for getting/setting the schema,
> Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may
> also add the property to the .bioperldb connection parameter file
> (-schema => 'yourschemahere').
> 
> Thanks for Brian Osborne for being the instigator (and tester, and for
> adding the code to load_ncbi_taxonomy.pl - I came too late).
> 
>     -hilmar
> --===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGZ7TF4C5LeMEKA/QRApwUAJ48q46iX152pB6Xcc/717Ie8foUTQCgm3ij
W/+0iO/ZsNDn1pLuf5yXbYA=
=asUn
-----END PGP SIGNATURE-----


From mmokrejs at ribosome.natur.cuni.cz  Thu Jun  7 10:26:44 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 07 Jun 2007 16:26:44 +0200
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
	<D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
Message-ID: <466815A4.9060505@ribosome.natur.cuni.cz>

Hi,

Chris Fields wrote:
> One thing I missed which explains the biopython error: the LOCUS line is 
> missing the locus identifier (see the NCBI example record link).  This 
> doesn't choke the bioperl parser but it appears to stop the biopython 
> parser in it's tracks (maybe a feature instead of a bug!).
> 
> You should try adding a unique identifier (maybe the name of the file or 
> record) to the LOCUS line to see if it works:
> 
> LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006
> 
> The bioperl parser in CVS writes out the correct alphabet when this is 
> added:
> 
> LOCUS       testfile                6499 bp    ds-DNA  linear   02-AUG-2006
> 
> I'll try adding a warning to the bioperl parser for this.

I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 but let me
emphasize the LOCUS line now contains 

LOCUS                      pRL        5428 bp ds-DNA   linear       07-JUN-2007


which still does not comply with the line you have proposed. But it can be
parsed by bioperl-live from cvs. Is it still wrong? Testcase as pRL.gb-new
in the bugzilla record #2305.

Martin

> 
> chris
> 
> On Jun 5, 2007, at 10:28 AM, Chris Fields wrote:
> 
>> Martin,
>>
>> The example file you give in the bioperl bugzilla report has several
>> blank annotation lines which may lead to additional problems.  When
>> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,
>> DEFINITION, etc) then it expects there will also be relevant data
>> (text descriptions) accompanying it; I assume the BioPython parser
>> expects likewise though I may be wrong.
>>
>> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL-
>> compliant.  GenBank records lacking text either have a '.' instead or
>> are left out entirely:
>>
>> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html
>>
>> We could add a fix but you should probably contact the ApE developers
>> and request that field names w/o text be left out or have '.' added.
>>
>> chris
>>
>> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:
>>
>>> Ezequiel Panepucci wrote:
>>>>>     genbank entry = parser.parse(fhandle)
>>>>
>>>> there is a space character between "genbank" and "entry".
>>>> It is a syntax error.
>>>> I suppose you meant "genbank_entry" ?
>>>
>>> Yes, the next command was right and has shown the error. Sorry, I
>>> forgot
>>> to delete the first attempt. ;-)
>>>
>>>>>> genbank_entry = parser.parse(fhandle)
>>> Traceback (most recent call last):
>>>  File "<stdin>", line 1, in ?
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",
>>> line 187, in parse
>>>    self._scanner.feed(handle, self._consumer)
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>>> line 360, in feed
>>>    self._feed_first_line(consumer, self.line)
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>>> line 835, in _feed_first_line
>>>    assert False, \
>>> AssertionError: Did not recognise the LOCUS line layout:
>>> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>>>
>>>>>>
>>>
>>> Martin
>>> _______________________________________________
>>> BioPython mailing list  -  BioPython at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>
>> _______________________________________________
>> BioPython mailing list  -  BioPython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs


From cjfields at uiuc.edu  Thu Jun  7 11:31:45 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 7 Jun 2007 10:31:45 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <466815A4.9060505@ribosome.natur.cuni.cz>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
	<D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
	<466815A4.9060505@ribosome.natur.cuni.cz>
Message-ID: <2A403865-F1E8-4D19-8D19-455C22E7C6D9@uiuc.edu>

On Jun 7, 2007, at 9:26 AM, Martin MOKREJ? wrote:

> Hi,
>
> Chris Fields wrote:
>> One thing I missed which explains the biopython error: the LOCUS  
>> line is missing the locus identifier (see the NCBI example record  
>> link).  This doesn't choke the bioperl parser but it appears to  
>> stop the biopython parser in it's tracks (maybe a feature instead  
>> of a bug!).
>> You should try adding a unique identifier (maybe the name of the  
>> file or record) to the LOCUS line to see if it works:
>> LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006
>> The bioperl parser in CVS writes out the correct alphabet when  
>> this is added:
>> LOCUS       testfile                6499 bp    ds-DNA  linear   02- 
>> AUG-2006
>> I'll try adding a warning to the bioperl parser for this.
>
> I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305  
> but let me
> emphasize the LOCUS line now contains
> LOCUS                      pRL        5428 bp ds-DNA   linear        
> 07-JUN-2007
>
>
> which still does not comply with the line you have proposed. But it  
> can be
> parsed by bioperl-live from cvs. Is it still wrong? Testcase as  
> pRL.gb-new
> in the bugzilla record #2305.
>
> Martin

That should work.  There isn't a strict uniqueness test (that would  
require caching and isn't worth the trouble IMHO), though it's  
required you add something unique for the accession/locus if you plan  
on indexing them in the future.

Parsing GenBank data produced from third-party software is  
problematic at best; there seems to be no steadfast rule with GenBank  
output for some programs, even though the specification is plainly  
stated in the NCBI release notes.  My take on that is to have a  
stricter (read:follows release notes) GenBank parser which passes off  
the data in the record to default handler methods.  A user could then  
subjugate the defined handlers with their own by subclassing the  
default handler class and overloading the methods or adding their own  
code references directly.

chris

...


From rich at thevillas.eclipse.co.uk  Fri Jun  8 07:00:45 2007
From: rich at thevillas.eclipse.co.uk (richard)
Date: Fri, 08 Jun 2007 12:00:45 +0100
Subject: [Bioperl-l] protparam
Message-ID: <466936DD.8080604@thevillas.eclipse.co.uk>


Hi,

I noticed that in April someone asked whether there was a bioperl mod 
for obtaining protein sequence related properties using protparam.
I have a module that could potentially be submitted to bioperl for this 
purpose. Does anybody have any thoughts on whether it should go in?

Example script and the module are at:

http://81.5.159.173/webshare/ 


Cheers
Rich


From cjfields at uiuc.edu  Fri Jun  8 08:37:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 8 Jun 2007 07:37:27 -0500
Subject: [Bioperl-l] protparam
In-Reply-To: <466936DD.8080604@thevillas.eclipse.co.uk>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
Message-ID: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>

Richard,

We'll gladly add this in, though it'll need to be bioperlized  
(inherit Bio::Root::Root).  We also generally ask for tests but it  
should be easy to write up a quick test suite using any protein seq.

If you can could you add some bioperl-like POD to the module (i.e.  
SYNOPSIS, AUTHOR, DESCRIPTION, etc)?

thanks!

chris

On Jun 8, 2007, at 6:00 AM, richard wrote:

>
> Hi,
>
> I noticed that in April someone asked whether there was a bioperl mod
> for obtaining protein sequence related properties using protparam.
> I have a module that could potentially be submitted to bioperl for  
> this
> purpose. Does anybody have any thoughts on whether it should go in?
>
> Example script and the module are at:
>
> http://81.5.159.173/webshare/
>
>
> Cheers
> Rich
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From mmokrejs at ribosome.natur.cuni.cz  Fri Jun  8 07:09:42 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Fri, 08 Jun 2007 13:09:42 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file?
Message-ID: <466938F6.7050903@ribosome.natur.cuni.cz>

Hi,
  how can I convert GenBank/EMBL formatted file to a GFF file? The manpage for
Bio::Graphics::FeatureFile does not help me in this way. The information is in
the file, so I want just to extract the features to a GFF format, probably somewhere
the sequence has to be stored ...
 Is there a tool so I can convert it automatically? ;) This would be great. I
can't make the GFF manually for every file. Other programs draw plasmid maps
also automatically from the GenBank formatted input so how can I do it in bioperl?
Thanks for help,
Martin


From shameer at ncbs.res.in  Fri Jun  8 10:11:00 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Fri, 8 Jun 2007 19:41:00 +0530 (IST)
Subject: [Bioperl-l] protparam
In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
Message-ID: <54411.192.168.1.1.1181311860.squirrel@mail.ncbs.res.in>

Richard,

I asked for protparam module in bioperl !
Thats a good job.

Cheers,
SK

> Richard,
>
> We'll gladly add this in, though it'll need to be bioperlized
> (inherit Bio::Root::Root).  We also generally ask for tests but it
> should be easy to write up a quick test suite using any protein seq.
>
> If you can could you add some bioperl-like POD to the module (i.e.
> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>
> thanks!
>
> chris
>
> On Jun 8, 2007, at 6:00 AM, richard wrote:
>
>>
>> Hi,
>>
>> I noticed that in April someone asked whether there was a bioperl mod
>> for obtaining protein sequence related properties using protparam.
>> I have a module that could potentially be submitted to bioperl for
>> this
>> purpose. Does anybody have any thoughts on whether it should go in?
>>
>> Example script and the module are at:
>>
>> http://81.5.159.173/webshare/
>>
>>
>> Cheers
>> Rich
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From dmessina at wustl.edu  Fri Jun  8 10:58:20 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 8 Jun 2007 09:58:20 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <466938F6.7050903@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
Message-ID: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>

Hi Martin,

You're in luck -- the BioPerl core distribution includes two scripts  
for doing just that:

	genbank2gff
	genbank2gff3

Look in the scripts directory of the distro.

Also, there is a *huge* amount of documentation and examples on the  
BioPerl website.

	http://www.bioperl.org/wiki/HOWTOs

Reading those, reading the FAQ, and searching the mailing list  
archives are where I look first when I don't know how to do something  
in BioPerl.


Dave

--
Dave Messina
Senior Analyst, Assembly Group
Genome Sequencing Center
Washington University
St. Louis, MO


From rich at thevillas.eclipse.co.uk  Fri Jun  8 11:51:21 2007
From: rich at thevillas.eclipse.co.uk (richard)
Date: Fri, 08 Jun 2007 16:51:21 +0100
Subject: [Bioperl-l] protparam
In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
Message-ID: <46697AF9.2090502@thevillas.eclipse.co.uk>


Hi,

ok, great, that's no problem. I'll add the POD and bioperlize it,

thanks
Rich

Chris Fields wrote:
> Richard,
>
> We'll gladly add this in, though it'll need to be bioperlized  
> (inherit Bio::Root::Root).  We also generally ask for tests but it  
> should be easy to write up a quick test suite using any protein seq.
>
> If you can could you add some bioperl-like POD to the module (i.e.  
> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>
> thanks!
>
> chris
>
> On Jun 8, 2007, at 6:00 AM, richard wrote:
>
>   
>> Hi,
>>
>> I noticed that in April someone asked whether there was a bioperl mod
>> for obtaining protein sequence related properties using protparam.
>> I have a module that could potentially be submitted to bioperl for  
>> this
>> purpose. Does anybody have any thoughts on whether it should go in?
>>
>> Example script and the module are at:
>>
>> http://81.5.159.173/webshare/
>>
>>
>> Cheers
>> Rich
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>   


From cjfields at uiuc.edu  Fri Jun  8 13:45:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 8 Jun 2007 12:45:17 -0500
Subject: [Bioperl-l] protparam
In-Reply-To: <46697AF9.2090502@thevillas.eclipse.co.uk>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
	<46697AF9.2090502@thevillas.eclipse.co.uk>
Message-ID: <AA43E9C9-7064-438A-89A9-12E4B21E4F04@uiuc.edu>

Another issue is namespace.  I suggest Bio::Tools::ProtParam, though  
there may be some others out there.

We can add support for direct Bio::Seq/PrimarySeq input and other  
odds and ends once it's committed.  Good work!

chris

On Jun 8, 2007, at 10:51 AM, richard wrote:

>
> Hi,
>
> ok, great, that's no problem. I'll add the POD and bioperlize it,
>
> thanks
> Rich
>
> Chris Fields wrote:
>> Richard,
>>
>> We'll gladly add this in, though it'll need to be bioperlized
>> (inherit Bio::Root::Root).  We also generally ask for tests but it
>> should be easy to write up a quick test suite using any protein seq.
>>
>> If you can could you add some bioperl-like POD to the module (i.e.
>> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>>
>> thanks!
>>
>> chris
>>
>> On Jun 8, 2007, at 6:00 AM, richard wrote:
>>
>>
>>> Hi,
>>>
>>> I noticed that in April someone asked whether there was a bioperl  
>>> mod
>>> for obtaining protein sequence related properties using protparam.
>>> I have a module that could potentially be submitted to bioperl for
>>> this
>>> purpose. Does anybody have any thoughts on whether it should go in?
>>>
>>> Example script and the module are at:
>>>
>>> http://81.5.159.173/webshare/
>>>
>>>
>>> Cheers
>>> Rich
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Mon Jun 11 07:30:24 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 11 Jun 2007 07:30:24 -0400
Subject: [Bioperl-l] script to load ITIS taxonomy
Message-ID: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>

Hi all -

I added a script to load the ITIS taxonomy (www.itis.gov) into the  
phylodb module. It is called load_itis_taxonomy.pl and is in the  
scripts/ directory.

It is independent of BioPerl right now (the ITIS download is either a  
MS SQL Server or an Informix dump - no kidding), but I'm hoping that  
at some point support for this can be integrated into Bio::TreeIO.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 11 08:24:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 11 Jun 2007 07:24:50 -0500
Subject: [Bioperl-l] script to load ITIS taxonomy
In-Reply-To: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>
References: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>
Message-ID: <99AC6C0F-10DD-4587-AFB3-32BC495CD2BD@uiuc.edu>


On Jun 11, 2007, at 6:30 AM, Hilmar Lapp wrote:

> Hi all -
>
> I added a script to load the ITIS taxonomy (www.itis.gov) into the
> phylodb module. It is called load_itis_taxonomy.pl and is in the
> scripts/ directory.
>
> It is independent of BioPerl right now (the ITIS download is either a
> MS SQL Server or an Informix dump - no kidding), but I'm hoping that
> at some point support for this can be integrated into Bio::TreeIO.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

I second the TreeIO support.  Anyone up for it?

chris


From ryanx07 at hotmail.com  Mon Jun 11 11:24:31 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Mon, 11 Jun 2007 10:24:31 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>

I just started to learn BioPerl by reading the BioPerl Tutorial on the 
BioPerl website. By trying the 1st example on my window,
use Bio::Perl;
$seq_object = get_sequence('swiss',"ID ROA1_HUMAN");
write_sequence(">roa1.fasta",'fasta',$seq_object);

I got the error as the following:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
3
STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
STACK: t8.pl:7

I cannot figure out where is wrong but cannot find the solution on the web. 
Could someone help me please?

Also, this lead to my 2nd question: is there a way to search in the archieve 
of the current list?

Thanks so much


R

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Like puzzles? Play free games & earn great prizes. Play Clink now. 
http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2


From dmessina at wustl.edu  Mon Jun 11 12:34:29 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 11 Jun 2007 11:34:29 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>
References: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>
Message-ID: <25517EA3-7BDA-44AC-BDF3-93A6810D9D63@wustl.edu>

The example code works here, but I'm on OS X. Could you tell us which  
version of Perl and BioPerl you are using, and which operating system?

Are you getting anything in the roa1.fasta file?


> is there a way to search in the archieve of the current list?

http://www.bioperl.org/wiki/Mailing_lists


Dave


From dmessina at wustl.edu  Mon Jun 11 14:48:23 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 11 Jun 2007 13:48:23 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F39783926A21896CCB15CD9B41A0@phx.gbl>
References: <BAY106-F39783926A21896CCB15CD9B41A0@phx.gbl>
Message-ID: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu>

Hi,

Please use 'Reply All' so everyone on the list can follow the  
discussion.

Try adding the following line after the line that starts with  
$seq_object:

	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";

And then run the program again. What do you get? Could you post a  
complete printout of what you're doing?


Dave


On Jun 11, 2007, at 11:45 AM, L Xu wrote:
> I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
> activeperl 5.8.8.819 Thank you very much.


From johnsonm at gmail.com  Mon Jun 11 20:45:13 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Mon, 11 Jun 2007 19:45:13 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
Message-ID: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>

    This bit in Bio::SeqFeature::Gene::Exon is causing me some
problems trying to extend Bio::Tools::Glimmer to handle 'wraparound'
genes (circular genomes):

sub location {
   my ($self,$value) = @_;

   if(defined($value) && $value->isa('Bio::Location::SplitLocationI')) {
       $self->throw("split or compound location is not allowed ".
                    "for an object of type " . ref($self));
   }
   return $self->SUPER::location($value);
}

    That seems to be there all the way back to the initial revision
(checked in by Hilmar).  I presume it's there because of code like
this ( from the seq() method in Bio::SeqFeature::Generic):

# assumming our seq object is sensible, it should not have to yank
# the entire sequence out here.

my $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end());

    That's not going to work too well with a feature that has a
Bio::Location::Split location.  Fixing it up seems straightforward, if
a bit hackish.  Something like:

my $seq;

if (ref($self->location()) eq 'Bio::Location::Split')) {
    my $seqstring;
    my @sublocs = $self->location()->sub_Location();

    foreach my $subloc (@sublocs) {
        $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(),
$subloc->end())->seq();
    }

    my $seq = Bio::Seq->new(
                                          -id =>
$self->{'_gsf_seq'}->display_id(),
                                          -seq => $seqstring
                                         );
}
else {
    $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end());
}

    I don't see any companion to trunc() in Bio::PrimarySeqI for
joining sequences.  A join() would be handy, and make the above
cleaner.
    Comments, suggestions, rotten fruit?


From torsten.seemann at infotech.monash.edu.au  Tue Jun 12 02:18:27 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 12 Jun 2007 16:18:27 +1000
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
Message-ID: <a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>

Mark,

> if (ref($self->location()) eq 'Bio::Location::Split')) {
>     my $seqstring;
>     my @sublocs = $self->location()->sub_Location();
>
>     foreach my $subloc (@sublocs) {
>         $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(),
> $subloc->end())->seq();
>     }

Can you use the ->spliced_seq() method to do this?

http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From pengchy at yahoo.com.cn  Tue Jun 12 03:00:46 2007
From: pengchy at yahoo.com.cn (=?gb2312?q?=D1=EE=20=C5=F4=B3=CC?=)
Date: Tue, 12 Jun 2007 15:00:46 +0800 (CST)
Subject: [Bioperl-l] Can't locate loadable object for module
	TFBS::Ext::pwmsearch
Message-ID: <66745.92089.qm@web15205.mail.cnb.yahoo.com>

hi all,
   
  Today, I download the TFBS package from http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the files contained in the TFBS and Ext directories to directory "C:\perl\site\lib", then put Ext under the TFBS directory. I run the example script1.pl, but a wrong message respond: 
   
  Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC (@INC contains: C:/perl/site/lib C:/perl/lib .) at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, <
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, <DATA> line 206.
Compilation failed in require at script1.pl line 3, <DATA> line 206.
BEGIN failed--compilation aborted at script1.pl line 3, <DATA> line 206.
shell returned 2
   
  when I run the list_matrices.pl script, the same message respond. But when I empty the pwmsearch.pm file, following message respond:
   
  TFBS/Ext/pwmsearch.pm did not return a true value at :/perl/site/lib/TFBS/Matr
x/PWM.pm line 141, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 11, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137,
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 17, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52,
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line2, <DATA> line 206.
Compilation failed in require at script1.pl line 3, <DATA> line 206.
BEGIN failed--compilation aborted at script1.pl line 3, <DATA> line 206.
   
  Is anyone else meet the same problem? Is it a bug for TFBS package?


Best wishes!

Sincerely, Pengcheng
       
---------------------------------
????????????????3.5G??????20M?????? 


From bix at sendu.me.uk  Tue Jun 12 03:32:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 12 Jun 2007 08:32:02 +0100
Subject: [Bioperl-l] Can't locate loadable object for
	module	TFBS::Ext::pwmsearch
In-Reply-To: <66745.92089.qm@web15205.mail.cnb.yahoo.com>
References: <66745.92089.qm@web15205.mail.cnb.yahoo.com>
Message-ID: <466E4BF2.7020504@sendu.me.uk>

? ?? wrote:
> hi all,
> 
> Today, I download the TFBS package from
> http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the
> files contained in the TFBS and Ext directories to directory
> "C:\perl\site\lib", then put Ext under the TFBS directory. I run the
> example script1.pl, but a wrong message respond:
> 
> Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC

You have to follow the installation instructions in the README file.
Copying the files out is insufficient - you have to 'make'.


From ryanx07 at hotmail.com  Tue Jun 12 07:30:09 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 06:30:09 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu>
Message-ID: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>

Here is the code:

use Bio::Perl;
$seq_object = get_sequence('swiss',"ROA1_HUMAN");
print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
write_sequence(">roa1.fasta",'fasta',$seq_object);

The output looks like the same as the previous version:

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\~Scripts>perl test.pl

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
3
STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
STACK: test.pl:7
-----------------------------------------------------------

Thanks.


>From: David Messina <dmessina at wustl.edu>
>To: L Xu <ryanx07 at hotmail.com>
>CC: BioPerl list <bioperl-l at lists.open-bio.org>
>Subject: Re: [Bioperl-l] basic questions
>Date: Mon, 11 Jun 2007 13:48:23 -0500
>
>Hi,
>
>Please use 'Reply All' so everyone on the list can follow the  discussion.
>
>Try adding the following line after the line that starts with  $seq_object:
>
>	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
>
>And then run the program again. What do you get? Could you post a  complete 
>printout of what you're doing?
>
>
>Dave
>
>
>On Jun 11, 2007, at 11:45 AM, L Xu wrote:
>>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
>>activeperl 5.8.8.819 Thank you very much.
>

_________________________________________________________________
Picture this ? share your photos and you could win big!  
http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us


From pengchy at yahoo.com.cn  Tue Jun 12 10:33:15 2007
From: pengchy at yahoo.com.cn (Pengcheng Yang)
Date: Tue, 12 Jun 2007 22:33:15 +0800 (CST)
Subject: [Bioperl-l]
	=?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20basic=20questions?=
In-Reply-To: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>
Message-ID: <936780.8655.qm@web15215.mail.cnb.yahoo.com>


I got the same questions.

I guess that the swissprote database has some problems!

code:
use Bio::DB::SwissProt;
$sp = new Bio::DB::SwissProt;
$seq = $sp->get_Seq_by_id('KPY1_ECOLI'); 
print ref($seq),"\t",$seq->display_id,"\n"

the mesage:

------------- EXCEPTION  -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK Bio::SeqIO::swiss::next_seq C:/perl/site/lib/Bio\SeqIO\swiss.pm:180
STACK Bio::DB::WebDBSeqI::get_Seq_by_id
C:/perl/site/lib/Bio/DB/WebDBSeqI.pm:154

STACK toplevel t.pl:7

--------------------------------------


--- L Xu <ryanx07 at hotmail.com>????:

> Here is the code:
> 
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
> write_sequence(">roa1.fasta",'fasta',$seq_object);
> 
> The output looks like the same as the previous version:
> 
> Microsoft Windows XP [Version 5.1.2600]
> (C) Copyright 1985-2001 Microsoft Corp.
> 
> C:\~Scripts>perl test.pl
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: swissprot stream with no ID. Not swissprot in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
> STACK: Bio::SeqIO::swiss::next_seq
> C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
> 3
> STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
> STACK: test.pl:7
> -----------------------------------------------------------
> 
> Thanks.
> 
> 
> 
> 
> 
> >From: David Messina <dmessina at wustl.edu>
> >To: L Xu <ryanx07 at hotmail.com>
> >CC: BioPerl list <bioperl-l at lists.open-bio.org>
> >Subject: Re: [Bioperl-l] basic questions
> >Date: Mon, 11 Jun 2007 13:48:23 -0500
> >
> >Hi,
> >
> >Please use 'Reply All' so everyone on the list can follow the 
> discussion.
> >
> >Try adding the following line after the line that starts with 
> $seq_object:
> >
> >	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
> >
> >And then run the program again. What do you get? Could you post a 
> complete 
> >printout of what you're doing?
> >
> >
> >Dave
> >
> >
> >On Jun 11, 2007, at 11:45 AM, L Xu wrote:
> >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
> >>activeperl 5.8.8.819 Thank you very much.
> >
> 
> _________________________________________________________________
> Picture this ?share your photos and you could win big!  
> http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us
> 
> > _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


Best wishes!

Sincerely, Pengcheng


      ___________________________________________________________ 
????????????????3.5G??????20M?????? 
http://cn.mail.yahoo.com


From drummike at gmail.com  Tue Jun 12 11:49:36 2007
From: drummike at gmail.com (Mike Williams)
Date: Tue, 12 Jun 2007 11:49:36 -0400
Subject: [Bioperl-l]
	=?GB2312?B?UmU6IFtCaW9wZXJsLWxdILvYuLSjuiBSZTogYmFzaWMgcXVlc3Rpb25z?=
In-Reply-To: <936780.8655.qm@web15215.mail.cnb.yahoo.com>
References: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>
	<936780.8655.qm@web15215.mail.cnb.yahoo.com>
Message-ID: <bc95ab8d0706120849qc60ee50qf743f4a7342580e1@mail.gmail.com>

On 6/12/07, Pengcheng Yang <pengchy at yahoo.com.cn> wrote:
> I got the same questions.
> I guess that the swissprote database has some problems!
> code:
> use Bio::DB::SwissProt;
> $sp = new Bio::DB::SwissProt;
> $seq = $sp->get_Seq_by_id('KPY1_ECOLI');
> print ref($seq),"\t",$seq->display_id,"\n"
> ------------- EXCEPTION  -------------
> MSG: swissprot stream with no ID. Not swissprot in my book
> STACK toplevel t.pl:7

This is a different problem.  The id was not valid.  If you change
KPY1 to KPYK1 it works fine.

$seq = $sp->get_Seq_by_id('KPYK1_ECOLI');
print ref($seq),"\t",$seq->display_id,"\n"
[mike at Wheatley]$ ./bio_quest2.pl

Bio::Seq::RichSeq       KPYK1_ECOLI

If you got this example from the bio perl site would you please post
the url?  Seems to me this same problem has come up before, but I
could not find it in the archives nor on the web site.

Mike


From ryanx07 at hotmail.com  Tue Jun 12 11:42:28 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 10:42:28 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>

I tested another code (the 2nd test on the same machine) from the tutorial 
and got error again. I don't know what happened and please help.
Thanks so much.

===========================================================Code:
use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection;
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection){
   print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";
   # prints name, recognition site, overhang
}
=========================================== Results:

C:\~Scripts>perl t9.pl
Can't use string ("Bio::Restriction::EnzymeCollecti") as a HASH ref while 
"stric
t refs" in use at C:/Perl/site/lib/Bio/Restriction/EnzymeCollection.pm line 
236.


= = = Original message = = =

On Jun 11, 2007, at 11:45 AM, L Xu wrote:

   I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and? 
activeperl 5.8.8.819 Thank you very much.

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Need a break? Find your escape route with Live Search Maps. 
http://maps.live.com/default.aspx?ss=Restaurants~Hotels~Amusement%20Park&cp=33.832922~-117.915659&style=r&lvl=13&tilt=-90&dir=0&alt=-1000&scene=1118863&encType=1&FORM=MGAC01


From limericksean at gmail.com  Tue Jun 12 12:04:40 2007
From: limericksean at gmail.com (Sean O'Keeffe)
Date: Tue, 12 Jun 2007 18:04:40 +0200
Subject: [Bioperl-l] gff2xml
Message-ID: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>

Hi all,
I posted this on the gbrowse list earlier. I'm looking to convert gff
data files into xml. Does anyone know of a module written to do this
already?

respect,
sean.


From johnsonm at gmail.com  Tue Jun 12 12:10:45 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 11:10:45 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
Message-ID: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>

On 6/12/07, Torsten Seemann <torsten.seemann at infotech.monash.edu.au> wrote:
> Can you use the ->spliced_seq() method to do this?
>
> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11
>
> --
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> --Tel +61 3 9905 9010

    Actually, I'd forgotten about spliced_seq().  That seems like it
will Do The Right Thing.  It's just up to the invoker to call
spliced_seq() instead of seq() as appropriate.
    So, is there any other code that will break if I modify
Bio::SeqFeature::Gene::Exon::location to not throw an exception when
encountering Bio::Location::SplitLocationI?  I'm wondering if it's
just a paranoid check or if it's there to guard against something.  If
the latter, I need to know what code to fix.  I'll dig and look, but
if anybody knows or has an idea, save me some time.  I suppose I can
just change it and see what tests start failing. 8)


From dmessina at wustl.edu  Tue Jun 12 12:11:36 2007
From: dmessina at wustl.edu (David Messina)
Date: Tue, 12 Jun 2007 11:11:36 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>
References: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>
Message-ID: <30B8F841-E694-4577-8C15-8703E846CDFE@wustl.edu>

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps  
Perl wasn't seeing the second argument to get_sequence. And then your  
new program has the error 'Can't use string  
("Bio::Restriction::EnzymeCollecti")' where the end of the word is  
cut off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.  Are  
there any example scripts that come with ActivePerl? If there are,  
and they run correctly, perhaps you could look to see how the line  
breaks are done and make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem --  
anyone else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall BioPerl  
and make sure that you run the full test suite and that all of the  
tests pass. My guess is that something in your current setup is not  
quite right.

Dave


From cjfields at uiuc.edu  Tue Jun 12 12:42:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 11:42:29 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
Message-ID: <E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>


On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:

> On 6/12/07, Torsten Seemann  
> <torsten.seemann at infotech.monash.edu.au> wrote:
>> Can you use the ->spliced_seq() method to do this?
>>
>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ 
>> SeqFeatureI.html#POD11
>>
>> --
>> --Torsten Seemann
>> --Victorian Bioinformatics Consortium, Monash University
>> --Tel +61 3 9905 9010
>
>     Actually, I'd forgotten about spliced_seq().  That seems like it
> will Do The Right Thing.  It's just up to the invoker to call
> spliced_seq() instead of seq() as appropriate.
>     So, is there any other code that will break if I modify
> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> just a paranoid check or if it's there to guard against something.  If
> the latter, I need to know what code to fix.  I'll dig and look, but
> if anybody knows or has an idea, save me some time.  I suppose I can
> just change it and see what tests start failing. 8)

I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to  
describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs  
state that the Exon class is used to specifically describe exons, as  
the name implies.  Exons are primarily eukaryotic in origin, so you  
shouldn't encounter wraparounds, and should not have split locations  
by definition (which likely explains the exception).

Wouldn't a SeqFeature::Generic work just as well using a split location?

chris


From johnsonm at gmail.com  Tue Jun 12 12:59:54 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 11:59:54 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
Message-ID: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>

    That's a good point.  Both Bio::Tools::Glimmer and
Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
prokaryotic sequence (multiple exons for eukaryotic).  There are
eukaryotic and prokaryotic versions of both predictor families.  Maybe
the most elegant solution would be to simply modify both modules to
only emit Bio::SeqFeature::Generic features when operating on
prokaryotic mode output?  Fix the data model and the problem goes
away.  8)

On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>
> > On 6/12/07, Torsten Seemann
> > <torsten.seemann at infotech.monash.edu.au> wrote:
> >> Can you use the ->spliced_seq() method to do this?
> >>
> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
> >> SeqFeatureI.html#POD11
> >>
> >> --
> >> --Torsten Seemann
> >> --Victorian Bioinformatics Consortium, Monash University
> >> --Tel +61 3 9905 9010
> >
> >     Actually, I'd forgotten about spliced_seq().  That seems like it
> > will Do The Right Thing.  It's just up to the invoker to call
> > spliced_seq() instead of seq() as appropriate.
> >     So, is there any other code that will break if I modify
> > Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> > encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> > just a paranoid check or if it's there to guard against something.  If
> > the latter, I need to know what code to fix.  I'll dig and look, but
> > if anybody knows or has an idea, save me some time.  I suppose I can
> > just change it and see what tests start failing. 8)
>
> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
> state that the Exon class is used to specifically describe exons, as
> the name implies.  Exons are primarily eukaryotic in origin, so you
> shouldn't encounter wraparounds, and should not have split locations
> by definition (which likely explains the exception).
>
> Wouldn't a SeqFeature::Generic work just as well using a split location?
>
> chris
>


From ryanx07 at hotmail.com  Tue Jun 12 13:17:18 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 12:17:18 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>

I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 build 820.
However, both scripts generated the same error with my computer. I tested 
the code in another WinXP computer with the same versions of activePerl and 
BioPerl, the one for the swissprot did work but the restriction enzyme 
generated the same error.

= = = Original message = = =

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps? Perl 
wasn't seeing the second argument to get_sequence. And then your? new 
program has the error 'Can't use string? 
("Bio::Restriction::EnzymeCollecti")' where the end of the word is? cut off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.? Are? there 
any example scripts that come with ActivePerl? If there are,? and they run 
correctly, perhaps you could look to see how the line? breaks are done and 
make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem --? anyone 
else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall BioPerl? and 
make sure that you run the full test suite and that all of the? tests pass. 
My guess is that something in your current setup is not? quite right.

Dave

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Tue Jun 12 13:51:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 12:51:47 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>
References: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>
Message-ID: <D01CF97A-FE62-4E40-A3DD-FAFD97D8BA45@uiuc.edu>

This is an instance where 'use strict' would have shown the problem  
right away.  You left off your constructor call:

my $all_collection = Bio::Restriction::EnzymeCollection;

should be

my $all_collection = Bio::Restriction::EnzymeCollection->new;

chris

On Jun 12, 2007, at 12:17 PM, L Xu wrote:

> I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8  
> build 820.
> However, both scripts generated the same error with my computer. I  
> tested
> the code in another WinXP computer with the same versions of  
> activePerl and
> BioPerl, the one for the swissprot did work but the restriction enzyme
> generated the same error.
>
> = = = Original message = = =
>
> Hmm, it almost looks like you're having an issue with line breaks.
>
> The 'swissprot stream with no ID' error made me think that perhaps?  
> Perl
> wasn't seeing the second argument to get_sequence. And then your? new
> program has the error 'Can't use string?
> ("Bio::Restriction::EnzymeCollecti")' where the end of the word is?  
> cut off.
>
> I don't know how ActivePerl handles Windows vs UNIX line breaks.?  
> Are? there
> any example scripts that come with ActivePerl? If there are,? and  
> they run
> correctly, perhaps you could look to see how the line? breaks are  
> done and
> make sure the your program does it the same way.
>
> Other than that, I'm not seeing an obvious answer to your problem  
> --? anyone
> else have a suggestion?
>
> Perhaps the easiest thing for you to do would be to reinstall  
> BioPerl? and
> make sure that you run the full test suite and that all of the?  
> tests pass.
> My guess is that something in your current setup is not? quite right.
>
> Dave
>
> ___________________________________________________________
> Sent by ePrompter, the premier email notification software.
> Free download at http://www.ePrompter.com.
>
> _________________________________________________________________
> Get a preview of Live Earth, the hottest event this summer - only  
> on MSN
> http://liveearth.msn.com?source=msntaglineliveearthhm
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ryanx07 at hotmail.com  Tue Jun 12 14:11:15 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 13:11:15 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>

Thank you very much, it did make the script advanced a bit but I got the 
following error:

C:\~Scripts>perl t9.pl
Can't locate object method "name" via package 
"Bio::Restriction::EnzymeCollectio
n" at t9.pl line 5, <DATA> line 532.

I checked the documentation , there is no "name" method for the package. 
Thanks.

= = = Original message = = =

This is an instance where 'use strict' would have shown the problem? right 
away.? You left off your constructor call:

my $all_collection = Bio::Restriction::EnzymeCollection;

should be

my $all_collection = Bio::Restriction::EnzymeCollection->new;

chris

On Jun 12, 2007, at 12:17 PM, L Xu wrote:


   I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8? build 
820.
However, both scripts generated the same error with my computer. I? tested
the code in another WinXP computer with the same versions of? activePerl and
BioPerl, the one for the swissprot did work but the restriction enzyme
generated the same error.

= = = Original message = = =

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps?? Perl
wasn't seeing the second argument to get_sequence. And then your? new
program has the error 'Can't use string?
("Bio::Restriction::EnzymeCollecti")' where the end of the word is?? cut 
off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.?? Are? 
there
any example scripts that come with ActivePerl? If there are,? and? they run
correctly, perhaps you could look to see how the line? breaks are? done and
make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem? --? 
anyone
else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall? BioPerl? and
make sure that you run the full test suite and that all of the?? tests pass.
My guess is that something in your current setup is not? quite right.

Dave

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only? on MSN
http://liveearth.msn.com?source=msntaglineliveearthhm

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Tue Jun 12 14:35:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 13:35:15 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>
References: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>
Message-ID: <287E93E2-1902-4796-971E-B1DCA805D032@uiuc.edu>

Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme  
objects, each with its own name().  Using grouped methods like  
'$collection->cutters(6)' will retrieve a new EnzymeCollection  
containing all six-cutters from the original collection.  You should  
use one of the EnzymeCollection accessor methods to retrieve the  
enzyme that you wanted first or iterate through them all.  This works  
for me:

use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection->new();
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection->each_enzyme){
    print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";
}

chris

On Jun 12, 2007, at 1:11 PM, L Xu wrote:

> Thank you very much, it did make the script advanced a bit but I  
> got the following error:
>
> C:\~Scripts>perl t9.pl
> Can't locate object method "name" via package  
> "Bio::Restriction::EnzymeCollectio
> n" at t9.pl line 5, <DATA> line 532.
>
> I checked the documentation , there is no "name" method for the  
> package. Thanks.


From johnsonm at gmail.com  Tue Jun 12 15:07:57 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 14:07:57 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
Message-ID: <ebf5eb170706121207p4ad86a6cr9af85e766168cfbe@mail.gmail.com>

I'll wait a day, and if there is no opinion to the contrary, implement
it this way.

On 6/12/07, Mark Johnson <johnsonm at gmail.com> wrote:
>     That's a good point.  Both Bio::Tools::Glimmer and
> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
> prokaryotic sequence (multiple exons for eukaryotic).  There are
> eukaryotic and prokaryotic versions of both predictor families.  Maybe
> the most elegant solution would be to simply modify both modules to
> only emit Bio::SeqFeature::Generic features when operating on
> prokaryotic mode output?  Fix the data model and the problem goes
> away.  8)


From torsten.seemann at infotech.monash.edu.au  Tue Jun 12 20:18:27 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 13 Jun 2007 10:18:27 +1000
Subject: [Bioperl-l] gff2xml
In-Reply-To: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
Message-ID: <a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>

Sean

> I posted this on the gbrowse list earlier. I'm looking to convert gff
> data files into xml. Does anyone know of a module written to do this
> already?

What DTD do you want the XML to conform to?
eg. ChadoXML, TinySeq XML, TIGR XML ... ?

What program are you trying to get to load the XML?

BioPerl has some Bio::SeqIO:xxxxx modules for some XML formats that
you could use. There is a script "bp_seqconvert.pl -h" which comes
with BioPerl which may be useful.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From hlapp at gmx.net  Tue Jun 12 20:55:57 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 12 Jun 2007 20:55:57 -0400
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
Message-ID: <0915FAB4-E554-4E65-BA3F-1B916F0F95FC@gmx.net>

I think it was just trying to guard against people trying to do  
stupid things.

I'm actually not sure that representing locations on a circular  
genome using split locations really is the best thing. I'm wondering  
whether one shouldn't rather introduce a CircularLocation object  
(though obviously it isn't the location that's circular...).

Just a thought. In the end, if you have a way to make this work that  
you feel comfortable with than go for it.

	-hilmar

On Jun 12, 2007, at 12:10 PM, Mark Johnson wrote:

> On 6/12/07, Torsten Seemann  
> <torsten.seemann at infotech.monash.edu.au> wrote:
>> Can you use the ->spliced_seq() method to do this?
>>
>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ 
>> SeqFeatureI.html#POD11
>>
>> --
>> --Torsten Seemann
>> --Victorian Bioinformatics Consortium, Monash University
>> --Tel +61 3 9905 9010
>
>     Actually, I'd forgotten about spliced_seq().  That seems like it
> will Do The Right Thing.  It's just up to the invoker to call
> spliced_seq() instead of seq() as appropriate.
>     So, is there any other code that will break if I modify
> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> just a paranoid check or if it's there to guard against something.  If
> the latter, I need to know what code to fix.  I'll dig and look, but
> if anybody knows or has an idea, save me some time.  I suppose I can
> just change it and see what tests start failing. 8)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Tue Jun 12 20:57:06 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 12 Jun 2007 20:57:06 -0400
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
Message-ID: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>

I like that. Don't force a model to do what you want if it doesn't  
really apply anyway.

	-hilmar

On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote:

>     That's a good point.  Both Bio::Tools::Glimmer and
> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
> prokaryotic sequence (multiple exons for eukaryotic).  There are
> eukaryotic and prokaryotic versions of both predictor families.  Maybe
> the most elegant solution would be to simply modify both modules to
> only emit Bio::SeqFeature::Generic features when operating on
> prokaryotic mode output?  Fix the data model and the problem goes
> away.  8)
>
> On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>
>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>>
>>> On 6/12/07, Torsten Seemann
>>> <torsten.seemann at infotech.monash.edu.au> wrote:
>>>> Can you use the ->spliced_seq() method to do this?
>>>>
>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
>>>> SeqFeatureI.html#POD11
>>>>
>>>> --
>>>> --Torsten Seemann
>>>> --Victorian Bioinformatics Consortium, Monash University
>>>> --Tel +61 3 9905 9010
>>>
>>>     Actually, I'd forgotten about spliced_seq().  That seems like it
>>> will Do The Right Thing.  It's just up to the invoker to call
>>> spliced_seq() instead of seq() as appropriate.
>>>     So, is there any other code that will break if I modify
>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
>>> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
>>> just a paranoid check or if it's there to guard against  
>>> something.  If
>>> the latter, I need to know what code to fix.  I'll dig and look, but
>>> if anybody knows or has an idea, save me some time.  I suppose I can
>>> just change it and see what tests start failing. 8)
>>
>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
>> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
>> state that the Exon class is used to specifically describe exons, as
>> the name implies.  Exons are primarily eukaryotic in origin, so you
>> shouldn't encounter wraparounds, and should not have split locations
>> by definition (which likely explains the exception).
>>
>> Wouldn't a SeqFeature::Generic work just as well using a split  
>> location?
>>
>> chris
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Jun 12 21:20:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 20:20:41 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
	<80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>
Message-ID: <951EB9CA-2066-4CD1-BCD5-4E00232CA507@uiuc.edu>

It will be interesting to see if bioperl handles wrap-around split  
locations via spliced_seq() and other methods.  I can't see why it  
wouldn't but one never knows.  Might be something to add to location  
tests at some point...

chris

On Jun 12, 2007, at 7:57 PM, Hilmar Lapp wrote:

> I like that. Don't force a model to do what you want if it doesn't
> really apply anyway.
>
> 	-hilmar
>
> On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote:
>
>>     That's a good point.  Both Bio::Tools::Glimmer and
>> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
>> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
>> prokaryotic sequence (multiple exons for eukaryotic).  There are
>> eukaryotic and prokaryotic versions of both predictor families.   
>> Maybe
>> the most elegant solution would be to simply modify both modules to
>> only emit Bio::SeqFeature::Generic features when operating on
>> prokaryotic mode output?  Fix the data model and the problem goes
>> away.  8)
>>
>> On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>>
>>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>>>
>>>> On 6/12/07, Torsten Seemann
>>>> <torsten.seemann at infotech.monash.edu.au> wrote:
>>>>> Can you use the ->spliced_seq() method to do this?
>>>>>
>>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
>>>>> SeqFeatureI.html#POD11
>>>>>
>>>>> --
>>>>> --Torsten Seemann
>>>>> --Victorian Bioinformatics Consortium, Monash University
>>>>> --Tel +61 3 9905 9010
>>>>
>>>>     Actually, I'd forgotten about spliced_seq().  That seems  
>>>> like it
>>>> will Do The Right Thing.  It's just up to the invoker to call
>>>> spliced_seq() instead of seq() as appropriate.
>>>>     So, is there any other code that will break if I modify
>>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception  
>>>> when
>>>> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
>>>> just a paranoid check or if it's there to guard against
>>>> something.  If
>>>> the latter, I need to know what code to fix.  I'll dig and look,  
>>>> but
>>>> if anybody knows or has an idea, save me some time.  I suppose I  
>>>> can
>>>> just change it and see what tests start failing. 8)
>>>
>>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
>>> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
>>> state that the Exon class is used to specifically describe exons, as
>>> the name implies.  Exons are primarily eukaryotic in origin, so you
>>> shouldn't encounter wraparounds, and should not have split locations
>>> by definition (which likely explains the exception).
>>>
>>> Wouldn't a SeqFeature::Generic work just as well using a split
>>> location?
>>>
>>> chris
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ryanx07 at hotmail.com  Wed Jun 13 08:16:15 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Wed, 13 Jun 2007 07:16:15 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
Message-ID: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>

Thanks so much, Chris, it works now.
All the codes I tested were copied from Bioperl Tutorial. Why did they have 
such problems, because of the platform issue or different versions of 
BioPerl? I tested so far 6 scripts, three work and three don't.

Here is the problem for the 3rd failed script:
=================================
use strict;
use Bio::Tools::Run::RemoteBlast;
my $remote_blast = Bio::Tools::Run::RemoteBlast->new (
         -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' );
my $r = $remote_blast->submit_blast("d1.fa");
my $rc;
while ( my @rids = $remote_blast->each_rid ) {
    for my $rid ( @rids ) {
       $rc = $remote_blast->retrieve_blast($rid);
    }
}
print "$rc\n"; #I just want to print sth here before parsing the result
=========================================================d1.fa
>example
CCCTTCAGGTACCCCGAGGTAACACGAGACACTCGGGATCTGGGAAGGGGACTGGGGCTTCTTTAAAAGCGCTCAGTTTAAAAAGCTTCTATGCCTGAATAGGTGACCGGAGGCCGGCACC
=========================================================result
C:\>perl t13.pl

-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------

-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------
Terminating on signal SIGINT(2)

C:\>


Please help me to correct the problem, thanks.


= = = Original message = = =

Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme? objects, 
each with its own name().? Using grouped methods like? 
'$collection->cutters(6)' will retrieve a new EnzymeCollection? containing 
all six-cutters from the original collection.? You should? use one of the 
EnzymeCollection accessor methods to retrieve the? enzyme that you wanted 
first or iterate through them all.? This works? for me:

use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection->new();
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection->each_enzyme)
?? print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";


chris

On Jun 12, 2007, at 1:11 PM, L Xu wrote:


   Thank you very much, it did make the script advanced a bit but I? got the 
following error:

C:\~Scripts>perl t9.pl
Can't locate object method "name" via package? 
"Bio::Restriction::EnzymeCollectio
n" at t9.pl line 5, <DATA> line 532.

I checked the documentation , there is no "name" method for the? package. 
Thanks.

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Make every IM count. Download Messenger and join the i?m Initiative now. 
It?s free. http://im.live.com/messenger/im/home/?source=TAGHM_June07


From cjfields at uiuc.edu  Wed Jun 13 10:41:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 09:41:55 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
Message-ID: <4F7BE556-BD8C-4378-BDE7-1F31364F49DA@uiuc.edu>

Judging by the output it looks like you have no network access or  
can't connect to the server (what remoteblast needs).  Make sure you  
don't need proxy settings.

To preempt the next question, no, I'm not going to explain what a  
proxy is.  The RemoteBlast docs show how to set them, and Google is a  
wonderful tool...

chris

On Jun 13, 2007, at 7:16 AM, L Xu wrote:

> ...
> -------------------- WARNING ---------------------
> MSG: <HTML>
> <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> <BODY>
> <H1>An Error Occurred</H1>
> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> </BODY>
> </HTML>
>
> ---------------------------------------------------
> ...


From ryanx07 at hotmail.com  Wed Jun 13 11:01:07 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Wed, 13 Jun 2007 10:01:07 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
Message-ID: <BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>

I do have the internet connection bu not use the proxy server.
I tested the network connection with ping command (below). The ncbi website 
does not response. Is there any special network setting needed for 
connecting the ncbi website?
Thank you so much.

C:\>ping www.yahoo.com

Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:

Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
Reply from 69.147.114.210: bytes=32 time=360ms TTL=45

Ping statistics for 69.147.114.210:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 312ms, Maximum = 363ms, Average = 338ms

C:\>ping www.ncbi.nlm.nih.gov

Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:

Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 130.14.29.110:
    Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),


= = = Original message = = =

Judging by the output it looks like you have no network access or? can't 
connect to the server (what remoteblast needs).? Make sure you? don't need 
proxy settings.

To preempt the next question, no, I'm not going to explain what a? proxy 
is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
tool...

chris

On Jun 13, 2007, at 7:16 AM, L Xu wrote:


   ...
-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------
...

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Wed Jun 13 12:14:22 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 11:14:22 -0500
Subject: [Bioperl-l] method naming
Message-ID: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>

Some quick questions on method naming.  I couldn't find this on the  
mail list previously and just want some opinions.

1) Is there any preference on how to name a method that returns a  
list of class instances vs. data?  I have seen 'each' (each_Location,  
each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.  
simple (hits, hsps).

2) Do we want have methods which return objects have the object name  
in Title Case (each_Location, get_Seq_by_id, etc) or does it really  
matter?

chris


From dmessina at wustl.edu  Wed Jun 13 12:41:53 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 13 Jun 2007 11:41:53 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
Message-ID: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>

> 1) Is there any preference on how to name a method that returns a
> list of class instances vs. data?  I have seen 'each' (each_Location,
> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
> simple (hits, hsps).

I'd prefer 'get_all' because it's more intuitive to me what the  
method is doing. 'Each' is too programmer-y.


> 2) Do we want have methods which return objects have the object name
> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
> matter?

I like Title Case because it reinforces the notion that what you're  
getting back is a specific object with that name (Seq) rather than  
the generic thing that the name represents (AGTCTGTGATAT, the actual  
sequence as a string).


Dave


From hlapp at gmx.net  Wed Jun 13 13:03:59 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 13:03:59 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
Message-ID: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>

We set a convention a while back on how to name these. It is  
implemented in the bioperl.lisp file (too bad no one is using emacs  
any more these days - it's a great editor), and in fact we started a  
renaming campaign (not sure when that was) on the SeqI and  
SeqFeatureI classes (you'll still see the old names aliased).

However, we never got to finish the clean up.

The convention was to use get_{ClassName}s, and get_all_{ClassName}s  
if there is a difference to the former (mostly because of  
hierarchical data; for example features can be nested, and  
get_all_SeqFeatures returns them all flattened out, while  
get_SeqFeatures returns only the top objects), and for modifying add_ 
{ClassName} and remove_{ClassName}s.

The class name was to be in title case to emphasize the fact that it  
is an array of object you'd be getting back (and what kind of  
objects). If it is strings or any other scalar type, the name would  
be in lower case.

	-hilmar

On Jun 13, 2007, at 12:14 PM, Chris Fields wrote:

> Some quick questions on method naming.  I couldn't find this on the
> mail list previously and just want some opinions.
>
> 1) Is there any preference on how to name a method that returns a
> list of class instances vs. data?  I have seen 'each' (each_Location,
> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
> simple (hits, hsps).
>
> 2) Do we want have methods which return objects have the object name
> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
> matter?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 13 13:19:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 12:19:43 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
Message-ID: <B7E2E5CA-3027-4D25-B9EA-998D2BC59DBB@uiuc.edu>

Sounds good.  I agree with Dave also one the use of 'each', as it's a  
bit ambiguous (seems to imply iteration as opposed to returning a  
whole list).

We probably need to post this somewhere on the wiki for future  
reference; maybe in Advanced BioPerl?  I'll add this in shortly.

chris

On Jun 13, 2007, at 12:03 PM, Hilmar Lapp wrote:

> We set a convention a while back on how to name these. It is  
> implemented in the bioperl.lisp file (too bad no one is using emacs  
> any more these days - it's a great editor), and in fact we started  
> a renaming campaign (not sure when that was) on the SeqI and  
> SeqFeatureI classes (you'll still see the old names aliased).
>
> However, we never got to finish the clean up.
>
> The convention was to use get_{ClassName}s, and get_all_{ClassName} 
> s if there is a difference to the former (mostly because of  
> hierarchical data; for example features can be nested, and  
> get_all_SeqFeatures returns them all flattened out, while  
> get_SeqFeatures returns only the top objects), and for modifying  
> add_{ClassName} and remove_{ClassName}s.
>
> The class name was to be in title case to emphasize the fact that  
> it is an array of object you'd be getting back (and what kind of  
> objects). If it is strings or any other scalar type, the name would  
> be in lower case.
>
> 	-hilmar
>
> On Jun 13, 2007, at 12:14 PM, Chris Fields wrote:
>
>> Some quick questions on method naming.  I couldn't find this on the
>> mail list previously and just want some opinions.
>>
>> 1) Is there any preference on how to name a method that returns a
>> list of class instances vs. data?  I have seen 'each' (each_Location,
>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
>> simple (hits, hsps).
>>
>> 2) Do we want have methods which return objects have the object name
>> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
>> matter?
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Jun 13 14:43:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 13:43:41 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <467036FC.8000505@watson.wustl.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
	<467036FC.8000505@watson.wustl.edu>
Message-ID: <286EE81C-0926-4AAE-9110-02948DFADF36@uiuc.edu>


On Jun 13, 2007, at 1:27 PM, Michael Kiwala wrote:

>
> David Messina wrote:
>>> 1) Is there any preference on how to name a method that returns a
>>> list of class instances vs. data?  I have seen  
>>> 'each' (each_Location,
>>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures)  
>>> vs.
>>> simple (hits, hsps).
>>>
>>
>> I'd prefer 'get_all' because it's more intuitive to me what the   
>> method is doing. 'Each' is too programmer-y.
>>
>>
>>
> When I think 'get_all', I think of a method that returns a list of  
> objects at once. When I think of 'each', I think of a method that  
> returns a scalar but can be called multiple times to iterate over a  
> set of objects.

Yep, hence the ambiguity issue (and my confusion).  I think it was so  
you could both iterate and return a list using this:

for my $obj ($seq->each_Class) {...}
my @objs = $seq->each_Class;

I use 'next' and 'get/get_all' as an iterator and get accessor  
(similar to how it's used in Bio::SearchIO):

while (my $obj = $seq->next_Class) {...}
my @objs = $seq->get_Class; # or get_all_Class for flattened lists

which to me is much clearer.

chris


From mkiwala at watson.wustl.edu  Wed Jun 13 14:27:08 2007
From: mkiwala at watson.wustl.edu (Michael Kiwala)
Date: Wed, 13 Jun 2007 13:27:08 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
Message-ID: <467036FC.8000505@watson.wustl.edu>


David Messina wrote:
>> 1) Is there any preference on how to name a method that returns a
>> list of class instances vs. data?  I have seen 'each' (each_Location,
>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
>> simple (hits, hsps).
>>     
>
> I'd prefer 'get_all' because it's more intuitive to me what the  
> method is doing. 'Each' is too programmer-y.
>
>
>   
When I think 'get_all', I think of a method that returns a list of 
objects at once. When I think of 'each', I think of a method that 
returns a scalar but can be called multiple times to iterate over a set 
of objects.


From sac at bioperl.org  Wed Jun 13 17:17:27 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Wed, 13 Jun 2007 14:17:27 -0700
Subject: [Bioperl-l] method naming
In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
Message-ID: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>

On 6/13/07, Hilmar Lapp <hlapp at gmx.net> wrote:
> We set a convention a while back on how to name these. It is
> implemented in the bioperl.lisp file (too bad no one is using emacs
> any more these days - it's a great editor),

As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we
could improve the visibility of bioperl.lisp. In truth, I had
forgotten about it, though lit turns out I was loading an old version
of it. (Btw, using the latest version of bioperl.lisp with xemacs
21.4.17, I don't get a bioperl menu item, though I can access bioperl
functions via M-x. Suggestions?)

I see bioperl.lisp is mentioned twice parenthetically in the advanced
bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here
would help. While we're at it, maybe we could add a bioperl.vi file to
the distribution (if you can do such things with vi/vim).

On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
> We probably need to post this somewhere on the wiki for future
> reference; maybe in Advanced BioPerl?  I'll add this in shortly.

Another idea: Add a method naming check to the set of audits we
perform on CVS committed code. It could check for agreement with our
conventions and warn if nothing was found (may not be a problem
though).

Steve


From arareko at campus.iztacala.unam.mx  Wed Jun 13 18:03:34 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 13 Jun 2007 17:03:34 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
Message-ID: <467069B6.7080003@campus.iztacala.unam.mx>

By the time of the 1.5.2 release, I jumped onto the idea of creating a 
BioPerl template for Komodo. Chris F handed me one he had already made 
but in the end I didn't had enough spare time to get into it. If someone 
wants to give it a try please let ChrisF/me know.

Regards,
Mauricio.

Steve Chervitz wrote:
> On 6/13/07, Hilmar Lapp <hlapp at gmx.net> wrote:
>> We set a convention a while back on how to name these. It is
>> implemented in the bioperl.lisp file (too bad no one is using emacs
>> any more these days - it's a great editor),
> 
> As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we
> could improve the visibility of bioperl.lisp. In truth, I had
> forgotten about it, though lit turns out I was loading an old version
> of it. (Btw, using the latest version of bioperl.lisp with xemacs
> 21.4.17, I don't get a bioperl menu item, though I can access bioperl
> functions via M-x. Suggestions?)
> 
> I see bioperl.lisp is mentioned twice parenthetically in the advanced
> bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here
> would help. While we're at it, maybe we could add a bioperl.vi file to
> the distribution (if you can do such things with vi/vim).
> 
> On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> We probably need to post this somewhere on the wiki for future
>> reference; maybe in Advanced BioPerl?  I'll add this in shortly.
> 
> Another idea: Add a method naming check to the set of audits we
> perform on CVS committed code. It could check for agreement with our
> conventions and warn if nothing was found (may not be a problem
> though).
> 
> Steve
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From hlapp at gmx.net  Wed Jun 13 18:41:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 18:41:45 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
Message-ID: <FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>


On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:

> using the latest version of bioperl.lisp with xemacs 21.4.17, I  
> don't get a bioperl menu item

I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item  
it showing up just beautifully. (BTW it also have very nice icons for  
various functions - though I always feel guilty for using keystrokes  
instead.)

Is GNU Emacs finally winning this? ;)

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jason at bioperl.org  Wed Jun 13 18:58:51 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 13 Jun 2007 15:58:51 -0700
Subject: [Bioperl-l] method naming
In-Reply-To: <FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
Message-ID: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>

Post your dualing screenshots to the wiki!

I had started a couple of IDE pages on the wiki a while ago:
  http://bioperl.org/wiki/Emacs
  http://bioperl.org/wiki/Emacs_template
  http://bioperl.org/wiki/Vi

If anyone is feeling excited enough to write a few more IDE pages and  
link them into a common article that would be great.

-jason
On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:

>
> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>
>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>> don't get a bioperl menu item
>
> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item
> it showing up just beautifully. (BTW it also have very nice icons for
> various functions - though I always feel guilty for using keystrokes
> instead.)
>
> Is GNU Emacs finally winning this? ;)
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From cjfields at uiuc.edu  Wed Jun 13 19:08:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 18:08:17 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
Message-ID: <E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>

Would probably be worth writing one up for Komodo since Mauricio,  
Sendu, and I use it.

I updated the Advanced BioPerl page with Hilmar's methods suggestions/ 
rules (as well as a few I found dating back a number of years on the  
mail list).  It might be worth a glance in case there are any changes  
needed:

http://www.bioperl.org/wiki/Advanced_BioPerl

chris

On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote:

> Post your dualing screenshots to the wiki!
>
> I had started a couple of IDE pages on the wiki a while ago:
>  http://bioperl.org/wiki/Emacs
>  http://bioperl.org/wiki/Emacs_template
>  http://bioperl.org/wiki/Vi
>
> If anyone is feeling excited enough to write a few more IDE pages  
> and link them into a common article that would be great.
>
> -jason
> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:
>
>>
>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>>
>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>>> don't get a bioperl menu item
>>
>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item
>> it showing up just beautifully. (BTW it also have very nice icons for
>> various functions - though I always feel guilty for using keystrokes
>> instead.)
>>
>> Is GNU Emacs finally winning this? ;)
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Wed Jun 13 19:28:17 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 19:28:17 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
	<E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
Message-ID: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>

Thanks Chris for doing this - looks great. The only comment that I  
have is that method names should never start with a capital letter.  
If the getter/setter is for a single object (as opposed to a list),  
the name should probably be similar (if not identical) to the class  
being expected and returned, but lower-case.

E.g., $feature->location(), $seq->species() etc

	-hilmar

On Jun 13, 2007, at 7:08 PM, Chris Fields wrote:

> Would probably be worth writing one up for Komodo since Mauricio,  
> Sendu, and I use it.
>
> I updated the Advanced BioPerl page with Hilmar's methods  
> suggestions/rules (as well as a few I found dating back a number of  
> years on the mail list).  It might be worth a glance in case there  
> are any changes needed:
>
> http://www.bioperl.org/wiki/Advanced_BioPerl
>
> chris
>
> On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote:
>
>> Post your dualing screenshots to the wiki!
>>
>> I had started a couple of IDE pages on the wiki a while ago:
>>  http://bioperl.org/wiki/Emacs
>>  http://bioperl.org/wiki/Emacs_template
>>  http://bioperl.org/wiki/Vi
>>
>> If anyone is feeling excited enough to write a few more IDE pages  
>> and link them into a common article that would be great.
>>
>> -jason
>> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:
>>
>>>
>>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>>>
>>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>>>> don't get a bioperl menu item
>>>
>>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu  
>>> item
>>> it showing up just beautifully. (BTW it also have very nice icons  
>>> for
>>> various functions - though I always feel guilty for using keystrokes
>>> instead.)
>>>
>>> Is GNU Emacs finally winning this? ;)
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 13 19:44:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 18:44:08 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
	<E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
	<06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>
Message-ID: <91AF2018-EC27-49FD-A4D1-C31C0E73DEFB@uiuc.edu>

Agreed.  We can definitely add that in.

As we edge towards another release we try another round of cleaning  
up.  I wouldn't mind pushing out another 1.5 point release before  
summer's up if possible; most of the tough work was done for v.1.5.2  
by Sendu.

chris

On Jun 13, 2007, at 6:28 PM, Hilmar Lapp wrote:

> Thanks Chris for doing this - looks great. The only comment that I
> have is that method names should never start with a capital letter.
> If the getter/setter is for a single object (as opposed to a list),
> the name should probably be similar (if not identical) to the class
> being expected and returned, but lower-case.
>
> E.g., $feature->location(), $seq->species() etc
>
> 	-hilmar
>
> On Jun 13, 2007, at 7:08 PM, Chris Fields wrote:
>
>> Would probably be worth writing one up for Komodo since Mauricio,
>> Sendu, and I use it.
>>
>> I updated the Advanced BioPerl page with Hilmar's methods
>> suggestions/rules (as well as a few I found dating back a number of
>> years on the mail list).  It might be worth a glance in case there
>> are any changes needed:
>>
>> http://www.bioperl.org/wiki/Advanced_BioPerl
>>
>> chris
...


From johncumbers at gmail.com  Wed Jun 13 20:20:42 2007
From: johncumbers at gmail.com (John Cumbers)
Date: Wed, 13 Jun 2007 20:20:42 -0400
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
Message-ID: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>

Hello,

I have a simple problem, I'm trying to search a genome sequence for a motif,
I then want to output a BED file to display all the locations of this motif
on the UCSC Genome Browser.  I could not find a script to do this, so I
started to write my own.   I'm new to perl and my code below was my attempt
to read the sequence string and output the index bp of the start of each
motif.  With this I could build the BED file myself, which requires start
and finish base pairs.

For the first motif I can output the start index, but when I try and read
the next one off the sequence it does not work.  Instead I just get an
output of a list of 1's.  I realise that this is more a request for some
simple perl help, but any help much appreciated.

Best wishes,
John


$seq_object = read_sequence("Drosophila.Chr3.test.AE014296.fasta");  #turn
my FASTA file into a seq object.
$sequence_as_a_string = $seq_object->seq();  #turn it into a string
# search $sequence_as_a_string  string for motif AAA as example
# if found, return the index that it is found at

while ($sequence_as_a_string =~ m/AAA/g) {
  print "Found '$&'.  Next attempt at character " .
pos($sequence_as_a_string)+1 . "\n";
}


-- 
John Cumbers,  Graduate Student
Biology and Medicine
Brown University, Box G-W
Providence, Rhode Island, 02912, USA
Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
UK to USA: 0207 617 7824


From cjfields at uiuc.edu  Wed Jun 13 21:58:37 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 20:58:37 -0500
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
In-Reply-To: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
References: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
Message-ID: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>

This is answered in the FAQ (sorry if the URL wraps, but we don't  
like tinyurls):

http://www.bioperl.org/wiki/ 
FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. 
22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F

chris

On Jun 13, 2007, at 7:20 PM, John Cumbers wrote:

> Hello,
>
> I have a simple problem, I'm trying to search a genome sequence for  
> a motif,
> I then want to output a BED file to display all the locations of  
> this motif
> on the UCSC Genome Browser.  I could not find a script to do this,  
> so I
> started to write my own.   I'm new to perl and my code below was my  
> attempt
> to read the sequence string and output the index bp of the start of  
> each
> motif.  With this I could build the BED file myself, which requires  
> start
> and finish base pairs.
>
> For the first motif I can output the start index, but when I try  
> and read
> the next one off the sequence it does not work.  Instead I just get an
> output of a list of 1's.  I realise that this is more a request for  
> some
> simple perl help, but any help much appreciated.
>
> Best wishes,
> John
>
>
> $seq_object = read_sequence 
> ("Drosophila.Chr3.test.AE014296.fasta");  #turn
> my FASTA file into a seq object.
> $sequence_as_a_string = $seq_object->seq();  #turn it into a string
> # search $sequence_as_a_string  string for motif AAA as example
> # if found, return the index that it is found at
>
> while ($sequence_as_a_string =~ m/AAA/g) {
>   print "Found '$&'.  Next attempt at character " .
> pos($sequence_as_a_string)+1 . "\n";
> }
>
>
>
> -- 
> John Cumbers,  Graduate Student
> Biology and Medicine
> Brown University, Box G-W
> Providence, Rhode Island, 02912, USA
> Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
> UK to USA: 0207 617 7824
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Thu Jun 14 00:08:04 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 13 Jun 2007 21:08:04 -0700
Subject: [Bioperl-l] wiki bulk update
Message-ID: <992B2C7A-E944-4C69-BDE0-B0B0F6D1274D@bioperl.org>

I did a some bulk update of Module pages for new modules that had  
been created since we last setup these pages:
I outlined a little bit of what it requires behind the scenes.

http://bioperl.org/wiki/BioPerl:Module_pages

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From bix at sendu.me.uk  Thu Jun 14 05:35:00 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 10:35:00 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
Message-ID: <46710BC4.3060302@sendu.me.uk>

It is preferable to have ->new syntax over new Object syntax, as 
outlined here: 
http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules

I propose making this syntax change in all Bioperl POD documentation, so 
that the bad syntax is no longer suggested/encouraged. Any objections? 
If not, I'll go ahead and commit the changes.

(affects 907 modules in live)


Cheers,
Sendu.


From bix at sendu.me.uk  Thu Jun 14 06:01:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 11:01:02 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46710BC4.3060302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
Message-ID: <467111DE.6060800@sendu.me.uk>

Sendu Bala wrote:
> It is preferable to have ->new syntax over new Object syntax, as 
> outlined here: 
> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules 
> 
> 
> I propose making this syntax change in all Bioperl POD documentation,

Actually, I propose making the change to code as well.


From hlapp at gmx.net  Thu Jun 14 08:47:47 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 14 Jun 2007 08:47:47 -0400
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <467111DE.6060800@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk>
Message-ID: <0D7CD74F-DCB3-44F8-9AC7-144B1BD58946@gmx.net>

Sounds fine to me. People do go by working examples, and I've seen  
inconsistent examples leading to confusion on the end of newbies.

	-hilmar

On Jun 14, 2007, at 6:01 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>>
>> I propose making this syntax change in all Bioperl POD documentation,
>
> Actually, I propose making the change to code as well.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Jun 14 08:55:18 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 07:55:18 -0500
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <467111DE.6060800@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk>
Message-ID: <EC0DB8AB-F7C8-423B-9566-34B3FD24B3EC@uiuc.edu>

Sounds fine by me.  I may actually start tackling some of the feature/ 
annotation overloading stuff myself to see what happens (I'll drop a  
notice when that occurs).

chris

On Jun 14, 2007, at 5:01 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>>
>> I propose making this syntax change in all Bioperl POD documentation,
>
> Actually, I propose making the change to code as well.


From tanzeem.mb at gmail.com  Thu Jun 14 02:27:19 2007
From: tanzeem.mb at gmail.com (tanzeem)
Date: Wed, 13 Jun 2007 23:27:19 -0700 (PDT)
Subject: [Bioperl-l] Problem working with remoteblast submit method in
	webbrowser.
Message-ID: <11114623.post@talk.nabble.com>


 I have a program which uses the Bio perl remoteblast module which compares a
aminoacid  fasta file with swissprot database. The submit_blast() method 
works successfully when   run  from commandline.But when the program is run
from web browser it returns -1. I was trying to adapt the code from
Remoteblast synopsis for my need.
-- 
View this message in context: http://www.nabble.com/Problem-working-with-remoteblast-submit-method-in-webbrowser.-tf3919886.html#a11114623
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From bix at sendu.me.uk  Thu Jun 14 11:34:27 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 16:34:27 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46710BC4.3060302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
Message-ID: <46716003.2030302@sendu.me.uk>

Sendu Bala wrote:
> It is preferable to have ->new syntax over new Object syntax, as 
> outlined here: 
> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules
> 
> I propose making this syntax change in all Bioperl POD documentation, so 
> that the bad syntax is no longer suggested/encouraged. Any objections? 
> If not, I'll go ahead and commit the changes.
> 
> (affects 907 modules in live)

It was actually 515 modules & test scripts from live, 48 from run, 21
from db and 2 from network.

Now committed. Before and after my changes these were failing:


Failed Test     Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
t/BioGraphics.t    3   768    38    3  3-5
t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
                                        1932 2106
t/Sopma.t          2   512    16    2  8 15
t/genbank.t        2   512   247    2  122-123


BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
(unintentional?).

Sopma may not be a bug: results from server might have changed.

genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163
-> 1.164 not doing what the new tests expect.

PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
you working on that, or can I fix those errors?

Anyone care to look into those things?

Cheers,
Sendu.


From cjfields at uiuc.edu  Thu Jun 14 12:35:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 11:35:21 -0500
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <AAFC1021-9E3A-4C31-A9B8-4B0046F907A1@uiuc.edu>

The genbank commit was mine so I'll look into it; may be that I  
hadn't finished up the bug work.  If if have time I'll look into  
Sopma as well (unless you get to it first).

chris

On Jun 14, 2007, at 10:34 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>> I propose making this syntax change in all Bioperl POD  
>> documentation, so
>> that the bad syntax is no longer suggested/encouraged. Any  
>> objections?
>> If not, I'll go ahead and commit the changes.
>>
>> (affects 907 modules in live)
>
> It was actually 515 modules & test scripts from live, 48 from run, 21
> from db and 2 from network.
>
> Now committed. Before and after my changes these were failing:
>
>
> Failed Test     Stat Wstat Total Fail  List of Failed
> ---------------------------------------------------------------------- 
> ---------
> t/BioGraphics.t    3   768    38    3  3-5
> t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
>                                         1932 2106
> t/Sopma.t          2   512    16    2  8 15
> t/genbank.t        2   512   247    2  122-123
>
>
> BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
> (unintentional?).
>
> Sopma may not be a bug: results from server might have changed.
>
> genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm  
> 1.163
> -> 1.164 not doing what the new tests expect.
>
> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan,  
> are
> you working on that, or can I fix those errors?
>
> Anyone care to look into those things?
>
> Cheers,
> Sendu.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Thu Jun 14 12:43:43 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 17:43:43 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <4671703F.4010109@sheffield.ac.uk>

I'm just wondering if anyone passes their modules through perltidy in
order for them to have the same look/feel? If so, do you have a
.perltidyrc file? Also, is it worth running the Bioperl modules through it?

Nath


From n.haigh at sheffield.ac.uk  Thu Jun 14 12:36:37 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 17:36:37 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <46716E95.3090604@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as 
>> outlined here: 
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules
>>
>> I propose making this syntax change in all Bioperl POD documentation, so 
>> that the bad syntax is no longer suggested/encouraged. Any objections? 
>> If not, I'll go ahead and commit the changes.
>>
>> (affects 907 modules in live)
> 
> It was actually 515 modules & test scripts from live, 48 from run, 21
> from db and 2 from network.
> 
> Now committed. Before and after my changes these were failing:
> 
> 
> Failed Test     Stat Wstat Total Fail  List of Failed
> -------------------------------------------------------------------------------
> t/BioGraphics.t    3   768    38    3  3-5
> t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
>                                         1932 2106
> t/Sopma.t          2   512    16    2  8 15
> t/genbank.t        2   512   247    2  122-123
> 
> 
> BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
> (unintentional?).
> 
> Sopma may not be a bug: results from server might have changed.
> 
> genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163
> -> 1.164 not doing what the new tests expect.
> 
> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
> you working on that, or can I fix those errors?
> 

I can fix these - although I'm still trying to get my new Debian 4.0
system up-to-speed so it might take me a little while! RE the
PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't
installed. However, would it be better to have Test::Pod in t/lib so
that it runs on the user's system during installation or leave it as is?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGcW6VczuW2jkwy2gRAv3dAKCURgd4F881MhbessKxNh/cPrJu2wCeLwnS
7olroF2e6+4I0biz6fWRmu4=
=s3hK
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Thu Jun 14 13:15:24 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 18:15:24 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <4671703F.4010109@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk>
Message-ID: <467177AC.8060104@sendu.me.uk>

Nathan S. Haigh wrote:
> I'm just wondering if anyone passes their modules through perltidy in
> order for them to have the same look/feel? If so, do you have a
> .perltidyrc file? Also, is it worth running the Bioperl modules through it?

I don't use it, but I was contemplating the same thing. Chris uses it 
from time to time and I think we have a similar taste in style.

But we'd have to hammer something out that was agreeable to everyone.


From mmokrejs at ribosome.natur.cuni.cz  Thu Jun 14 13:19:42 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 14 Jun 2007 19:19:42 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
Message-ID: <467178AE.5040905@ribosome.natur.cuni.cz>


David Messina wrote:
> Hi Martin,
> 
> You're in luck -- the BioPerl core distribution includes two scripts  
> for doing just that:
> 
> 	genbank2gff

Somehow these scripts were not installed for me on Gentoo, but I have then in the
cvs copy. ;-) Anyway, the one above is not for me, I do not need the GFF database,
or better to say I have no intent to install that unknown thing, seems like an overkill
for my case. I just want to render a plasmid map.

> 	genbank2gff3

This one seems more promising but still with current cvs checkout I get...

$ perl /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl --in stdin --out stdout < ~/99.gb 
# Input: stdin
Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, <FH> line 7.
Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, <FH> line 7.
Can't call method "binomial" on an undefined value at /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl line 675, <FH> line 125.
$
$ bp_seqconvert.pl --from genbank --to embl < ~/IRESite/gb/99.gb 
Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, <STDIN> line 7.
Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, <STDIN> line 7.
ID   unknown; SV 1; circular; unassigned DNA; STD; UNC; 5391 BP.
XX
AC   unknown;
XX
XX
XX
CC   ApEinfo:methylated:0
...

Oh dear, I have just manually edited the files and still they are wrong? Oh no. :(

> 
> Look in the scripts directory of the distro.
> 
> Also, there is a *huge* amount of documentation and examples on the  
> BioPerl website.
> 
> 	http://www.bioperl.org/wiki/HOWTOs

You mean http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File ? ;-)

> 
> Reading those, reading the FAQ, and searching the mailing list  
> archives are where I look first when I don't know how to do something  
> in BioPerl.
> 
> 
> Dave
> 
> --
> Dave Messina
> Senior Analyst, Assembly Group
> Genome Sequencing Center
> Washington University
> St. Louis, MO
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 99.gb
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070614/fc6e601a/attachment-0003.pl>

From mmokrejs at ribosome.natur.cuni.cz  Thu Jun 14 13:23:28 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 14 Jun 2007 19:23:28 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <467178AE.5040905@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
Message-ID: <46717990.6040509@ribosome.natur.cuni.cz>

Martin MOKREJ? wrote:

>> Also, there is a *huge* amount of documentation and examples on the  
>> BioPerl website.
>>
>>     http://www.bioperl.org/wiki/HOWTOs
> 
> You mean 
> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File 
> ? ;-)

$ perl embl2picture.pl ~/99.gb | display -
Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature Bio::Location::Simple=HASH(0x893ebac): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature Bio::Location::Simple=HASH(0x893e720): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.
$

The plasmid is a circular DNA, why is the diagram in linear? ;-)

Martin


From bix at sendu.me.uk  Thu Jun 14 13:03:34 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 18:03:34 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716E95.3090604@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<46716E95.3090604@sheffield.ac.uk>
Message-ID: <467174E6.1090001@sendu.me.uk>

Nathan S. Haigh wrote:
>> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
>> you working on that, or can I fix those errors?
> 
> I can fix these - although I'm still trying to get my new Debian 4.0
> system up-to-speed so it might take me a little while! RE the
> PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't
> installed. However, would it be better to have Test::Pod in t/lib so
> that it runs on the user's system during installation or leave it as is?

Leave it as is. Every-day users don't need to check the syntax of the 
pod. In fact, it really only needs to be done once, prior to packaging 
up a new release.


From n.haigh at sheffield.ac.uk  Thu Jun 14 13:32:37 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 18:32:37 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <467177AC.8060104@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
Message-ID: <46717BB5.8000706@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> I'm just wondering if anyone passes their modules through perltidy in
>> order for them to have the same look/feel? If so, do you have a
>> .perltidyrc file? Also, is it worth running the Bioperl modules
>> through it?
> 
> I don't use it, but I was contemplating the same thing. Chris uses it
> from time to time and I think we have a similar taste in style.
> 
> But we'd have to hammer something out that was agreeable to everyone.

A starting place maybe Perl Best Practices by Damian Conway:
http://www.oreilly.com/catalog/perlbp/


The perltidyrc file can e found here:
http://www.perlmonks.org/?node_id=485885

I also found this nice thread with some ideas, inc some code that causes
emacs to auto-perltidy everything you use cperl-mode with. I don't use
emacs myself, ut here's the link if anyone is interested:
http://www.perlmonks.org/?node_id=516501

Nath


From johnsonm at gmail.com  Thu Jun 14 13:38:31 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Thu, 14 Jun 2007 12:38:31 -0500
Subject: [Bioperl-l] Perltidy
In-Reply-To: <467177AC.8060104@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
Message-ID: <ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>

    The nice thing about Perl Tidy is that everybody can have their
own config file.  There could be a bioperl default config that gets
applied at checkin time.  Anybody that didn't like it could script
checkouts to get run through their own config.  Diffs might get a
little hairy, but as long as you tidy before diffing, it shouldn't be
too bad.  Speaking of which....coding style is controversial enough,
but since that's already been opened, what about CVS vs Subversion? 8)
 Some of the scripting for this sort of thing might be easer in
Subversion.  Though maybe something like Git would fit the developer
model better (more support for distributed development).

On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
> Nathan S. Haigh wrote:
> > I'm just wondering if anyone passes their modules through perltidy in
> > order for them to have the same look/feel? If so, do you have a
> > .perltidyrc file? Also, is it worth running the Bioperl modules through it?
>
> I don't use it, but I was contemplating the same thing. Chris uses it
> from time to time and I think we have a similar taste in style.
>
> But we'd have to hammer something out that was agreeable to everyone.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From n.haigh at sheffield.ac.uk  Thu Jun 14 13:39:39 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 18:39:39 +0100
Subject: [Bioperl-l] cvs changes in working copy
Message-ID: <46717D5B.5040108@sheffield.ac.uk>

Not sure if I'm being dense or if it's because I've been working with
svn recently, but - how do I get a list of files that are different in
my working copy compared to the repository?

Cheers
Nath


From cjfields at uiuc.edu  Thu Jun 14 13:46:38 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 12:46:38 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
Message-ID: <CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>

Is 99.gb supposed to be a GenBank file?  And you're loading it into  
embl2picture (which I assume takes EMBL format files)?  Without  
example code we can easily make the wrong assumptions (i.e. that this  
is user error and not a BioPerl problem).

Also, I don't believe the feature plotting scripts plot circular  
chromosomes/plasmids.  If you want this functionality you'll have to  
code it for yourself.

chris

On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote:

> Martin MOKREJ? wrote:
>
>>> Also, there is a *huge* amount of documentation and examples on the
>>> BioPerl website.
>>>
>>>     http://www.bioperl.org/wiki/HOWTOs
>>
>> You mean
>> http://www.bioperl.org/wiki/ 
>> HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>> ? ;-)
>
> $ perl embl2picture.pl ~/99.gb | display -
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature  
> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature  
> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature  
> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature  
> Bio::Location::Simple=HASH(0x893e720): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature  
> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
> $
>
> The plasmid is a circular DNA, why is the diagram in linear? ;-)
>
> Martin
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Thu Jun 14 13:57:35 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 14 Jun 2007 12:57:35 -0500
Subject: [Bioperl-l] Perltidy
In-Reply-To: <46717BB5.8000706@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk> <46717BB5.8000706@sheffield.ac.uk>
Message-ID: <4671818F.5040902@campus.iztacala.unam.mx>

I think a consensus .perltidyrc could be placed in the source distribution.

Mauricio.

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> I'm just wondering if anyone passes their modules through perltidy in
>>> order for them to have the same look/feel? If so, do you have a
>>> .perltidyrc file? Also, is it worth running the Bioperl modules
>>> through it?
>> I don't use it, but I was contemplating the same thing. Chris uses it
>> from time to time and I think we have a similar taste in style.
>>
>> But we'd have to hammer something out that was agreeable to everyone.
> 
> A starting place maybe Perl Best Practices by Damian Conway:
> http://www.oreilly.com/catalog/perlbp/
> 
> 
> The perltidyrc file can e found here:
> http://www.perlmonks.org/?node_id=485885
> 
> I also found this nice thread with some ideas, inc some code that causes
> emacs to auto-perltidy everything you use cperl-mode with. I don't use
> emacs myself, ut here's the link if anyone is interested:
> http://www.perlmonks.org/?node_id=516501
> 
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Thu Jun 14 14:32:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 13:32:41 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
Message-ID: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>

To chip in on this, I only use perltidy when I need to clean bioperl  
code up for debugging (particularly if blocks are hard to see) and  
just use the defaults.  I agree it would be nice to have everything  
tidied up but it'll definitely need to be a consensus config file.

About svn, I like the idea of eventually migrating to using it over  
CVS (I think BioPython and BioJava have plans to but I'm not sure)  
but I don't really know enough to say how feasible/difficult the  
migration path would be.  Anyone know?

chris

On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote:

>     The nice thing about Perl Tidy is that everybody can have their
> own config file.  There could be a bioperl default config that gets
> applied at checkin time.  Anybody that didn't like it could script
> checkouts to get run through their own config.  Diffs might get a
> little hairy, but as long as you tidy before diffing, it shouldn't be
> too bad.  Speaking of which....coding style is controversial enough,
> but since that's already been opened, what about CVS vs Subversion? 8)
>  Some of the scripting for this sort of thing might be easer in
> Subversion.  Though maybe something like Git would fit the developer
> model better (more support for distributed development).
>
> On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
>> Nathan S. Haigh wrote:
>>> I'm just wondering if anyone passes their modules through  
>>> perltidy in
>>> order for them to have the same look/feel? If so, do you have a
>>> .perltidyrc file? Also, is it worth running the Bioperl modules  
>>> through it?
>>
>> I don't use it, but I was contemplating the same thing. Chris uses it
>> from time to time and I think we have a similar taste in style.
>>
>> But we'd have to hammer something out that was agreeable to everyone.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonm at gmail.com  Thu Jun 14 14:46:24 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Thu, 14 Jun 2007 13:46:24 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
Message-ID: <ebf5eb170706141146r6e07efffhbb98a6d101c45ccd@mail.gmail.com>

    If there was a default/standard/consensus bioperl perltidy config
file, I would probably use it prior to checkin, on my own, so I could
code in my schizophrenic style without worrying about starting any
format wars.  When I'm fixing or enhancing somebody else's code, I
always try and adapt to whatever style they used, even if it grates on
my nerves.  I'd love to not have to worry about that with Bioperl.  Of
course, nobody will every agree on a standard, so it's probably a moot
point.  8)

On 6/14/07, Chris Fields <cjfields at uiuc.edu> wrote:
> To chip in on this, I only use perltidy when I need to clean bioperl
> code up for debugging (particularly if blocks are hard to see) and
> just use the defaults.  I agree it would be nice to have everything
> tidied up but it'll definitely need to be a consensus config file.
>
> About svn, I like the idea of eventually migrating to using it over
> CVS (I think BioPython and BioJava have plans to but I'm not sure)
> but I don't really know enough to say how feasible/difficult the
> migration path would be.  Anyone know?
>
> chris


From jason at bioperl.org  Thu Jun 14 15:00:09 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 14 Jun 2007 12:00:09 -0700
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
Message-ID: <CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>


On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:

> To chip in on this, I only use perltidy when I need to clean bioperl
> code up for debugging (particularly if blocks are hard to see) and
> just use the defaults.  I agree it would be nice to have everything
> tidied up but it'll definitely need to be a consensus config file.
>

Can we do any sort of massive conversion at some logical timepoint.   
Probably after a branch release or something?  Because it basically  
means we're going to have differences on nearly every line which is  
going to make diff-ing difficult when debugging old/new versions.   
Maybe it is not a problem because we aren't introducing and new bugs!

> About svn, I like the idea of eventually migrating to using it over
> CVS (I think BioPython and BioJava have plans to but I'm not sure)
> but I don't really know enough to say how feasible/difficult the
> migration path would be.  Anyone know?
>

It's doable but non-trivial.  cvs2svn (python gah!) script exists to  
help in this.  There are pros and cons to converting.   There is a  
fair amount of documentation and other pointers out there that point  
to the CVS server for getting latest code so we'd need to think about  
whether we'd support some sort of backwards compatible SVN -> CVS for  
read-only or what.

Mostly it will need someone to lead the charge - I made a go at doing  
it in the winter, but I really don't have the SVN-foo to make this  
work.  We'd need someone with SVN experience to step up and help.   
You can always try and we can play with the converted repository for  
a while without making it the new code base.

-j

> chris
>
> On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote:
>
>>     The nice thing about Perl Tidy is that everybody can have their
>> own config file.  There could be a bioperl default config that gets
>> applied at checkin time.  Anybody that didn't like it could script
>> checkouts to get run through their own config.  Diffs might get a
>> little hairy, but as long as you tidy before diffing, it shouldn't be
>> too bad.  Speaking of which....coding style is controversial enough,
>> but since that's already been opened, what about CVS vs  
>> Subversion? 8)
>>  Some of the scripting for this sort of thing might be easer in
>> Subversion.  Though maybe something like Git would fit the developer
>> model better (more support for distributed development).
>>
>> On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
>>> Nathan S. Haigh wrote:
>>>> I'm just wondering if anyone passes their modules through
>>>> perltidy in
>>>> order for them to have the same look/feel? If so, do you have a
>>>> .perltidyrc file? Also, is it worth running the Bioperl modules
>>>> through it?
>>>
>>> I don't use it, but I was contemplating the same thing. Chris  
>>> uses it
>>> from time to time and I think we have a similar taste in style.
>>>
>>> But we'd have to hammer something out that was agreeable to  
>>> everyone.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Thu Jun 14 15:01:27 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 14 Jun 2007 12:01:27 -0700
Subject: [Bioperl-l] cvs changes in working copy
In-Reply-To: <46717D5B.5040108@sheffield.ac.uk>
References: <46717D5B.5040108@sheffield.ac.uk>
Message-ID: <EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>

cvs update | grep '^M'

On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote:

> Not sure if I'm being dense or if it's because I've been working with
> svn recently, but - how do I get a list of files that are different in
> my working copy compared to the repository?
>
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From cjfields at uiuc.edu  Thu Jun 14 15:20:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 14:20:46 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
Message-ID: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>


On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:

>
> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>
>> To chip in on this, I only use perltidy when I need to clean bioperl
>> code up for debugging (particularly if blocks are hard to see) and
>> just use the defaults.  I agree it would be nice to have everything
>> tidied up but it'll definitely need to be a consensus config file.
>>
>
> Can we do any sort of massive conversion at some logical timepoint.
> Probably after a branch release or something?  Because it basically
> means we're going to have differences on nearly every line which is
> going to make diff-ing difficult when debugging old/new versions.
> Maybe it is not a problem because we aren't introducing and new bugs!

I agree; if we intend on doing this it should be all at once, maybe  
on a branch dedicated to ensure that code changes don't tank tests  
(they shouldn't but one never knows).  We would then need a script up- 
and-running that tidies everything up prior to commits (though what  
happens if perltidy tanks?...).

Sendu, up for it?

>> About svn, I like the idea of eventually migrating to using it over
>> CVS (I think BioPython and BioJava have plans to but I'm not sure)
>> but I don't really know enough to say how feasible/difficult the
>> migration path would be.  Anyone know?
>>
>
> It's doable but non-trivial.  cvs2svn (python gah!) script exists to
> help in this.  There are pros and cons to converting.   There is a
> fair amount of documentation and other pointers out there that point
> to the CVS server for getting latest code so we'd need to think about
> whether we'd support some sort of backwards compatible SVN -> CVS for
> read-only or what.
>
> Mostly it will need someone to lead the charge - I made a go at doing
> it in the winter, but I really don't have the SVN-foo to make this
> work.  We'd need someone with SVN experience to step up and help.
> You can always try and we can play with the converted repository for
> a while without making it the new code base.
>
> -j

Stepped into that one, didn't I!  I'll look into how much effort is  
involved and try getting something going in the next month or two,  
maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
might be worth looking into.

chris


From arareko at campus.iztacala.unam.mx  Thu Jun 14 15:50:39 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 14 Jun 2007 14:50:39 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
Message-ID: <46719C0F.5010706@campus.iztacala.unam.mx>

Chris Fields wrote:
> On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:
> 
>> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>>
>>> About svn, I like the idea of eventually migrating to using it over
>>> CVS (I think BioPython and BioJava have plans to but I'm not sure)
>>> but I don't really know enough to say how feasible/difficult the
>>> migration path would be.  Anyone know?
>>>
>> It's doable but non-trivial.  cvs2svn (python gah!) script exists to
>> help in this.  There are pros and cons to converting.   There is a
>> fair amount of documentation and other pointers out there that point
>> to the CVS server for getting latest code so we'd need to think about
>> whether we'd support some sort of backwards compatible SVN -> CVS for
>> read-only or what.
>>
>> Mostly it will need someone to lead the charge - I made a go at doing
>> it in the winter, but I really don't have the SVN-foo to make this
>> work.  We'd need someone with SVN experience to step up and help.
>> You can always try and we can play with the converted repository for
>> a while without making it the new code base.
>>
>> -j
> 
> Stepped into that one, didn't I!  I'll look into how much effort is  
> involved and try getting something going in the next month or two,  
> maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
> might be worth looking into.
> 
> chris
> 

Chris D has worked with CVS-SVN transitioning for other projects, maybe 
he can shed some light on this.

Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From sac at bioperl.org  Thu Jun 14 17:33:39 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Thu, 14 Jun 2007 14:33:39 -0700
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
In-Reply-To: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>
References: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
	<5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>
Message-ID: <8f200b4c0706141433i37267774u1dc2193d8508c47b@mail.gmail.com>

This issue was discussed recently here. Check out this thread:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15046/focus=15048

Some of the tools mentioned in the FAQ item Chris mentioned do not
report where the match occurred, only that a match occurred
(String::Approx, agrep), though some do report do report match
locations (fuzznuc, fuzzprot; not sure about TFBS).

My Bio::Tools::SeqPattern module does not even perform any matches, it
just encapsulates a regular expression for a nuc or protein motif and
knows how to handle ambiguity code expansion and reverse
complementing. The idea is that you can use this to convert a
biological sequence motif into a string suitable for use in a perl
regex. Adding a match() method to this module would be handy.

There an example script for it in examples/tools of the distro (which,
btw references an obsolete module, so it won't run as is -- I'll fix).

Steve

On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
> This is answered in the FAQ (sorry if the URL wraps, but we don't
> like tinyurls):
>
> http://www.bioperl.org/wiki/
> FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_.
> 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F
>
> chris
>
> On Jun 13, 2007, at 7:20 PM, John Cumbers wrote:
>
> > Hello,
> >
> > I have a simple problem, I'm trying to search a genome sequence for
> > a motif,
> > I then want to output a BED file to display all the locations of
> > this motif
> > on the UCSC Genome Browser.  I could not find a script to do this,
> > so I
> > started to write my own.   I'm new to perl and my code below was my
> > attempt
> > to read the sequence string and output the index bp of the start of
> > each
> > motif.  With this I could build the BED file myself, which requires
> > start
> > and finish base pairs.
> >
> > For the first motif I can output the start index, but when I try
> > and read
> > the next one off the sequence it does not work.  Instead I just get an
> > output of a list of 1's.  I realise that this is more a request for
> > some
> > simple perl help, but any help much appreciated.
> >
> > Best wishes,
> > John
> >
> >
> > $seq_object = read_sequence
> > ("Drosophila.Chr3.test.AE014296.fasta");  #turn
> > my FASTA file into a seq object.
> > $sequence_as_a_string = $seq_object->seq();  #turn it into a string
> > # search $sequence_as_a_string  string for motif AAA as example
> > # if found, return the index that it is found at
> >
> > while ($sequence_as_a_string =~ m/AAA/g) {
> >   print "Found '$&'.  Next attempt at character " .
> > pos($sequence_as_a_string)+1 . "\n";
> > }
> >
> >
> >
> > --
> > John Cumbers,  Graduate Student
> > Biology and Medicine
> > Brown University, Box G-W
> > Providence, Rhode Island, 02912, USA
> > Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
> > UK to USA: 0207 617 7824
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From hlapp at gmx.net  Thu Jun 14 19:04:11 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 14 Jun 2007 19:04:11 -0400
Subject: [Bioperl-l] cvs changes in working copy
In-Reply-To: <EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>
References: <46717D5B.5040108@sheffield.ac.uk>
	<EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>
Message-ID: <3B262E6A-2C90-49FA-BCA1-BF1900C5AC3A@gmx.net>

Actually, that will update your repository. If you just wanted to  
take a peek you would use cvs status:

$ cvs status | grep 'Locally Modified'

	-hilmar

On Jun 14, 2007, at 3:01 PM, Jason Stajich wrote:

> cvs update | grep '^M'
>
> On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote:
>
>> Not sure if I'm being dense or if it's because I've been working with
>> svn recently, but - how do I get a list of files that are  
>> different in
>> my working copy compared to the repository?
>>
>> Cheers
>> Nath
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From mmokrejs at ribosome.natur.cuni.cz  Fri Jun 15 03:28:17 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Fri, 15 Jun 2007 09:28:17 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
Message-ID: <46723F91.60501@ribosome.natur.cuni.cz>

Chris Fields wrote:
> Is 99.gb supposed to be a GenBank file?  And you're loading it into 

Yes, it was attached to the email. ;)

> embl2picture (which I assume takes EMBL format files)?  Without example 
> code we can easily make the wrong assumptions (i.e. that this is user 
> error and not a BioPerl problem).

use constant USAGE =><<END;
Usage: $0 <file>
   Render a GenBank/EMBL entry into drawable form.
   Return as a GIF or PNG image on standard output.
 
   File must be in embl, genbank, or another SeqIO-
   recognized format.  Only the first entry will be
   rendered.
 
Example to try:
   embl2picture.pl factor7.embl | display -
 
END

> 
> Also, I don't believe the feature plotting scripts plot circular 
> chromosomes/plasmids.  If you want this functionality you'll have to 
> code it for yourself.

That's a pitty it does not, but at least if someone could improve the docs. ;)
Unfortunately I don't have the time to rewrite the code myself now,
I need a working, standalone, already available tool. :(
M.

> 
> chris
> 
> On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote:
> 
>> Martin MOKREJ? wrote:
>>
>>>> Also, there is a *huge* amount of documentation and examples on the
>>>> BioPerl website.
>>>>
>>>>     http://www.bioperl.org/wiki/HOWTOs
>>>
>>> You mean
>>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File 
>>>
>>> ? ;-)
>>
>> $ perl embl2picture.pl ~/99.gb | display -
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature 
>> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature 
>> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature 
>> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature 
>> Bio::Location::Simple=HASH(0x893e720): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature 
>> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>> $
>>
>> The plasmid is a circular DNA, why is the diagram in linear? ;-)
>>
>> Martin
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs


From dhoworth at mrc-lmb.cam.ac.uk  Fri Jun 15 04:59:09 2007
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Fri, 15 Jun 2007 09:59:09 +0100
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
Message-ID: <467254DD.3010505@mrc-lmb.cam.ac.uk>

Martin MOKREJ? wrote:
>>> Also, there is a *huge* amount of documentation and examples on
>>> the BioPerl website.
>>> 
>>> http://www.bioperl.org/wiki/HOWTOs
>> You mean 
>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>>  ? ;-)
> 
> $ perl embl2picture.pl ~/99.gb | display - Error returned while
> evaluating value of 'description' option for glyph
> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature
> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl
> line 141, <GEN0> line 125.

Hmm an error at line 141 of a 69 line script? Methinks you're not
actually running the script that's presented on the wiki page you
quoted. I cut-and-pasted the script and your file and it worked for me
(at least, it produced an image, along with a bunch of OOPS lines)

HTH, Dave


From n.haigh at sheffield.ac.uk  Fri Jun 15 06:21:38 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 11:21:38 +0100
Subject: [Bioperl-l] Installation using --install_base
Message-ID: <46726832.7080601@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm setting up a new installation of Debian 4.0 at home and though I'd
try to install BioPerl as a normal user rather than root. So in CPAN
options I set the --install_base to /home/username/perl and set PERL5LIB
to point to the same place.

Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
user and ask to install all optional modules, it tries to install them
through CPAN - however it seems to fail because some dependencies don't
seem to want to install in a user directory.

Has anyone else found this or might I be doing something wrong?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGcmgyczuW2jkwy2gRAtgqAKDIv717ciVHr5V+Z1kqPV2a++E8dgCfYr2a
VPt4tEPLW2J+BiKnN3B8aV8=
=c+9z
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Fri Jun 15 06:07:04 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 15 Jun 2007 11:07:04 +0100
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
Message-ID: <467264C8.4020202@sendu.me.uk>

Chris Fields wrote:
> On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:
> 
>> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>>
>>> To chip in on this, I only use perltidy when I need to clean bioperl
>>> code up for debugging (particularly if blocks are hard to see) and
>>> just use the defaults.  I agree it would be nice to have everything
>>> tidied up but it'll definitely need to be a consensus config file.
>>>
>> Can we do any sort of massive conversion at some logical timepoint.
>> Probably after a branch release or something?  Because it basically
>> means we're going to have differences on nearly every line which is
>> going to make diff-ing difficult when debugging old/new versions.
>> Maybe it is not a problem because we aren't introducing and new bugs!

Sorry, can you clarify the problem you envisage? And why would making a 
branch release help?


> I agree; if we intend on doing this it should be all at once, maybe  
> on a branch dedicated to ensure that code changes don't tank tests  
> (they shouldn't but one never knows).  We would then need a script up- 
> and-running that tidies everything up prior to commits (though what  
> happens if perltidy tanks?...).
> 
> Sendu, up for it?

If its going to be difficult and a hassle, for such an unnecessary thing 
I'm not sure its worth it. There are more pressing things to be done for 
Bioperl.

If I can just run perltidy on the entire package and commit, I'd do it. 
If that's not appropriate, I won't.


>>> About svn
[snip]
> Stepped into that one, didn't I!  I'll look into how much effort is  
> involved and try getting something going in the next month or two,  
> maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
> might be worth looking into.

I'd put this in the unnecessary-but-nice category as well. If it will be 
as easy as my ->new change, go ahead. If not, there are more pressing 
matters (POD fixing, test script updating and finishing...).


From n.haigh at sheffield.ac.uk  Fri Jun 15 06:35:40 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 11:35:40 +0100
Subject: [Bioperl-l] Installation using --install_base
Message-ID: <46726B7C.7070902@sheffield.ac.uk>

I'm setting up a new installation of Debian 4.0 at home and though I'd
try to install BioPerl as a normal user rather than root. So in CPAN
options I set the --install_base to /home/username/perl and set PERL5LIB
to point to the same place.

Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
user and ask to install all optional modules, it tries to install them
through CPAN - however it seems to fail because some dependencies don't
seem to want to install in a user directory.

Has anyone else found this or might I be doing something wrong?

Nath


From bix at sendu.me.uk  Fri Jun 15 06:45:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 15 Jun 2007 11:45:48 +0100
Subject: [Bioperl-l] Installation using --install_base
In-Reply-To: <46726832.7080601@sheffield.ac.uk>
References: <46726832.7080601@sheffield.ac.uk>
Message-ID: <46726DDC.8090202@sendu.me.uk>

Nathan S. Haigh wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> I'm setting up a new installation of Debian 4.0 at home and though I'd
> try to install BioPerl as a normal user rather than root. So in CPAN
> options I set the --install_base to /home/username/perl and set PERL5LIB
> to point to the same place.
> 
> Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
> user and ask to install all optional modules, it tries to install them
> through CPAN - however it seems to fail because some dependencies don't
> seem to want to install in a user directory.
> 
> Has anyone else found this or might I be doing something wrong?

You'll need to configure CPAN to install into your user directory. 
Upgrade to the latest version, then go read the docs on the various 
configurable options. I thought I at least mentioned this in the Bioperl 
INSTALL doc. If not, can someone come up with a concise clarification?


From sdavis2 at mail.nih.gov  Fri Jun 15 06:56:08 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 15 Jun 2007 06:56:08 -0400
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467264C8.4020202@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
Message-ID: <46727048.3080904@mail.nih.gov>

Sendu Bala wrote:
> If its going to be difficult and a hassle, for such an unnecessary thing 
> I'm not sure its worth it. There are more pressing things to be done for 
> Bioperl.
> 
> If I can just run perltidy on the entire package and commit, I'd do it. 
> If that's not appropriate, I won't.

I agree with the sentiment noted above.  I'm a bit of an outsider here,
but bioperl is a collaborative project.  Not everyone has the same
sentiments about what "correct" style means.  As a programmer, I really
wouldn't want significant changes on the style of my code.  And perl
happily puts up with many styles.  I would say leave things as they
are--let the individual programmers choose.  It reduces the amount of
work of questionable importance and allows the coding style freedom that
perl supports.

Just my $.02.

Sean


From cjfields at uiuc.edu  Fri Jun 15 10:05:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:05:07 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <46723F91.60501@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
	<46723F91.60501@ribosome.natur.cuni.cz>
Message-ID: <A2212781-75F3-4BB7-967F-1668B682E84E@uiuc.edu>


On Jun 15, 2007, at 2:28 AM, Martin MOKREJ? wrote:

> Chris Fields wrote:
>> Is 99.gb supposed to be a GenBank file?  And you're loading it into
>
> Yes, it was attached to the email. ;)

<bring foot to mouth and insert>

Sorry about that.  I notice that '.' was added, but the spacing  
seemed off.  I think bioperl catches that fine but it's something  
Wayne should consider.

>> embl2picture (which I assume takes EMBL format files)?  Without  
>> example
>> code we can easily make the wrong assumptions (i.e. that this is user
>> error and not a BioPerl problem).
>
> use constant USAGE =><<END;
> Usage: $0 <file>
>    Render a GenBank/EMBL entry into drawable form.
>    Return as a GIF or PNG image on standard output.
>
>    File must be in embl, genbank, or another SeqIO-
>    recognized format.  Only the first entry will be
>    rendered.
>
> Example to try:
>    embl2picture.pl factor7.embl | display -
>
> END

Horribly named script (should be seq2picture, since it converts both  
gb/embl).  The use of 'all_tags' makes me think the script version  
you are using is old, as those methods have long since been renamed.   
Dave has it working though, so maybe your version has been updated?   
The 'use of initialized data in' errors are probably from inclusion  
of mandatory fields with no data or '.'.

>> Also, I don't believe the feature plotting scripts plot circular
>> chromosomes/plasmids.  If you want this functionality you'll have to
>> code it for yourself.
>
> That's a pitty it does not, but at least if someone could improve  
> the docs. ;)
> Unfortunately I don't have the time to rewrite the code myself now,
> I need a working, standalone, already available tool. :(
> M.

As I said, unless someone shows interest and codes it just won't get  
done.  We have had very little interest in this, either b/c there are  
tools already out there to do this very thing (multitudes of plasmid  
drawing programs, some free like ApE) or that nobody's bothered to  
write it up.

chris


From cjfields at uiuc.edu  Fri Jun 15 10:22:23 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:22:23 -0500
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <46727048.3080904@mail.nih.gov>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov>
Message-ID: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>


On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:

> Sendu Bala wrote:
>> If its going to be difficult and a hassle, for such an unnecessary  
>> thing
>> I'm not sure its worth it. There are more pressing things to be  
>> done for
>> Bioperl.
>>
>> If I can just run perltidy on the entire package and commit, I'd  
>> do it.
>> If that's not appropriate, I won't.
>
> I agree with the sentiment noted above.  I'm a bit of an outsider  
> here,
> but bioperl is a collaborative project.  Not everyone has the same
> sentiments about what "correct" style means.  As a programmer, I  
> really
> wouldn't want significant changes on the style of my code.  And perl
> happily puts up with many styles.  I would say leave things as they
> are--let the individual programmers choose.  It reduces the amount of
> work of questionable importance and allows the coding style freedom  
> that
> perl supports.
>
> Just my $.02.
>
> Sean

I tend to run it on modules that need some reformatting  
(SearchIO::blast comes to mind).  I believe you're correct when this  
comes down to programming style, but I think this echoes a sentiment  
(frustration, perhaps) that some of us have with long-term  
maintenance of said code.

Maybe a compromise:  include a copy of .perltidyrc with the  
distribution that goes by what a consensus wants or by the general  
rules laid out in Perl Best Practices (spaced settings, use of spaces  
over tabs, etc).  Conversion would be encouraged but voluntary, with  
the caveat that if someone needs to clean up code down the road (bug  
fixes, enhancements, etc) and if the original author isn't able to  
add it in themselves, it could be perltidy'd in order to help the  
developer (locate and fix the issue)|(add relevant enhancement where  
needed).

chris


From cjfields at uiuc.edu  Fri Jun 15 10:56:23 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:56:23 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467264C8.4020202@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
Message-ID: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>


On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:

>>>> ...
>>> Can we do any sort of massive conversion at some logical timepoint.
>>> Probably after a branch release or something?  Because it basically
>>> means we're going to have differences on nearly every line which is
>>> going to make diff-ing difficult when debugging old/new versions.
>>> Maybe it is not a problem because we aren't introducing and new  
>>> bugs!
>
> Sorry, can you clarify the problem you envisage? And why would  
> making a branch release help?

Maybe the worry is that mass conversion in such a large codebase  
could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o  
trying?

>> I agree; if we intend on doing this it should be all at once,  
>> maybe  on a branch dedicated to ensure that code changes don't  
>> tank tests  (they shouldn't but one never knows).  We would then  
>> need a script up- and-running that tidies everything up prior to  
>> commits (though what  happens if perltidy tanks?...).
>> Sendu, up for it?
>
> If its going to be difficult and a hassle, for such an unnecessary  
> thing I'm not sure its worth it. There are more pressing things to  
> be done for Bioperl.
>
> If I can just run perltidy on the entire package and commit, I'd do  
> it. If that's not appropriate, I won't.

The choices aren't necessarily all or nothing.  What about voluntary,  
recommended use of a perltidy config file included with the  
distribution, with additional 'caveats'?  See my response to Sean.

>>>> About svn
> [snip]
>> Stepped into that one, didn't I!  I'll look into how much effort  
>> is  involved and try getting something going in the next month or  
>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as  
>> well but it  might be worth looking into.
>
> I'd put this in the unnecessary-but-nice category as well. If it  
> will be as easy as my ->new change, go ahead. If not, there are  
> more pressing matters (POD fixing, test script updating and  
> finishing...).

A few other open-bio projects have actively discussed a CVS->SVN  
migration (BioRuby and I think BioPython, though the latter could be  
wrong).  As I said, "it might be worth looking into" to weigh the  
pros/cons, get others opinions from others who have made the  
transition, etc.  We could, as Jason suggested, even set up a tester  
SVN w/o making it the default codebase (lock it off to a few testers,  
have CVS commits automatically/manually carry over to SVN, etc).

I agree with you that it's not feasible to switch over prior to a  
release and that there are more pressing issues, but it doesn't hurt  
having an open discussion about it.

chris


From sdavis2 at mail.nih.gov  Fri Jun 15 11:15:57 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 15 Jun 2007 11:15:57 -0400
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov>
	<78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
Message-ID: <4672AD2D.2090001@mail.nih.gov>

Chris Fields wrote:
> 
> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:
> 
>> Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary thing
>>> I'm not sure its worth it. There are more pressing things to be done for
>>> Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd do it.
>>> If that's not appropriate, I won't.
>>
>> I agree with the sentiment noted above.  I'm a bit of an outsider here,
>> but bioperl is a collaborative project.  Not everyone has the same
>> sentiments about what "correct" style means.  As a programmer, I really
>> wouldn't want significant changes on the style of my code.  And perl
>> happily puts up with many styles.  I would say leave things as they
>> are--let the individual programmers choose.  It reduces the amount of
>> work of questionable importance and allows the coding style freedom that
>> perl supports.
>>
>> Just my $.02.
>>
>> Sean
> 
> I tend to run it on modules that need some reformatting (SearchIO::blast
> comes to mind).  I believe you're correct when this comes down to
> programming style, but I think this echoes a sentiment (frustration,
> perhaps) that some of us have with long-term maintenance of said code.
> 
> Maybe a compromise:  include a copy of .perltidyrc with the distribution
> that goes by what a consensus wants or by the general rules laid out in
> Perl Best Practices (spaced settings, use of spaces over tabs, etc). 
> Conversion would be encouraged but voluntary, with the caveat that if
> someone needs to clean up code down the road (bug fixes, enhancements,
> etc) and if the original author isn't able to add it in themselves, it
> could be perltidy'd in order to help the developer (locate and fix the
> issue)|(add relevant enhancement where needed).

Don't get me wrong--I think whatever makes bioperl a better, more
maintainable beast should be what is done.  The bioperl gurus should
absolutely do what is best for them for code maintainability.

Sean


From n.haigh at sheffield.ac.uk  Fri Jun 15 11:17:15 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 16:17:15 +0100
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>	<467264C8.4020202@sendu.me.uk>
	<46727048.3080904@mail.nih.gov>
	<78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
Message-ID: <4672AD7B.4050109@sheffield.ac.uk>

Chris Fields wrote:
> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:
> 
>> Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary  
>>> thing
>>> I'm not sure its worth it. There are more pressing things to be  
>>> done for
>>> Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd  
>>> do it.
>>> If that's not appropriate, I won't.
>> I agree with the sentiment noted above.  I'm a bit of an outsider  
>> here,
>> but bioperl is a collaborative project.  Not everyone has the same
>> sentiments about what "correct" style means.  As a programmer, I  
>> really
>> wouldn't want significant changes on the style of my code.  And perl
>> happily puts up with many styles.  I would say leave things as they
>> are--let the individual programmers choose.  It reduces the amount of
>> work of questionable importance and allows the coding style freedom  
>> that
>> perl supports.
>>
>> Just my $.02.
>>
>> Sean
> 
> I tend to run it on modules that need some reformatting  
> (SearchIO::blast comes to mind).  I believe you're correct when this  
> comes down to programming style, but I think this echoes a sentiment  
> (frustration, perhaps) that some of us have with long-term  
> maintenance of said code.
> 
> Maybe a compromise:  include a copy of .perltidyrc with the  
> distribution that goes by what a consensus wants or by the general  
> rules laid out in Perl Best Practices (spaced settings, use of spaces  
> over tabs, etc).  

RE spaces, tabs etc - how well is the different coding styles handled
for displaying in html and via the online browsable cvs?

Conversion would be encouraged but voluntary, with
> the caveat that if someone needs to clean up code down the road (bug  
> fixes, enhancements, etc) and if the original author isn't able to  
> add it in themselves, it could be perltidy'd in order to help the  
> developer (locate and fix the issue)|(add relevant enhancement where  
> needed).
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From johnsonm at gmail.com  Fri Jun 15 15:37:26 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Fri, 15 Jun 2007 14:37:26 -0500
Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap
	start and stop coordinates??
In-Reply-To: <E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
References: <CED81D34E37D5043A1211565277A51E507E23161@exchkc02.stowers-institute.org>
	<79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu>
	<ebf5eb170705161211m6fb570b5r86ee055299993172@mail.gmail.com>
	<B012903E-7C0F-4E34-9BFE-E551855B6C62@uiuc.edu>
	<ebf5eb170705211348w57c37f18oeb128656c446cff@mail.gmail.com>
	<62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu>
	<ebf5eb170705211421w244933fcu4db8ba748653c090@mail.gmail.com>
	<9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu>
	<a79f6a4b0705211729j3ff17d60v610fab7f5e135303@mail.gmail.com>
	<E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
Message-ID: <ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>

Patches waiting in Bugzilla (Bug #2299).  Changes:

-Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for
prokaryotic reports (Glimmer2/Glimmer3)
-Bio::Tools::Glimmer now produces features with Fuzzy or Split
locations as appropriate (partial or circular/wraparound predictions)
-Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out
sequence lengths
-Bio::Tools::Run::Glimmer passes along the sequence length to
Bio::Tools::Glimmer for Glimmer2

I should probably modify Bio::Tools::Genemark to use
Bio::SeqFeature::Generic features for prokaryotic reports, to be
consistent, but this is more likely to surprise people.  If nobody
screams about the change to Bio::Tools::Glimmer, I'll do it at some
point.

On 5/21/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote:
>
> >> glimmer2/3 both assume the genome is circular by default (I'm
> >> assuming since Glimmer2/3 are used for bacterial genomes).  Acc. to
> >> the Glimmer3 release notes the detail file has the information in the
> >> header; from the Glimmer3 data used for tests:
> >
> > You beat me to the reply Chris - yes, Glimmer2/3 assume circular
> > chromosome by default. I had forgotten about this in earlier
> > discussions of the new Glimmer parsers as I normally run it in
> > --linear / -L mode (even if I know it is circular) because it is
> > easier to handle, and our sequencer/assembler team usually gets the
> > origin of replication right.
> >
> >> Command:  /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../BCTDNA
> >> Glimmer3.icm Glimmer3
> >
> > I did a double-take here - that's the path to my Glimmer3
> > installation! It took me a couple of minutes to realise that you got
> > it from the bioperl test data I created. D'oh! :-)
>
> Yep, I forgot about that!
>
> >> There are options available for glimmer3 (-L, -X) that specify a
> >> linear sequence or allow ORFs to extend past the end of the sequence
> >> analyzed (the latter assumes a linear sequence).
> >
> > If the -L mode should produce Bio::Location::Split objects, I guess if
> > -X is used
> > it should produce Bio::Location::Fuzzy objects too...
> >
> > --Torsten
>
> True, didn't think about that one.  Def. something to consider adding
> in.
>
> chris
>
>
>


From cjfields at uiuc.edu  Fri Jun 15 16:55:06 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 15:55:06 -0500
Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap
	start and stop coordinates??
In-Reply-To: <ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>
References: <CED81D34E37D5043A1211565277A51E507E23161@exchkc02.stowers-institute.org>
	<79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu>
	<ebf5eb170705161211m6fb570b5r86ee055299993172@mail.gmail.com>
	<B012903E-7C0F-4E34-9BFE-E551855B6C62@uiuc.edu>
	<ebf5eb170705211348w57c37f18oeb128656c446cff@mail.gmail.com>
	<62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu>
	<ebf5eb170705211421w244933fcu4db8ba748653c090@mail.gmail.com>
	<9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu>
	<a79f6a4b0705211729j3ff17d60v610fab7f5e135303@mail.gmail.com>
	<E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
	<ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>
Message-ID: <D09AF2F1-1459-4B6B-A3ED-85CEDE34D7B6@uiuc.edu>

I'll try getting to that in tonight.  Been pretty tied up lately...

chris

On Jun 15, 2007, at 2:37 PM, Mark Johnson wrote:

> Patches waiting in Bugzilla (Bug #2299).  Changes:
>
> -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for
> prokaryotic reports (Glimmer2/Glimmer3)
> -Bio::Tools::Glimmer now produces features with Fuzzy or Split
> locations as appropriate (partial or circular/wraparound predictions)
> -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out
> sequence lengths
> -Bio::Tools::Run::Glimmer passes along the sequence length to
> Bio::Tools::Glimmer for Glimmer2
>
> I should probably modify Bio::Tools::Genemark to use
> Bio::SeqFeature::Generic features for prokaryotic reports, to be
> consistent, but this is more likely to surprise people.  If nobody
> screams about the change to Bio::Tools::Glimmer, I'll do it at some
> point.
>
> On 5/21/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>
>> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote:
>>
>>>> glimmer2/3 both assume the genome is circular by default (I'm
>>>> assuming since Glimmer2/3 are used for bacterial genomes).  Acc. to
>>>> the Glimmer3 release notes the detail file has the information  
>>>> in the
>>>> header; from the Glimmer3 data used for tests:
>>>
>>> You beat me to the reply Chris - yes, Glimmer2/3 assume circular
>>> chromosome by default. I had forgotten about this in earlier
>>> discussions of the new Glimmer parsers as I normally run it in
>>> --linear / -L mode (even if I know it is circular) because it is
>>> easier to handle, and our sequencer/assembler team usually gets the
>>> origin of replication right.
>>>
>>>> Command:  /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../ 
>>>> BCTDNA
>>>> Glimmer3.icm Glimmer3
>>>
>>> I did a double-take here - that's the path to my Glimmer3
>>> installation! It took me a couple of minutes to realise that you got
>>> it from the bioperl test data I created. D'oh! :-)
>>
>> Yep, I forgot about that!
>>
>>>> There are options available for glimmer3 (-L, -X) that specify a
>>>> linear sequence or allow ORFs to extend past the end of the  
>>>> sequence
>>>> analyzed (the latter assumes a linear sequence).
>>>
>>> If the -L mode should produce Bio::Location::Split objects, I  
>>> guess if
>>> -X is used
>>> it should produce Bio::Location::Fuzzy objects too...
>>>
>>> --Torsten
>>
>> True, didn't think about that one.  Def. something to consider adding
>> in.
>>
>> chris
>>
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From rvos at interchange.ubc.ca  Fri Jun 15 17:08:17 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Fri, 15 Jun 2007 14:08:17 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
Message-ID: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>

Hi,

I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS.

Rutger


-----Original Message-----

> Date: Fri Jun 15 07:56:23 PDT 2007
> From: "Chris Fields" <cjfields at uiuc.edu>
> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> To: "Sendu Bala" <bix at sendu.me.uk>
>
> 
> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> 
> >>>> ...
> >>> Can we do any sort of massive conversion at some logical timepoint.
> >>> Probably after a branch release or something?  Because it basically
> >>> means we're going to have differences on nearly every line which is
> >>> going to make diff-ing difficult when debugging old/new versions.
> >>> Maybe it is not a problem because we aren't introducing and new  
> >>> bugs!
> >
> > Sorry, can you clarify the problem you envisage? And why would  
> > making a branch release help?
> 
> Maybe the worry is that mass conversion in such a large codebase  
> could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o  
> trying?
> 
> >> I agree; if we intend on doing this it should be all at once,  
> >> maybe  on a branch dedicated to ensure that code changes don't  
> >> tank tests  (they shouldn't but one never knows).  We would then  
> >> need a script up- and-running that tidies everything up prior to  
> >> commits (though what  happens if perltidy tanks?...).
> >> Sendu, up for it?
> >
> > If its going to be difficult and a hassle, for such an unnecessary  
> > thing I'm not sure its worth it. There are more pressing things to  
> > be done for Bioperl.
> >
> > If I can just run perltidy on the entire package and commit, I'd do  
> > it. If that's not appropriate, I won't.
> 
> The choices aren't necessarily all or nothing.  What about voluntary,  
> recommended use of a perltidy config file included with the  
> distribution, with additional 'caveats'?  See my response to Sean.
> 
> >>>> About svn
> > [snip]
> >> Stepped into that one, didn't I!  I'll look into how much effort  
> >> is  involved and try getting something going in the next month or  
> >> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as  
> >> well but it  might be worth looking into.
> >
> > I'd put this in the unnecessary-but-nice category as well. If it  
> > will be as easy as my ->new change, go ahead. If not, there are  
> > more pressing matters (POD fixing, test script updating and  
> > finishing...).
> 
> A few other open-bio projects have actively discussed a CVS->SVN  
> migration (BioRuby and I think BioPython, though the latter could be  
> wrong).  As I said, "it might be worth looking into" to weigh the  
> pros/cons, get others opinions from others who have made the  
> transition, etc.  We could, as Jason suggested, even set up a tester  
> SVN w/o making it the default codebase (lock it off to a few testers,  
> have CVS commits automatically/manually carry over to SVN, etc).
> 
> I agree with you that it's not feasible to switch over prior to a  
> release and that there are more pressing issues, but it doesn't hurt  
> having an open discussion about it.
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From spiros at lokku.com  Fri Jun 15 17:40:32 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Fri, 15 Jun 2007 22:40:32 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>

On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
> Hi,
>
> I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS.
>
> Rutger
>

I second that, SVN seems like the reasonable choice. I would be more
than happy to help out as well.

Spiros

>
> -----Original Message-----
>
> > Date: Fri Jun 15 07:56:23 PDT 2007
> > From: "Chris Fields" <cjfields at uiuc.edu>
> > Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> > To: "Sendu Bala" <bix at sendu.me.uk>
> >
> >
> > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> >
> > >>>> ...
> > >>> Can we do any sort of massive conversion at some logical timepoint.
> > >>> Probably after a branch release or something?  Because it basically
> > >>> means we're going to have differences on nearly every line which is
> > >>> going to make diff-ing difficult when debugging old/new versions.
> > >>> Maybe it is not a problem because we aren't introducing and new
> > >>> bugs!
> > >
> > > Sorry, can you clarify the problem you envisage? And why would
> > > making a branch release help?
> >
> > Maybe the worry is that mass conversion in such a large codebase
> > could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o
> > trying?
> >
> > >> I agree; if we intend on doing this it should be all at once,
> > >> maybe  on a branch dedicated to ensure that code changes don't
> > >> tank tests  (they shouldn't but one never knows).  We would then
> > >> need a script up- and-running that tidies everything up prior to
> > >> commits (though what  happens if perltidy tanks?...).
> > >> Sendu, up for it?
> > >
> > > If its going to be difficult and a hassle, for such an unnecessary
> > > thing I'm not sure its worth it. There are more pressing things to
> > > be done for Bioperl.
> > >
> > > If I can just run perltidy on the entire package and commit, I'd do
> > > it. If that's not appropriate, I won't.
> >
> > The choices aren't necessarily all or nothing.  What about voluntary,
> > recommended use of a perltidy config file included with the
> > distribution, with additional 'caveats'?  See my response to Sean.
> >
> > >>>> About svn
> > > [snip]
> > >> Stepped into that one, didn't I!  I'll look into how much effort
> > >> is  involved and try getting something going in the next month or
> > >> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
> > >> well but it  might be worth looking into.
> > >
> > > I'd put this in the unnecessary-but-nice category as well. If it
> > > will be as easy as my ->new change, go ahead. If not, there are
> > > more pressing matters (POD fixing, test script updating and
> > > finishing...).
> >
> > A few other open-bio projects have actively discussed a CVS->SVN
> > migration (BioRuby and I think BioPython, though the latter could be
> > wrong).  As I said, "it might be worth looking into" to weigh the
> > pros/cons, get others opinions from others who have made the
> > transition, etc.  We could, as Jason suggested, even set up a tester
> > SVN w/o making it the default codebase (lock it off to a few testers,
> > have CVS commits automatically/manually carry over to SVN, etc).
> >
> > I agree with you that it's not feasible to switch over prior to a
> > release and that there are more pressing issues, but it doesn't hurt
> > having an open discussion about it.
> >
> > chris
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From hlapp at gmx.net  Fri Jun 15 18:10:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 18:10:25 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
Message-ID: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>

So should we set up a sandbox svn repository and those who would like  
to help out

- take shots at migrating bioperl (any current cvs snapshot will do)  
to svn

- you document what you find yourself having to do in trying to make  
it work

- you report back when you think you have a working repository

- we all get a defined amount of time to test to our hearts' content,  
say 2 weeks

- you fix issues that were encountered

- report back when done, followed by retesting for, say 1 week

- iterate previous 2 steps until no issues and no objections to  
migration

- two more weeks of warning period to all developers to commit all  
outstanding changes, or reapply them to a future svn checkout

- pull the trigger by locking down cvs, applying the migration as  
worked out before, and announcing that BioPerl is now on svn

- get free beer at next BOSC (I'll pay if no one else does)

This may not be precisely the plan that needs to be executed, but  
it's probably somewhere along those lines.

If there are volunteers who would like to spearhead this, then power  
to you - I think everyone is in favor and the advantages of svn don't  
need to be debated. The only reason it hasn't happened yet is because  
no one has stepped forward who would have the energy.

I'm sure ChrisD will gladly create the svn sandbox if we have  
volunteers lined up to get going.

	-hilmar

On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:

> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>> Hi,
>>
>> I would very much prefer it if bioperl moved to svn. I'm  
>> considering merging Bio::Phylo (to the extent that that's possible/ 
>> practical) with bioperl and move it to an OBF repository, but I'd  
>> rather not go back to CVS.
>>
>> Rutger
>>
>
> I second that, SVN seems like the reasonable choice. I would be more
> than happy to help out as well.
>
> Spiros
>
>>
>> -----Original Message-----
>>
>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>
>>>
>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>
>>>>>>> ...
>>>>>> Can we do any sort of massive conversion at some logical  
>>>>>> timepoint.
>>>>>> Probably after a branch release or something?  Because it  
>>>>>> basically
>>>>>> means we're going to have differences on nearly every line  
>>>>>> which is
>>>>>> going to make diff-ing difficult when debugging old/new versions.
>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>> bugs!
>>>>
>>>> Sorry, can you clarify the problem you envisage? And why would
>>>> making a branch release help?
>>>
>>> Maybe the worry is that mass conversion in such a large codebase
>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows  
>>> w/o
>>> trying?
>>>
>>>>> I agree; if we intend on doing this it should be all at once,
>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>> need a script up- and-running that tidies everything up prior to
>>>>> commits (though what  happens if perltidy tanks?...).
>>>>> Sendu, up for it?
>>>>
>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>> thing I'm not sure its worth it. There are more pressing things to
>>>> be done for Bioperl.
>>>>
>>>> If I can just run perltidy on the entire package and commit, I'd do
>>>> it. If that's not appropriate, I won't.
>>>
>>> The choices aren't necessarily all or nothing.  What about  
>>> voluntary,
>>> recommended use of a perltidy config file included with the
>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>
>>>>>>> About svn
>>>> [snip]
>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>> is  involved and try getting something going in the next month or
>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>> well but it  might be worth looking into.
>>>>
>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>> more pressing matters (POD fixing, test script updating and
>>>> finishing...).
>>>
>>> A few other open-bio projects have actively discussed a CVS->SVN
>>> migration (BioRuby and I think BioPython, though the latter could be
>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>> pros/cons, get others opinions from others who have made the
>>> transition, etc.  We could, as Jason suggested, even set up a tester
>>> SVN w/o making it the default codebase (lock it off to a few  
>>> testers,
>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>
>>> I agree with you that it's not feasible to switch over prior to a
>>> release and that there are more pressing issues, but it doesn't hurt
>>> having an open discussion about it.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jason at bioperl.org  Fri Jun 15 18:23:15 2007
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 15 Jun 2007 15:23:15 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>

Sounds like a plan, I'll be curious to see if we can still get keep  
anonymous CVS working as I'd like to not have to pull the plug on  
that.  There are some threads out on the web about how to do this  
with a commit rule on SVN.

Also, can someone who is close enough to all the SVN benefits please  
elaborate how it is going to help _this_ project?
Perhaps you would be willing to put a few words up -- like on (a to  
be created):
http://bioperl.org/wiki/BioPerl:Version_control_changeover

This way if anonymous CVS is broken and/or developers who haven't  
been paying attention come back to commit code ask why things changed  
we don't have to compose long emails... =)

-jason
On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote:

> So should we set up a sandbox svn repository and those who would like
> to help out
>
> - take shots at migrating bioperl (any current cvs snapshot will do)
> to svn
>
> - you document what you find yourself having to do in trying to make
> it work
>
> - you report back when you think you have a working repository
>
> - we all get a defined amount of time to test to our hearts' content,
> say 2 weeks
>
> - you fix issues that were encountered
>
> - report back when done, followed by retesting for, say 1 week
>
> - iterate previous 2 steps until no issues and no objections to
> migration
>
> - two more weeks of warning period to all developers to commit all
> outstanding changes, or reapply them to a future svn checkout
>
> - pull the trigger by locking down cvs, applying the migration as
> worked out before, and announcing that BioPerl is now on svn
>
> - get free beer at next BOSC (I'll pay if no one else does)
>
> This may not be precisely the plan that needs to be executed, but
> it's probably somewhere along those lines.
>
> If there are volunteers who would like to spearhead this, then power
> to you - I think everyone is in favor and the advantages of svn don't
> need to be debated. The only reason it hasn't happened yet is because
> no one has stepped forward who would have the energy.

>
> I'm sure ChrisD will gladly create the svn sandbox if we have
> volunteers lined up to get going.
>
> 	-hilmar
>
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>
>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>> Hi,
>>>
>>> I would very much prefer it if bioperl moved to svn. I'm
>>> considering merging Bio::Phylo (to the extent that that's possible/
>>> practical) with bioperl and move it to an OBF repository, but I'd
>>> rather not go back to CVS.
>>>
>>> Rutger
>>>
>>
>> I second that, SVN seems like the reasonable choice. I would be more
>> than happy to help out as well.
>>
>> Spiros
>>
>>>
>>> -----Original Message-----
>>>
>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>
>>>>
>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>
>>>>>>>> ...
>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>> timepoint.
>>>>>>> Probably after a branch release or something?  Because it
>>>>>>> basically
>>>>>>> means we're going to have differences on nearly every line
>>>>>>> which is
>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>> versions.
>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>> bugs!
>>>>>
>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>> making a branch release help?
>>>>
>>>> Maybe the worry is that mass conversion in such a large codebase
>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>> w/o
>>>> trying?
>>>>
>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>> Sendu, up for it?
>>>>>
>>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>>> thing I'm not sure its worth it. There are more pressing things to
>>>>> be done for Bioperl.
>>>>>
>>>>> If I can just run perltidy on the entire package and commit,  
>>>>> I'd do
>>>>> it. If that's not appropriate, I won't.
>>>>
>>>> The choices aren't necessarily all or nothing.  What about
>>>> voluntary,
>>>> recommended use of a perltidy config file included with the
>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>
>>>>>>>> About svn
>>>>> [snip]
>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>> is  involved and try getting something going in the next month or
>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>> well but it  might be worth looking into.
>>>>>
>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>> more pressing matters (POD fixing, test script updating and
>>>>> finishing...).
>>>>
>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>> migration (BioRuby and I think BioPython, though the latter  
>>>> could be
>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>> pros/cons, get others opinions from others who have made the
>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>> tester
>>>> SVN w/o making it the default codebase (lock it off to a few
>>>> testers,
>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>
>>>> I agree with you that it's not feasible to switch over prior to a
>>>> release and that there are more pressing issues, but it doesn't  
>>>> hurt
>>>> having an open discussion about it.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From sheris at eps.berkeley.edu  Fri Jun 15 18:58:12 2007
From: sheris at eps.berkeley.edu (Sheri Simmons)
Date: Fri, 15 Jun 2007 15:58:12 -0700
Subject: [Bioperl-l] seq doesn't validate error
Message-ID: <200706151558.12911.sheris@eps.berkeley.edu>

Hi,
I'm getting an error as follows when I try to reverse complement a sequence 
string stored in a hash of arrays. The storage code is: 

		$nstarthash{$key} = [$sortchecks[0], join("", @nseq), 		
join("",@{$seqhash{$key}})];

the sequence of interest is the element at index 1. 

Later, I try to retrieve this string for a subset of keys so I can reverse 
complement it based on input from another hash (%complement):

			my %revcomphash = map { my $read = $_;
			grep $complement{$read} eq 'C', %complement;
			{$_, (Bio::Seq->new(-seq =>$nstarthash{$_}[1]))->revcom->seq()};}
			 keys(%nstarthash); 


I get the following warning (long sequence edited for clarity):

-- -------------------- WARNING ---------------------
MSG: seq doesn't validate, mismatch is 1
---------------------------------------------------

------------- EXCEPTION  -------------
MSG: Attempting to set the sequence to [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] 
which does not look healthy
STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498
STACK toplevel ../quality_wrapper.pl:103

I cannot find any non-allowed characters in the sequence, and the 
de-referencing appears to work correctly. Can anyone help me?
I'm using the latest Bioperl installation (1.5.2) with ActivePerl5.8 on a 
Mepis 6.5 system. 

Thanks
Sheri

---------------------------------------------------------------------
Sheri Simmons
Department of Earth and Planetary Sciences
University of California, Berkeley
Berkeley, CA 94720-4767


From Kevin.M.Brown at asu.edu  Fri Jun 15 19:11:34 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Fri, 15 Jun 2007 16:11:34 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <200706151558.12911.sheris@eps.berkeley.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
Message-ID: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>

> I'm getting an error as follows when I try to reverse 
> complement a sequence string stored in a hash of arrays. The 
> storage code is: 
> 
> 		$nstarthash{$key} = [$sortchecks[0], join("", 
> @nseq), 		
> join("",@{$seqhash{$key}})];
> 
> the sequence of interest is the element at index 1. 
> 
> Later, I try to retrieve this string for a subset of keys so 
> I can reverse complement it based on input from another hash 
> (%complement):
> 
> 			my %revcomphash = map { my $read = $_;
> 			grep $complement{$read} eq 'C', %complement;
> 			{$_, (Bio::Seq->new(-seq 
> =>$nstarthash{$_}[1]))->revcom->seq()};}
> 			 keys(%nstarthash); 
> 
> 
> I get the following warning (long sequence edited for clarity):
> 
> -- -------------------- WARNING ---------------------
> MSG: seq doesn't validate, mismatch is 1
> ---------------------------------------------------
> 
> ------------- EXCEPTION  -------------
> MSG: Attempting to set the sequence to 
> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
> which does not look healthy
> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK 
> toplevel ../quality_wrapper.pl:103
> 
> I cannot find any non-allowed characters in the sequence, and 
> the de-referencing appears to work correctly. Can anyone help me?
> I'm using the latest Bioperl installation (1.5.2) with 
> ActivePerl5.8 on a Mepis 6.5 system. 

Try telling the Bio::Seq object what alphabet to use when creating it.
I tend to create them like:

Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')


From sheris at eps.berkeley.edu  Fri Jun 15 19:53:04 2007
From: sheris at eps.berkeley.edu (Sheri Simmons)
Date: Fri, 15 Jun 2007 16:53:04 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
	<1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
Message-ID: <200706151653.04135.sheris@eps.berkeley.edu>

Thanks for the suggestion, but that still gives the same error as before.

On Friday 15 June 2007 4:11 pm, Kevin Brown wrote:
> > I'm getting an error as follows when I try to reverse
> > complement a sequence string stored in a hash of arrays. The
> > storage code is:
> >
> > 		$nstarthash{$key} = [$sortchecks[0], join("",
> > @nseq),
> > join("",@{$seqhash{$key}})];
> >
> > the sequence of interest is the element at index 1.
> >
> > Later, I try to retrieve this string for a subset of keys so
> > I can reverse complement it based on input from another hash
> > (%complement):
> >
> > 			my %revcomphash = map { my $read = $_;
> > 			grep $complement{$read} eq 'C', %complement;
> > 			{$_, (Bio::Seq->new(-seq
> > =>$nstarthash{$_}[1]))->revcom->seq()};}
> > 			 keys(%nstarthash);
> >
> >
> > I get the following warning (long sequence edited for clarity):
> >
> > -- -------------------- WARNING ---------------------
> > MSG: seq doesn't validate, mismatch is 1
> > ---------------------------------------------------
> >
> > ------------- EXCEPTION  -------------
> > MSG: Attempting to set the sequence to
> > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
> > which does not look healthy
> > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
> > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
> > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK
> > toplevel ../quality_wrapper.pl:103
> >
> > I cannot find any non-allowed characters in the sequence, and
> > the de-referencing appears to work correctly. Can anyone help me?
> > I'm using the latest Bioperl installation (1.5.2) with
> > ActivePerl5.8 on a Mepis 6.5 system.
>
> Try telling the Bio::Seq object what alphabet to use when creating it.
> I tend to create them like:
>
> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')

-- 
Sheri Simmons
Department of Earth and Planetary Sciences
University of California, Berkeley
Berkeley, CA 94720-4767


From hlapp at gmx.net  Fri Jun 15 21:27:42 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 21:27:42 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <EDC569BF-2E4B-4BFC-916A-665CC2FFABAF@gmx.net>

Could you post a ticket to the helpdesk: support at open-bio.org.

	-hilmar

On Jun 15, 2007, at 9:08 PM, George Hartzell wrote:

> Hilmar Lapp writes:
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>
> Free Beer, huh?  Do you deliver?
>
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
>
> thanks!
>
> g.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Fri Jun 15 21:08:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Fri, 15 Jun 2007 21:08:32 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <18035.14352.963113.473274@almost.alerce.com>

Hilmar Lapp writes:
 > So should we set up a sandbox svn repository and those who would like  
 > to help out
 > 
 > - take shots at migrating bioperl (any current cvs snapshot will do)  
 > to svn

Free Beer, huh?  Do you deliver?

Can you package up a tarball of the cvs repository (bzip or gzip would
save some time) itself?

thanks!

g.


From cjfields at uiuc.edu  Fri Jun 15 21:42:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 20:42:05 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>

The browsable CVS has a 'Download tarball' link if that helps.

http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
cvsroot=bioperl

chris

On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:

> Hilmar Lapp writes:
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>
> Free Beer, huh?  Do you deliver?
>
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
>
> thanks!
>
> g.


From cjfields at uiuc.edu  Fri Jun 15 21:50:09 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 20:50:09 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>

I'll help out to the extent I can w/o having the SVN know-how.  We  
need (as Jason points out) someone who can detail the benefits and  
maybe keep an updated journal on the wiki.

I believe at least one or two of the other Bio* contemplated moving  
over to SVN, which may be worth checking out.

chris

On Jun 15, 2007, at 5:10 PM, Hilmar Lapp wrote:

> So should we set up a sandbox svn repository and those who would like
> to help out
>
> - take shots at migrating bioperl (any current cvs snapshot will do)
> to svn
>
> - you document what you find yourself having to do in trying to make
> it work
>
> - you report back when you think you have a working repository
>
> - we all get a defined amount of time to test to our hearts' content,
> say 2 weeks
>
> - you fix issues that were encountered
>
> - report back when done, followed by retesting for, say 1 week
>
> - iterate previous 2 steps until no issues and no objections to
> migration
>
> - two more weeks of warning period to all developers to commit all
> outstanding changes, or reapply them to a future svn checkout
>
> - pull the trigger by locking down cvs, applying the migration as
> worked out before, and announcing that BioPerl is now on svn
>
> - get free beer at next BOSC (I'll pay if no one else does)
>
> This may not be precisely the plan that needs to be executed, but
> it's probably somewhere along those lines.
>
> If there are volunteers who would like to spearhead this, then power
> to you - I think everyone is in favor and the advantages of svn don't
> need to be debated. The only reason it hasn't happened yet is because
> no one has stepped forward who would have the energy.
>
> I'm sure ChrisD will gladly create the svn sandbox if we have
> volunteers lined up to get going.
>
> 	-hilmar
>
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>
>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>> Hi,
>>>
>>> I would very much prefer it if bioperl moved to svn. I'm
>>> considering merging Bio::Phylo (to the extent that that's possible/
>>> practical) with bioperl and move it to an OBF repository, but I'd
>>> rather not go back to CVS.
>>>
>>> Rutger
>>>
>>
>> I second that, SVN seems like the reasonable choice. I would be more
>> than happy to help out as well.
>>
>> Spiros
>>
>>>
>>> -----Original Message-----
>>>
>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>
>>>>
>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>
>>>>>>>> ...
>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>> timepoint.
>>>>>>> Probably after a branch release or something?  Because it
>>>>>>> basically
>>>>>>> means we're going to have differences on nearly every line
>>>>>>> which is
>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>> versions.
>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>> bugs!
>>>>>
>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>> making a branch release help?
>>>>
>>>> Maybe the worry is that mass conversion in such a large codebase
>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>> w/o
>>>> trying?
>>>>
>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>> Sendu, up for it?
>>>>>
>>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>>> thing I'm not sure its worth it. There are more pressing things to
>>>>> be done for Bioperl.
>>>>>
>>>>> If I can just run perltidy on the entire package and commit,  
>>>>> I'd do
>>>>> it. If that's not appropriate, I won't.
>>>>
>>>> The choices aren't necessarily all or nothing.  What about
>>>> voluntary,
>>>> recommended use of a perltidy config file included with the
>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>
>>>>>>>> About svn
>>>>> [snip]
>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>> is  involved and try getting something going in the next month or
>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>> well but it  might be worth looking into.
>>>>>
>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>> more pressing matters (POD fixing, test script updating and
>>>>> finishing...).
>>>>
>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>> migration (BioRuby and I think BioPython, though the latter  
>>>> could be
>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>> pros/cons, get others opinions from others who have made the
>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>> tester
>>>> SVN w/o making it the default codebase (lock it off to a few
>>>> testers,
>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>
>>>> I agree with you that it's not feasible to switch over prior to a
>>>> release and that there are more pressing issues, but it doesn't  
>>>> hurt
>>>> having an open discussion about it.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Fri Jun 15 22:12:55 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 22:12:55 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
Message-ID: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>

I think he meant the cvs repository itself, containing all the change  
data. -hilmar

On Jun 15, 2007, at 9:42 PM, Chris Fields wrote:

> The browsable CVS has a 'Download tarball' link if that helps.
>
> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
> cvsroot=bioperl
>
> chris
>
> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:
>
>> Hilmar Lapp writes:
>>> So should we set up a sandbox svn repository and those who would  
>>> like
>>> to help out
>>>
>>> - take shots at migrating bioperl (any current cvs snapshot will do)
>>> to svn
>>
>> Free Beer, huh?  Do you deliver?
>>
>> Can you package up a tarball of the cvs repository (bzip or gzip  
>> would
>> save some time) itself?
>>
>> thanks!
>>
>> g.
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Jun 15 22:37:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 21:37:55 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
Message-ID: <F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>

Ah, got it.  Sorry.

George, planning on taking this up?

chris

On Jun 15, 2007, at 9:12 PM, Hilmar Lapp wrote:

> I think he meant the cvs repository itself, containing all the  
> change data. -hilmar
>
> On Jun 15, 2007, at 9:42 PM, Chris Fields wrote:
>
>> The browsable CVS has a 'Download tarball' link if that helps.
>>
>> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
>> cvsroot=bioperl
>>
>> chris
>>
>> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:
>>
>>> Hilmar Lapp writes:
>>>> So should we set up a sandbox svn repository and those who would  
>>>> like
>>>> to help out
>>>>
>>>> - take shots at migrating bioperl (any current cvs snapshot will  
>>>> do)
>>>> to svn
>>>
>>> Free Beer, huh?  Do you deliver?
>>>
>>> Can you package up a tarball of the cvs repository (bzip or gzip  
>>> would
>>> save some time) itself?
>>>
>>> thanks!
>>>
>>> g.
>>
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sat Jun 16 04:20:57 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 16 Jun 2007 09:20:57 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <46739D69.4090204@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:
> Hilmar Lapp writes:
>  > So should we set up a sandbox svn repository and those who would like  
>  > to help out
>  > 
>  > - take shots at migrating bioperl (any current cvs snapshot will do)  
>  > to svn
> 
> Free Beer, huh?  Do you deliver?
> 
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
> 
> thanks!
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Sounds like George might know what he's doing! I have a question about
setting up svn access. I believe access can be done in several ways,
over webdav, over ssh and probably others too. Do you have any knowledge
about the benefits of one over the other? I suppose I'm thinking of what
to implement to allow anonymous read access for users and authenticated
access for developers.

Nath

p.s. if you need any monkeys to do some work I'm happy to help out as
much as possible.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGc51pczuW2jkwy2gRAmi9AJ0XojVdh4ckXoc3bwVSmeNw95cR7QCfV+G9
Lb9NUEe4dkCakQ+Gc7Py98A=
=BG9m
-----END PGP SIGNATURE-----


From rvos at interchange.ubc.ca  Sat Jun 16 06:37:11 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 03:37:11 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <15232024.1181990231860.JavaMail.myubc2@handel.my.ubc.ca>

I can volunteer some time to help out with this.

Rutger

-----Original Message-----

> Date: Fri Jun 15 15:10:25 PDT 2007
> From: "Hilmar Lapp" <hlapp at gmx.net>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: spiros at lokku.com
>
> So should we set up a sandbox svn repository and those who would like  
> to help out
> 
> - take shots at migrating bioperl (any current cvs snapshot will do)  
> to svn
> 
> - you document what you find yourself having to do in trying to make  
> it work
> 
> - you report back when you think you have a working repository
> 
> - we all get a defined amount of time to test to our hearts' content,  
> say 2 weeks
> 
> - you fix issues that were encountered
> 
> - report back when done, followed by retesting for, say 1 week
> 
> - iterate previous 2 steps until no issues and no objections to  
> migration
> 
> - two more weeks of warning period to all developers to commit all  
> outstanding changes, or reapply them to a future svn checkout
> 
> - pull the trigger by locking down cvs, applying the migration as  
> worked out before, and announcing that BioPerl is now on svn
> 
> - get free beer at next BOSC (I'll pay if no one else does)
> 
> This may not be precisely the plan that needs to be executed, but  
> it's probably somewhere along those lines.
> 
> If there are volunteers who would like to spearhead this, then power  
> to you - I think everyone is in favor and the advantages of svn don't  
> need to be debated. The only reason it hasn't happened yet is because  
> no one has stepped forward who would have the energy.
> 
> I'm sure ChrisD will gladly create the svn sandbox if we have  
> volunteers lined up to get going.
> 
> 	-hilmar
> 
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
> 
> > On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
> >> Hi,
> >>
> >> I would very much prefer it if bioperl moved to svn. I'm  
> >> considering merging Bio::Phylo (to the extent that that's possible/ 
> >> practical) with bioperl and move it to an OBF repository, but I'd  
> >> rather not go back to CVS.
> >>
> >> Rutger
> >>
> >
> > I second that, SVN seems like the reasonable choice. I would be more
> > than happy to help out as well.
> >
> > Spiros
> >
> >>
> >> -----Original Message-----
> >>
> >>> Date: Fri Jun 15 07:56:23 PDT 2007
> >>> From: "Chris Fields" <cjfields at uiuc.edu>
> >>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> >>> To: "Sendu Bala" <bix at sendu.me.uk>
> >>>
> >>>
> >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> >>>
> >>>>>>> ...
> >>>>>> Can we do any sort of massive conversion at some logical  
> >>>>>> timepoint.
> >>>>>> Probably after a branch release or something?  Because it  
> >>>>>> basically
> >>>>>> means we're going to have differences on nearly every line  
> >>>>>> which is
> >>>>>> going to make diff-ing difficult when debugging old/new versions.
> >>>>>> Maybe it is not a problem because we aren't introducing and new
> >>>>>> bugs!
> >>>>
> >>>> Sorry, can you clarify the problem you envisage? And why would
> >>>> making a branch release help?
> >>>
> >>> Maybe the worry is that mass conversion in such a large codebase
> >>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows  
> >>> w/o
> >>> trying?
> >>>
> >>>>> I agree; if we intend on doing this it should be all at once,
> >>>>> maybe  on a branch dedicated to ensure that code changes don't
> >>>>> tank tests  (they shouldn't but one never knows).  We would then
> >>>>> need a script up- and-running that tidies everything up prior to
> >>>>> commits (though what  happens if perltidy tanks?...).
> >>>>> Sendu, up for it?
> >>>>
> >>>> If its going to be difficult and a hassle, for such an unnecessary
> >>>> thing I'm not sure its worth it. There are more pressing things to
> >>>> be done for Bioperl.
> >>>>
> >>>> If I can just run perltidy on the entire package and commit, I'd do
> >>>> it. If that's not appropriate, I won't.
> >>>
> >>> The choices aren't necessarily all or nothing.  What about  
> >>> voluntary,
> >>> recommended use of a perltidy config file included with the
> >>> distribution, with additional 'caveats'?  See my response to Sean.
> >>>
> >>>>>>> About svn
> >>>> [snip]
> >>>>> Stepped into that one, didn't I!  I'll look into how much effort
> >>>>> is  involved and try getting something going in the next month or
> >>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
> >>>>> well but it  might be worth looking into.
> >>>>
> >>>> I'd put this in the unnecessary-but-nice category as well. If it
> >>>> will be as easy as my ->new change, go ahead. If not, there are
> >>>> more pressing matters (POD fixing, test script updating and
> >>>> finishing...).
> >>>
> >>> A few other open-bio projects have actively discussed a CVS->SVN
> >>> migration (BioRuby and I think BioPython, though the latter could be
> >>> wrong).  As I said, "it might be worth looking into" to weigh the
> >>> pros/cons, get others opinions from others who have made the
> >>> transition, etc.  We could, as Jason suggested, even set up a tester
> >>> SVN w/o making it the default codebase (lock it off to a few  
> >>> testers,
> >>> have CVS commits automatically/manually carry over to SVN, etc).
> >>>
> >>> I agree with you that it's not feasible to switch over prior to a
> >>> release and that there are more pressing issues, but it doesn't hurt
> >>> having an open discussion about it.
> >>>
> >>> chris
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Sat Jun 16 07:21:47 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Sat, 16 Jun 2007 07:21:47 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
Message-ID: <4673C7CB.1030709@mail.nih.gov>

Chris Fields wrote:
> I'll help out to the extent I can w/o having the SVN know-how.  We  
> need (as Jason points out) someone who can detail the benefits and  
> maybe keep an updated journal on the wiki.
>
> I believe at least one or two of the other Bio* contemplated moving  
> over to SVN, which may be worth checking out.
>   
The bioconductor project is on SVN.  The project includes over 200 
packages (the equivalent of perl modules) with something around 150-200 
ACTIVE developers.  They also have a build system for several OSes that 
operates on a cron-like system with builds of several versions 
approximately daily.  Their system is running at something like revision 
30,000, so they have significant experience.  If anyone would like 
technical support, I can certainly ask the folks maintaining their site 
if they can give some input.  Let me know if anyone would like a contact 
person.

As for access, the typical access is over http (or https).  Access 
controls can be set up on the server side while allowing anonymous 
access for checkout.  There are many excellent SVN for every OS, so that 
should not be a problem. 

Sean


From cjfields at uiuc.edu  Sat Jun 16 10:02:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 09:02:35 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4673C7CB.1030709@mail.nih.gov>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
Message-ID: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>


On Jun 16, 2007, at 6:21 AM, Sean Davis wrote:

> Chris Fields wrote:
>> I'll help out to the extent I can w/o having the SVN know-how.  We
>> need (as Jason points out) someone who can detail the benefits and
>> maybe keep an updated journal on the wiki.
>>
>> I believe at least one or two of the other Bio* contemplated moving
>> over to SVN, which may be worth checking out.
>>
> The bioconductor project is on SVN.  The project includes over 200
> packages (the equivalent of perl modules) with something around  
> 150-200
> ACTIVE developers.  They also have a build system for several OSes  
> that
> operates on a cron-like system with builds of several versions
> approximately daily.  Their system is running at something like  
> revision
> 30,000, so they have significant experience.  If anyone would like
> technical support, I can certainly ask the folks maintaining their  
> site
> if they can give some input.  Let me know if anyone would like a  
> contact
> person.
>
> As for access, the typical access is over http (or https).  Access
> controls can be set up on the server side while allowing anonymous
> access for checkout.  There are many excellent SVN for every OS, so  
> that
> should not be a problem.
>
> Sean

It looks like George Hartzell may be taking a crack at it, with  
Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
could have something testable relatively soon.  After that we'll need  
to work out a few other issues, basically what's on Hilmar's list.

chris


From hlapp at gmx.net  Sat Jun 16 10:40:08 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 16 Jun 2007 10:40:08 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>
Message-ID: <51E89347-4AF7-482E-98DB-BE1AA0138A91@gmx.net>

Just as an aside, even if we can't keep anonymous cvs working, I  
would think that using apache URL rewriting and a small CGI script  
that returns an appropriate page redirect we can without too much  
trouble keep the hyperlinks functional that people may have bookmarked

	-hilmar

On Jun 15, 2007, at 6:23 PM, Jason Stajich wrote:

> Sounds like a plan, I'll be curious to see if we can still get keep  
> anonymous CVS working as I'd like to not have to pull the plug on  
> that.  There are some threads out on the web about how to do this  
> with a commit rule on SVN.
>
> Also, can someone who is close enough to all the SVN benefits  
> please elaborate how it is going to help _this_ project?
> Perhaps you would be willing to put a few words up -- like on (a to  
> be created):
> http://bioperl.org/wiki/BioPerl:Version_control_changeover
>
> This way if anonymous CVS is broken and/or developers who haven't  
> been paying attention come back to commit code ask why things  
> changed we don't have to compose long emails... =)
>
> -jason
> On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote:
>
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>>
>> - you document what you find yourself having to do in trying to make
>> it work
>>
>> - you report back when you think you have a working repository
>>
>> - we all get a defined amount of time to test to our hearts' content,
>> say 2 weeks
>>
>> - you fix issues that were encountered
>>
>> - report back when done, followed by retesting for, say 1 week
>>
>> - iterate previous 2 steps until no issues and no objections to
>> migration
>>
>> - two more weeks of warning period to all developers to commit all
>> outstanding changes, or reapply them to a future svn checkout
>>
>> - pull the trigger by locking down cvs, applying the migration as
>> worked out before, and announcing that BioPerl is now on svn
>>
>> - get free beer at next BOSC (I'll pay if no one else does)
>>
>> This may not be precisely the plan that needs to be executed, but
>> it's probably somewhere along those lines.
>>
>> If there are volunteers who would like to spearhead this, then power
>> to you - I think everyone is in favor and the advantages of svn don't
>> need to be debated. The only reason it hasn't happened yet is because
>> no one has stepped forward who would have the energy.
>
>>
>> I'm sure ChrisD will gladly create the svn sandbox if we have
>> volunteers lined up to get going.
>>
>> 	-hilmar
>>
>> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>>
>>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>>> Hi,
>>>>
>>>> I would very much prefer it if bioperl moved to svn. I'm
>>>> considering merging Bio::Phylo (to the extent that that's possible/
>>>> practical) with bioperl and move it to an OBF repository, but I'd
>>>> rather not go back to CVS.
>>>>
>>>> Rutger
>>>>
>>>
>>> I second that, SVN seems like the reasonable choice. I would be more
>>> than happy to help out as well.
>>>
>>> Spiros
>>>
>>>>
>>>> -----Original Message-----
>>>>
>>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>>
>>>>>
>>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>>
>>>>>>>>> ...
>>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>>> timepoint.
>>>>>>>> Probably after a branch release or something?  Because it
>>>>>>>> basically
>>>>>>>> means we're going to have differences on nearly every line
>>>>>>>> which is
>>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>>> versions.
>>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>>> bugs!
>>>>>>
>>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>>> making a branch release help?
>>>>>
>>>>> Maybe the worry is that mass conversion in such a large codebase
>>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>>> w/o
>>>>> trying?
>>>>>
>>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>>> Sendu, up for it?
>>>>>>
>>>>>> If its going to be difficult and a hassle, for such an  
>>>>>> unnecessary
>>>>>> thing I'm not sure its worth it. There are more pressing  
>>>>>> things to
>>>>>> be done for Bioperl.
>>>>>>
>>>>>> If I can just run perltidy on the entire package and commit,  
>>>>>> I'd do
>>>>>> it. If that's not appropriate, I won't.
>>>>>
>>>>> The choices aren't necessarily all or nothing.  What about
>>>>> voluntary,
>>>>> recommended use of a perltidy config file included with the
>>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>>
>>>>>>>>> About svn
>>>>>> [snip]
>>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>>> is  involved and try getting something going in the next  
>>>>>>> month or
>>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>>> well but it  might be worth looking into.
>>>>>>
>>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>>> more pressing matters (POD fixing, test script updating and
>>>>>> finishing...).
>>>>>
>>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>>> migration (BioRuby and I think BioPython, though the latter  
>>>>> could be
>>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>>> pros/cons, get others opinions >from others who have made the
>>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>>> tester
>>>>> SVN w/o making it the default codebase (lock it off to a few
>>>>> testers,
>>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>>
>>>>> I agree with you that it's not feasible to switch over prior to a
>>>>> release and that there are more pressing issues, but it doesn't  
>>>>> hurt
>>>>> having an open discussion about it.
>>>>>
>>>>> chris
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Jun 16 10:55:09 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 16 Jun 2007 10:55:09 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4673C7CB.1030709@mail.nih.gov>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
Message-ID: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>


On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:

> As for access, the typical access is over http (or https).

We're using svn+ssh here (NESCent) so the password is the same as the  
one you set for your account on the server, and you can use public/ 
private key negotiation for authentication.

I think the ability to not provide a password for every single  
interaction is a requirement. If that requires using svn+ssh or can  
be made to work through https too I don't know. On sf.net I have to  
use https for svn and it doesn't ask me for the password each time.  
Not sure how this works though, maybe some local caching?

We should not be using http, or whatever other protocol that sends  
unencrypted passwords.

>   Access controls can be set up on the server side while allowing  
> anonymous access for checkout.  There are many excellent SVN for  
> every OS, so that should not be a problem.

On Mac OSX the most convenient way I have found is through fink. It  
does ask to install 30 other dependencies, which had me balk at  
first, but me doing it by hand is even worse than fink doing it, so I  
finally gave in and it's really a breeze. I've not had a single issue.

  From a sysadmin perspective, what might be worth keeping in mind is  
that svn is going to store everything in a database (BerkeleyDB I  
think). I.e., there is no such thing anymore as restoring individual  
source code files from backup if one gets accidentally corrupted on  
the server. It seems you have to restore the entire database, i.e.,  
the entire repository. I vaguely recall though that how svn manages  
the repository is actually configurable and that other storage than  
DB is possible too. Don't ask me for the pros and cons of one vs the  
other.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From rvos at interchange.ubc.ca  Sat Jun 16 13:09:18 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 10:09:18 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>

CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).

For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).

Rutger


-----Original Message-----

> Date: Sat Jun 16 07:55:09 PDT 2007
> From: "Hilmar Lapp" <hlapp at gmx.net>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: "Sean Davis" <sdavis2 at mail.nih.gov>
>
> 
> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> 
> > As for access, the typical access is over http (or https).
> 
> We're using svn+ssh here (NESCent) so the password is the same as the  
> one you set for your account on the server, and you can use public/ 
> private key negotiation for authentication.
> 
> I think the ability to not provide a password for every single  
> interaction is a requirement. If that requires using svn+ssh or can  
> be made to work through https too I don't know. On sf.net I have to  
> use https for svn and it doesn't ask me for the password each time.  
> Not sure how this works though, maybe some local caching?
> 
> We should not be using http, or whatever other protocol that sends  
> unencrypted passwords.
> 
> >   Access controls can be set up on the server side while allowing  
> > anonymous access for checkout.  There are many excellent SVN for  
> > every OS, so that should not be a problem.
> 
> On Mac OSX the most convenient way I have found is through fink. It  
> does ask to install 30 other dependencies, which had me balk at  
> first, but me doing it by hand is even worse than fink doing it, so I  
> finally gave in and it's really a breeze. I've not had a single issue.
> 
>   From a sysadmin perspective, what might be worth keeping in mind is  
> that svn is going to store everything in a database (BerkeleyDB I  
> think). I.e., there is no such thing anymore as restoring individual  
> source code files from backup if one gets accidentally corrupted on  
> the server. It seems you have to restore the entire database, i.e.,  
> the entire repository. I vaguely recall though that how svn manages  
> the repository is actually configurable and that other storage than  
> DB is possible too. Don't ask me for the pros and cons of one vs the  
> other.
> 
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From rvos at interchange.ubc.ca  Sat Jun 16 13:15:45 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 10:15:45 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>

A brief word on the topic of perltidy: no. I like what it does, and I sort of follow one of its settings (-syn -sob -b), but if you run it on a whole source tree it'll screw up the diffs, and I'm still worried about it breaking things (though really it shouldn't, it creates a *.bak if something doesn't compile anymore).

Rutger


-----Original Message-----

> Date: Sat Jun 16 10:09:18 PDT 2007
> From: "rvos" <rvos at interchange.ubc.ca>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: "Hilmar Lapp" <hlapp at gmx.net>, "Sean Davis" <sdavis2 at mail.nih.gov>
>
> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).
> 
> For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).
> 
> Rutger
> 
> 
> -----Original Message-----
> 
> > Date: Sat Jun 16 07:55:09 PDT 2007
> > From: "Hilmar Lapp" <hlapp at gmx.net>
> > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> > To: "Sean Davis" <sdavis2 at mail.nih.gov>
> >
> > 
> > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> > 
> > > As for access, the typical access is over http (or https).
> > 
> > We're using svn+ssh here (NESCent) so the password is the same as the  
> > one you set for your account on the server, and you can use public/ 
> > private key negotiation for authentication.
> > 
> > I think the ability to not provide a password for every single  
> > interaction is a requirement. If that requires using svn+ssh or can  
> > be made to work through https too I don't know. On sf.net I have to  
> > use https for svn and it doesn't ask me for the password each time.  
> > Not sure how this works though, maybe some local caching?
> > 
> > We should not be using http, or whatever other protocol that sends  
> > unencrypted passwords.
> > 
> > >   Access controls can be set up on the server side while allowing  
> > > anonymous access for checkout.  There are many excellent SVN for  
> > > every OS, so that should not be a problem.
> > 
> > On Mac OSX the most convenient way I have found is through fink. It  
> > does ask to install 30 other dependencies, which had me balk at  
> > first, but me doing it by hand is even worse than fink doing it, so I  
> > finally gave in and it's really a breeze. I've not had a single issue.
> > 
> >   From a sysadmin perspective, what might be worth keeping in mind is  
> > that svn is going to store everything in a database (BerkeleyDB I  
> > think). I.e., there is no such thing anymore as restoring individual  
> > source code files from backup if one gets accidentally corrupted on  
> > the server. It seems you have to restore the entire database, i.e.,  
> > the entire repository. I vaguely recall though that how svn manages  
> > the repository is actually configurable and that other storage than  
> > DB is possible too. Don't ask me for the pros and cons of one vs the  
> > other.
> > 
> > 	-hilmar
> > -- 
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From george.heller at yahoo.com  Sat Jun 16 13:29:26 2007
From: george.heller at yahoo.com (George Heller)
Date: Sat, 16 Jun 2007 10:29:26 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
Message-ID: <959624.48556.qm@web56502.mail.re3.yahoo.com>

Hi all,
   
  I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. 
   
  Any ideas on the way I can go about doing this?
   
  George

       
---------------------------------
Shape Yahoo! in your own image.  Join our Network Research Panel today!


From bix at sendu.me.uk  Sat Jun 16 14:21:38 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 16 Jun 2007 19:21:38 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <959624.48556.qm@web56502.mail.re3.yahoo.com>
References: <959624.48556.qm@web56502.mail.re3.yahoo.com>
Message-ID: <46742A32.90305@sendu.me.uk>

George Heller wrote:
> Hi all,
> 
> I am looking at extracting the taxonomy hierarchy for some taxon ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
> 
> Any ideas on the way I can go about doing this?

Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
some kind of looping structure. Most easily a recursing sub.

If you happen to code up something neat and efficient, why not share it 
with us and we could add it to the Taxonomy module(s).


From cjfields at uiuc.edu  Sat Jun 16 15:23:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 14:23:43 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
Message-ID: <A59B3FA2-6732-4DB2-9C9C-223DFF41D1E9@uiuc.edu>


On Jun 16, 2007, at 9:55 AM, Hilmar Lapp wrote:

>
> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>
>> As for access, the typical access is over http (or https).
>
> We're using svn+ssh here (NESCent) so the password is the same as the
> one you set for your account on the server, and you can use public/
> private key negotiation for authentication.
>
> I think the ability to not provide a password for every single
> interaction is a requirement. If that requires using svn+ssh or can
> be made to work through https too I don't know. On sf.net I have to
> use https for svn and it doesn't ask me for the password each time.
> Not sure how this works though, maybe some local caching?
>
> We should not be using http, or whatever other protocol that sends
> unencrypted passwords.

Agreed; it should be through ssh.

>>   Access controls can be set up on the server side while allowing
>> anonymous access for checkout.  There are many excellent SVN for
>> every OS, so that should not be a problem.
>
> On Mac OSX the most convenient way I have found is through fink. It
> does ask to install 30 other dependencies, which had me balk at
> first, but me doing it by hand is even worse than fink doing it, so I
> finally gave in and it's really a breeze. I've not had a single issue.
>
>   From a sysadmin perspective, what might be worth keeping in mind is
> that svn is going to store everything in a database (BerkeleyDB I
> think). I.e., there is no such thing anymore as restoring individual
> source code files from backup if one gets accidentally corrupted on
> the server. It seems you have to restore the entire database, i.e.,
> the entire repository. I vaguely recall though that how svn manages
> the repository is actually configurable and that other storage than
> DB is possible too. Don't ask me for the pros and cons of one vs the
> other.

MacPorts/DarwinPorts also has subversion, various language bindings,  
cvs2svn, and various perl modules.  There are also a few SVN GUIs  
lingering around (including live folders within Komodo).

chris


From cjfields at uiuc.edu  Sat Jun 16 15:18:06 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 14:18:06 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>
References: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <1A314D08-8F3C-4A4B-B58D-64AC7952F149@uiuc.edu>

I think it's viable as an option if the code really needs it.  After  
100+ commits some of the code has schizy coding styles, so cleaning  
it up helps.  In those cases having a perltidy config file present  
wouldn't hurt.  However I agree that it shouldn't be applied across  
every module and should be done judiciously (the commit message, for  
instance, should actually state the code was tidied).

chris

PS - Nice to see the ball is rolling on SVN!

On Jun 16, 2007, at 12:15 PM, rvos wrote:

> A brief word on the topic of perltidy: no. I like what it does, and  
> I sort of follow one of its settings (-syn -sob -b), but if you run  
> it on a whole source tree it'll screw up the diffs, and I'm still  
> worried about it breaking things (though really it shouldn't, it  
> creates a *.bak if something doesn't compile anymore).
>
> Rutger
>
>
>
> -----Original Message-----
>
>> Date: Sat Jun 16 10:09:18 PDT 2007
>> From: "rvos" <rvos at interchange.ubc.ca>
>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
>> To: "Hilmar Lapp" <hlapp at gmx.net>, "Sean Davis"  
>> <sdavis2 at mail.nih.gov>
>>
>> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales  
>> talk has been expended over it already, for my own purpose I like  
>> the integration with eclipse (through subclipse plugin) and  
>> komodo, in addition to the atomic commits (so I can ctrl+c if I  
>> goof up (again)).
>>
>> For standalone use on osx I didn't use the fink one, but I forgot  
>> where I did get it from. It was very easy to set up, though. On  
>> windows there is a really nice standalone one (tortoisesvn) that  
>> integrates with the explorer so you can see on the file icons what  
>> the state of a file is. I know that there's a cvs2svn utility that  
>> converts your revision history (seems a requirement).
>>
>> Rutger
>>
>>
>> -----Original Message-----
>>
>>> Date: Sat Jun 16 07:55:09 PDT 2007
>>> From: "Hilmar Lapp" <hlapp at gmx.net>
>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
>>> To: "Sean Davis" <sdavis2 at mail.nih.gov>
>>>
>>>
>>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>>
>>>> As for access, the typical access is over http (or https).
>>>
>>> We're using svn+ssh here (NESCent) so the password is the same as  
>>> the
>>> one you set for your account on the server, and you can use public/
>>> private key negotiation for authentication.
>>>
>>> I think the ability to not provide a password for every single
>>> interaction is a requirement. If that requires using svn+ssh or can
>>> be made to work through https too I don't know. On sf.net I have to
>>> use https for svn and it doesn't ask me for the password each time.
>>> Not sure how this works though, maybe some local caching?
>>>
>>> We should not be using http, or whatever other protocol that sends
>>> unencrypted passwords.
>>>
>>>>   Access controls can be set up on the server side while allowing
>>>> anonymous access for checkout.  There are many excellent SVN for
>>>> every OS, so that should not be a problem.
>>>
>>> On Mac OSX the most convenient way I have found is through fink. It
>>> does ask to install 30 other dependencies, which had me balk at
>>> first, but me doing it by hand is even worse than fink doing it,  
>>> so I
>>> finally gave in and it's really a breeze. I've not had a single  
>>> issue.
>>>
>>>   From a sysadmin perspective, what might be worth keeping in  
>>> mind is
>>> that svn is going to store everything in a database (BerkeleyDB I
>>> think). I.e., there is no such thing anymore as restoring individual
>>> source code files from backup if one gets accidentally corrupted on
>>> the server. It seems you have to restore the entire database, i.e.,
>>> the entire repository. I vaguely recall though that how svn manages
>>> the repository is actually configurable and that other storage than
>>> DB is possible too. Don't ask me for the pros and cons of one vs the
>>> other.
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hartzell at alerce.com  Sat Jun 16 13:47:01 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 16 Jun 2007 10:47:01 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
	<F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
Message-ID: <18036.8725.29073.619527@almost.alerce.com>

Chris Fields writes:
 > Ah, got it.  Sorry.
 > 
 > George, planning on taking this up?

I'm going to take a *peek*.  I just finished (unless someone finds
another issue) moving someone's cvs repository over to svn, so I have
some tools cobbled together and some knowledge in the cache.

I don't have too much idle time at the moment though, so if it gets
gooey I'll just summarize what I learn.  Either way it seems worth a
peek.

I will need the repository itself though.  I'll post a note to
support at open-bio.org.

g.


From jason at bioperl.org  Sat Jun 16 19:54:18 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 16 Jun 2007 16:54:18 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18036.8725.29073.619527@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
	<F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
	<18036.8725.29073.619527@almost.alerce.com>
Message-ID: <6F57475B-715F-49D1-B6D2-F3FD3ACCB728@bioperl.org>

Thanks George.
I'll respond to your support ticket as well but I put up tarballs of  
the repository as of today.

I had thought at one point ChrisD might have setup rsync-able access  
to the whole repostitory through code.open-bio.org but for now I have  
put up tarballs of most of the CVS dirs from bioperl
http://bioperl.org/uploads/

Just to say I already went through all the steps of running cvs2svn  
myself and had problems gathering back out the branches and all the  
tags when I tried it.  If you want to start with a smaller repository  
like bioperl-network or bioperl-db as the initial cvs2svn conversion  
script took quite a long time to run on bioperl-live.

Regarding ssh/https:
We have already gone through some of this for blipkit and biojava  
projects.  I think we'll still keep separate anonymous read-only  
(code.open-bio.org) and writeable repositories (dev.open-bio.org) as  
I think we are resisting any webapps on the developement server as we  
want that to as locked down as possible.  For the newly created svn  
repositories that I've been creating/using I just use svn+ssh and  
that worked okay.


-jason

On Jun 16, 2007, at 10:47 AM, George Hartzell wrote:

> Chris Fields writes:
>> Ah, got it.  Sorry.
>>
>> George, planning on taking this up?
>
> I'm going to take a *peek*.  I just finished (unless someone finds
> another issue) moving someone's cvs repository over to svn, so I have
> some tools cobbled together and some knowledge in the cache.
>
> I don't have too much idle time at the moment though, so if it gets
> gooey I'll just summarize what I learn.  Either way it seems worth a
> peek.
>
> I will need the repository itself though.  I'll post a note to
> support at open-bio.org.
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hartzell at alerce.com  Sat Jun 16 19:56:09 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 16 Jun 2007 16:56:09 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <46739D69.4090204@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<46739D69.4090204@sheffield.ac.uk>
Message-ID: <18036.30873.609341.181853@almost.alerce.com>

Nathan S. Haigh writes:
 > [...]
 > Sounds like George might know what he's doing! 

Hey, I've been looking for a Marketing Director.  Want a job?

 > I have a question about
 > setting up svn access. I believe access can be done in several ways,
 > over webdav, over ssh and probably others too. Do you have any knowledge
 > about the benefits of one over the other? I suppose I'm thinking of what
 > to implement to allow anonymous read access for users and authenticated
 > access for developers.

There are two and a half ways to talk to the repository:

  - You can put it behind a web server (e.g. apache) and get at it
    using http/https.  Authentication and authorization happen using
    the normal web server tricks, so as long as you don't do anything
    silly (e.g. don't use basic auth, stick with mod_auth_digest),
    even http connections won't send passwords in the clear.  You can
    define users in .htpassword files or use any of the fancier setup
    (e.g. sql databases, etc...).

  - You can talk to it via subversion's simple server, svnserve.
    There are two ways you usually talk to svnserve (neither of which
    send passwords in the clear):

      * directly, using a URL like
          svn:/svn.example.com/repo/proj/trunk
        when you do this the client either talks directly to a copy of
        svnserve running as a daemon, or possibly to something like
        inetd that'll start an svnserve as necessary.

        In this case, you define authen. and author. info in an
        svnserve.conf file.

      * indirectly, using a URL like
          svn+ssh://svn.example.com/repo/proj/trunk/
        in which case you make an ssh connection to the server machine
        (and authenticate via ssh mechanisms, anything other than a
        key-pair will drive you nuts with repeated password requests)
        and then an svnserve process is started up for you in "tunnel
        mode".  Access control is coarse grained an via OS level  access
        permisions. 

        Generally in this case you need to give out shell accounts to
        everyone involved, or (tsk, tsk) have them use a common
        account.  There's a cute trick in the svn book that shows how
        to use a shared ssh account but still have all of the changes
        in the repo keep track of the real user.  I've never tried
        it.... 

   - If you're on the same machine as the repo, you can do this
     simple:
        file:///path/to/repo/proj/trunk

The biggest deciding factor is how you want to manage your users and
whether you're already messing around with a web server.  I've
generally worked in small group and everyone's had ssh access, but
I've set it up the other ways too.

You can even access via multiple paths.  The only trick is that the
repository needs to be writable by whoever's committing, and if
they're running svnserve themselves (file: or svn+ssh:) and things
aren't set up right (all the dirs in the repo need to be group
writable and have the magic bit set so that any new stuff created is
also writable, users umasks and group membership need to be aligned)
then things go fubar.  Google's your friend here, and each of the
OS's/distro's has a standard hack for making this work, usually
involving a wrapper app that takes care of things.

Feel free to ask any particular questions.

Phew,

g.


From jason at bioperl.org  Sat Jun 16 20:17:58 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 16 Jun 2007 17:17:58 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <200706151653.04135.sheris@eps.berkeley.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
	<1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
	<200706151653.04135.sheris@eps.berkeley.edu>
Message-ID: <6A369DE9-943A-4DF1-9DF0-F68E361C8C20@bioperl.org>

There error is clearly saying there must be a symbol or letter in  
your sequence that violates the regexp.
I had modified the code in CVS to actually provide a more informative  
mismatch error in the error message, but this probably not in the  
release you are using.

Anyways, add this to see what is causing the problem:

print join(",",($nstarthash{$_}[1] =~ /([^ 
$Bio::PrimarySeq::MATCHPATTERN]+)/g)), "\n";

-jason
On Jun 15, 2007, at 4:53 PM, Sheri Simmons wrote:

> Thanks for the suggestion, but that still gives the same error as  
> before.
>
> On Friday 15 June 2007 4:11 pm, Kevin Brown wrote:
>>> I'm getting an error as follows when I try to reverse
>>> complement a sequence string stored in a hash of arrays. The
>>> storage code is:
>>>
>>> 		$nstarthash{$key} = [$sortchecks[0], join("",
>>> @nseq),
>>> join("",@{$seqhash{$key}})];
>>>
>>> the sequence of interest is the element at index 1.
>>>
>>> Later, I try to retrieve this string for a subset of keys so
>>> I can reverse complement it based on input from another hash
>>> (%complement):
>>>
>>> 			my %revcomphash = map { my $read = $_;
>>> 			grep $complement{$read} eq 'C', %complement;
>>> 			{$_, (Bio::Seq->new(-seq
>>> =>$nstarthash{$_}[1]))->revcom->seq()};}
>>> 			 keys(%nstarthash);
>>>
>>>
>>> I get the following warning (long sequence edited for clarity):
>>>
>>> -- -------------------- WARNING ---------------------
>>> MSG: seq doesn't validate, mismatch is 1
>>> ---------------------------------------------------
>>>
>>> ------------- EXCEPTION  -------------
>>> MSG: Attempting to set the sequence to
>>> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
>>> which does not look healthy
>>> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
>>> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
>>> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK
>>> toplevel ../quality_wrapper.pl:103
>>>
>>> I cannot find any non-allowed characters in the sequence, and
>>> the de-referencing appears to work correctly. Can anyone help me?
>>> I'm using the latest Bioperl installation (1.5.2) with
>>> ActivePerl5.8 on a Mepis 6.5 system.
>>
>> Try telling the Bio::Seq object what alphabet to use when creating  
>> it.
>> I tend to create them like:
>>
>> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')
>
> -- 
> Sheri Simmons
> Department of Earth and Planetary Sciences
> University of California, Berkeley
> Berkeley, CA 94720-4767
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From n.haigh at sheffield.ac.uk  Sun Jun 17 07:45:11 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 17 Jun 2007 12:45:11 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <46751EC7.8020609@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

rvos wrote:
> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).
> 
> For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).
> 
> Rutger
> 
> 

Just to clarify, subversion is available as command line for windows:
http://subversion.tigris.org/project_packages.html

TortoiseSVN is another svn client with a GUI that integrates into the
shell. I tried setting this up a while back to use ssh (via PUTTY), but
I wasn't successful. This may have been due to me just starting out with
svn or that it was harder to setup in an earlier version of TortoiseSVN.

Does anyone have experience of setting up svn on Windows to use ssh? If
the changeover takes place, I'm happy to write some howto's for setting
up svn clients for Windows.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGdR7HczuW2jkwy2gRAmgOAJ96wLzVYbjqEPborZTsw6gwU6UitgCfV02v
8xHJvn/Eqf9LePR3Ei0ZaIw=
=t5pN
-----END PGP SIGNATURE-----


From george.heller at yahoo.com  Sun Jun 17 14:41:55 2007
From: george.heller at yahoo.com (George Heller)
Date: Sun, 17 Jun 2007 11:41:55 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46742A32.90305@sendu.me.uk>
Message-ID: <148654.15952.qm@web56511.mail.re3.yahoo.com>

Hi all,
   
  Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. 
   
  Thanks.
  George

Sendu Bala <bix at sendu.me.uk> wrote:
  George Heller wrote:
> Hi all,
> 
> I am looking at extracting the taxonomy hierarchy for some taxon ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
> 
> Any ideas on the way I can go about doing this?

Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
some kind of looping structure. Most easily a recursing sub.

If you happen to code up something neat and efficient, why not share it 
with us and we could add it to the Taxonomy module(s).


---------------------------------
Shape Yahoo! in your own image.  Join our Network Research Panel today!


From jason at bioperl.org  Sun Jun 17 16:48:05 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sun, 17 Jun 2007 13:48:05 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <148654.15952.qm@web56511.mail.re3.yahoo.com>
References: <148654.15952.qm@web56511.mail.re3.yahoo.com>
Message-ID: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org>

I assume you already figured out how to setup a local taxonomydb?

You just want the extant species/leaves of the tree

my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;


-jason
On Jun 17, 2007, at 11:41 AM, George Heller wrote:

> Hi all,
>
>   Can anyone point me to some example that uses the  
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at  
> this, and I am not quite sure how to implement it.
>
>   Thanks.
>   George
>
> Sendu Bala <bix at sendu.me.uk> wrote:
>   George Heller wrote:
>> Hi all,
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children and so
>> on.
>>
>> Any ideas on the way I can go about doing this?
>
> Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
>
> If you happen to code up something neat and efficient, why not  
> share it
> with us and we could add it to the Taxonomy module(s).
>
>
>
> ---------------------------------
> Shape Yahoo! in your own image.  Join our Network Research Panel  
> today!
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From aaron.j.mackey at gsk.com  Sun Jun 17 22:35:42 2007
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Sun, 17 Jun 2007 22:35:42 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46742A32.90305@sendu.me.uk>
Message-ID: <OF9A874C93.CFF12016-ON852572FE.000E328D-852572FE.000E463E@gsk.com>

To do so efficiently, you might want to check out:

  http://www.oreillynet.com/pub/a/network/2002/11/27/bioconf.html

-Aaron

bioperl-l-bounces at lists.open-bio.org wrote on 06/16/2007 02:21:38 PM:

> George Heller wrote:
> > Hi all,
> > 
> > I am looking at extracting the taxonomy hierarchy for some taxon ids.
> > What I plan to do is, for a given taxon id, say 33090, I want to
> > extract all taxon ids that are children of this species. I do not
> > just want the immediate children, but the children's children and so
> > on.
> > 
> > Any ideas on the way I can go about doing this?
> 
> Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
> 
> If you happen to code up something neat and efficient, why not share it 
> with us and we could add it to the Taxonomy module(s).
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From aaron.j.mackey at gsk.com  Sun Jun 17 22:34:12 2007
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Sun, 17 Jun 2007 22:34:12 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
Message-ID: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>

> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> 
> > As for access, the typical access is over http (or https).
> 
> We're using svn+ssh here (NESCent)

Let me just note that https is preferable to ssh for those poor slobs 
stuck behind a corporate firewall (svn happily prompts me for my proxy 
server's user/pass, then my https authentication realm's user/pass - all 
then get cached in some .svn/ file that I don't have to worry about again 
until my proxy server password changes once a month ...)

-Aaron


From george.heller at yahoo.com  Mon Jun 18 00:21:45 2007
From: george.heller at yahoo.com (George Heller)
Date: Sun, 17 Jun 2007 21:21:45 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org>
Message-ID: <487845.37410.qm@web56510.mail.re3.yahoo.com>

Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. 
   
  I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. 
   
  Thanks.
  George
   
  Jason Stajich <jason at bioperl.org> wrote:
    I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;  

  
  -jason
    On Jun 17, 2007, at 11:41 AM, George Heller wrote:

    Hi all,
  

    Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. 
  

    Thanks.
    George
  

  Sendu Bala <bix at sendu.me.uk> wrote:
    George Heller wrote:
    Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not share it 
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image.  Join our Network Research Panel today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Need a vacation? Get great deals to amazing places on Yahoo! Travel. 


From bix at sendu.me.uk  Mon Jun 18 06:44:00 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 11:44:00 +0100
Subject: [Bioperl-l] Network tests overhaul
Message-ID: <467661F0.2060703@sendu.me.uk>

When the test suite runs currently, most (the intent is all) tests skip 
if the test would require network (internet) access. This is to avoid 
tests failing not due to bugs in Bioperl code, but due to temporarily 
inaccessible servers. This is also to make running the test suite faster.

To do a complete test you currently have to set BIOPERLDEBUG to true, 
which activates the network test but also increases verbosity. This 
actually causes a problem, since when running the entire test suite the 
additional debug information is more a hindrance than a help, since the 
reams of printed information can hide significant warnings that may also 
get printed. Its also ugly.

The solution is to divorce activation of network tests from the request 
for verbosity. The obvious implementation is to have another environment 
variable, perhaps BIOPERLNETWORK. However, there is an opportunity to do 
something more appropriate. The running of networking tests should be a 
choice given to every end-user installing Bioperl. Debugging 
information, on the other hand, is only of interest to the developer 
working on a specific module under test, so can be left as a 'hidden' 
env var.


I have just committed one possible implementation along these lines.

You say:
perl Build.PL
as normal, and if you seem to have internet access it asks you if you'd 
like to run network tests. The default answer is no. If you answer yes, 
network tests will be enabled.

You can alternatively say:
perl Build.PL --network
and if you seem to have internet access, network tests will be enabled.

Then you run the tests:
./Build test
Any tests written to support the new system will then skip network tests 
if they haven't been enabled.

The only test I've written to support the new system is t/RemoteBlast.t:
./Build test --test_files t/RemoteBlast.t --verbose


Adding support to test scripts consists of the following changes:

+ use Module::Build;
+ my $build = Module::Build->current(get_options => { network => {} });
+ my $do_network_tests = $build->notes('network');

! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests
---
! if (!$do_network_tests) { # skip network tests


I propose adding this support to all test scripts that carry out network 
tests. Does anyone have objections? Does anyone have alternate 
implementations that may be superior?

I specifically suggest we don't use an env var in addition to the above, 
because the multiple ways of doing things could lead to confusion. Which 
takes priority? Did a user really have the networking tests turned on 
when he reported his test results?


The one thing I need help with is identifying which tests attempt to 
access the internet. I think we caught most of them for the 1.5.2 
release, but I think there are more lurking around. Can anyone offer a 
way to systematically find at least the test scripts which access the 
internet, if not the specific tests within?

Cheers,
Sendu.


From bix at sendu.me.uk  Mon Jun 18 06:46:17 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 11:46:17 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <46766279.7050202@sendu.me.uk>

Sendu Bala wrote:
> Adding support to test scripts consists of the following changes:
> 
> + use Module::Build;
> + my $build = Module::Build->current(get_options => { network => {} });

That should read:
+ my $build = Module::Build->current();

> + my $do_network_tests = $build->notes('network');


From cjfields at uiuc.edu  Mon Jun 18 07:45:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 06:45:10 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <46766279.7050202@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk>
Message-ID: <C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>

The idea sounds good, though if we plan on doing this we need to  
update the Test HOWTO as well.

Some modules require only a few (<50% of the total) network tests; I  
think SeqFeature.t may be one, though I'm not sure.  Does this handle  
those cases?

chris

On Jun 18, 2007, at 5:46 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Adding support to test scripts consists of the following changes:
>>
>> + use Module::Build;
>> + my $build = Module::Build->current(get_options => { network =>  
>> {} });
>
> That should read:
> + my $build = Module::Build->current();
>
>> + my $do_network_tests = $build->notes('network');
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Jun 18 07:49:18 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 12:49:18 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk>
	<C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>
Message-ID: <4676713E.1000508@sendu.me.uk>

Chris Fields wrote:
> The idea sounds good, though if we plan on doing this we need to update 
> the Test HOWTO as well.
> 
> Some modules require only a few (<50% of the total) network tests; I 
> think SeqFeature.t may be one, though I'm not sure.  Does this handle 
> those cases?

Yes, the system just gives the test script a boolean describing if 
network tests should be run. The script can then do whatever it wants 
with the boolean. Skip all tests, skip no tests, skip just some tests... 
its a drop-in replacement for the current 'debug' boolean used based on 
BIOPERLDEBUG.


From hlapp at gmx.net  Mon Jun 18 08:38:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:38:25 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <487845.37410.qm@web56510.mail.re3.yahoo.com>
References: <487845.37410.qm@web56510.mail.re3.yahoo.com>
Message-ID: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net>

I'm a bit confused - it sounds like you have set up a local BioSQL  
database and loaded the NCBI taxonomy into the database. You can now  
use simple SQL to retrieve all descendants of a node in the tree  
given its NCBI taxonID such as

	SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
	WHERE
	    n.ncbi_taxon_id = :taxonID
	AND tn.left_value > n. left_value
	AND tn.right_value < n.right_value
	AND tn.taxon_id = tnm.taxon_id
	AND tn.name_class = 'scientific_name'

BioPerl doesn't have a Taxonomy::biosql module yet (though this would  
seem like a worthwhile thing to add), so you can't use the  
Bio::DB::Taxonomy interface to do this against a BioSQL instance.

However, BioPerl does have support for the flat-file download of the  
NCBI taxonomy database and indexes it, so you can simply use  
Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download  
to achieve what you wanted to do in a less than 5 lines of perl.

Although the recursive implementation of Taxonomy::get_all_Descendants 
() won't be lightning fast, it may still be perfectly fine for your  
application - are you sure it is not?

	-hilmar

On Jun 18, 2007, at 12:21 AM, George Heller wrote:

> Thanks. And how can I assign the $node here in the below code, such  
> that I can reference it to a particular taxon id record? I want to  
> retrieve all the descendents from the taxonomy hierarchy, given a  
> particular taxon id.
>
>   I have a local db setup, in which I have uploaded data using the  
> load_ncbi_taxonomy.pl script.
>
>   Thanks.
>   George
>
>   Jason Stajich <jason at bioperl.org> wrote:
>     I assume you already figured out how to setup a local taxonomydb?
>
>
>   You just want the extant species/leaves of the tree
>
>
> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>
>
>
>   -jason
>     On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>     Hi all,
>
>
>     Can anyone point me to some example that uses the  
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at  
> this, and I am not quite sure how to implement it.
>
>
>     Thanks.
>     George
>
>
>   Sendu Bala <bix at sendu.me.uk> wrote:
>     George Heller wrote:
>     Hi all,
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon  
> ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children and so
>   on.
>
>
>   Any ideas on the way I can go about doing this?
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and  
> each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>   If you happen to code up something neat and efficient, why not  
> share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image.  Join our Network Research Panel  
> today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Jun 18 08:44:22 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:44:22 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
Message-ID: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>

Just curious - how do you cvs commit then to an external repository?  
Is that open in the firewall?

It is true though that corporations typically will not permit any  
encrypted outgoing traffic through their firewall except https.  
sf.net only supports https for svn, AFAIK.

	-hilmar

On Jun 17, 2007, at 10:34 PM, aaron.j.mackey at gsk.com wrote:

>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>
>>> As for access, the typical access is over http (or https).
>>
>> We're using svn+ssh here (NESCent)
>
> Let me just note that https is preferable to ssh for those poor slobs
> stuck behind a corporate firewall (svn happily prompts me for my proxy
> server's user/pass, then my https authentication realm's user/pass  
> - all
> then get cached in some .svn/ file that I don't have to worry about  
> again
> until my proxy server password changes once a month ...)
>
> -Aaron
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Jun 18 08:47:56 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:47:56 -0400
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <B9BDBD4A-962D-4E83-8151-5D6EA8B69D3B@gmx.net>

Sounds like a great idea to me. -hilmar

On Jun 18, 2007, at 6:44 AM, Sendu Bala wrote:

> When the test suite runs currently, most (the intent is all) tests  
> skip
> if the test would require network (internet) access. This is to avoid
> tests failing not due to bugs in Bioperl code, but due to temporarily
> inaccessible servers. This is also to make running the test suite  
> faster.
>
> To do a complete test you currently have to set BIOPERLDEBUG to true,
> which activates the network test but also increases verbosity. This
> actually causes a problem, since when running the entire test suite  
> the
> additional debug information is more a hindrance than a help, since  
> the
> reams of printed information can hide significant warnings that may  
> also
> get printed. Its also ugly.
>
> The solution is to divorce activation of network tests from the  
> request
> for verbosity. The obvious implementation is to have another  
> environment
> variable, perhaps BIOPERLNETWORK. However, there is an opportunity  
> to do
> something more appropriate. The running of networking tests should  
> be a
> choice given to every end-user installing Bioperl. Debugging
> information, on the other hand, is only of interest to the developer
> working on a specific module under test, so can be left as a 'hidden'
> env var.
>
>
> I have just committed one possible implementation along these lines.
>
> You say:
> perl Build.PL
> as normal, and if you seem to have internet access it asks you if  
> you'd
> like to run network tests. The default answer is no. If you answer  
> yes,
> network tests will be enabled.
>
> You can alternatively say:
> perl Build.PL --network
> and if you seem to have internet access, network tests will be  
> enabled.
>
> Then you run the tests:
> ./Build test
> Any tests written to support the new system will then skip network  
> tests
> if they haven't been enabled.
>
> The only test I've written to support the new system is t/ 
> RemoteBlast.t:
> ./Build test --test_files t/RemoteBlast.t --verbose
>
>
> Adding support to test scripts consists of the following changes:
>
> + use Module::Build;
> + my $build = Module::Build->current(get_options => { network =>  
> {} });
> + my $do_network_tests = $build->notes('network');
>
> ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests
> ---
> ! if (!$do_network_tests) { # skip network tests
>
>
> I propose adding this support to all test scripts that carry out  
> network
> tests. Does anyone have objections? Does anyone have alternate
> implementations that may be superior?
>
> I specifically suggest we don't use an env var in addition to the  
> above,
> because the multiple ways of doing things could lead to confusion.  
> Which
> takes priority? Did a user really have the networking tests turned on
> when he reported his test results?
>
>
> The one thing I need help with is identifying which tests attempt to
> access the internet. I think we caught most of them for the 1.5.2
> release, but I think there are more lurking around. Can anyone offer a
> way to systematically find at least the test scripts which access the
> internet, if not the specific tests within?
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 08:55:53 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 07:55:53 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
Message-ID: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>

On Jun 18, 2007, at 7:44 AM, Hilmar Lapp wrote:

> Just curious - how do you cvs commit then to an external repository?
> Is that open in the firewall?
>
> It is true though that corporations typically will not permit any
> encrypted outgoing traffic through their firewall except https.
> sf.net only supports https for svn, AFAIK.
>
> 	-hilmar

If so it may be better to allow https, though I don't know how Chris  
D. and others feel about it.

Did we make a decision as to the fate of cvs if we get svn up-and- 
running?  Keep it around (assuming svn commits would be carried over  
to cvs and vice versa)?  Or see what happens over time?

chris


From sdavis2 at mail.nih.gov  Mon Jun 18 09:05:50 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 18 Jun 2007 09:05:50 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
Message-ID: <4676832E.5080704@mail.nih.gov>

aaron.j.mackey at gsk.com wrote:
>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>
>>> As for access, the typical access is over http (or https).
>> We're using svn+ssh here (NESCent)
> 
> Let me just note that https is preferable to ssh for those poor slobs 
> stuck behind a corporate firewall (svn happily prompts me for my proxy 
> server's user/pass, then my https authentication realm's user/pass - all 
> then get cached in some .svn/ file that I don't have to worry about again 
> until my proxy server password changes once a month ...)

That would be my suggestion as well (although I added it only
parenthetically).

Sean


From hlapp at gmx.net  Mon Jun 18 09:13:27 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 09:13:27 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
Message-ID: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>


On Jun 18, 2007, at 8:55 AM, Chris Fields wrote:

> Did we make a decision as to the fate of cvs if we get svn up-and- 
> running?  Keep it around (assuming svn commits would be carried  
> over to cvs and vice versa)?  Or see what happens over time?

Let's not plan for having cvs and svn writable repositories in  
parallel - that would create an administrative nightmare. Once the  
tests complete, there'll be a clean cut-over.

What Jason suggested is to try and continue a read-only (anonymous)  
cvs repository, updated from the svn repository that the developers  
use, aside from an anonymous svn repository mirroring the writable  
one. This would primarily be for maintaining working URLs for those  
folks who http-linked into the anonymous cvs repository. What I added  
earlier is that even if that fails to be feasible, you can achieve  
the goal using some small CGI script and apache redirect to map CVS- 
style links to the anonymous svn repository.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 09:31:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 08:31:35 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>
Message-ID: <0E64DBD0-BBE9-411A-A146-70236EF558BB@uiuc.edu>


On Jun 18, 2007, at 8:13 AM, Hilmar Lapp wrote:

>
> On Jun 18, 2007, at 8:55 AM, Chris Fields wrote:
>
>> Did we make a decision as to the fate of cvs if we get svn up-and- 
>> running?  Keep it around (assuming svn commits would be carried  
>> over to cvs and vice versa)?  Or see what happens over time?
>
> Let's not plan for having cvs and svn writable repositories in  
> parallel - that would create an administrative nightmare. Once the  
> tests complete, there'll be a clean cut-over.

My thoughts as well.  Much simpler.

> What Jason suggested is to try and continue a read-only (anonymous)  
> cvs repository, updated from the svn repository that the developers  
> use, aside from an anonymous svn repository mirroring the writable  
> one. This would primarily be for maintaining working URLs for those  
> folks who http-linked into the anonymous cvs repository. What I  
> added earlier is that even if that fails to be feasible, you can  
> achieve the goal using some small CGI script and apache redirect to  
> map CVS-style links to the anonymous svn repository.
>
> 	-hilmar

I like the idea of a read-only cvs or a 'faux' cvs, though the former  
would initially be easier as we already have it available.  We could  
just lock it down at some switchover point to read-only (something I  
think Jason also suggested).

chris


From bix at sendu.me.uk  Mon Jun 18 09:13:33 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 14:13:33 +0100
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
Message-ID: <467684FD.3080300@sendu.me.uk>

Chris Fields wrote:
> 
> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>> If its going to be difficult and a hassle, for such an unnecessary 
>> thing I'm not sure its worth it. There are more pressing things to be 
>> done for Bioperl.
>>
>> If I can just run perltidy on the entire package and commit, I'd do 
>> it. If that's not appropriate, I won't.
> 
> The choices aren't necessarily all or nothing.  What about voluntary, 
> recommended use of a perltidy config file included with the 
> distribution, with additional 'caveats'?

I'm happy with that idea. Why not come up with something and make it 
available for us to try out?


Cheers,
Sendu.


From bix at sendu.me.uk  Mon Jun 18 09:26:36 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 14:26:36 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
Message-ID: <4676880C.9030009@sendu.me.uk>

Chris Fields wrote:
> If so it may be better to allow https, though I don't know how Chris  
> D. and others feel about it.

If it makes no difference to me as an end-user, I won't mind. But I 
won't want to enter my password even once, at the beginning of a 
session. If that's not possible with https, then ssh should be an option 
as well.


Unrelated, but it randomly just occurred to me: what happens to all the 
id lines at the top of modules? Eg:

$Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $

That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
I wish we would, since they caused me no end of hassles during the 1.5.2 
release, doing updates across branches.)


> Did we make a decision as to the fate of cvs if we get svn up-and- 
> running?  Keep it around (assuming svn commits would be carried over  
> to cvs and vice versa)?  Or see what happens over time?

Well, I don't think hard decisions are possible until we know how its 
going to work in practice. I tried setting up my own svn repository 
once, but didn't keep it and can't remember much about it.

So, I suppose we'll play it by ear and decide things later. Is someone 
out there actively doing something leading toward a demonstration of how 
it will be?


From cjfields at uiuc.edu  Mon Jun 18 09:58:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 08:58:34 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467684FD.3080300@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
Message-ID: <DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>


On Jun 18, 2007, at 8:13 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary
>>> thing I'm not sure its worth it. There are more pressing things  
>>> to be
>>> done for Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd do
>>> it. If that's not appropriate, I won't.
>>
>> The choices aren't necessarily all or nothing.  What about voluntary,
>> recommended use of a perltidy config file included with the
>> distribution, with additional 'caveats'?
>
> I'm happy with that idea. Why not come up with something and make it
> available for us to try out?
>
>
> Cheers,
> Sendu.

Will do.  Maybe something that conforms to PBP; there's a PBP  
perltidy config on perlmonks, along with some emacs/vim related bits:

http://www.perlmonks.org/?node_id=516501

chris


From sdavis2 at mail.nih.gov  Mon Jun 18 10:03:35 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 18 Jun 2007 10:03:35 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4676880C.9030009@sendu.me.uk>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
Message-ID: <467690B7.7090105@mail.nih.gov>

Sendu Bala wrote:
> Chris Fields wrote:
>> If so it may be better to allow https, though I don't know how Chris  
>> D. and others feel about it.
> 
> If it makes no difference to me as an end-user, I won't mind. But I 
> won't want to enter my password even once, at the beginning of a 
> session. If that's not possible with https, then ssh should be an option 
> as well.
> 
> 
> Unrelated, but it randomly just occurred to me: what happens to all the 
> id lines at the top of modules? Eg:
> 
> $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $
> 
> That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
> I wish we would, since they caused me no end of hassles during the 1.5.2 
> release, doing updates across branches.)

See here:

http://svnbook.red-bean.com/en/1.0/ch07s02.html

Check out the section at the bottom having to do with svn:keywords.

Sean


From akarger at CGR.Harvard.edu  Mon Jun 18 10:10:57 2007
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 18 Jun 2007 10:10:57 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <46751EC7.8020609@sheffield.ac.uk>
References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
	<46751EC7.8020609@sheffield.ac.uk>
Message-ID: <B9182BFF5B004245BABC12956EA6322E04AFA6BC@huls5.nucleus.harvard.edu>

 
> Just to clarify, subversion is available as command line for windows:
> http://subversion.tigris.org/project_packages.html
> 
> TortoiseSVN is another svn client with a GUI that integrates into the
> shell. I tried setting this up a while back to use ssh (via 
> PUTTY), but
> I wasn't successful. This may have been due to me just 
> starting out with
> svn or that it was harder to setup in an earlier version of 
> TortoiseSVN.
> 
> Does anyone have experience of setting up svn on Windows to 
> use ssh? If
> the changeover takes place, I'm happy to write some howto's 
> for setting
> up svn clients for Windows.

Here are some notes I wrote recently. I'm using this with command-line
svn, not TortoiseSVN. I would hope that it would work with Tortoise,
too, but I can't guarantee.

1. Run PuTTYgen (installed with PuTTY, probably in Start
menu->Programs->PuTTY) and follow directions to create a private key
file like C:\someplace\private_key.ppk and a public key. At this point,
you'll pick an ssh password, which is separate from your login password.

2. Get an account with the appropriate .ssh/authorized_keys file on the
host machine. (This is not Windows-specific. By the way, if you change
the lines of the authorized_keys file to start with, e.g., 
	command="svnserve -t -r /main/repos/dir",no-pty ssh-rsa AAAAB...
comment
then (a) you're more secure because users can't open a real shell on the
computer, and (b) users don't need to type the repository directory in
their svn co commands.)

3. Set your environment variables (My Computer->Properties. Advanced
Tab, click on Environment Variables. In the top half ("User variables
for ..."), click "New" and put in the variable name and value.

3a. Set the SVN_EDITOR environment variable to your favorite editor,
such as vim or emacs, or a full path to some other editor. If it's not
set, then either VISUAL or EDITOR must be set.

3b. Set the SVN_SSH environment variable to run PuTTY's "plink" program,
which is the Windows equivalent of command-line ssh. If you installed
PuTTY in the default location, set it to "C:/Program
Files/PuTTY/plink.exe". Note 1: use FORWARD slashes. Note 2: Include the
quotation marks in the environment variable.

4. When you want to start using svn, you'll need to run Pageant (Start
menu->Programs->PuTTY), select "Add Key", browse to your private key
file, and enter the ssh password you chose in step 1 (not your login
password). Pageant will stay running until you quit it or logout, so you
can have multiple svn checkins etc., and you only need to type in your
password once.

5. Now just run command-line svn commands the same way you would on UNIX
(modulo Windows' brain-dead shell).

-Amir Karger


From cjfields at uiuc.edu  Mon Jun 18 10:24:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 09:24:00 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4676880C.9030009@sendu.me.uk>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
Message-ID: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>

On Jun 18, 2007, at 8:26 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> If so it may be better to allow https, though I don't know how  
>> Chris  D. and others feel about it.
>
> If it makes no difference to me as an end-user, I won't mind. But I  
> won't want to enter my password even once, at the beginning of a  
> session. If that's not possible with https, then ssh should be an  
> option as well.

Aaron pointed out in a related post that https access is the  
preferred option behind a corporate firewall (svn prompts for proxy  
user/pass, then caches it).  Not sure how Jason/Hilmar/Chris D. feel  
about https or supporting both https+ssh.

...

>> Did we make a decision as to the fate of cvs if we get svn up-and-  
>> running?  Keep it around (assuming svn commits would be carried  
>> over  to cvs and vice versa)?  Or see what happens over time?
>
> Well, I don't think hard decisions are possible until we know how  
> its going to work in practice. I tried setting up my own svn  
> repository once, but didn't keep it and can't remember much about it.

Agree; we'll need to work out specifics once we know how things work  
out using cvs2svn.  I think the idea is to test using a smaller  
distribution (maybe network or db) and move up from there.

> So, I suppose we'll play it by ear and decide things later. Is  
> someone out there actively doing something leading toward a  
> demonstration of how it will be?

George Hartzell is going to test it out, I believe, and will post  
something when he can.

chris


From dmessina at wustl.edu  Mon Jun 18 10:54:31 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 18 Jun 2007 09:54:31 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
	<DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
Message-ID: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>

[Chris F]
> Will do.  Maybe something that conforms to PBP; there's a PBP
> perltidy config on perlmonks, along with some emacs/vim related bits:
>
> http://www.perlmonks.org/?node_id=516501


FYI, perltidy now has a built-in -pbp flag:

[from perltidy-20070508]
> -pbp, --perl-best-practices
> -pbp is an abbreviation for the parameters in the book Perl Best  
> Practices by Damian Conway:
>
>     -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1  
> -nsfs -nolq
>     -wbb="% + - * / x != == >= <= =~ !~ < > | & =
>           **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x="
> Note that the -st and -se flags make perltidy act as a filter on  
> one file only. These can be overridden with -nst and -nse if  
> necessary.
>
[full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ 
bin/perltidy]


Dave


From dmessina at wustl.edu  Mon Jun 18 11:04:10 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 18 Jun 2007 10:04:10 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>

Awesome, Sendu! Really glad you implemented this.


> Can anyone offer a
> way to systematically find at least the test scripts which access the
> internet, if not the specific tests within?

I think tests would be accessing the net indirectly through a BioPerl  
module (which may also be using indirect access), so it'd be hard to  
come up with a universal glob for that.

However:

	% grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l
	108

	% ls -1 bioperl-live/t | wc -l
	248

Less than half of the test files use BIOPERLDEBUG, so that narrows  
down the possibilities...

Dave


From bix at sendu.me.uk  Mon Jun 18 11:09:19 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 16:09:19 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
Message-ID: <4676A01F.30205@sendu.me.uk>

David Messina wrote:
>> Can anyone offer a
>> way to systematically find at least the test scripts which access the
>> internet, if not the specific tests within?
> 
> I think tests would be accessing the net indirectly through a BioPerl 
> module (which may also be using indirect access), so it'd be hard to 
> come up with a universal glob for that.
> 
> However:
> 
>     % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l
>     108
> 
>     % ls -1 bioperl-live/t | wc -l
>     248
> 
> Less than half of the test files use BIOPERLDEBUG, so that narrows down 
> the possibilities...

Not necessarily. The problem is that there may be test scripts that have 
never even tried to skip network tests, and therefore don't use 
BIOPERLDEBUG. (Or that chose their own way to decide when to skip.)

I was thinking along the lines of, does anyone know how to monitor 
accesses to the network card (or equivalent), getting information on 
which program (test script) requested the access?


From cjfields at uiuc.edu  Mon Jun 18 11:41:28 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 10:41:28 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
	<DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
	<67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>
Message-ID: <B3EDFCDD-0F3D-47C8-B3A8-A428F24B265A@uiuc.edu>


On Jun 18, 2007, at 9:54 AM, David Messina wrote:

> [Chris F]
>> Will do.  Maybe something that conforms to PBP; there's a PBP
>> perltidy config on perlmonks, along with some emacs/vim related bits:
>>
>> http://www.perlmonks.org/?node_id=516501
>
>
> FYI, perltidy now has a built-in -pbp flag:
>
> [from perltidy-20070508]
>> -pbp, --perl-best-practices
>> -pbp is an abbreviation for the parameters in the book Perl Best
>> Practices by Damian Conway:
>>
>>     -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1
>> -nsfs -nolq
>>     -wbb="% + - * / x != == >= <= =~ !~ < > | & =
>>           **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x="
>> Note that the -st and -se flags make perltidy act as a filter on
>> one file only. These can be overridden with -nst and -nse if
>> necessary.
>>
> [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/
> bin/perltidy]
>
>
> Dave

<slaps head>  Makes sense that would eventually be incorporated.

If so there's no need to include a config (unless we want to sway  
away from PBP-style).  We can just recommend everyone use that setting.

chris


From cjfields at uiuc.edu  Mon Jun 18 12:06:26 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 11:06:26 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676A01F.30205@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
Message-ID: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>


On Jun 18, 2007, at 10:09 AM, Sendu Bala wrote:

> David Messina wrote:
>>> ...
>> Less than half of the test files use BIOPERLDEBUG, so that narrows  
>> down
>> the possibilities...
>
> Not necessarily. The problem is that there may be test scripts that  
> have
> never even tried to skip network tests, and therefore don't use
> BIOPERLDEBUG. (Or that chose their own way to decide when to skip.)
>
> I was thinking along the lines of, does anyone know how to monitor
> accesses to the network card (or equivalent), getting information on
> which program (test script) requested the access?

EUtilities.t uses network tests predominately.  I'll switch over when  
I commit everything from the overhaul.

Couldn't you enable BIOPERLDEBUG, disable network access, then  
iterate through tests checking for those which fail or skip?  I think  
Test::Harness has a way to do this, using execute_tests().

chris


From bix at sendu.me.uk  Mon Jun 18 12:34:38 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 17:34:38 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
Message-ID: <4676B41E.3050706@sendu.me.uk>

Chris Fields wrote:
> Couldn't you enable BIOPERLDEBUG, disable network access, then iterate 
> through tests checking for those which fail or skip?

Yes, good idea, though my dev machine is also my email/webserver so I'd 
rather come up with an alternate solution than one involving 'disable 
network access'.

Still, that's what I'll probably end up doing. Cheers!


Oh, Chris, Spiros, how goes the Test::More conversion? I might want to 
wait for you to finish, or join in? If you're not going to have time to 
do any more in the next few weeks, can you please update 
http://www.bioperl.org/wiki/TestMoreProgress removing your name (or in 
the opposite case, add your name in)? Its not quite clear to me which 
tests are assigned to whom. Can someone clarify what the markings mean?

Cheers,
Sendu.


From cjfields at uiuc.edu  Mon Jun 18 12:43:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 11:43:31 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676B41E.3050706@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
	<4676B41E.3050706@sendu.me.uk>
Message-ID: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>


On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> Couldn't you enable BIOPERLDEBUG, disable network access, then  
>> iterate through tests checking for those which fail or skip?
>
> Yes, good idea, though my dev machine is also my email/webserver so  
> I'd rather come up with an alternate solution than one involving  
> 'disable network access'.
>
> Still, that's what I'll probably end up doing. Cheers!
>
>
> Oh, Chris, Spiros, how goes the Test::More conversion? I might want  
> to wait for you to finish, or join in? If you're not going to have  
> time to do any more in the next few weeks, can you please update  
> http://www.bioperl.org/wiki/TestMoreProgress removing your name (or  
> in the opposite case, add your name in)? Its not quite clear to me  
> which tests are assigned to whom. Can someone clarify what the  
> markings mean?
>
> Cheers,
> Sendu.

Not sure how far along spiros is; I handed it over after I finished  
up to the 'Q' tests.  In general the ones marked out have been  
converted over, ones with names next to them have been claimed.  If  
you need help I'll prob. start back up again to finish them off; we  
just need to divy them up.

chris


From george.heller at yahoo.com  Mon Jun 18 13:07:59 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 10:07:59 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net>
Message-ID: <218165.62089.qm@web56505.mail.re3.yahoo.com>

What exactly is the "node n" in the query below. When I issue this query, it says, 
   
  relation "node" does not exist.
   
  I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line,
   
  shift->throw_not_implemented();
   
  Thanks.
  George.

Hilmar Lapp <hlapp at gmx.net> wrote:
  I'm a bit confused - it sounds like you have set up a local BioSQL 
database and loaded the NCBI taxonomy into the database. You can now 
use simple SQL to retrieve all descendants of a node in the tree 
given its NCBI taxonID such as

SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
WHERE
n.ncbi_taxon_id = :taxonID
AND tn.left_value > n. left_value
AND tn.right_value < n.right_value
AND tn.taxon_id = tnm.taxon_id
AND tn.name_class = 'scientific_name'

BioPerl doesn't have a Taxonomy::biosql module yet (though this would 
seem like a worthwhile thing to add), so you can't use the 
Bio::DB::Taxonomy interface to do this against a BioSQL instance.

However, BioPerl does have support for the flat-file download of the 
NCBI taxonomy database and indexes it, so you can simply use 
Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download 
to achieve what you wanted to do in a less than 5 lines of perl.

Although the recursive implementation of Taxonomy::get_all_Descendants 
() won't be lightning fast, it may still be perfectly fine for your 
application - are you sure it is not?

-hilmar

On Jun 18, 2007, at 12:21 AM, George Heller wrote:

> Thanks. And how can I assign the $node here in the below code, such 
> that I can reference it to a particular taxon id record? I want to 
> retrieve all the descendents from the taxonomy hierarchy, given a 
> particular taxon id.
>
> I have a local db setup, in which I have uploaded data using the 
> load_ncbi_taxonomy.pl script.
>
> Thanks.
> George
>
> Jason Stajich wrote:
> I assume you already figured out how to setup a local taxonomydb?
>
>
> You just want the extant species/leaves of the tree
>
>
> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>
>
>
> -jason
> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
> Hi all,
>
>
> Can anyone point me to some example that uses the 
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at 
> this, and I am not quite sure how to implement it.
>
>
> Thanks.
> George
>
>
> Sendu Bala wrote:
> George Heller wrote:
> Hi all,
>
>
> I am looking at extracting the taxonomy hierarchy for some taxon 
> ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
>
>
> Any ideas on the way I can go about doing this?
>
>
> Well, you'll use Bio::DB::Taxonomy presumably, and 
> each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
>
>
> If you happen to code up something neat and efficient, why not 
> share it
> with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
> ---------------------------------
> Shape Yahoo! in your own image. Join our Network Research Panel 
> today!
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================


---------------------------------
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 


From jason at bioperl.org  Mon Jun 18 13:53:28 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 10:53:28 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com>
References: <218165.62089.qm@web56505.mail.re3.yahoo.com>
Message-ID: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org>

It is implemented in the implementing class - DB::Taxonomy is just  
the base class. For example see the flatfile implementation  
Bio::DB::Taxonomy::flatfile

See the scripts/taxa/local_taxonomydb_query.PLS for example using it:
nodes and names are from NCBI taxonomy database.

Here is an un-debugged copy+paste for your question that *should* work.

use Bio::DB::Taxonomy
my $idx_dir = '/tmp';

my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                -nodesfile => $nodesfile,
                                -namesfile => $namesfile,
                                -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;


-jason

On Jun 18, 2007, at 10:07 AM, George Heller wrote:

> What exactly is the "node n" in the query below. When I issue this  
> query, it says,
>
>   relation "node" does not exist.
>
>   I tried to use the get_all_Descendents method but it looks like  
> in order to do a recursive call it calls the method  
> each_Descendent. This method is not implemented in  
> Bio::DB::Taxonomy. It just has a single line,
>
>   shift->throw_not_implemented();
>
>   Thanks.
>   George.
>
> Hilmar Lapp <hlapp at gmx.net> wrote:
>   I'm a bit confused - it sounds like you have set up a local BioSQL
> database and loaded the NCBI taxonomy into the database. You can now
> use simple SQL to retrieve all descendants of a node in the tree
> given its NCBI taxonID such as
>
> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
> WHERE
> n.ncbi_taxon_id = :taxonID
> AND tn.left_value > n. left_value
> AND tn.right_value < n.right_value
> AND tn.taxon_id = tnm.taxon_id
> AND tn.name_class = 'scientific_name'
>
> BioPerl doesn't have a Taxonomy::biosql module yet (though this would
> seem like a worthwhile thing to add), so you can't use the
> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
> However, BioPerl does have support for the flat-file download of the
> NCBI taxonomy database and indexes it, so you can simply use
> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download
> to achieve what you wanted to do in a less than 5 lines of perl.
>
> Although the recursive implementation of Taxonomy::get_all_Descendants
> () won't be lightning fast, it may still be perfectly fine for your
> application - are you sure it is not?
>
> -hilmar
>
> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>> Thanks. And how can I assign the $node here in the below code, such
>> that I can reference it to a particular taxon id record? I want to
>> retrieve all the descendents from the taxonomy hierarchy, given a
>> particular taxon id.
>>
>> I have a local db setup, in which I have uploaded data using the
>> load_ncbi_taxonomy.pl script.
>>
>> Thanks.
>> George
>>
>> Jason Stajich wrote:
>> I assume you already figured out how to setup a local taxonomydb?
>>
>>
>> You just want the extant species/leaves of the tree
>>
>>
>> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>>
>>
>>
>> -jason
>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>> Hi all,
>>
>>
>> Can anyone point me to some example that uses the
>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>> this, and I am not quite sure how to implement it.
>>
>>
>> Thanks.
>> George
>>
>>
>> Sendu Bala wrote:
>> George Heller wrote:
>> Hi all,
>>
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon
>> ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children and so
>> on.
>>
>>
>> Any ideas on the way I can go about doing this?
>>
>>
>> Well, you'll use Bio::DB::Taxonomy presumably, and
>> each_Descendent in
>> some kind of looping structure. Most easily a recursing sub.
>>
>>
>> If you happen to code up something neat and efficient, why not
>> share it
>> with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Shape Yahoo! in your own image. Join our Network Research Panel
>> today!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
>
>
>
> ---------------------------------
> Take the Internet to Go: Yahoo!Go puts the Internet in your pocket:  
> mail, news, photos & more.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hlapp at gmx.net  Mon Jun 18 18:10:00 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:10:00 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
	<278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>
Message-ID: <989DBD68-896E-4FB9-9413-4A1060E88ABD@gmx.net>

https is working fine for me for sf.net repositories, and I only have  
to enter the password upon first commit (since checkout doesn't even  
need a password).

	-hilmar

On Jun 18, 2007, at 10:24 AM, Chris Fields wrote:

> Not sure how Jason/Hilmar/Chris D. feel about https or supporting  
> both https+ssh

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From george.heller at yahoo.com  Mon Jun 18 18:18:21 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 15:18:21 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org>
Message-ID: <904670.24974.qm@web56513.mail.re3.yahoo.com>

I tried running the below mentioned script and I seem to be getting the following error:
   
  Weak references are not implemented in the version of perl at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76.
Compilation failed in require at my.pl line 7.
BEGIN failed--compilation aborted at my.pl line 7.

  My script looks something like,
   
  #!/usr/bin/perl
  use strict;
#use warnings;
use DBI;
  use Bio::Tree::Node;
use Bio::DB::Taxonomy;
use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
  
my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                               -nodesfile => $nodesfile,
                               -namesfile => $namesfile,
                               -directory => $idx_dir);
 my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  
      foreach $field (@extant_children) {
         print "$field";
         print "|";
         print "\n";
      }

  And I am running the script using the command,
   
  perl myscript.pl -v --names names.dmp --nodes nodes.dmp
   
  and I have the nodes.dmp and names.dmp files in the current directory.
   
  Thanks,
  George
  

Jason Stajich <jason at bioperl.org> wrote:
  It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile  

  See the scripts/taxa/local_taxonomydb_query.PLS for example using it:
  nodes and names are from NCBI taxonomy database.
  

  Here is an un-debugged copy+paste for your question that *should* work.
  

  use Bio::DB::Taxonomy
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
    my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                 -nodesfile => $nodesfile,
                                 -namesfile => $namesfile,
                                 -directory => $idx_dir);
     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  -jason

    On Jun 18, 2007, at 10:07 AM, George Heller wrote:

    What exactly is the "node n" in the query below. When I issue this query, it says, 
  

    relation "node" does not exist.
  

    I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line,
  

    shift->throw_not_implemented();
  

    Thanks.
    George.
  

  Hilmar Lapp <hlapp at gmx.net> wrote:
    I'm a bit confused - it sounds like you have set up a local BioSQL 
  database and loaded the NCBI taxonomy into the database. You can now 
  use simple SQL to retrieve all descendants of a node in the tree 
  given its NCBI taxonID such as
  

  SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
  WHERE
  n.ncbi_taxon_id = :taxonID
  AND tn.left_value > n. left_value
  AND tn.right_value < n.right_value
  AND tn.taxon_id = tnm.taxon_id
  AND tn.name_class = 'scientific_name'
  

  BioPerl doesn't have a Taxonomy::biosql module yet (though this would 
  seem like a worthwhile thing to add), so you can't use the 
  Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

  However, BioPerl does have support for the flat-file download of the 
  NCBI taxonomy database and indexes it, so you can simply use 
  Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download 
  to achieve what you wanted to do in a less than 5 lines of perl.
  

  Although the recursive implementation of Taxonomy::get_all_Descendants 
  () won't be lightning fast, it may still be perfectly fine for your 
  application - are you sure it is not?
  

  -hilmar
  

  On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

    Thanks. And how can I assign the $node here in the below code, such 
  that I can reference it to a particular taxon id record? I want to 
  retrieve all the descendents from the taxonomy hierarchy, given a 
  particular taxon id.
  

  I have a local db setup, in which I have uploaded data using the 
  load_ncbi_taxonomy.pl script.
  

  Thanks.
  George
  

  Jason Stajich wrote:
  I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
  

  -jason
  On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

  Hi all,
  

  Can anyone point me to some example that uses the 
  get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at 
  this, and I am not quite sure how to implement it.
  

  Thanks.
  George
  

  Sendu Bala wrote:
  George Heller wrote:
  Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon 
  ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and 
  each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not 
  share it
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image. Join our Network Research Panel 
  today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Need a vacation? Get great deals to amazing places on Yahoo! Travel.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  -- 
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Bored stiff? Loosen up...
Download and play hundreds of games for free on Yahoo! Games.


From hlapp at gmx.net  Mon Jun 18 18:27:19 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:27:19 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com>
References: <218165.62089.qm@web56505.mail.re3.yahoo.com>
Message-ID: <DEB0D23B-4FEC-418A-8AAB-FF4CBF4DAF65@gmx.net>


On Jun 18, 2007, at 1:07 PM, George Heller wrote:

> What exactly is the "node n" in the query below. When I issue this  
> query, it says,

Sorry, replace with "taxon". Jason answered the rest.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 18:33:40 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 17:33:40 -0500
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <904670.24974.qm@web56513.mail.re3.yahoo.com>
References: <904670.24974.qm@web56513.mail.re3.yahoo.com>
Message-ID: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>

As the error implies your local version of perl doesn't seem support  
weak references, which means it doesn't have Scalar::Utils (which was  
added to core after perl 5.6.1, I think).  Try installing  
Scalar::Utils to see what happens.

chris

On Jun 18, 2007, at 5:18 PM, George Heller wrote:

> I tried running the below mentioned script and I seem to be getting  
> the following error:
>
>   Weak references are not implemented in the version of perl at / 
> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ 
> Bio/Tree/Node.pm line 76.
> Compilation failed in require at my.pl line 7.
> BEGIN failed--compilation aborted at my.pl line 7.
>
>   My script looks something like,
>
>   #!/usr/bin/perl
>   use strict;
> #use warnings;
> use DBI;
>   use Bio::Tree::Node;
> use Bio::DB::Taxonomy;
> use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
>
> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                -nodesfile => $nodesfile,
>                                -namesfile => $namesfile,
>                                -directory => $idx_dir);
>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>       foreach $field (@extant_children) {
>          print "$field";
>          print "|";
>          print "\n";
>       }
>
>   And I am running the script using the command,
>
>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>   and I have the nodes.dmp and names.dmp files in the current  
> directory.
>
>   Thanks,
>   George
>
>
> Jason Stajich <jason at bioperl.org> wrote:
>   It is implemented in the implementing class - DB::Taxonomy is  
> just the base class. For example see the flatfile implementation  
> Bio::DB::Taxonomy::flatfile
>
>   See the scripts/taxa/local_taxonomydb_query.PLS for example using  
> it:
>   nodes and names are from NCBI taxonomy database.
>
>
>   Here is an un-debugged copy+paste for your question that *should*  
> work.
>
>
>   use Bio::DB::Taxonomy
>   my $idx_dir = '/tmp';
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                  -nodesfile => $nodesfile,
>                                  -namesfile => $namesfile,
>                                  -directory => $idx_dir);
>      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>
>
>
>   -jason
>
>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>     What exactly is the "node n" in the query below. When I issue  
> this query, it says,
>
>
>     relation "node" does not exist.
>
>
>     I tried to use the get_all_Descendents method but it looks like  
> in order to do a recursive call it calls the method  
> each_Descendent. This method is not implemented in  
> Bio::DB::Taxonomy. It just has a single line,
>
>
>     shift->throw_not_implemented();
>
>
>     Thanks.
>     George.
>
>
>   Hilmar Lapp <hlapp at gmx.net> wrote:
>     I'm a bit confused - it sounds like you have set up a local BioSQL
>   database and loaded the NCBI taxonomy into the database. You can now
>   use simple SQL to retrieve all descendants of a node in the tree
>   given its NCBI taxonID such as
>
>
>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>   WHERE
>   n.ncbi_taxon_id = :taxonID
>   AND tn.left_value > n. left_value
>   AND tn.right_value < n.right_value
>   AND tn.taxon_id = tnm.taxon_id
>   AND tn.name_class = 'scientific_name'
>
>
>   BioPerl doesn't have a Taxonomy::biosql module yet (though this  
> would
>   seem like a worthwhile thing to add), so you can't use the
>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>   However, BioPerl does have support for the flat-file download of the
>   NCBI taxonomy database and indexes it, so you can simply use
>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile  
> download
>   to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>   Although the recursive implementation of  
> Taxonomy::get_all_Descendants
>   () won't be lightning fast, it may still be perfectly fine for your
>   application - are you sure it is not?
>
>
>   -hilmar
>
>
>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>     Thanks. And how can I assign the $node here in the below code,  
> such
>   that I can reference it to a particular taxon id record? I want to
>   retrieve all the descendents from the taxonomy hierarchy, given a
>   particular taxon id.
>
>
>   I have a local db setup, in which I have uploaded data using the
>   load_ncbi_taxonomy.pl script.
>
>
>   Thanks.
>   George
>
>
>   Jason Stajich wrote:
>   I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>   You just want the extant species/leaves of the tree
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descedents;
>
>
>
>
>
>
>   -jason
>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>   Hi all,
>
>
>
>
>   Can anyone point me to some example that uses the
>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>   this, and I am not quite sure how to implement it.
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>   Sendu Bala wrote:
>   George Heller wrote:
>   Hi all,
>
>
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon
>   ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children and so
>   on.
>
>
>
>
>   Any ideas on the way I can go about doing this?
>
>
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and
>   each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>   If you happen to code up something neat and efficient, why not
>   share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image. Join our Network Research Panel
>   today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Need a vacation? Get great deals to amazing places on Yahoo! Travel.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Take the Internet to Go: Yahoo!Go puts the Internet in your  
> pocket: mail, news, photos & more.
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Bored stiff? Loosen up...
> Download and play hundreds of games for free on Yahoo! Games.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Mon Jun 18 18:50:38 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:50:38 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>
References: <904670.24974.qm@web56513.mail.re3.yahoo.com>
	<707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>
Message-ID: <F433CCB4-781D-480E-8EF5-CD68E70B27B8@gmx.net>

The perl version appears to be 5.8.5 though, so something strange  
appears to be going on too.

George, can you please post the output of

	$ /usr/bin/perl -V

-hilmar

On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:

> As the error implies your local version of perl doesn't seem support
> weak references, which means it doesn't have Scalar::Utils (which was
> added to core after perl 5.6.1, I think).  Try installing
> Scalar::Utils to see what happens.
>
> chris
>
> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>> I tried running the below mentioned script and I seem to be getting
>> the following error:
>>
>>   Weak references are not implemented in the version of perl at /
>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>> Bio/Tree/Node.pm line 76.
>> Compilation failed in require at my.pl line 7.
>> BEGIN failed--compilation aborted at my.pl line 7.
>>
>>   My script looks something like,
>>
>>   #!/usr/bin/perl
>>   use strict;
>> #use warnings;
>> use DBI;
>>   use Bio::Tree::Node;
>> use Bio::DB::Taxonomy;
>> use Bio::DB::Taxonomy::flatfile;
>>   my $idx_dir = '/tmp';
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>>                                -nodesfile => $nodesfile,
>>                                -namesfile => $namesfile,
>>                                -directory => $idx_dir);
>>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>  my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>       foreach $field (@extant_children) {
>>          print "$field";
>>          print "|";
>>          print "\n";
>>       }
>>
>>   And I am running the script using the command,
>>
>>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>
>>   and I have the nodes.dmp and names.dmp files in the current
>> directory.
>>
>>   Thanks,
>>   George
>>
>>
>> Jason Stajich <jason at bioperl.org> wrote:
>>   It is implemented in the implementing class - DB::Taxonomy is
>> just the base class. For example see the flatfile implementation
>> Bio::DB::Taxonomy::flatfile
>>
>>   See the scripts/taxa/local_taxonomydb_query.PLS for example using
>> it:
>>   nodes and names are from NCBI taxonomy database.
>>
>>
>>   Here is an un-debugged copy+paste for your question that *should*
>> work.
>>
>>
>>   use Bio::DB::Taxonomy
>>   my $idx_dir = '/tmp';
>>
>>
>>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>>     my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>>                                  -nodesfile => $nodesfile,
>>                                  -namesfile => $namesfile,
>>                                  -directory => $idx_dir);
>>      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>  my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>
>>
>>
>>   -jason
>>
>>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>
>>     What exactly is the "node n" in the query below. When I issue
>> this query, it says,
>>
>>
>>     relation "node" does not exist.
>>
>>
>>     I tried to use the get_all_Descendents method but it looks like
>> in order to do a recursive call it calls the method
>> each_Descendent. This method is not implemented in
>> Bio::DB::Taxonomy. It just has a single line,
>>
>>
>>     shift->throw_not_implemented();
>>
>>
>>     Thanks.
>>     George.
>>
>>
>>   Hilmar Lapp <hlapp at gmx.net> wrote:
>>     I'm a bit confused - it sounds like you have set up a local  
>> BioSQL
>>   database and loaded the NCBI taxonomy into the database. You can  
>> now
>>   use simple SQL to retrieve all descendants of a node in the tree
>>   given its NCBI taxonID such as
>>
>>
>>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>>   WHERE
>>   n.ncbi_taxon_id = :taxonID
>>   AND tn.left_value > n. left_value
>>   AND tn.right_value < n.right_value
>>   AND tn.taxon_id = tnm.taxon_id
>>   AND tn.name_class = 'scientific_name'
>>
>>
>>   BioPerl doesn't have a Taxonomy::biosql module yet (though this
>> would
>>   seem like a worthwhile thing to add), so you can't use the
>>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>
>>
>>   However, BioPerl does have support for the flat-file download of  
>> the
>>   NCBI taxonomy database and indexes it, so you can simply use
>>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>> download
>>   to achieve what you wanted to do in a less than 5 lines of perl.
>>
>>
>>   Although the recursive implementation of
>> Taxonomy::get_all_Descendants
>>   () won't be lightning fast, it may still be perfectly fine for your
>>   application - are you sure it is not?
>>
>>
>>   -hilmar
>>
>>
>>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>
>>
>>     Thanks. And how can I assign the $node here in the below code,
>> such
>>   that I can reference it to a particular taxon id record? I want to
>>   retrieve all the descendents from the taxonomy hierarchy, given a
>>   particular taxon id.
>>
>>
>>   I have a local db setup, in which I have uploaded data using the
>>   load_ncbi_taxonomy.pl script.
>>
>>
>>   Thanks.
>>   George
>>
>>
>>   Jason Stajich wrote:
>>   I assume you already figured out how to setup a local taxonomydb?
>>
>>
>>
>>
>>   You just want the extant species/leaves of the tree
>>
>>
>>
>>
>>   my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descedents;
>>
>>
>>
>>
>>
>>
>>   -jason
>>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>>
>>   Hi all,
>>
>>
>>
>>
>>   Can anyone point me to some example that uses the
>>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>>   this, and I am not quite sure how to implement it.
>>
>>
>>
>>
>>   Thanks.
>>   George
>>
>>
>>
>>
>>   Sendu Bala wrote:
>>   George Heller wrote:
>>   Hi all,
>>
>>
>>
>>
>>   I am looking at extracting the taxonomy hierarchy for some taxon
>>   ids.
>>   What I plan to do is, for a given taxon id, say 33090, I want to
>>   extract all taxon ids that are children of this species. I do not
>>   just want the immediate children, but the children's children  
>> and so
>>   on.
>>
>>
>>
>>
>>   Any ideas on the way I can go about doing this?
>>
>>
>>
>>
>>   Well, you'll use Bio::DB::Taxonomy presumably, and
>>   each_Descendent in
>>   some kind of looping structure. Most easily a recursing sub.
>>
>>
>>
>>
>>   If you happen to code up something neat and efficient, why not
>>   share it
>>   with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Shape Yahoo! in your own image. Join our Network Research Panel
>>   today!
>>   _______________________________________________
>>   Bioperl-l mailing list
>>   Bioperl-l at lists.open-bio.org
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>>
>>   --
>>   Jason Stajich
>>   jason at bioperl.org
>>   http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Need a vacation? Get great deals to amazing places on Yahoo!  
>> Travel.
>>   _______________________________________________
>>   Bioperl-l mailing list
>>   Bioperl-l at lists.open-bio.org
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>   --
>>   ===========================================================
>>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>>   ===========================================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Take the Internet to Go: Yahoo!Go puts the Internet in your
>> pocket: mail, news, photos & more.
>>
>>
>>     --
>>   Jason Stajich
>>   jason at bioperl.org
>>   http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Bored stiff? Loosen up...
>> Download and play hundreds of games for free on Yahoo! Games.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From george.heller at yahoo.com  Mon Jun 18 19:05:42 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 16:05:42 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F433CCB4-781D-480E-8EF5-CD68E70B27B8@gmx.net>
Message-ID: <706979.34648.qm@web56509.mail.re3.yahoo.com>

This is the output of /usr/bin/perl -V

Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
  Platform:
    osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
    uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.3.4'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  
Characteristics of this binary (from libperl):
  Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
  Built under linux
  Compiled at Jul 24 2006 18:28:10
  @INC:
    /usr/lib/perl5/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/5.8.5
    /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.5
    /usr/lib/perl5/site_perl/5.8.4
    /usr/lib/perl5/site_perl/5.8.3
    /usr/lib/perl5/site_perl/5.8.2
    /usr/lib/perl5/site_perl/5.8.1
    /usr/lib/perl5/site_perl/5.8.0
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.5
    /usr/lib/perl5/vendor_perl/5.8.4
    /usr/lib/perl5/vendor_perl/5.8.3
    /usr/lib/perl5/vendor_perl/5.8.2
    /usr/lib/perl5/vendor_perl/5.8.1
    /usr/lib/perl5/vendor_perl/5.8.0
    /usr/lib/perl5/vendor_perl
   
  Thanks.
  George
    .

Hilmar Lapp <hlapp at gmx.net> wrote:
  The perl version appears to be 5.8.5 though, so something strange 
appears to be going on too.

George, can you please post the output of

$ /usr/bin/perl -V

-hilmar

On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:

> As the error implies your local version of perl doesn't seem support
> weak references, which means it doesn't have Scalar::Utils (which was
> added to core after perl 5.6.1, I think). Try installing
> Scalar::Utils to see what happens.
>
> chris
>
> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>> I tried running the below mentioned script and I seem to be getting
>> the following error:
>>
>> Weak references are not implemented in the version of perl at /
>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>> Bio/Tree/Node.pm line 76.
>> Compilation failed in require at my.pl line 7.
>> BEGIN failed--compilation aborted at my.pl line 7.
>>
>> My script looks something like,
>>
>> #!/usr/bin/perl
>> use strict;
>> #use warnings;
>> use DBI;
>> use Bio::Tree::Node;
>> use Bio::DB::Taxonomy;
>> use Bio::DB::Taxonomy::flatfile;
>> my $idx_dir = '/tmp';
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>> -nodesfile => $nodesfile,
>> -namesfile => $namesfile,
>> -directory => $idx_dir);
>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>> foreach $field (@extant_children) {
>> print "$field";
>> print "|";
>> print "\n";
>> }
>>
>> And I am running the script using the command,
>>
>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>
>> and I have the nodes.dmp and names.dmp files in the current
>> directory.
>>
>> Thanks,
>> George
>>
>>
>> Jason Stajich wrote:
>> It is implemented in the implementing class - DB::Taxonomy is
>> just the base class. For example see the flatfile implementation
>> Bio::DB::Taxonomy::flatfile
>>
>> See the scripts/taxa/local_taxonomydb_query.PLS for example using
>> it:
>> nodes and names are from NCBI taxonomy database.
>>
>>
>> Here is an un-debugged copy+paste for your question that *should*
>> work.
>>
>>
>> use Bio::DB::Taxonomy
>> my $idx_dir = '/tmp';
>>
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>> -nodesfile => $nodesfile,
>> -namesfile => $namesfile,
>> -directory => $idx_dir);
>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>
>>
>>
>> -jason
>>
>> On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>
>> What exactly is the "node n" in the query below. When I issue
>> this query, it says,
>>
>>
>> relation "node" does not exist.
>>
>>
>> I tried to use the get_all_Descendents method but it looks like
>> in order to do a recursive call it calls the method
>> each_Descendent. This method is not implemented in
>> Bio::DB::Taxonomy. It just has a single line,
>>
>>
>> shift->throw_not_implemented();
>>
>>
>> Thanks.
>> George.
>>
>>
>> Hilmar Lapp wrote:
>> I'm a bit confused - it sounds like you have set up a local 
>> BioSQL
>> database and loaded the NCBI taxonomy into the database. You can 
>> now
>> use simple SQL to retrieve all descendants of a node in the tree
>> given its NCBI taxonID such as
>>
>>
>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>> WHERE
>> n.ncbi_taxon_id = :taxonID
>> AND tn.left_value > n. left_value
>> AND tn.right_value < n.right_value
>> AND tn.taxon_id = tnm.taxon_id
>> AND tn.name_class = 'scientific_name'
>>
>>
>> BioPerl doesn't have a Taxonomy::biosql module yet (though this
>> would
>> seem like a worthwhile thing to add), so you can't use the
>> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>
>>
>> However, BioPerl does have support for the flat-file download of 
>> the
>> NCBI taxonomy database and indexes it, so you can simply use
>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>> download
>> to achieve what you wanted to do in a less than 5 lines of perl.
>>
>>
>> Although the recursive implementation of
>> Taxonomy::get_all_Descendants
>> () won't be lightning fast, it may still be perfectly fine for your
>> application - are you sure it is not?
>>
>>
>> -hilmar
>>
>>
>> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>
>>
>> Thanks. And how can I assign the $node here in the below code,
>> such
>> that I can reference it to a particular taxon id record? I want to
>> retrieve all the descendents from the taxonomy hierarchy, given a
>> particular taxon id.
>>
>>
>> I have a local db setup, in which I have uploaded data using the
>> load_ncbi_taxonomy.pl script.
>>
>>
>> Thanks.
>> George
>>
>>
>> Jason Stajich wrote:
>> I assume you already figured out how to setup a local taxonomydb?
>>
>>
>>
>>
>> You just want the extant species/leaves of the tree
>>
>>
>>
>>
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descedents;
>>
>>
>>
>>
>>
>>
>> -jason
>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>>
>> Hi all,
>>
>>
>>
>>
>> Can anyone point me to some example that uses the
>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>> this, and I am not quite sure how to implement it.
>>
>>
>>
>>
>> Thanks.
>> George
>>
>>
>>
>>
>> Sendu Bala wrote:
>> George Heller wrote:
>> Hi all,
>>
>>
>>
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon
>> ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children 
>> and so
>> on.
>>
>>
>>
>>
>> Any ideas on the way I can go about doing this?
>>
>>
>>
>>
>> Well, you'll use Bio::DB::Taxonomy presumably, and
>> each_Descendent in
>> some kind of looping structure. Most easily a recursing sub.
>>
>>
>>
>>
>> If you happen to code up something neat and efficient, why not
>> share it
>> with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Shape Yahoo! in your own image. Join our Network Research Panel
>> today!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Need a vacation? Get great deals to amazing places on Yahoo! 
>> Travel.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> --
>> ===========================================================
>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Take the Internet to Go: Yahoo!Go puts the Internet in your
>> pocket: mail, news, photos & more.
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Bored stiff? Loosen up...
>> Download and play hundreds of games for free on Yahoo! Games.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================


---------------------------------
Expecting? Get great news right away with email Auto-Check.
Try the Yahoo! Mail Beta.


From jason at bioperl.org  Mon Jun 18 19:22:08 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 16:22:08 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <706979.34648.qm@web56509.mail.re3.yahoo.com>
References: <706979.34648.qm@web56509.mail.re3.yahoo.com>
Message-ID: <C93DF7A1-20AC-4474-BBC6-0C2598406EEB@bioperl.org>

Try installing the latest Scalar::Util

On Jun 18, 2007, at 4:05 PM, George Heller wrote:

> This is the output of /usr/bin/perl -V
>
> Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
>   Platform:
>     osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386- 
> linux-thread-multi
>     uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>     config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>     hint=recommended, useposix=true, d_sigaction=define
>     usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>     useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
>     use64bitint=undef use64bitall=undef uselongdouble=undef
>     usemymalloc=n, bincompat5005=undef
>   Compiler:
>     cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- 
> strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>     optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>     cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- 
> aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>     ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)',  
> gccosandvers=''
>     intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>     d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>     ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>     alignbytes=4, prototype=define
>   Linker and Libraries:
>     ld='gcc', ldflags =' -L/usr/local/lib'
>     libpth=/usr/local/lib /lib /usr/lib
>     libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>     perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>     libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>     gnulibc_version='2.3.4'
>   Dynamic Linking:
>     dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,- 
> E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>     cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
> Characteristics of this binary (from libperl):
>   Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>   Built under linux
>   Compiled at Jul 24 2006 18:28:10
>   @INC:
>     /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/5.8.5
>     /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.5
>     /usr/lib/perl5/site_perl/5.8.4
>     /usr/lib/perl5/site_perl/5.8.3
>     /usr/lib/perl5/site_perl/5.8.2
>     /usr/lib/perl5/site_perl/5.8.1
>     /usr/lib/perl5/site_perl/5.8.0
>     /usr/lib/perl5/site_perl
>     /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.5
>     /usr/lib/perl5/vendor_perl/5.8.4
>     /usr/lib/perl5/vendor_perl/5.8.3
>     /usr/lib/perl5/vendor_perl/5.8.2
>     /usr/lib/perl5/vendor_perl/5.8.1
>     /usr/lib/perl5/vendor_perl/5.8.0
>     /usr/lib/perl5/vendor_perl
>
>   Thanks.
>   George
>     .
>
> Hilmar Lapp <hlapp at gmx.net> wrote:
>   The perl version appears to be 5.8.5 though, so something strange
> appears to be going on too.
>
> George, can you please post the output of
>
> $ /usr/bin/perl -V
>
> -hilmar
>
> On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>> As the error implies your local version of perl doesn't seem support
>> weak references, which means it doesn't have Scalar::Utils (which was
>> added to core after perl 5.6.1, I think). Try installing
>> Scalar::Utils to see what happens.
>>
>> chris
>>
>> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>>
>>> I tried running the below mentioned script and I seem to be getting
>>> the following error:
>>>
>>> Weak references are not implemented in the version of perl at /
>>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>>> Bio/Tree/Node.pm line 76.
>>> Compilation failed in require at my.pl line 7.
>>> BEGIN failed--compilation aborted at my.pl line 7.
>>>
>>> My script looks something like,
>>>
>>> #!/usr/bin/perl
>>> use strict;
>>> #use warnings;
>>> use DBI;
>>> use Bio::Tree::Node;
>>> use Bio::DB::Taxonomy;
>>> use Bio::DB::Taxonomy::flatfile;
>>> my $idx_dir = '/tmp';
>>>
>>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>>> -nodesfile => $nodesfile,
>>> -namesfile => $namesfile,
>>> -directory => $idx_dir);
>>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descendents;
>>>
>>> foreach $field (@extant_children) {
>>> print "$field";
>>> print "|";
>>> print "\n";
>>> }
>>>
>>> And I am running the script using the command,
>>>
>>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>>
>>> and I have the nodes.dmp and names.dmp files in the current
>>> directory.
>>>
>>> Thanks,
>>> George
>>>
>>>
>>> Jason Stajich wrote:
>>> It is implemented in the implementing class - DB::Taxonomy is
>>> just the base class. For example see the flatfile implementation
>>> Bio::DB::Taxonomy::flatfile
>>>
>>> See the scripts/taxa/local_taxonomydb_query.PLS for example using
>>> it:
>>> nodes and names are from NCBI taxonomy database.
>>>
>>>
>>> Here is an un-debugged copy+paste for your question that *should*
>>> work.
>>>
>>>
>>> use Bio::DB::Taxonomy
>>> my $idx_dir = '/tmp';
>>>
>>>
>>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>>> -nodesfile => $nodesfile,
>>> -namesfile => $namesfile,
>>> -directory => $idx_dir);
>>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descendents;
>>>
>>>
>>>
>>>
>>> -jason
>>>
>>> On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>>
>>> What exactly is the "node n" in the query below. When I issue
>>> this query, it says,
>>>
>>>
>>> relation "node" does not exist.
>>>
>>>
>>> I tried to use the get_all_Descendents method but it looks like
>>> in order to do a recursive call it calls the method
>>> each_Descendent. This method is not implemented in
>>> Bio::DB::Taxonomy. It just has a single line,
>>>
>>>
>>> shift->throw_not_implemented();
>>>
>>>
>>> Thanks.
>>> George.
>>>
>>>
>>> Hilmar Lapp wrote:
>>> I'm a bit confused - it sounds like you have set up a local
>>> BioSQL
>>> database and loaded the NCBI taxonomy into the database. You can
>>> now
>>> use simple SQL to retrieve all descendants of a node in the tree
>>> given its NCBI taxonID such as
>>>
>>>
>>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>>> WHERE
>>> n.ncbi_taxon_id = :taxonID
>>> AND tn.left_value > n. left_value
>>> AND tn.right_value < n.right_value
>>> AND tn.taxon_id = tnm.taxon_id
>>> AND tn.name_class = 'scientific_name'
>>>
>>>
>>> BioPerl doesn't have a Taxonomy::biosql module yet (though this
>>> would
>>> seem like a worthwhile thing to add), so you can't use the
>>> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>>
>>>
>>> However, BioPerl does have support for the flat-file download of
>>> the
>>> NCBI taxonomy database and indexes it, so you can simply use
>>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>>> download
>>> to achieve what you wanted to do in a less than 5 lines of perl.
>>>
>>>
>>> Although the recursive implementation of
>>> Taxonomy::get_all_Descendants
>>> () won't be lightning fast, it may still be perfectly fine for your
>>> application - are you sure it is not?
>>>
>>>
>>> -hilmar
>>>
>>>
>>> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>>
>>>
>>> Thanks. And how can I assign the $node here in the below code,
>>> such
>>> that I can reference it to a particular taxon id record? I want to
>>> retrieve all the descendents from the taxonomy hierarchy, given a
>>> particular taxon id.
>>>
>>>
>>> I have a local db setup, in which I have uploaded data using the
>>> load_ncbi_taxonomy.pl script.
>>>
>>>
>>> Thanks.
>>> George
>>>
>>>
>>> Jason Stajich wrote:
>>> I assume you already figured out how to setup a local taxonomydb?
>>>
>>>
>>>
>>>
>>> You just want the extant species/leaves of the tree
>>>
>>>
>>>
>>>
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descedents;
>>>
>>>
>>>
>>>
>>>
>>>
>>> -jason
>>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>>
>>>
>>> Hi all,
>>>
>>>
>>>
>>>
>>> Can anyone point me to some example that uses the
>>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>>> this, and I am not quite sure how to implement it.
>>>
>>>
>>>
>>>
>>> Thanks.
>>> George
>>>
>>>
>>>
>>>
>>> Sendu Bala wrote:
>>> George Heller wrote:
>>> Hi all,
>>>
>>>
>>>
>>>
>>> I am looking at extracting the taxonomy hierarchy for some taxon
>>> ids.
>>> What I plan to do is, for a given taxon id, say 33090, I want to
>>> extract all taxon ids that are children of this species. I do not
>>> just want the immediate children, but the children's children
>>> and so
>>> on.
>>>
>>>
>>>
>>>
>>> Any ideas on the way I can go about doing this?
>>>
>>>
>>>
>>>
>>> Well, you'll use Bio::DB::Taxonomy presumably, and
>>> each_Descendent in
>>> some kind of looping structure. Most easily a recursing sub.
>>>
>>>
>>>
>>>
>>> If you happen to code up something neat and efficient, why not
>>> share it
>>> with us and we could add it to the Taxonomy module(s).
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Shape Yahoo! in your own image. Join our Network Research Panel
>>> today!
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>>
>>> --
>>> Jason Stajich
>>> jason at bioperl.org
>>> http://jason.open-bio.org/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Need a vacation? Get great deals to amazing places on Yahoo!
>>> Travel.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> --
>>> ===========================================================
>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Take the Internet to Go: Yahoo!Go puts the Internet in your
>>> pocket: mail, news, photos & more.
>>>
>>>
>>> --
>>> Jason Stajich
>>> jason at bioperl.org
>>> http://jason.open-bio.org/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Bored stiff? Loosen up...
>>> Download and play hundreds of games for free on Yahoo! Games.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
>
>
>
> ---------------------------------
> Expecting? Get great news right away with email Auto-Check.
> Try the Yahoo! Mail Beta.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From george.heller at yahoo.com  Mon Jun 18 20:04:00 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 17:04:00 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <C93DF7A1-20AC-4474-BBC6-0C2598406EEB@bioperl.org>
Message-ID: <424035.72876.qm@web56507.mail.re3.yahoo.com>

Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
   
  Sorry to be bothering, really appreaciate your patience.
   
  Thanks.
  George

Jason Stajich <jason at bioperl.org> wrote:
  Try installing the latest Scalar::Util  
    On Jun 18, 2007, at 4:05 PM, George Heller wrote:

    This is the output of /usr/bin/perl -V
  

  Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
    Platform:
      osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
      uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
      config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
      hint=recommended, useposix=true, d_sigaction=define
      usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
      useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
      use64bitint=undef use64bitall=undef uselongdouble=undef
      usemymalloc=n, bincompat5005=undef
    Compiler:
      cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
      optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
      cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
      ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
      intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
      d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
      ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
      alignbytes=4, prototype=define
    Linker and Libraries:
      ld='gcc', ldflags =' -L/usr/local/lib'
      libpth=/usr/local/lib /lib /usr/lib
      libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
      perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
      libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
      gnulibc_version='2.3.4'
    Dynamic Linking:
      dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
      cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

  Characteristics of this binary (from libperl):
    Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
    Built under linux
    Compiled at Jul 24 2006 18:28:10
    @INC:
      /usr/lib/perl5/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/5.8.5
      /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.5
      /usr/lib/perl5/site_perl/5.8.4
      /usr/lib/perl5/site_perl/5.8.3
      /usr/lib/perl5/site_perl/5.8.2
      /usr/lib/perl5/site_perl/5.8.1
      /usr/lib/perl5/site_perl/5.8.0
      /usr/lib/perl5/site_perl
      /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.5
      /usr/lib/perl5/vendor_perl/5.8.4
      /usr/lib/perl5/vendor_perl/5.8.3
      /usr/lib/perl5/vendor_perl/5.8.2
      /usr/lib/perl5/vendor_perl/5.8.1
      /usr/lib/perl5/vendor_perl/5.8.0
      /usr/lib/perl5/vendor_perl
  

    Thanks.
    George
      .
  

  Hilmar Lapp <hlapp at gmx.net> wrote:
    The perl version appears to be 5.8.5 though, so something strange 
  appears to be going on too.
  

  George, can you please post the output of
  

  $ /usr/bin/perl -V
  

  -hilmar
  

  On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

    As the error implies your local version of perl doesn't seem support
  weak references, which means it doesn't have Scalar::Utils (which was
  added to core after perl 5.6.1, I think). Try installing
  Scalar::Utils to see what happens.
  

  chris
  

  On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

    I tried running the below mentioned script and I seem to be getting
  the following error:
  

  Weak references are not implemented in the version of perl at /
  usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
  BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
  Bio/Tree/Node.pm line 76.
  Compilation failed in require at my.pl line 7.
  BEGIN failed--compilation aborted at my.pl line 7.
  

  My script looks something like,
  

  #!/usr/bin/perl
  use strict;
  #use warnings;
  use DBI;
  use Bio::Tree::Node;
  use Bio::DB::Taxonomy;
  use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
  my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
  -nodesfile => $nodesfile,
  -namesfile => $namesfile,
  -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descendents;
  

  foreach $field (@extant_children) {
  print "$field";
  print "|";
  print "\n";
  }
  

  And I am running the script using the command,
  

  perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

  and I have the nodes.dmp and names.dmp files in the current
  directory.
  

  Thanks,
  George
  

  Jason Stajich wrote:
  It is implemented in the implementing class - DB::Taxonomy is
  just the base class. For example see the flatfile implementation
  Bio::DB::Taxonomy::flatfile
  

  See the scripts/taxa/local_taxonomydb_query.PLS for example using
  it:
  nodes and names are from NCBI taxonomy database.
  

  Here is an un-debugged copy+paste for your question that *should*
  work.
  

  use Bio::DB::Taxonomy
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
  my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
  -nodesfile => $nodesfile,
  -namesfile => $namesfile,
  -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descendents;
  

  -jason
  

  On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

  What exactly is the "node n" in the query below. When I issue
  this query, it says,
  

  relation "node" does not exist.
  

  I tried to use the get_all_Descendents method but it looks like
  in order to do a recursive call it calls the method
  each_Descendent. This method is not implemented in
  Bio::DB::Taxonomy. It just has a single line,
  

  shift->throw_not_implemented();
  

  Thanks.
  George.
  

  Hilmar Lapp wrote:
  I'm a bit confused - it sounds like you have set up a local 
  BioSQL
  database and loaded the NCBI taxonomy into the database. You can 
  now
  use simple SQL to retrieve all descendants of a node in the tree
  given its NCBI taxonID such as
  

  SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
  WHERE
  n.ncbi_taxon_id = :taxonID
  AND tn.left_value > n. left_value
  AND tn.right_value < n.right_value
  AND tn.taxon_id = tnm.taxon_id
  AND tn.name_class = 'scientific_name'
  

  BioPerl doesn't have a Taxonomy::biosql module yet (though this
  would
  seem like a worthwhile thing to add), so you can't use the
  Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

  However, BioPerl does have support for the flat-file download of 
  the
  NCBI taxonomy database and indexes it, so you can simply use
  Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
  download
  to achieve what you wanted to do in a less than 5 lines of perl.
  

  Although the recursive implementation of
  Taxonomy::get_all_Descendants
  () won't be lightning fast, it may still be perfectly fine for your
  application - are you sure it is not?
  

  -hilmar
  

  On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

  Thanks. And how can I assign the $node here in the below code,
  such
  that I can reference it to a particular taxon id record? I want to
  retrieve all the descendents from the taxonomy hierarchy, given a
  particular taxon id.
  

  I have a local db setup, in which I have uploaded data using the
  load_ncbi_taxonomy.pl script.
  

  Thanks.
  George
  

  Jason Stajich wrote:
  I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descedents;
  

  -jason
  On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

  Hi all,
  

  Can anyone point me to some example that uses the
  get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
  this, and I am not quite sure how to implement it.
  

  Thanks.
  George
  

  Sendu Bala wrote:
  George Heller wrote:
  Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon
  ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children 
  and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and
  each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not
  share it
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image. Join our Network Research Panel
  today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Need a vacation? Get great deals to amazing places on Yahoo! 
  Travel.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Take the Internet to Go: Yahoo!Go puts the Internet in your
  pocket: mail, news, photos & more.
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Bored stiff? Loosen up...
  Download and play hundreds of games for free on Yahoo! Games.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  Christopher Fields
  Postdoctoral Researcher
  Lab of Dr. Robert Switzer
  Dept of Biochemistry
  University of Illinois Urbana-Champaign
  

  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  -- 
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Expecting? Get great news right away with email Auto-Check.
  Try the Yahoo! Mail Beta.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Building a website is a piece of cake. 
Yahoo! Small Business gives you all the tools to get online.


From jason at bioperl.org  Mon Jun 18 20:17:34 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 17:17:34 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <424035.72876.qm@web56507.mail.re3.yahoo.com>
References: <424035.72876.qm@web56507.mail.re3.yahoo.com>
Message-ID: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org>

All the children are in this array.

You get to decide what you want to do with them. In the following  
example I print the id, rank, and scientific name out to the screen.
Because this is a taxonomy db query you are getting back  
Bio::Taxonomy::Taxon objects so read the documentation for this  
module to see what you can do with the object.
I would also suggest spending a little time with the Getting started  
and HOWTO:Trees documentation on the website to get familiar with the  
objects and nomenclature.


my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;

for my $child ( @extant_children ) {
   print "id is ", $child->id, "\n"; # NCBI taxa id
   print "rank is ", $child->rank, "\n"; # e.g. species
   print "scientific name is ", $child->scientific_name, "\n"; #  
scientific name
}

On Jun 18, 2007, at 5:04 PM, George Heller wrote:

> Ok, I installed the latest of Scalar::Util and the script seems to  
> be working. But I am confused where exactly I need to look for the  
> descendent taxon ids once the script is run. I did look into the / 
> tmp/ directory, but I couldnt understand much.
>
>   Sorry to be bothering, really appreaciate your patience.
>
>   Thanks.
>   George
>
> Jason Stajich <jason at bioperl.org> wrote:
>   Try installing the latest Scalar::Util
>     On Jun 18, 2007, at 4:05 PM, George Heller wrote:
>
>     This is the output of /usr/bin/perl -V
>
>
>   Summary of my perl5 (revision 5 version 8 subversion 5)  
> configuration:
>     Platform:
>       osname=linux, osvers=2.6.9-22.18.bz155725.elsmp,  
> archname=i386-linux-thread-multi
>       uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>       config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>       hint=recommended, useposix=true, d_sigaction=define
>       usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>       useperlio=define d_sfio=undef uselargefiles=define  
> usesocks=undef
>       use64bitint=undef use64bitall=undef uselongdouble=undef
>       usemymalloc=n, bincompat5005=undef
>     Compiler:
>       cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - 
> fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>       optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>       cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- 
> aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>       ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)',  
> gccosandvers=''
>       intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>       d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>       ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>       alignbytes=4, prototype=define
>     Linker and Libraries:
>       ld='gcc', ldflags =' -L/usr/local/lib'
>       libpth=/usr/local/lib /lib /usr/lib
>       libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>       perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>       libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>       gnulibc_version='2.3.4'
>     Dynamic Linking:
>       dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- 
> Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>       cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
>
>   Characteristics of this binary (from libperl):
>     Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>     Built under linux
>     Compiled at Jul 24 2006 18:28:10
>     @INC:
>       /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/5.8.5
>       /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.5
>       /usr/lib/perl5/site_perl/5.8.4
>       /usr/lib/perl5/site_perl/5.8.3
>       /usr/lib/perl5/site_perl/5.8.2
>       /usr/lib/perl5/site_perl/5.8.1
>       /usr/lib/perl5/site_perl/5.8.0
>       /usr/lib/perl5/site_perl
>       /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.5
>       /usr/lib/perl5/vendor_perl/5.8.4
>       /usr/lib/perl5/vendor_perl/5.8.3
>       /usr/lib/perl5/vendor_perl/5.8.2
>       /usr/lib/perl5/vendor_perl/5.8.1
>       /usr/lib/perl5/vendor_perl/5.8.0
>       /usr/lib/perl5/vendor_perl
>
>
>     Thanks.
>     George
>       .
>
>
>   Hilmar Lapp <hlapp at gmx.net> wrote:
>     The perl version appears to be 5.8.5 though, so something strange
>   appears to be going on too.
>
>
>   George, can you please post the output of
>
>
>   $ /usr/bin/perl -V
>
>
>   -hilmar
>
>
>   On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>
>     As the error implies your local version of perl doesn't seem  
> support
>   weak references, which means it doesn't have Scalar::Utils (which  
> was
>   added to core after perl 5.6.1, I think). Try installing
>   Scalar::Utils to see what happens.
>
>
>   chris
>
>
>   On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>
>     I tried running the below mentioned script and I seem to be  
> getting
>   the following error:
>
>
>   Weak references are not implemented in the version of perl at /
>   usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>   BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>   Bio/Tree/Node.pm line 76.
>   Compilation failed in require at my.pl line 7.
>   BEGIN failed--compilation aborted at my.pl line 7.
>
>
>   My script looks something like,
>
>
>   #!/usr/bin/perl
>   use strict;
>   #use warnings;
>   use DBI;
>   use Bio::Tree::Node;
>   use Bio::DB::Taxonomy;
>   use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>   my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>   -nodesfile => $nodesfile,
>   -namesfile => $namesfile,
>   -directory => $idx_dir);
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descendents;
>
>
>   foreach $field (@extant_children) {
>   print "$field";
>   print "|";
>   print "\n";
>   }
>
>
>   And I am running the script using the command,
>
>
>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>
>   and I have the nodes.dmp and names.dmp files in the current
>   directory.
>
>
>   Thanks,
>   George
>
>
>
>
>   Jason Stajich wrote:
>   It is implemented in the implementing class - DB::Taxonomy is
>   just the base class. For example see the flatfile implementation
>   Bio::DB::Taxonomy::flatfile
>
>
>   See the scripts/taxa/local_taxonomydb_query.PLS for example using
>   it:
>   nodes and names are from NCBI taxonomy database.
>
>
>
>
>   Here is an un-debugged copy+paste for your question that *should*
>   work.
>
>
>
>
>   use Bio::DB::Taxonomy
>   my $idx_dir = '/tmp';
>
>
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>   my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>   -nodesfile => $nodesfile,
>   -namesfile => $namesfile,
>   -directory => $idx_dir);
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descendents;
>
>
>
>
>
>
>
>
>   -jason
>
>
>   On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>
>   What exactly is the "node n" in the query below. When I issue
>   this query, it says,
>
>
>
>
>   relation "node" does not exist.
>
>
>
>
>   I tried to use the get_all_Descendents method but it looks like
>   in order to do a recursive call it calls the method
>   each_Descendent. This method is not implemented in
>   Bio::DB::Taxonomy. It just has a single line,
>
>
>
>
>   shift->throw_not_implemented();
>
>
>
>
>   Thanks.
>   George.
>
>
>
>
>   Hilmar Lapp wrote:
>   I'm a bit confused - it sounds like you have set up a local
>   BioSQL
>   database and loaded the NCBI taxonomy into the database. You can
>   now
>   use simple SQL to retrieve all descendants of a node in the tree
>   given its NCBI taxonID such as
>
>
>
>
>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>   WHERE
>   n.ncbi_taxon_id = :taxonID
>   AND tn.left_value > n. left_value
>   AND tn.right_value < n.right_value
>   AND tn.taxon_id = tnm.taxon_id
>   AND tn.name_class = 'scientific_name'
>
>
>
>
>   BioPerl doesn't have a Taxonomy::biosql module yet (though this
>   would
>   seem like a worthwhile thing to add), so you can't use the
>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>
>
>   However, BioPerl does have support for the flat-file download of
>   the
>   NCBI taxonomy database and indexes it, so you can simply use
>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>   download
>   to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>
>
>   Although the recursive implementation of
>   Taxonomy::get_all_Descendants
>   () won't be lightning fast, it may still be perfectly fine for your
>   application - are you sure it is not?
>
>
>
>
>   -hilmar
>
>
>
>
>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>
>
>   Thanks. And how can I assign the $node here in the below code,
>   such
>   that I can reference it to a particular taxon id record? I want to
>   retrieve all the descendents from the taxonomy hierarchy, given a
>   particular taxon id.
>
>
>
>
>   I have a local db setup, in which I have uploaded data using the
>   load_ncbi_taxonomy.pl script.
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>   Jason Stajich wrote:
>   I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>
>
>
>
>   You just want the extant species/leaves of the tree
>
>
>
>
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descedents;
>
>
>
>
>
>
>
>
>
>
>
>
>   -jason
>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>
>
>   Hi all,
>
>
>
>
>
>
>
>
>   Can anyone point me to some example that uses the
>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>   this, and I am not quite sure how to implement it.
>
>
>
>
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>
>
>
>
>   Sendu Bala wrote:
>   George Heller wrote:
>   Hi all,
>
>
>
>
>
>
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon
>   ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children
>   and so
>   on.
>
>
>
>
>
>
>
>
>   Any ideas on the way I can go about doing this?
>
>
>
>
>
>
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and
>   each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>
>
>
>
>   If you happen to code up something neat and efficient, why not
>   share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image. Join our Network Research Panel
>   today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Need a vacation? Get great deals to amazing places on Yahoo!
>   Travel.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Take the Internet to Go: Yahoo!Go puts the Internet in your
>   pocket: mail, news, photos & more.
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Bored stiff? Loosen up...
>   Download and play hundreds of games for free on Yahoo! Games.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   Christopher Fields
>   Postdoctoral Researcher
>   Lab of Dr. Robert Switzer
>   Dept of Biochemistry
>   University of Illinois Urbana-Champaign
>
>
>
>
>
>
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Expecting? Get great news right away with email Auto-Check.
>   Try the Yahoo! Mail Beta.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Building a website is a piece of cake.
> Yahoo! Small Business gives you all the tools to get online.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From george.heller at yahoo.com  Mon Jun 18 20:29:31 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 17:29:31 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org>
Message-ID: <369098.81077.qm@web56507.mail.re3.yahoo.com>

But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like,
   
  #!/usr/bin/perl
  use strict;
#use warnings;
use DBI;
  use Bio::Tree::Node;
use Bio::DB::Taxonomy;
use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
my $nodefile;
my $namesfile;

  my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                               -nodesfile => $nodefile,
                               -namesfile => $namesfile,
                               -directory => $idx_dir);
 my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  
for my $child ( @extant_children ) {
  print "id is ", $child->id, "\n"; # NCBI taxa id
  print "rank is ", $child->rank, "\n"; # e.g. species
  print "scientific name is ", $child->scientific_name, "\n"; #
scientific name
}

Thanks.
  George
  
Jason Stajich <jason at bioperl.org> wrote:
    All the children are in this array.  
  

  You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen.  
  Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object.
    I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature.
  

  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  for my $child ( @extant_children ) {
      print "id is ", $child->id, "\n"; # NCBI taxa id
    print "rank is ", $child->rank, "\n"; # e.g. species
    print "scientific name is ", $child->scientific_name, "\n"; # scientific name
  }


    On Jun 18, 2007, at 5:04 PM, George Heller wrote:

    Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
  

    Sorry to be bothering, really appreaciate your patience.
  

    Thanks.
    George
  

  Jason Stajich <jason at bioperl.org> wrote:
    Try installing the latest Scalar::Util  
      On Jun 18, 2007, at 4:05 PM, George Heller wrote:
  

      This is the output of /usr/bin/perl -V
  

    Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
      Platform:
        osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
        uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
        config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
        hint=recommended, useposix=true, d_sigaction=define
        usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
        useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
        use64bitint=undef use64bitall=undef uselongdouble=undef
        usemymalloc=n, bincompat5005=undef
      Compiler:
        cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
        optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
        cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
        ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
        intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
        d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
        ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
        alignbytes=4, prototype=define
      Linker and Libraries:
        ld='gcc', ldflags =' -L/usr/local/lib'
        libpth=/usr/local/lib /lib /usr/lib
        libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
        perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
        libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
        gnulibc_version='2.3.4'
      Dynamic Linking:
        dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
        cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

    Characteristics of this binary (from libperl):
      Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
      Built under linux
      Compiled at Jul 24 2006 18:28:10
      @INC:
        /usr/lib/perl5/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/5.8.5
        /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.5
        /usr/lib/perl5/site_perl/5.8.4
        /usr/lib/perl5/site_perl/5.8.3
        /usr/lib/perl5/site_perl/5.8.2
        /usr/lib/perl5/site_perl/5.8.1
        /usr/lib/perl5/site_perl/5.8.0
        /usr/lib/perl5/site_perl
        /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.5
        /usr/lib/perl5/vendor_perl/5.8.4
        /usr/lib/perl5/vendor_perl/5.8.3
        /usr/lib/perl5/vendor_perl/5.8.2
        /usr/lib/perl5/vendor_perl/5.8.1
        /usr/lib/perl5/vendor_perl/5.8.0
        /usr/lib/perl5/vendor_perl
  

      Thanks.
      George
        .
  

    Hilmar Lapp <hlapp at gmx.net> wrote:
      The perl version appears to be 5.8.5 though, so something strange 
    appears to be going on too.
  

    George, can you please post the output of
  

    $ /usr/bin/perl -V
  

    -hilmar
  

    On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

      As the error implies your local version of perl doesn't seem support
    weak references, which means it doesn't have Scalar::Utils (which was
    added to core after perl 5.6.1, I think). Try installing
    Scalar::Utils to see what happens.
  

    chris
  

    On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

      I tried running the below mentioned script and I seem to be getting
    the following error:
  

    Weak references are not implemented in the version of perl at /
    usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
    BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
    Bio/Tree/Node.pm line 76.
    Compilation failed in require at my.pl line 7.
    BEGIN failed--compilation aborted at my.pl line 7.
  

    My script looks something like,
  

    #!/usr/bin/perl
    use strict;
    #use warnings;
    use DBI;
    use Bio::Tree::Node;
    use Bio::DB::Taxonomy;
    use Bio::DB::Taxonomy::flatfile;
    my $idx_dir = '/tmp';
  

    my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
    my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
    -nodesfile => $nodesfile,
    -namesfile => $namesfile,
    -directory => $idx_dir);
    my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descendents;
  

    foreach $field (@extant_children) {
    print "$field";
    print "|";
    print "\n";
    }
  

    And I am running the script using the command,
  

    perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

    and I have the nodes.dmp and names.dmp files in the current
    directory.
  

    Thanks,
    George
  

    Jason Stajich wrote:
    It is implemented in the implementing class - DB::Taxonomy is
    just the base class. For example see the flatfile implementation
    Bio::DB::Taxonomy::flatfile
  

    See the scripts/taxa/local_taxonomydb_query.PLS for example using
    it:
    nodes and names are from NCBI taxonomy database.
  

    Here is an un-debugged copy+paste for your question that *should*
    work.
  

    use Bio::DB::Taxonomy
    my $idx_dir = '/tmp';
  

    my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
    my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
    -nodesfile => $nodesfile,
    -namesfile => $namesfile,
    -directory => $idx_dir);
    my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descendents;
  

    -jason
  

    On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

    What exactly is the "node n" in the query below. When I issue
    this query, it says,
  

    relation "node" does not exist.
  

    I tried to use the get_all_Descendents method but it looks like
    in order to do a recursive call it calls the method
    each_Descendent. This method is not implemented in
    Bio::DB::Taxonomy. It just has a single line,
  

    shift->throw_not_implemented();
  

    Thanks.
    George.
  

    Hilmar Lapp wrote:
    I'm a bit confused - it sounds like you have set up a local 
    BioSQL
    database and loaded the NCBI taxonomy into the database. You can 
    now
    use simple SQL to retrieve all descendants of a node in the tree
    given its NCBI taxonID such as
  

    SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
    WHERE
    n.ncbi_taxon_id = :taxonID
    AND tn.left_value > n. left_value
    AND tn.right_value < n.right_value
    AND tn.taxon_id = tnm.taxon_id
    AND tn.name_class = 'scientific_name'
  

    BioPerl doesn't have a Taxonomy::biosql module yet (though this
    would
    seem like a worthwhile thing to add), so you can't use the
    Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

    However, BioPerl does have support for the flat-file download of 
    the
    NCBI taxonomy database and indexes it, so you can simply use
    Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
    download
    to achieve what you wanted to do in a less than 5 lines of perl.
  

    Although the recursive implementation of
    Taxonomy::get_all_Descendants
    () won't be lightning fast, it may still be perfectly fine for your
    application - are you sure it is not?
  

    -hilmar
  

    On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

    Thanks. And how can I assign the $node here in the below code,
    such
    that I can reference it to a particular taxon id record? I want to
    retrieve all the descendents from the taxonomy hierarchy, given a
    particular taxon id.
  

    I have a local db setup, in which I have uploaded data using the
    load_ncbi_taxonomy.pl script.
  

    Thanks.
    George
  

    Jason Stajich wrote:
    I assume you already figured out how to setup a local taxonomydb?
  

    You just want the extant species/leaves of the tree
  

    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descedents;
  

    -jason
    On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

    Hi all,
  

    Can anyone point me to some example that uses the
    get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
    this, and I am not quite sure how to implement it.
  

    Thanks.
    George
  

    Sendu Bala wrote:
    George Heller wrote:
    Hi all,
  

    I am looking at extracting the taxonomy hierarchy for some taxon
    ids.
    What I plan to do is, for a given taxon id, say 33090, I want to
    extract all taxon ids that are children of this species. I do not
    just want the immediate children, but the children's children 
    and so
    on.
  

    Any ideas on the way I can go about doing this?
  

    Well, you'll use Bio::DB::Taxonomy presumably, and
    each_Descendent in
    some kind of looping structure. Most easily a recursing sub.
  

    If you happen to code up something neat and efficient, why not
    share it
    with us and we could add it to the Taxonomy module(s).
  

    ---------------------------------
    Shape Yahoo! in your own image. Join our Network Research Panel
    today!
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

    ---------------------------------
    Need a vacation? Get great deals to amazing places on Yahoo! 
    Travel.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    --
    ===========================================================
    : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
    ===========================================================
  

    ---------------------------------
    Take the Internet to Go: Yahoo!Go puts the Internet in your
    pocket: mail, news, photos & more.
  

    --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

    ---------------------------------
    Bored stiff? Loosen up...
    Download and play hundreds of games for free on Yahoo! Games.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    Christopher Fields
    Postdoctoral Researcher
    Lab of Dr. Robert Switzer
    Dept of Biochemistry
    University of Illinois Urbana-Champaign
  

    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    -- 
    ===========================================================
    : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
    ===========================================================
  

    ---------------------------------
    Expecting? Get great news right away with email Auto-Check.
    Try the Yahoo! Mail Beta.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

  ---------------------------------
  Building a website is a piece of cake. 
  Yahoo! Small Business gives you all the tools to get online.


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us.


From jason at bioperl.org  Mon Jun 18 21:05:43 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 18:05:43 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <369098.81077.qm@web56507.mail.re3.yahoo.com>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
Message-ID: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>

The files are indexes because you are indexing a flatfile - this  
speeds up the lookup so the second time you run the script it doesn't  
have to index.
You don't need to look at the files, they won't make sense to a human!

The reason it isn't printing anything is someone didn't really write  
the implementation quite right. This code was overhauled by Sendu  
before the last release I guess something didn't quite get connected.

I checked in code that has the Bio::Taxon delegating now to a DB  
handle for the each_Descendent call.
You can either patch your code  or just use the code listed here:
  http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

On Jun 18, 2007, at 5:29 PM, George Heller wrote:

> But the problem is that I don't really get any output on the  
> screen. In the /tmp directory I get 4 files namely parents, nodes,  
> id2names and names2id, but I dont know what to make of them. This  
> is what my script looks like,
>
>   #!/usr/bin/perl
>   use strict;
> #use warnings;
> use DBI;
>   use Bio::Tree::Node;
> use Bio::DB::Taxonomy;
> use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
> my $nodefile;
> my $namesfile;
>
>   my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                -nodesfile => $nodefile,
>                                -namesfile => $namesfile,
>                                -directory => $idx_dir);
>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
> for my $child ( @extant_children ) {
>   print "id is ", $child->id, "\n"; # NCBI taxa id
>   print "rank is ", $child->rank, "\n"; # e.g. species
>   print "scientific name is ", $child->scientific_name, "\n"; #
> scientific name
> }
>
> Thanks.
>   George
>
> Jason Stajich <jason at bioperl.org> wrote:
>     All the children are in this array.
>
>
>   You get to decide what you want to do with them. In the following  
> example I print the id, rank, and scientific name out to the screen.
>   Because this is a taxonomy db query you are getting back  
> Bio::Taxonomy::Taxon objects so read the documentation for this  
> module to see what you can do with the object.
>     I would also suggest spending a little time with the Getting  
> started and HOWTO:Trees documentation on the website to get  
> familiar with the objects and nomenclature.
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>
>   for my $child ( @extant_children ) {
>       print "id is ", $child->id, "\n"; # NCBI taxa id
>     print "rank is ", $child->rank, "\n"; # e.g. species
>     print "scientific name is ", $child->scientific_name, "\n"; #  
> scientific name
>   }
>
>
>     On Jun 18, 2007, at 5:04 PM, George Heller wrote:
>
>     Ok, I installed the latest of Scalar::Util and the script seems  
> to be working. But I am confused where exactly I need to look for  
> the descendent taxon ids once the script is run. I did look into  
> the /tmp/ directory, but I couldnt understand much.
>
>
>     Sorry to be bothering, really appreaciate your patience.
>
>
>     Thanks.
>     George
>
>
>   Jason Stajich <jason at bioperl.org> wrote:
>     Try installing the latest Scalar::Util
>       On Jun 18, 2007, at 4:05 PM, George Heller wrote:
>
>
>       This is the output of /usr/bin/perl -V
>
>
>
>
>     Summary of my perl5 (revision 5 version 8 subversion 5)  
> configuration:
>       Platform:
>         osname=linux, osvers=2.6.9-22.18.bz155725.elsmp,  
> archname=i386-linux-thread-multi
>         uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>         config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>         hint=recommended, useposix=true, d_sigaction=define
>         usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>         useperlio=define d_sfio=undef uselargefiles=define  
> usesocks=undef
>         use64bitint=undef use64bitall=undef uselongdouble=undef
>         usemymalloc=n, bincompat5005=undef
>       Compiler:
>         cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - 
> fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>         optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>         cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- 
> strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>         ccversion='', gccversion='3.4.6 20060404 (Red Hat  
> 3.4.6-2)', gccosandvers=''
>         intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>         d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>         ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>         alignbytes=4, prototype=define
>       Linker and Libraries:
>         ld='gcc', ldflags =' -L/usr/local/lib'
>         libpth=/usr/local/lib /lib /usr/lib
>         libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>         perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>         libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>         gnulibc_version='2.3.4'
>       Dynamic Linking:
>         dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- 
> Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>         cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
>
>
>
>     Characteristics of this binary (from libperl):
>       Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>       Built under linux
>       Compiled at Jul 24 2006 18:28:10
>       @INC:
>         /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/5.8.5
>         /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.5
>         /usr/lib/perl5/site_perl/5.8.4
>         /usr/lib/perl5/site_perl/5.8.3
>         /usr/lib/perl5/site_perl/5.8.2
>         /usr/lib/perl5/site_perl/5.8.1
>         /usr/lib/perl5/site_perl/5.8.0
>         /usr/lib/perl5/site_perl
>         /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.5
>         /usr/lib/perl5/vendor_perl/5.8.4
>         /usr/lib/perl5/vendor_perl/5.8.3
>         /usr/lib/perl5/vendor_perl/5.8.2
>         /usr/lib/perl5/vendor_perl/5.8.1
>         /usr/lib/perl5/vendor_perl/5.8.0
>         /usr/lib/perl5/vendor_perl
>
>
>
>
>       Thanks.
>       George
>         .
>
>
>
>
>     Hilmar Lapp <hlapp at gmx.net> wrote:
>       The perl version appears to be 5.8.5 though, so something  
> strange
>     appears to be going on too.
>
>
>
>
>     George, can you please post the output of
>
>
>
>
>     $ /usr/bin/perl -V
>
>
>
>
>     -hilmar
>
>
>
>
>     On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>
>
>
>       As the error implies your local version of perl doesn't seem  
> support
>     weak references, which means it doesn't have Scalar::Utils  
> (which was
>     added to core after perl 5.6.1, I think). Try installing
>     Scalar::Utils to see what happens.
>
>
>
>
>     chris
>
>
>
>
>     On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>
>
>
>       I tried running the below mentioned script and I seem to be  
> getting
>     the following error:
>
>
>
>
>     Weak references are not implemented in the version of perl at /
>     usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>     BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/ 
> 5.8.5/
>     Bio/Tree/Node.pm line 76.
>     Compilation failed in require at my.pl line 7.
>     BEGIN failed--compilation aborted at my.pl line 7.
>
>
>
>
>     My script looks something like,
>
>
>
>
>     #!/usr/bin/perl
>     use strict;
>     #use warnings;
>     use DBI;
>     use Bio::Tree::Node;
>     use Bio::DB::Taxonomy;
>     use Bio::DB::Taxonomy::flatfile;
>     my $idx_dir = '/tmp';
>
>
>
>
>     my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>     -nodesfile => $nodesfile,
>     -namesfile => $namesfile,
>     -directory => $idx_dir);
>     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descendents;
>
>
>
>
>     foreach $field (@extant_children) {
>     print "$field";
>     print "|";
>     print "\n";
>     }
>
>
>
>
>     And I am running the script using the command,
>
>
>
>
>     perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>
>
>
>     and I have the nodes.dmp and names.dmp files in the current
>     directory.
>
>
>
>
>     Thanks,
>     George
>
>
>
>
>
>
>
>
>     Jason Stajich wrote:
>     It is implemented in the implementing class - DB::Taxonomy is
>     just the base class. For example see the flatfile implementation
>     Bio::DB::Taxonomy::flatfile
>
>
>
>
>     See the scripts/taxa/local_taxonomydb_query.PLS for example using
>     it:
>     nodes and names are from NCBI taxonomy database.
>
>
>
>
>
>
>
>
>     Here is an un-debugged copy+paste for your question that *should*
>     work.
>
>
>
>
>
>
>
>
>     use Bio::DB::Taxonomy
>     my $idx_dir = '/tmp';
>
>
>
>
>
>
>
>
>     my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>     -nodesfile => $nodesfile,
>     -namesfile => $namesfile,
>     -directory => $idx_dir);
>     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descendents;
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     -jason
>
>
>
>
>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>
>
>
>     What exactly is the "node n" in the query below. When I issue
>     this query, it says,
>
>
>
>
>
>
>
>
>     relation "node" does not exist.
>
>
>
>
>
>
>
>
>     I tried to use the get_all_Descendents method but it looks like
>     in order to do a recursive call it calls the method
>     each_Descendent. This method is not implemented in
>     Bio::DB::Taxonomy. It just has a single line,
>
>
>
>
>
>
>
>
>     shift->throw_not_implemented();
>
>
>
>
>
>
>
>
>     Thanks.
>     George.
>
>
>
>
>
>
>
>
>     Hilmar Lapp wrote:
>     I'm a bit confused - it sounds like you have set up a local
>     BioSQL
>     database and loaded the NCBI taxonomy into the database. You can
>     now
>     use simple SQL to retrieve all descendants of a node in the tree
>     given its NCBI taxonID such as
>
>
>
>
>
>
>
>
>     SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>     WHERE
>     n.ncbi_taxon_id = :taxonID
>     AND tn.left_value > n. left_value
>     AND tn.right_value < n.right_value
>     AND tn.taxon_id = tnm.taxon_id
>     AND tn.name_class = 'scientific_name'
>
>
>
>
>
>
>
>
>     BioPerl doesn't have a Taxonomy::biosql module yet (though this
>     would
>     seem like a worthwhile thing to add), so you can't use the
>     Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>
>
>
>
>
>
>     However, BioPerl does have support for the flat-file download of
>     the
>     NCBI taxonomy database and indexes it, so you can simply use
>     Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>     download
>     to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>
>
>
>
>
>
>     Although the recursive implementation of
>     Taxonomy::get_all_Descendants
>     () won't be lightning fast, it may still be perfectly fine for  
> your
>     application - are you sure it is not?
>
>
>
>
>
>
>
>
>     -hilmar
>
>
>
>
>
>
>
>
>     On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>
>
>
>
>
>
>     Thanks. And how can I assign the $node here in the below code,
>     such
>     that I can reference it to a particular taxon id record? I want to
>     retrieve all the descendents from the taxonomy hierarchy, given a
>     particular taxon id.
>
>
>
>
>
>
>
>
>     I have a local db setup, in which I have uploaded data using the
>     load_ncbi_taxonomy.pl script.
>
>
>
>
>
>
>
>
>     Thanks.
>     George
>
>
>
>
>
>
>
>
>     Jason Stajich wrote:
>     I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     You just want the extant species/leaves of the tree
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descedents;
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     -jason
>     On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>
>
>
>
>
>
>     Hi all,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Can anyone point me to some example that uses the
>     get_all_Descendents method from Bio::DB::Taxonomy? I am a  
> newbie at
>     this, and I am not quite sure how to implement it.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Thanks.
>     George
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Sendu Bala wrote:
>     George Heller wrote:
>     Hi all,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     I am looking at extracting the taxonomy hierarchy for some taxon
>     ids.
>     What I plan to do is, for a given taxon id, say 33090, I want to
>     extract all taxon ids that are children of this species. I do not
>     just want the immediate children, but the children's children
>     and so
>     on.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Any ideas on the way I can go about doing this?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Well, you'll use Bio::DB::Taxonomy presumably, and
>     each_Descendent in
>     some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     If you happen to code up something neat and efficient, why not
>     share it
>     with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Shape Yahoo! in your own image. Join our Network Research Panel
>     today!
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Need a vacation? Get great deals to amazing places on Yahoo!
>     Travel.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>     --
>     ===========================================================
>     : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>     ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Take the Internet to Go: Yahoo!Go puts the Internet in your
>     pocket: mail, news, photos & more.
>
>
>
>
>
>
>
>
>     --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Bored stiff? Loosen up...
>     Download and play hundreds of games for free on Yahoo! Games.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>     Christopher Fields
>     Postdoctoral Researcher
>     Lab of Dr. Robert Switzer
>     Dept of Biochemistry
>     University of Illinois Urbana-Champaign
>
>
>
>
>
>
>
>
>
>
>
>
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>     --
>     ===========================================================
>     : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>     ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Expecting? Get great news right away with email Auto-Check.
>     Try the Yahoo! Mail Beta.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>       --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Building a website is a piece of cake.
>   Yahoo! Small Business gives you all the tools to get online.
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s  
> user panel and lay it on us.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From torsten.seemann at infotech.monash.edu.au  Mon Jun 18 21:21:04 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 19 Jun 2007 11:21:04 +1000
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676A01F.30205@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
Message-ID: <a79f6a4b0706181821p12a2e138xade9c30895e45068@mail.gmail.com>

Sendu,

> >> Can anyone offer a
> >> way to systematically find at least the test scripts which access the
> >> internet, if not the specific tests within?

Perhaps you could use 'strace' to list network system calls for each
test script, and grep out AF_INET connections?

% strace -e trace=network command_to_test 2>&1 | grep AF_INET

I'm not an strace expert but it might do what you need.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From george.heller at yahoo.com  Mon Jun 18 21:16:10 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 18:16:10 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
Message-ID: <815364.33231.qm@web56512.mail.re3.yahoo.com>

Works perfectly. Thanks so much Jason, Hilmar, Chris. You've been a great help!
   
  Thanks.
  George

Jason Stajich <jason at bioperl.org> wrote:
  The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index.  You don't need to look at the files, they won't make sense to a human!
  

  The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. 
  

  I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call.
  You can either patch your code  or just use the code listed here:
     http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

  
    On Jun 18, 2007, at 5:29 PM, George Heller wrote:

    But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like,
  

    #!/usr/bin/perl
    use strict;
  #use warnings;
  use DBI;
    use Bio::Tree::Node;
  use Bio::DB::Taxonomy;
  use Bio::DB::Taxonomy::flatfile;
    my $idx_dir = '/tmp';
  my $nodefile;
  my $namesfile;
  

    my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
  my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                 -nodesfile => $nodefile,
                                 -namesfile => $namesfile,
                                 -directory => $idx_dir);
   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
   my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  for my $child ( @extant_children ) {
    print "id is ", $child->id, "\n"; # NCBI taxa id
    print "rank is ", $child->rank, "\n"; # e.g. species
    print "scientific name is ", $child->scientific_name, "\n"; #
  scientific name
  }
  

  Thanks.
    George
  

  Jason Stajich <jason at bioperl.org> wrote:
      All the children are in this array.  
  

    You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen.  
    Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object.
      I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature.
  

    my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

    for my $child ( @extant_children ) {
        print "id is ", $child->id, "\n"; # NCBI taxa id
      print "rank is ", $child->rank, "\n"; # e.g. species
      print "scientific name is ", $child->scientific_name, "\n"; # scientific name
    }
  

      On Jun 18, 2007, at 5:04 PM, George Heller wrote:
  

      Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
  

      Sorry to be bothering, really appreaciate your patience.
  

      Thanks.
      George
  

    Jason Stajich <jason at bioperl.org> wrote:
      Try installing the latest Scalar::Util  
        On Jun 18, 2007, at 4:05 PM, George Heller wrote:
  

        This is the output of /usr/bin/perl -V
  

      Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
        Platform:
          osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
          uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
          config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
          hint=recommended, useposix=true, d_sigaction=define
          usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
          useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
          use64bitint=undef use64bitall=undef uselongdouble=undef
          usemymalloc=n, bincompat5005=undef
        Compiler:
          cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
          optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
          cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
          ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
          intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
          d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
          ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
          alignbytes=4, prototype=define
        Linker and Libraries:
          ld='gcc', ldflags =' -L/usr/local/lib'
          libpth=/usr/local/lib /lib /usr/lib
          libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
          perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
          libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
          gnulibc_version='2.3.4'
        Dynamic Linking:
          dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
          cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

      Characteristics of this binary (from libperl):
        Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
        Built under linux
        Compiled at Jul 24 2006 18:28:10
        @INC:
          /usr/lib/perl5/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/5.8.5
          /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.5
          /usr/lib/perl5/site_perl/5.8.4
          /usr/lib/perl5/site_perl/5.8.3
          /usr/lib/perl5/site_perl/5.8.2
          /usr/lib/perl5/site_perl/5.8.1
          /usr/lib/perl5/site_perl/5.8.0
          /usr/lib/perl5/site_perl
          /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.5
          /usr/lib/perl5/vendor_perl/5.8.4
          /usr/lib/perl5/vendor_perl/5.8.3
          /usr/lib/perl5/vendor_perl/5.8.2
          /usr/lib/perl5/vendor_perl/5.8.1
          /usr/lib/perl5/vendor_perl/5.8.0
          /usr/lib/perl5/vendor_perl
  

        Thanks.
        George
          .
  

      Hilmar Lapp <hlapp at gmx.net> wrote:
        The perl version appears to be 5.8.5 though, so something strange 
      appears to be going on too.
  

      George, can you please post the output of
  

      $ /usr/bin/perl -V
  

      -hilmar
  

      On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

        As the error implies your local version of perl doesn't seem support
      weak references, which means it doesn't have Scalar::Utils (which was
      added to core after perl 5.6.1, I think). Try installing
      Scalar::Utils to see what happens.
  

      chris
  

      On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

        I tried running the below mentioned script and I seem to be getting
      the following error:
  

      Weak references are not implemented in the version of perl at /
      usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
      BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
      Bio/Tree/Node.pm line 76.
      Compilation failed in require at my.pl line 7.
      BEGIN failed--compilation aborted at my.pl line 7.
  

      My script looks something like,
  

      #!/usr/bin/perl
      use strict;
      #use warnings;
      use DBI;
      use Bio::Tree::Node;
      use Bio::DB::Taxonomy;
      use Bio::DB::Taxonomy::flatfile;
      my $idx_dir = '/tmp';
  

      my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
      my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
      -nodesfile => $nodesfile,
      -namesfile => $namesfile,
      -directory => $idx_dir);
      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descendents;
  

      foreach $field (@extant_children) {
      print "$field";
      print "|";
      print "\n";
      }
  

      And I am running the script using the command,
  

      perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

      and I have the nodes.dmp and names.dmp files in the current
      directory.
  

      Thanks,
      George
  

      Jason Stajich wrote:
      It is implemented in the implementing class - DB::Taxonomy is
      just the base class. For example see the flatfile implementation
      Bio::DB::Taxonomy::flatfile
  

      See the scripts/taxa/local_taxonomydb_query.PLS for example using
      it:
      nodes and names are from NCBI taxonomy database.
  

      Here is an un-debugged copy+paste for your question that *should*
      work.
  

      use Bio::DB::Taxonomy
      my $idx_dir = '/tmp';
  

      my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
      my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
      -nodesfile => $nodesfile,
      -namesfile => $namesfile,
      -directory => $idx_dir);
      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descendents;
  

      -jason
  

      On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

      What exactly is the "node n" in the query below. When I issue
      this query, it says,
  

      relation "node" does not exist.
  

      I tried to use the get_all_Descendents method but it looks like
      in order to do a recursive call it calls the method
      each_Descendent. This method is not implemented in
      Bio::DB::Taxonomy. It just has a single line,
  

      shift->throw_not_implemented();
  

      Thanks.
      George.
  

      Hilmar Lapp wrote:
      I'm a bit confused - it sounds like you have set up a local 
      BioSQL
      database and loaded the NCBI taxonomy into the database. You can 
      now
      use simple SQL to retrieve all descendants of a node in the tree
      given its NCBI taxonID such as
  

      SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
      WHERE
      n.ncbi_taxon_id = :taxonID
      AND tn.left_value > n. left_value
      AND tn.right_value < n.right_value
      AND tn.taxon_id = tnm.taxon_id
      AND tn.name_class = 'scientific_name'
  

      BioPerl doesn't have a Taxonomy::biosql module yet (though this
      would
      seem like a worthwhile thing to add), so you can't use the
      Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

      However, BioPerl does have support for the flat-file download of 
      the
      NCBI taxonomy database and indexes it, so you can simply use
      Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
      download
      to achieve what you wanted to do in a less than 5 lines of perl.
  

      Although the recursive implementation of
      Taxonomy::get_all_Descendants
      () won't be lightning fast, it may still be perfectly fine for your
      application - are you sure it is not?
  

      -hilmar
  

      On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

      Thanks. And how can I assign the $node here in the below code,
      such
      that I can reference it to a particular taxon id record? I want to
      retrieve all the descendents from the taxonomy hierarchy, given a
      particular taxon id.
  

      I have a local db setup, in which I have uploaded data using the
      load_ncbi_taxonomy.pl script.
  

      Thanks.
      George
  

      Jason Stajich wrote:
      I assume you already figured out how to setup a local taxonomydb?
  

      You just want the extant species/leaves of the tree
  

      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descedents;
  

      -jason
      On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

      Hi all,
  

      Can anyone point me to some example that uses the
      get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
      this, and I am not quite sure how to implement it.
  

      Thanks.
      George
  

      Sendu Bala wrote:
      George Heller wrote:
      Hi all,
  

      I am looking at extracting the taxonomy hierarchy for some taxon
      ids.
      What I plan to do is, for a given taxon id, say 33090, I want to
      extract all taxon ids that are children of this species. I do not
      just want the immediate children, but the children's children 
      and so
      on.
  

      Any ideas on the way I can go about doing this?
  

      Well, you'll use Bio::DB::Taxonomy presumably, and
      each_Descendent in
      some kind of looping structure. Most easily a recursing sub.
  

      If you happen to code up something neat and efficient, why not
      share it
      with us and we could add it to the Taxonomy module(s).
  

      ---------------------------------
      Shape Yahoo! in your own image. Join our Network Research Panel
      today!
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

      ---------------------------------
      Need a vacation? Get great deals to amazing places on Yahoo! 
      Travel.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
      ===========================================================
      : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
      ===========================================================
  

      ---------------------------------
      Take the Internet to Go: Yahoo!Go puts the Internet in your
      pocket: mail, news, photos & more.
  

      --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

      ---------------------------------
      Bored stiff? Loosen up...
      Download and play hundreds of games for free on Yahoo! Games.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      Christopher Fields
      Postdoctoral Researcher
      Lab of Dr. Robert Switzer
      Dept of Biochemistry
      University of Illinois Urbana-Champaign
  

      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      -- 
      ===========================================================
      : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
      ===========================================================
  

      ---------------------------------
      Expecting? Get great news right away with email Auto-Check.
      Try the Yahoo! Mail Beta.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

        --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

    ---------------------------------
    Building a website is a piece of cake. 
    Yahoo! Small Business gives you all the tools to get online.
  

      --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

  ---------------------------------
  Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us.


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Now that's room service! Choose from over 150,000 hotels 
in 45,000 destinations on Yahoo! Travel to find your fit.


From torsten.seemann at infotech.monash.edu.au  Mon Jun 18 21:26:41 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 19 Jun 2007 11:26:41 +1000
Subject: [Bioperl-l] gff2xml
In-Reply-To: <a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>
References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
	<a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>
Message-ID: <a79f6a4b0706181826x4ccc4ee5n8ddafa703ad162a3@mail.gmail.com>

(Sean, please reply to the bioperl-l list rather than to me personally
so everyone can read it. i'm reposting it here)

> > I posted this on the gbrowse list earlier. I'm looking to convert gff
> > data files into xml. Does anyone know of a module written to do this
> > already?
>
> What DTD do you want the XML to conform to?
> eg. ChadoXML, TinySeq XML, TIGR XML ... ?

Hi Torsten,
I'm collaborating with other groups and want web-service compatible
functionality for various tools. Normally the analysis tools I'm using
generate gff output. I'm going to have to wrap this output in XML with
XSL stylesheet for end-users to view. Haven't done it before and don't
know what DTD to use. The bp_seqconvert.pl doesn't accept gff format.
I would imagine the DTD would be quite short as the gff files are very
standard, I just don't have any experience with these DTD
requirements.
--Sean O'Keeffe <limericksean at gmail.com>


From sac at bioperl.org  Tue Jun 19 02:42:27 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Mon, 18 Jun 2007 23:42:27 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy)
Message-ID: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>

On 6/16/07, Jason Stajich <jason at bioperl.org> wrote:
> [...]
> Just to say I already went through all the steps of running cvs2svn
> myself and had problems gathering back out the branches and all the
> tags when I tried it.  If you want to start with a smaller repository
> like bioperl-network or bioperl-db as the initial cvs2svn conversion
> script took quite a long time to run on bioperl-live.

Might this been a good opportunity to investigate partitioning
bioperl-live into sub-repositories? There has been talk in the past of
defining a set of "core" modules separate from other functionally
related groups of modules that would be viewed as optional extensions.
The goal being to help manage growth and simplify releases. There are
currently 892 modules under Bio/.

In addition to simplifying the migration to SVN, it would also have
other benefits. Say some new functionality or a slew of fixes were
added to Bio::Graphics. We could turn around a new Bio::Graphics
release quickly without having to work on getting various other parts
up to snuff that aren't related to graphics (Biblio, DB, PopGen,
Search etc.). Maintenance and releases of the various extensions would
be more parallelizable, orchestrated by separate ring leaders.

Over time, as a set of functionality matures, it would see fewer
updates and there would be less of a need for users to
download/install/test it. This could make bioperl easier to customize,
extend, and grok in general.

Long term, it should ease development and release cycles, but it will
involve a bit of near term bullet-biting. We'd need to get clear on
how to partition things, including modules, tests, docs, installation
logic, etc. and we'd probably need new integration tests to verify
that the subsets continue working together.

What do folks think? Would this SVN-based, re-partitioned bioperl-live
constitute a 2.0 release? Any volunteers to help assemble a roadmap
and milestones? Should I go on dreaming?

Cheers,
Steve


From bix at sendu.me.uk  Tue Jun 19 03:01:05 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 08:01:05 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
Message-ID: <46777F31.7030402@sendu.me.uk>

Jason Stajich wrote:
> The reason it isn't printing anything is someone didn't really write  
> the implementation quite right. This code was overhauled by Sendu  
> before the last release I guess something didn't quite get connected.
> 
> I checked in code that has the Bio::Taxon delegating now to a DB  
> handle for the each_Descendent call.
> You can either patch your code  or just use the code listed here:
>   http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

I've reverted that change.

For some reason the docs for Bio::Taxon::each_Descendent aren't showing 
up on the website, but they state:

---
Note that this method never asks the database for the descendents; it 
will only return objects you have manually set with add_Descendent(), or 
where this was done for you by making a Bio::Tree::Tree with this object 
as an argument to new().

To get the database descendents use 
$taxon->db_handle->each_Descendent($taxon).
---


I also have a note in the Synopsis for the module:

---
# Though be careful with each_Descendent - unless you add_Descendent()
# yourself, you won't get an answer because unlike for ancestor(),
# Bio::Taxon does not ask the database for the answer. You can ask the
# database yourself using the same method:
($human) = $homo->db_handle->each_Descendent($homo);
---


This is quite deliberate and is to prevent Bad Things from happening. 
(Can't exactly remember the reasoning now, but I know it was good.)


From bix at sendu.me.uk  Tue Jun 19 03:41:57 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 08:41:57 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
Message-ID: <467788C5.6070406@sendu.me.uk>

Steve Chervitz wrote:
> Might this been a good opportunity to investigate partitioning
> bioperl-live into sub-repositories? There has been talk in the past of
> defining a set of "core" modules separate from other functionally
> related groups of modules that would be viewed as optional extensions.
> The goal being to help manage growth and simplify releases. There are
> currently 892 modules under Bio/.
> 
> In addition to simplifying the migration to SVN, it would also have
> other benefits. Say some new functionality or a slew of fixes were
> added to Bio::Graphics. We could turn around a new Bio::Graphics
> release quickly without having to work on getting various other parts
> up to snuff that aren't related to graphics (Biblio, DB, PopGen,
> Search etc.). Maintenance and releases of the various extensions would
> be more parallelizable, orchestrated by separate ring leaders.
> 
> Over time, as a set of functionality matures, it would see fewer
> updates and there would be less of a need for users to
> download/install/test it. This could make bioperl easier to customize,
> extend, and grok in general.
> 
> Long term, it should ease development and release cycles

I actually take the opposite view. Breaking things up makes testing and 
releases more difficult.

If one person acts as pumpkin for all the sub-parts, his work-load 
increases almost linearly with the number of sub-parts. If each sub-part 
gets its own pumpkin, where do all these pumpkins come from? It seems to 
me that frequently authors will write modules but inevitably their 
circumstance changes and they can no longer devote the time to look 
after them. Having a single pumpkin and 'forcing' him to make sure 
everything works (regardless of his personal interest in the module) 
seems more reliable than hoping there will be a person interested enough 
in each sub-part to handle its release.

Since all sub-parts will at the least interact with the 'true' core set 
of Bioperl modules, they need to be tested and potentially re-released 
every time the true core is updated. And since some sub-parts will 
interact with other sub-parts, there will need to be coordinated 
joint-testing and release of multiple sub-parts.

What happens when users report problems? We ask them what version 
they're running. Right now '1.5.2' means a specific thing, and its 
trivial for someone to confirm the same problem by installing 1.5.2. 
What happens when users have to list out all the versions of all the 
sub-parts they have? Who is going to consistently recreate a users 
hodge-podge of versions in order to confirm a bug? Won't the advice 
instead be: "update all versions to the latest and get back to us"?

So, as I see it, all sub-parts would best be tested and released with a 
single new version number every time one sub-part is updated 
(significantly). In which case, why have sub-parts at all? Keeping 
things the way they are now means ease of release for the pumpkin and 
ease of installation for end-users (only one install command to issue to 
CPAN). Having 'true' sub-parts (each with its own pumpkin), in my 
fatalistic view, is just going to lead to some useful sub-parts being 
abandoned and never updated, even where updates may be desirable.

Each and every Bio:: module could have been released separately by its 
respective author. As I see it, one of the main values of 'Bioperl' is 
that its one (reasonably) consistent collection of modules that lowers 
the barrier of entry for new Bioinformaticians, giving them extremely 
easy access to a whole host of functionality with a single install.


From hlapp at gmx.net  Tue Jun 19 08:47:02 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 08:47:02 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46777F31.7030402@sendu.me.uk>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
Message-ID: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>

So the real mistake was to write

  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;

instead of

  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents 
($node);

I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the  
database?

If this is correct, can we highlight this in the documentation? It's  
a small difference that everyone failed to spot.

If it is not correct, then maybe we need to revisit the rationale for  
why a Bio::DB::Taxonomy::get_all_Descendents may not query the  
underlying database.

Also, in my reading of Bio::Taxonomy::Taxon it won't use the database  
either for ancestor(). Which would be consistent with its other methods.

I.e., the bottom line is don't use Node or Taxon objects for  
hierarchy queries that you expect to use an underlying database, use  
the Bio::DB::Taxonomy object instead. It makes sense, but is it true?

	-hilmar

On Jun 19, 2007, at 3:01 AM, Sendu Bala wrote:

> Jason Stajich wrote:
>> The reason it isn't printing anything is someone didn't really write
>> the implementation quite right. This code was overhauled by Sendu
>> before the last release I guess something didn't quite get connected.
>>
>> I checked in code that has the Bio::Taxon delegating now to a DB
>> handle for the each_Descendent call.
>> You can either patch your code  or just use the code listed here:
>>   http://bioperl.org/wiki/Module:Bio::DB::Taxonomy
>
> I've reverted that change.
>
> For some reason the docs for Bio::Taxon::each_Descendent aren't  
> showing
> up on the website, but they state:
>
> ---
> Note that this method never asks the database for the descendents; it
> will only return objects you have manually set with add_Descendent 
> (), or
> where this was done for you by making a Bio::Tree::Tree with this  
> object
> as an argument to new().
>
> To get the database descendents use
> $taxon->db_handle->each_Descendent($taxon).
> ---
>
>
> I also have a note in the Synopsis for the module:
>
> ---
> # Though be careful with each_Descendent - unless you add_Descendent()
> # yourself, you won't get an answer because unlike for ancestor(),
> # Bio::Taxon does not ask the database for the answer. You can ask the
> # database yourself using the same method:
> ($human) = $homo->db_handle->each_Descendent($homo);
> ---
>
>
> This is quite deliberate and is to prevent Bad Things from happening.
> (Can't exactly remember the reasoning now, but I know it was good.)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From rvos at interchange.ubc.ca  Tue Jun 19 09:05:25 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Tue, 19 Jun 2007 06:05:25 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <15433211.1182258325544.JavaMail.myubc2@brahms.my.ubc.ca>


> Unrelated, but it randomly just occurred to me: what happens to all the 
> id lines at the top of modules? Eg:
> 
> $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $
> 
> That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
> I wish we would, since they caused me no end of hassles during the 1.5.2 
> release, doing updates across branches.)

If you run something like 'svn propset svn:keywords Id' on the file/folder/recursively, svn picks up on the $Id tag. The structure of the resulting string would be a little different, because svn revision numbers are simply auto-increasing integers (afaik) - so any regular expressions that cleverly want to include the revision number in $VERSION would need to be updated.


From bix at sendu.me.uk  Tue Jun 19 10:25:26 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 15:25:26 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
Message-ID: <4677E756.6050200@sendu.me.uk>

Hilmar Lapp wrote:
> So the real mistake was to write
> 
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
> 
> instead of
> 
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents 
> ($node);
> 
> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the  
> database?

Yes, the database object methods use the database. I don't even think it 
makes sense to question that. What else would it do?


> If this is correct, can we highlight this in the documentation? It's  
> a small difference that everyone failed to spot.

The documentation for what? I've already clearly pointed out the gotcha 
in Bio::Taxon.


> Also, in my reading of Bio::Taxonomy::Taxon it won't use the database  
> either for ancestor(). Which would be consistent with its other methods.

Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're dealing 
with, and it /does/ use the db to get the ancestor, unless the ancestor 
is manually set (see below for explanation).


> I.e., the bottom line is don't use Node or Taxon objects for  
> hierarchy queries that you expect to use an underlying database, use  
> the Bio::DB::Taxonomy object instead. It makes sense, but is it true?

Almost. It happens to be true but ideally wouldn't be the case. The 
confusion and problems arise, I guess, because we have two ways to 
access/create hierarchies and both of them are built from the same 
building block (Bio::Taxon objects).

On the one hand we have Bio::DB::Taxonomy and the other we have 
Bio::Tree::Tree.

Tree objects are easy: you have a Taxon object created in memory for 
each and every node in the tree. Each Taxon knows its ancestor and 
descendants by storing references to the relevant Taxon objects in the 
tree. You 'navigate' through the tree by grabbing a Taxon inside it and 
asking the Taxon itself for its ancestor or descendant.

This leaves us with the Taxon object having the methods ancestor() and 
each_Descendent(), which we'll expect to work in other circumstances.

Bio::DB::Taxonomy returns single Taxon objects from the database on 
request. Now we still expect our ancestor() and each_Descendent() 
methods to work, but if things were set up like Bio::Tree::Tree we'd end 
up pulling the entire database into memory because we'd have to create 
all the Taxon objects that are ancestors and descendants, recursively, 
every time we request a single Taxon (which is wasteful in the case of 
Bio::DB::Taxonomy::flatfile and slow/not allowed in the case of 
Bio::DB::Taxonomy::entrez).

The solution? We simply don't create the immediate ancestor or 
descendant Taxon objects of the requested Taxon, and instead implement 
the Taxon methods to ask the database to create them on demand, if they 
don't already exist. Well, that idea is fine (and necessary) for the 
ancestor method, but we run into problems with each_Descendent().

The problem arises when we create Bio::Tree::Tree objects from a Taxon 
we got from the database. Being able to do that is why Bio::Taxon is 
shared between them, as it is a very desirable thing to do: you can 
instantly create a lineage tree for a Taxon of interest and then use all 
the Bio::Tree::Tree methods on it. Unfortunately one of those methods is 
get_nodes() which is implemented using each_Descendent() and 
get_all_Descendents(). If each_Descendent() asked the database for the 
real answer, we'd end up pulling the entire database into the tree.

So my implementation was to not ask the database and just warn people in 
the docs. Ideally it /would/ use the database, because that's what a 
user would expect. Can anyone see an alternate way around the problem?


From hlapp at gmx.net  Tue Jun 19 12:14:38 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 12:14:38 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <4677E756.6050200@sendu.me.uk>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
	<4677E756.6050200@sendu.me.uk>
Message-ID: <C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>

Sorry I was accidentally looking at an older branch.

Reading through the Taxon module I get more confused though than  
would leave me at ease.

Here's what I understand of your description of the problem:

- We would like nodes returned from Bio::DB::Taxonomy to use the  
database for all hierarchical queries.

- We would like nodes used in a Bio::Tree::Tree not to use the  
database for any hierarchical query.

What I understand that we have is

- Taxon node objects that have a db_handle set will use the database  
for ancestor(), unless it has been set manually (?), but not for  
each_Descendent().

- Taxon node objects that don't have a db_handle set won't use a  
database but will function normally otherwise.

- This is needed to prevent Bio::Tree::Tree methods from pulling the  
entire tree into memory.

If this is correct (I'm not sure it is), it sounds like we want to  
temporarily divorce taxonomy nodes from their database capabilities  
while they are being queried in a tree context?

I'm still trying to understand - if I create a Bio::Tree::Tree from a  
single node, will the tree automatically contain all nodes along the  
lineage of ancestors up to the root? So, even if extracting this  
lineage involved querying a database it would be acceptable, but not  
for querying descendents?

It sounds to me like what is needed is that nodes that get added to a  
tree need to be stripped of their database capabilities. This could  
be achieved by creating a wrapper class that delegates all non- 
hierarchical methods to the wrapped Taxon object, and overriding all  
hierarchical queries to not use a database. I'm not sure I fully  
understand yet though, but the inconsistent behavior will be sure to  
throw people off track.

	-hilmar

On Jun 19, 2007, at 10:25 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> So the real mistake was to write
>>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>   my @extant_children = grep { $_->is_Leaf } $node- 
>> >get_all_Descendents;
>> instead of
>>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>   my @extant_children = grep { $_->is_Leaf } $db- 
>> >get_all_Descendents ($node);
>> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask  
>> the  database?
>
> Yes, the database object methods use the database. I don't even  
> think it makes sense to question that. What else would it do?
>
>
>> If this is correct, can we highlight this in the documentation?  
>> It's  a small difference that everyone failed to spot.
>
> The documentation for what? I've already clearly pointed out the  
> gotcha in Bio::Taxon.
>
>
>> Also, in my reading of Bio::Taxonomy::Taxon it won't use the  
>> database  either for ancestor(). Which would be consistent with  
>> its other methods.
>
> Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're  
> dealing with, and it /does/ use the db to get the ancestor, unless  
> the ancestor is manually set (see below for explanation).
>
>
>> I.e., the bottom line is don't use Node or Taxon objects for   
>> hierarchy queries that you expect to use an underlying database,  
>> use  the Bio::DB::Taxonomy object instead. It makes sense, but is  
>> it true?
>
> Almost. It happens to be true but ideally wouldn't be the case. The  
> confusion and problems arise, I guess, because we have two ways to  
> access/create hierarchies and both of them are built from the same  
> building block (Bio::Taxon objects).
>
> On the one hand we have Bio::DB::Taxonomy and the other we have  
> Bio::Tree::Tree.
>
> Tree objects are easy: you have a Taxon object created in memory  
> for each and every node in the tree. Each Taxon knows its ancestor  
> and descendants by storing references to the relevant Taxon objects  
> in the tree. You 'navigate' through the tree by grabbing a Taxon  
> inside it and asking the Taxon itself for its ancestor or descendant.
>
> This leaves us with the Taxon object having the methods ancestor()  
> and each_Descendent(), which we'll expect to work in other  
> circumstances.
>
> Bio::DB::Taxonomy returns single Taxon objects from the database on  
> request. Now we still expect our ancestor() and each_Descendent()  
> methods to work, but if things were set up like Bio::Tree::Tree  
> we'd end up pulling the entire database into memory because we'd  
> have to create all the Taxon objects that are ancestors and  
> descendants, recursively, every time we request a single Taxon  
> (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and  
> slow/not allowed in the case of Bio::DB::Taxonomy::entrez).
>
> The solution? We simply don't create the immediate ancestor or  
> descendant Taxon objects of the requested Taxon, and instead  
> implement the Taxon methods to ask the database to create them on  
> demand, if they don't already exist. Well, that idea is fine (and  
> necessary) for the ancestor method, but we run into problems with  
> each_Descendent().
>
> The problem arises when we create Bio::Tree::Tree objects from a  
> Taxon we got from the database. Being able to do that is why  
> Bio::Taxon is shared between them, as it is a very desirable thing  
> to do: you can instantly create a lineage tree for a Taxon of  
> interest and then use all the Bio::Tree::Tree methods on it.  
> Unfortunately one of those methods is get_nodes() which is  
> implemented using each_Descendent() and get_all_Descendents(). If  
> each_Descendent() asked the database for the real answer, we'd end  
> up pulling the entire database into the tree.
>
> So my implementation was to not ask the database and just warn  
> people in the docs. Ideally it /would/ use the database, because  
> that's what a user would expect. Can anyone see an alternate way  
> around the problem?

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Tue Jun 19 14:41:52 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 19 Jun 2007 14:41:52 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] is this a bp_genbank2gff3.pl bug?
In-Reply-To: <18039.61086.829726.809888@gargle.gargle.HOWL>
References: <18039.61086.829726.809888@gargle.gargle.HOWL>
Message-ID: <1182278512.2592.42.camel@localhost.localdomain>

Hi Alessandra,

I cc'ed your message to the bioperl and sequence ontology mailing lists,
since your question is relevant to both.

Converting genbank files to GFF3 is excruciatingly difficult; I
generally find that I can use the genbank2gff3 script to get me most of
the way there, but then I need to do some manual fixing to make it
'right'.

I am using bioperl-live, since there have been several fixes to the
script since bioperl 1.5.2 was released, including the most recent fixes
from me today (when I started working on this); I would suggest you use
bioperl-live as well.  I ran the script on chrY.

Most (perhaps all) of the errors fit into a few categories:

  - CDS doesn't have a phase, where the GFF3 spec requires CDSes to have
a phase.  Since it can be a little bit of a hassle to calculate, I
understand why it was left out, but I'll submit a bug report to have
those calculated.  If you are planning on loading the GFF file into
Chado, you can use the --noCDS option to get exons instead of CDSes,
which makes the problem go away (the validator has a bug here though--it
reports the polypeptide derives_from mRNA as invalid, but it is correct;
I'm reporting that directly to the author).  Here's the bioperl bug
report:

  http://bugzilla.open-bio.org/show_bug.cgi?id=2322

  - "invalid type pair" is caused by the genbank file using feature
types in a way that conflicts with the Sequence Ontology.  For example,
it has STS features that are part_of a gene, pseudogenic_region as
part_of pseudogene.  I don't know if there would be an easy way to catch
this in the conversion script.  You may need to fix these by hand.  If
the problems occur for features that you don't care about, you can use
the --filter option to leave them out of the resulting GFF file (for
example, adding '--filter STS' would leave all STS features out of the
file).  Also, if you don't plan on loading these into Chado (which does
require SO-compliance) but instead plan on using a Bio::DB::SeqFeature
database, these errors may not be a problem.

  - "invalid type" is caused by feature types that are not in SOFA
(Sequence Ontology for Feature Annotation), though the terms probably
are in SO.  I thought at one point we discussed allowing any SO type to
appear in the GFF3 type column, but that is not what the spec says now.
I don't see this type of error as causing a problem for either
Bio::DB::SeqFeature or Chado.  Chado allows features to be typed with
anything that is in SO and does not restrict to SOFA.

Scott


On Tue, 2007-06-19 at 16:56 +0200, Alessandra Bilardi wrote:
> Hi all,
> 
> I used bp_genbank2gff3.pl with CVS bioperl and it created gff3 about
> human genbank file. I used validate_gff3 on line with human.gff and 
> it has id non-unique so the database gbrowse inserting has errors.
> 
> I attach the error file about hs_ref_chrY.gbk and hs_ref_chr1.gbk that 
> I download at at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens
> Elements having id non-unique are:
> - CDS or pseudo*exon without mRNA and parent 
> - STS with egual start and end
> - tRNA with egual name
> 
> If this is a bp_genbank2gff3.pl bug, can you rectify bp_genbank2gff3.pl?
> If I'm mistaken, can you help me?
> 
> Thanks very much for the help in advance,
> 
> Alessandra.
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070619/3d818b27/attachment-0003.bin>

From sac at bioperl.org  Tue Jun 19 14:54:39 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Tue, 19 Jun 2007 11:54:39 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <467788C5.6070406@sendu.me.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
Message-ID: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>

Valid points, Sendu. I wonder if there might be a best-of-both-worlds
approach here. I would not be advocating for a major slice and dice,
but just identifying a few large, reasonably well established and
encapsulated blocks of functionality that could be managed more
independently and segregating them away from the rest. For example:
DB, Graphics, Search+SearchIO, Tools.

Once per year, we could have a "whole caboodle" release where the core
and all sub parts are tested and released as a group, as we currently
do. Then, updates to the sub parts can occur as-needed but without
necessarily involving updates to other sub parts or the core.

The onus would be on the pumpkin for the sub part release to make sure
it continues to work with the last whole caboodle release. This would
minimize the number of release clashes, since sub part updates would
only be sanctioned relative to the last caboodle release, and it would
ensure that the whole set continues to interoperate.

Perhaps it would be worth experimenting with such an approach so we
can judge it based on actual experience. We could identify one
functional sub part and segregate it out, do a release cycle or two,
along with a sub part release, and decide if this makes things easier
or harder, for devs as well as users. We could always bring it back
into the fold if it doesn't work out.

My fear is that as bioperl continues to grow, the monolithic approach
will become increasingly onerous for a single release pumpkin to
manage, and harder to find someone who feels up to the task. It could
also discourage new developers from diving into the codebase if it
looks too deep. And they are our lifeblood.

A more functionally segregated bioperl codebase could lower the
activation energy needed to recruit release pumpkins and new devs,
leading to more release iterations, fewer bugs, more features, and
more sustainable growth.

When I first discovered Bioperl in 1996, it had three modules. At
~900, I  probably wouldn't have joined ranks as a developer (well, I
probably would, but it would have taken a while to digest it and
become a contributor).

Steve

On 6/19/07, Sendu Bala <bix at sendu.me.uk> wrote:
> Steve Chervitz wrote:
> > Might this been a good opportunity to investigate partitioning
> > bioperl-live into sub-repositories? There has been talk in the past of
> > defining a set of "core" modules separate from other functionally
> > related groups of modules that would be viewed as optional extensions.
> > The goal being to help manage growth and simplify releases. There are
> > currently 892 modules under Bio/.
> >
> > In addition to simplifying the migration to SVN, it would also have
> > other benefits. Say some new functionality or a slew of fixes were
> > added to Bio::Graphics. We could turn around a new Bio::Graphics
> > release quickly without having to work on getting various other parts
> > up to snuff that aren't related to graphics (Biblio, DB, PopGen,
> > Search etc.). Maintenance and releases of the various extensions would
> > be more parallelizable, orchestrated by separate ring leaders.
> >
> > Over time, as a set of functionality matures, it would see fewer
> > updates and there would be less of a need for users to
> > download/install/test it. This could make bioperl easier to customize,
> > extend, and grok in general.
> >
> > Long term, it should ease development and release cycles
>
> I actually take the opposite view. Breaking things up makes testing and
> releases more difficult.
>
> If one person acts as pumpkin for all the sub-parts, his work-load
> increases almost linearly with the number of sub-parts. If each sub-part
> gets its own pumpkin, where do all these pumpkins come from? It seems to
> me that frequently authors will write modules but inevitably their
> circumstance changes and they can no longer devote the time to look
> after them. Having a single pumpkin and 'forcing' him to make sure
> everything works (regardless of his personal interest in the module)
> seems more reliable than hoping there will be a person interested enough
> in each sub-part to handle its release.
>
> Since all sub-parts will at the least interact with the 'true' core set
> of Bioperl modules, they need to be tested and potentially re-released
> every time the true core is updated. And since some sub-parts will
> interact with other sub-parts, there will need to be coordinated
> joint-testing and release of multiple sub-parts.
>
> What happens when users report problems? We ask them what version
> they're running. Right now '1.5.2' means a specific thing, and its
> trivial for someone to confirm the same problem by installing 1.5.2.
> What happens when users have to list out all the versions of all the
> sub-parts they have? Who is going to consistently recreate a users
> hodge-podge of versions in order to confirm a bug? Won't the advice
> instead be: "update all versions to the latest and get back to us"?
>
> So, as I see it, all sub-parts would best be tested and released with a
> single new version number every time one sub-part is updated
> (significantly). In which case, why have sub-parts at all? Keeping
> things the way they are now means ease of release for the pumpkin and
> ease of installation for end-users (only one install command to issue to
> CPAN). Having 'true' sub-parts (each with its own pumpkin), in my
> fatalistic view, is just going to lead to some useful sub-parts being
> abandoned and never updated, even where updates may be desirable.
>
> Each and every Bio:: module could have been released separately by its
> respective author. As I see it, one of the main values of 'Bioperl' is
> that its one (reasonably) consistent collection of modules that lowers
> the barrier of entry for new Bioinformaticians, giving them extremely
> easy access to a whole host of functionality with a single install.
>


From bix at sendu.me.uk  Tue Jun 19 15:13:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 20:13:39 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
Message-ID: <46782AE3.2090703@sendu.me.uk>

Steve Chervitz wrote:
> Valid points, Sendu. I wonder if there might be a best-of-both-worlds
> approach here.
[snip]

You haven't convinced me, but I'd go along with the majority decision if 
best-of-both-worlds was picked.


> DB, Graphics, Search+SearchIO, Tools.

I will, however, say that DB interleaves into too many core modules. It 
should stay in core. Tools? Its hardly touched anyway, so I don't see 
the value of taking it out, what with Bio::Tools::Run already being its 
own package. Most Bioperl users probably get Bioperl just to do 
something Blast related, so all Blast stuff really ought to stay in core.

Graphics is an obvious choice and I agree. Updated frequently, and has 
its own release needs. It also has some of the trickier dependencies, so 
would make installing core simpler.

I can imagine plucking Search+SearchIO out, and its something that needs 
regular updating. Another good candidate.


> Perhaps it would be worth experimenting with such an approach so we
> can judge it based on actual experience. We could identify one
> functional sub part and segregate it out, do a release cycle or two,
> along with a sub part release, and decide if this makes things easier
> or harder, for devs as well as users.

Well, we already have the run package. Its a split-off subpart that gets 
updated. The only 'experiment' left to do is finding it its own pumpkin.


From bix at sendu.me.uk  Tue Jun 19 15:48:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 20:48:50 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
	<4677E756.6050200@sendu.me.uk>
	<C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>
Message-ID: <46783322.30309@sendu.me.uk>

Hilmar Lapp wrote:
> Here's what I understand of your description of the problem:
> 
> - We would like nodes returned from Bio::DB::Taxonomy to use the  
> database for all hierarchical queries.
> 
> - We would like nodes used in a Bio::Tree::Tree not to use the  
> database for any hierarchical query.

Correct.


> What I understand that we have is
> 
> - Taxon node objects that have a db_handle set will use the database  
> for ancestor(), unless it has been set manually (?), but not for  
> each_Descendent().
> 
> - Taxon node objects that don't have a db_handle set won't use a  
> database but will function normally otherwise.
> 
> - This is needed to prevent Bio::Tree::Tree methods from pulling the  
> entire tree into memory.

Correct.


> If this is correct (I'm not sure it is), it sounds like we want to  
> temporarily divorce taxonomy nodes from their database capabilities  
> while they are being queried in a tree context?

Yes.


> I'm still trying to understand - if I create a Bio::Tree::Tree from a  
> single node, will the tree automatically contain all nodes along the  
> lineage of ancestors up to the root? So, even if extracting this  
> lineage involved querying a database it would be acceptable, but not  
> for querying descendents?

Yes. Asking the database for all the ancestors up to root only pulls a 
couple of nodes into the tree and is exactly what the user would want to 
happen. But if nodes are allowed to get their descendants from the 
database, when we get the root node from the database, we'd get all the 
root's descendants, and then for each of those we'd get all /their/ 
descendants... that's when the whole db gets sucked in.


> It sounds to me like what is needed is that nodes that get added to a  
> tree need to be stripped of their database capabilities. This could  
> be achieved by creating a wrapper class that delegates all non- 
> hierarchical methods to the wrapped Taxon object, and overriding all  
> hierarchical queries to not use a database. I'm not sure I fully  
> understand yet though, but the inconsistent behavior will be sure to  
> throw people off track.

When we're making a tree from a db Taxon we need db access to find all 
the ancestors; we just don't want to get any descendants outside our 
initiating Taxon's direct lineage.


my @names = ('Eukaryota', 'Mammalia', 'Primates', 'Homo', 'Homo sapiens');
my @ranks = qw(superkingdom class order genus species);
my $db = Bio::DB::Taxonomy->new(-source => 'list', -names => \@names,
                                                    -ranks => \@ranks);

@names = ('Eukaryota', 'Mammalia', 'Rodentia', 'Mus', 'Mus musculus');
$db->add_lineage(-names => \@names, -ranks => \@ranks);


my $homo = $db->get_taxon(-name => 'Homo');
isa_ok($homo, 'Bio::Taxon'); # PASS

is $homo->ancestor->scientific_name, 'Primates' # PASS
my @descs = $homo->each_Descendent;
is @descs, 1 # FAIL, we wanted it to contain the 'Homo sapiens' node


my $lineage = Bio::Tree::Tree->new(-node => $homo);
is $lineage->get_root_node->scientific_name, 'Eukaryota'; # PASS
my @nodes = $lineage->get_nodes;
ok @nodes, 4; # PASS: we didn't pull in Rodentia which would be 8

(on that last test I can't remember if the answer might actually be 5 
because our lineage does contain 'Homo sapiens')


If anyone can figure out how to get all those to pass, please let me know.


From cjfields at uiuc.edu  Tue Jun 19 17:15:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 19 Jun 2007 16:15:00 -0500
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
Message-ID: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>


On Jun 19, 2007, at 1:54 PM, Steve Chervitz wrote:

> Valid points, Sendu. I wonder if there might be a best-of-both-worlds
> approach here. I would not be advocating for a major slice and dice,
> but just identifying a few large, reasonably well established and
> encapsulated blocks of functionality that could be managed more
> independently and segregating them away from the rest. For example:
> DB, Graphics, Search+SearchIO, Tools.

There should also be a consensus between the core devs on this; I  
don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing  
their opinions as it will directly impact projects which rely on core  
functionality (GBrowse/GMOD, bioperl-db, etc).  I also agree with  
George that this should be postponed until after svn issues are taken  
care of.

Stating that, I think this is a good idea in general, though we'll  
need to be careful which ones we segregate out as non-core.  I agree  
with your choices; I would add in Bio::Restriction, Bio::Assembly,  
Bio::Structure, and a few more.  As long as the distribution required  
installation of 'core' prior to test runs it shouldn't be too much of  
a problem.

In order for this to work we would need to delineate what defines  
'core' (how broad the definition should be), then identify those  
modules that don't fit and decide what to do with them.  Would we  
want to split the others into separate packages or lump together as a  
bioperl-auxiliary (horrid name, but you get my point)?  Too many  
could be a logistical nightmare, as Sendu has pointed out.

> Once per year, we could have a "whole caboodle" release where the core
> and all sub parts are tested and released as a group, as we currently
> do. Then, updates to the sub parts can occur as-needed but without
> necessarily involving updates to other sub parts or the core.

Sounds fine by me.  Actually, my thought was we could reimplement  
Bundle::BioPerl on CPAN (which Module::Build effectively obsoleted)  
to install all the necessary subpackages in order to emulate an old- 
style 'core' installation, or act as an 'install everything BioPerl- 
related' Bundle.  Regular updates of the subpackages to CPAN should  
just require updating the Bundle (which would update only the  
relevant parts, at least I believe it would).

> The onus would be on the pumpkin for the sub part release to make sure
> it continues to work with the last whole caboodle release. This would
> minimize the number of release clashes, since sub part updates would
> only be sanctioned relative to the last caboodle release, and it would
> ensure that the whole set continues to interoperate.
>
> Perhaps it would be worth experimenting with such an approach so we
> can judge it based on actual experience. We could identify one
> functional sub part and segregate it out, do a release cycle or two,
> along with a sub part release, and decide if this makes things easier
> or harder, for devs as well as users. We could always bring it back
> into the fold if it doesn't work out.
>
> My fear is that as bioperl continues to grow, the monolithic approach
> will become increasingly onerous for a single release pumpkin to
> manage, and harder to find someone who feels up to the task. It could
> also discourage new developers from diving into the codebase if it
> looks too deep. And they are our lifeblood.

Agreed!

> A more functionally segregated bioperl codebase could lower the
> activation energy needed to recruit release pumpkins and new devs,
> leading to more release iterations, fewer bugs, more features, and
> more sustainable growth.

'Activation energy.'  Hmm.  Spoken like a true biologist.

> When I first discovered Bioperl in 1996, it had three modules. At
> ~900, I  probably wouldn't have joined ranks as a developer (well, I
> probably would, but it would have taken a while to digest it and
> become a contributor).
>
> Steve

I pretty much agree, though this will require quite a bit more  
discussion.

chris


From hlapp at gmx.net  Tue Jun 19 17:57:54 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 17:57:54 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
Message-ID: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>


On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:

> There should also be a consensus between the core devs on this; I
> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
> their opinions

The problem I have increasingly had with BioPerl (aside from the fact  
that it's written in Perl ;) is the plethora of dependencies I need  
to install, not the number of modules.

But every time I've been told that that's what Perl is all about, and  
I should shut up and install the bundle. Idiosyncratically I don't  
like bundles that clutter up my hard disk with stuff I'll never use,  
and in this sense if BioPerl is divided into 10 packages I will have  
to think about each one whether I need it, and do a separate CVS  
checkout - and regular update - of each one (though granted, I  
believe there are ways the multiple checkout and update thing can be  
taken care of).

In reality, this may be a rapidly disappearing trait though of those  
who have grown up in a time when they proudly spent all their savings  
to buy that new computer because it had a 20MB hard disk, compared to  
the two 360k floppy drives the previous one had.

So don't ask me, just don't make it too hard for the dinosaurs.

> as it will directly impact projects which rely on core
> functionality (GBrowse/GMOD, bioperl-db, etc).

Well, I hope there are ways to limit that?

> I also agree with George that this should be postponed until after  
> svn issues are taken care of.

I agree entirely. Please don't throw this in the same bin or tie one  
to the other. The migration is neither easier nor faster nor better  
testable with a partitioned BioPerl.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Jun 19 21:48:20 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 19 Jun 2007 20:48:20 -0500
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
Message-ID: <D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>


On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote:

> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:
>
>> There should also be a consensus between the core devs on this; I
>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
>> their opinions
>
> The problem I have increasingly had with BioPerl (aside from the fact
> that it's written in Perl ;) is the plethora of dependencies I need
> to install, not the number of modules.
>
> But every time I've been told that that's what Perl is all about, and
> I should shut up and install the bundle. Idiosyncratically I don't
> like bundles that clutter up my hard disk with stuff I'll never use,
> and in this sense if BioPerl is divided into 10 packages I will have
> to think about each one whether I need it, and do a separate CVS
> checkout - and regular update - of each one (though granted, I
> believe there are ways the multiple checkout and update thing can be
> taken care of).

I agree; the fewer dependencies the better.  We could divide it up  
into a small, focused core package with only a few dependencies, and  
1-3 more containing the focused bits which require the most  
maintenance (Graphics, SearchIO/Tools, etc).  I worry about having  
too many more.

> In reality, this may be a rapidly disappearing trait though of those
> who have grown up in a time when they proudly spent all their savings
> to buy that new computer because it had a 20MB hard disk, compared to
> the two 360k floppy drives the previous one had.
>
> So don't ask me, just don't make it too hard for the dinosaurs.

There would need to be some way of getting an old-style full-blown  
core installation regardless of how many subdistros we would divy  
core up into.  My thought for CPAN was having Bundle::BioPerl take  
over this but I'm not sure if it's still being used.  Maybe there are  
other ways for svn/cvs.

>> as it will directly impact projects which rely on core
>> functionality (GBrowse/GMOD, bioperl-db, etc).
>
> Well, I hope there are ways to limit that?

I believe so, yes, particularly for bioperl-db.  I would think  
splitting off Bio::Graphics or Bio::DB* will have some effect on  
GBrowse/GFF.

>> I also agree with George that this should be postponed until after
>> svn issues are taken care of.
>
> I agree entirely. Please don't throw this in the same bin or tie one
> to the other. The migration is neither easier nor faster nor better
> testable with a partitioned BioPerl.
>
> 	-hilmar

We def. have to complete transition to subversion first, then think  
about this some more.

chris


From n.haigh at sheffield.ac.uk  Wed Jun 20 02:31:24 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 07:31:24 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
Message-ID: <4678C9BC.10206@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote:
> 
>> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:
>>
>>> There should also be a consensus between the core devs on this; I
>>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
>>> their opinions
>> The problem I have increasingly had with BioPerl (aside from the fact
>> that it's written in Perl ;) is the plethora of dependencies I need
>> to install, not the number of modules.
>>
>> But every time I've been told that that's what Perl is all about, and
>> I should shut up and install the bundle. Idiosyncratically I don't
>> like bundles that clutter up my hard disk with stuff I'll never use,
>> and in this sense if BioPerl is divided into 10 packages I will have
>> to think about each one whether I need it, and do a separate CVS
>> checkout - and regular update - of each one (though granted, I
>> believe there are ways the multiple checkout and update thing can be
>> taken care of).
> 
> I agree; the fewer dependencies the better.  We could divide it up  
> into a small, focused core package with only a few dependencies, and  
> 1-3 more containing the focused bits which require the most  
> maintenance (Graphics, SearchIO/Tools, etc).  I worry about having  
> too many more.
> 
>> In reality, this may be a rapidly disappearing trait though of those
>> who have grown up in a time when they proudly spent all their savings
>> to buy that new computer because it had a 20MB hard disk, compared to
>> the two 360k floppy drives the previous one had.
>>
>> So don't ask me, just don't make it too hard for the dinosaurs.
> 
> There would need to be some way of getting an old-style full-blown  
> core installation regardless of how many subdistros we would divy  
> core up into.  My thought for CPAN was having Bundle::BioPerl take  
> over this but I'm not sure if it's still being used.  Maybe there are  
> other ways for svn/cvs.

Personally, I think this use of Bundle::Bioperl is more in line with
what CPAN Bundles were meant to do - "a bundle is a collection of
modules that comprise a cohesive unit". Under that definition you could
probably put the whole of Bioperl but I won't go there! When a package
is updated and a new release is made, this should be
installable/updatable via cpan as well as updating the bundle with the
correct version. This was you can get all of Bioperl via the bundle, or
just install the sub-packages on their own.

If the switch over to svn takes place, will all the Bioperl-* projects
move over at the same time? If so, will they go into their own svn
repository or into the same one? Since with svn you can checkout any
subtree of the repository I'm not clear on the pro's and cons of either
of these options.

Am I right in thinking that there is a way for cvs to define a "project"
such that when you checkout that "project" it actually checks out
multiple projects behind the scene? I'm sure I've seen this somewhere,
possibly when the project is dependent on some 3rd party code that is
also in cvs. If this is possible, I'm sure it will also be possible with
svn. This could then allow something like the following to happen after
the split up of Bioperl. The following projects could be defined:
bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
called "bioperl" would actually checkout the real projects call
bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
that this ought to be possible, doesn't it?


> 
>>> as it will directly impact projects which rely on core
>>> functionality (GBrowse/GMOD, bioperl-db, etc).
>> Well, I hope there are ways to limit that?
> 
> I believe so, yes, particularly for bioperl-db.  I would think  
> splitting off Bio::Graphics or Bio::DB* will have some effect on  
> GBrowse/GFF.
> 
>>> I also agree with George that this should be postponed until after
>>> svn issues are taken care of.
>> I agree entirely. Please don't throw this in the sam. e bin or tie one
>> to the other. The migration is neither easier nor faster nor better
>> testable with a partitioned BioPerl.
>>
>> 	-hilmar
> 
> We def. have to complete transition to subversion first, then think  
> about this some more.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeMm7czuW2jkwy2gRAi+CAJ9cNZ70GojV7eviRjdWTFLk/MKYoACg2Ls4
op9sQTZyeK6G6taFhTAPMYc=
=7NRw
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 20 07:46:16 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 07:46:16 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <4678C9BC.10206@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
Message-ID: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:

> If the switch over to svn takes place, will all the Bioperl-* projects
> move over at the same time?

They are under the same CVSROOT right now. Locking down some sub- 
repositories but not others may be odd or impossible.

> If so, will they go into their own svn repository or into the same  
> one?

Good question, I'm not sure about the pros and cons one way or the  
other either. The fewer repositories the less sysadmin work in fine- 
graining permissions.

	-hilmar

- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGeRONuV6N2JxL7qsRAoYTAJ9GVuC0j4szCcWTg7yWGoxN3YFucQCgogJ8
Ims4d150lsX0vXtDwGI1lKg=
=K4++
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Wed Jun 20 07:57:22 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 12:57:22 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
Message-ID: <46791622.6080409@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hilmar Lapp wrote:
> 
> On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:
> 
>> If the switch over to svn takes place, will all the Bioperl-* projects
>> move over at the same time?
> 
> They are under the same CVSROOT right now. Locking down some
> sub-repositories but not others may be odd or impossible.
> 
>> If so, will they go into their own svn repository or into the same one?
> 
> Good question, I'm not sure about the pros and cons one way or the other
> either. The fewer repositories the less sysadmin work in fine-graining
> permissions.
> 
>     -hilmar
> 


I don't think there is any major reason why the following single repos
wouldn't do the trick:

/--
  |-bioperl-live
  |     |--- trunk
  |     |--- branches
  |     |--- tags
  |
  |-bioperl-run
        |--- trunk
        |--- branches
        |--- tags

Any reason why this couldn't be used?

I know some people don't like the idea of the revision number
incrementing for the whole repository if it contains several "projects".
However, revision numbers are really only a way for svn to keep track of
things and a very large revision number shouldn't really "upset" anyone.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeRYiczuW2jkwy2gRApS5AJsHl73MWZP8aMfOqlLgTYuzpMWmQgCg3VqA
1Vj8BSUnanpdjYYLE6eGanU=
=bOqK
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 20 08:08:33 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 08:08:33 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <46791622.6080409@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
	<46791622.6080409@sheffield.ac.uk>
Message-ID: <DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote:

> I don't think there is any major reason why the following single repos
> wouldn't do the trick:
>
> /--
>   |-bioperl-live
>   |     |--- trunk
>   |     |--- branches
>   |     |--- tags
>   |
>   |-bioperl-run
>         |--- trunk
>         |--- branches
>         |--- tags
>
> Any reason why this couldn't be used?

That would work fine except that there are several more sub-projects  
(bioperl-db, bioperl-graphics, bioperl-microarray, and a few more).

That should still be fine. I think what needs to be recognized is the  
limitations it puts on permission granularity. If it's all the same  
repository (as is now) then having commit rights to one (subproject)  
will mean commit rights to all. From my perspective that's fine, it  
has worked great so far.

	-hilmar

- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGeRjFuV6N2JxL7qsRAj3dAJ42r1C8By29DNTUP9Ts0Lf5dOcS9QCgjSE1
hckjT7LBtHcmwGI8B+BKQIM=
=gYfA
-----END PGP SIGNATURE-----


From hartzell at alerce.com  Tue Jun 19 15:53:39 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 19 Jun 2007 12:53:39 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
Message-ID: <18040.13379.217277.992742@almost.alerce.com>

Steve Chervitz writes:
 > On 6/16/07, Jason Stajich <jason at bioperl.org> wrote:
 > > [...]
 > > Just to say I already went through all the steps of running cvs2svn
 > > myself and had problems gathering back out the branches and all the
 > > tags when I tried it.  If you want to start with a smaller repository
 > > like bioperl-network or bioperl-db as the initial cvs2svn conversion
 > > script took quite a long time to run on bioperl-live.
 > 
 > Might this been a good opportunity to investigate partitioning
 > bioperl-live into sub-repositories? [...]

I'd say that the time to do this kind of rearrangement would be
*after* the svn repo's set up.  That way you'll be able to track stuff
back through to the beginning of time.

g.


From sdavis2 at mail.nih.gov  Wed Jun 20 08:44:08 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 20 Jun 2007 08:44:08 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN
	and	...Re:	Perltidy)
In-Reply-To: <DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>	<4678C9BC.10206@sheffield.ac.uk>	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>	<46791622.6080409@sheffield.ac.uk>
	<DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>
Message-ID: <46792118.4030205@mail.nih.gov>

Hilmar Lapp wrote:
> 
> On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote:
> 
>> I don't think there is any major reason why the following single repos
>> wouldn't do the trick:
> 
>> /--
>>   |-bioperl-live
>>   |     |--- trunk
>>   |     |--- branches
>>   |     |--- tags
>>   |
>>   |-bioperl-run
>>         |--- trunk
>>         |--- branches
>>         |--- tags
> 
>> Any reason why this couldn't be used?
> 
> That would work fine except that there are several more sub-projects  
> (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more).
> 
> That should still be fine. I think what needs to be recognized is the  
> limitations it puts on permission granularity. If it's all the same  
> repository (as is now) then having commit rights to one (subproject)  
> will mean commit rights to all. From my perspective that's fine, it  
> has worked great so far.

Actually, I think there are ways of creating per-directory access
control.  See here:

http://svnbook.red-bean.com/en/1.2/svn-book.html#svn.serverconfig.svnserve.auth.general

With Apache-based https access, such access control is relatively
straightforward, it appears.  With the standalone svn server over ssh,
one needs to use "commit hook scripts" to limit access.  But I think it
is possible (admitting that I have not tried to do this...).

Sean


From hartzell at alerce.com  Wed Jun 20 09:23:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 20 Jun 2007 06:23:32 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <4678C9BC.10206@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
Message-ID: <18041.10836.728079.835572@almost.alerce.com>

Nathan S. Haigh writes:
 > [...]
 > If the switch over to svn takes place, will all the Bioperl-* projects
 > move over at the same time? If so, will they go into their own svn
 > repository or into the same one? Since with svn you can checkout any
 > subtree of the repository I'm not clear on the pro's and cons of either
 > of these options.

I'm planning to drop the projects from the top of the CVSROOT into a
single svn repository:

    bioperl-ext bioperl-pipeline biodata bioperl-gui
    bioperl-run bioperl-cookbook bioperl-live biosql-schema
    bioperl-corba-client bioperl-microarray html bioperl-corba-server
    bioperl-network task-manager bioperl-das-client bioperl-papers
    xml-html bioperl-db bioperl-pedigree

although that's open to feedback from the core members.

As a progress report, I've built a demo repos with -run, -ext, and
-live in it and asked a couple of folks to to take a peek at it.  When
I get a bit further along I'll figure out how to get something for the
public to test.

 > Am I right in thinking that there is a way for cvs to define a "project"
 > such that when you checkout that "project" it actually checks out
 > multiple projects behind the scene? I'm sure I've seen this somewhere,
 > possibly when the project is dependent on some 3rd party code that is
 > also in cvs. If this is possible, I'm sure it will also be possible with
 > svn. This could then allow something like the following to happen after
 > the split up of Bioperl. The following projects could be defined:
 > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
 > called "bioperl" would actually checkout the real projects call
 > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
 > that this ought to be possible, doesn't it?
 > [...]

I don't think that there's any functionality like that in svn.

g.


From hartzell at alerce.com  Wed Jun 20 09:26:04 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 20 Jun 2007 06:26:04 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <46791622.6080409@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
	<46791622.6080409@sheffield.ac.uk>
Message-ID: <18041.10988.375946.833182@almost.alerce.com>

Nathan S. Haigh writes:
 > -----BEGIN PGP SIGNED MESSAGE-----
 > Hash: SHA1
 > 
 > Hilmar Lapp wrote:
 > > 
 > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:
 > > 
 > >> If the switch over to svn takes place, will all the Bioperl-* projects
 > >> move over at the same time?
 > > 
 > > They are under the same CVSROOT right now. Locking down some
 > > sub-repositories but not others may be odd or impossible.
 > > 
 > >> If so, will they go into their own svn repository or into the same one?
 > > 
 > > Good question, I'm not sure about the pros and cons one way or the other
 > > either. The fewer repositories the less sysadmin work in fine-graining
 > > permissions.
 > > 
 > >     -hilmar
 > > 
 > 
 > 
 > I don't think there is any major reason why the following single repos
 > wouldn't do the trick:
 > 
 > /--
 >   |-bioperl-live
 >   |     |--- trunk
 >   |     |--- branches
 >   |     |--- tags
 >   |
 >   |-bioperl-run
 >         |--- trunk
 >         |--- branches
 >         |--- tags
 > 
 > Any reason why this couldn't be used?
 > [...]

That's exactly the way that I'm setting it up.

g.


From n.haigh at sheffield.ac.uk  Wed Jun 20 09:33:33 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 14:33:33 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <18041.10836.728079.835572@almost.alerce.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>	<4678C9BC.10206@sheffield.ac.uk>
	<18041.10836.728079.835572@almost.alerce.com>
Message-ID: <46792CAD.5060700@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:
> Nathan S. Haigh writes:
>  > [...]
>  > If the switch over to svn takes place, will all the Bioperl-* projects
>  > move over at the same time? If so, will they go into their own svn
>  > repository or into the same one? Since with svn you can checkout any
>  > subtree of the repository I'm not clear on the pro's and cons of either
>  > of these options.
> 
> I'm planning to drop the projects from the top of the CVSROOT into a
> single svn repository:
> 
>     bioperl-ext bioperl-pipeline biodata bioperl-gui
>     bioperl-run bioperl-cookbook bioperl-live biosql-schema
>     bioperl-corba-client bioperl-microarray html bioperl-corba-server
>     bioperl-network task-manager bioperl-das-client bioperl-papers
>     xml-html bioperl-db bioperl-pedigree
> 
> although that's open to feedback from the core members.
> 
> As a progress report, I've built a demo repos with -run, -ext, and
> -live in it and asked a couple of folks to to take a peek at it.  When
> I get a bit further along I'll figure out how to get something for the
> public to test.

Could I take a peek??

> 
>  > Am I right in thinking that there is a way for cvs to define a "project"
>  > such that when you checkout that "project" it actually checks out
>  > multiple projects behind the scene? I'm sure I've seen this somewhere,
>  > possibly when the project is dependent on some 3rd party code that is
>  > also in cvs. If this is possible, I'm sure it will also be possible with
>  > svn. This could then allow something like the following to happen after
>  > the split up of Bioperl. The following projects could be defined:
>  > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
>  > called "bioperl" would actually checkout the real projects call
>  > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
>  > that this ought to be possible, doesn't it?
>  > [...]
> 
> I don't think that there's any functionality like that in svn.


I did come across this which might help:
http://subversion.tigris.org/servlets/ReadMsg?listName=users&msgNo=43561

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeSytczuW2jkwy2gRAnlUAJ4pjhPlYlqOm+M882Ni116MJVzPCwCbB3Su
sWDAmqFhGgtlyeawaIGSV14=
=zeAY
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Wed Jun 20 11:38:20 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 20 Jun 2007 16:38:20 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
Message-ID: <467949EC.9040100@sendu.me.uk>

In considering updating all the test scripts to take advantage of the 
new network option, and/or reimplementing them in Test::More, I thought 
now would be a good time to standardize all the test scripts and reduce 
the possibility of having to alter them all in the future if something 
changes.

For example we could decide on an alternate way of choosing to run 
network tests, or a new way of deciding to output debug information. 
There are also some inconsistencies in the messages produced by tests 
skipping all, and even an unfortunate mistake that has been copy/pasted 
through a lot of test scripts.

My solution is t/lib/BioperlTest.pm (documented with perldoc)

We go from this:

----
use strict;
our $DEBUG;

BEGIN {
   $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
	
   eval { require Test::More; };
   if( $@ ) {
     use lib 't/lib';
   }
   use Test::More; # the mistake!
	
   use Module::Build;
   my $build = Module::Build->current();
   my $do_network_tests = $build->notes('network');

   eval {
     require IO::String;
     require LWP;
     require LWP::UserAgent;
   };
   if ($@) {
     plan skip_all => 'IO::String or LWP or LWP::UserAgentnot installed.
This means Bio::Tools::Run::RemoteBlast is not usable. Skipping tests';
   }
   elsif (!$do_network_tests) {
     plan skip_all => 'Network tests have not been requested, skipping
all';
   }
   else {
     plan tests => 21;
   }

   #...
}

my $obj = Bio::Object->new(-verbose => $DEBUG);
#...
----

To this:

----
use strict;

BEGIN {
   use lib 't/lib';
   use BioperlTest;

   test_begin(-requires_modules => [qw(IO::String LWP LWP::UserAgent)],
              -requires_networking => 1,
              -tests => 21);

   #...
}

my $obj = Bio::Object->new(-verbose => test_debug());
#...
----


Can anyone identify problems with this approach? Is the interface 
presented by BioperlTest flexible enough that any changes would only be 
additions for new functionality (and therefore all test scripts wouldn't 
need to be altered)? Is BioperlTest missing anything you'd like?

Are there any objections to me updating all tests in this manner? For an 
example, see t/RemoteBlast.t


Cheers,
Sendu.


From spiros at lokku.com  Wed Jun 20 11:49:48 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Wed, 20 Jun 2007 16:49:48 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
	<4676B41E.3050706@sendu.me.uk>
	<4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>
Message-ID: <bba689ec0706200849p3d32ffb8wee14bbeb2027e905@mail.gmail.com>

Yep, they are not all done. Some still need to be ported over, doing
some here and there at home. However, the recent email Sendu sent, the
one about abstracting the setup of testing is actually something i was
thinking myself so it might be a better way to tackle the problem. For
once it would save us from duplicating the same 30 lines of code
across all tests.

As far as network tests are involved, ive always been an avid hater of
them. I believe they only bring more troubles than what they
contribute due to the diversity of setups people have. My way of
tackling them was always to group all the tests that required live
access into one file and then forcibly just run that - iff needed and
not by default. Like i said, thats just my opinion, ive been bitten by
them one time too many.

Spiros

On 6/18/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote:
>
> > Chris Fields wrote:
> >> Couldn't you enable BIOPERLDEBUG, disable network access, then
> >> iterate through tests checking for those which fail or skip?
> >
> > Yes, good idea, though my dev machine is also my email/webserver so
> > I'd rather come up with an alternate solution than one involving
> > 'disable network access'.
> >
> > Still, that's what I'll probably end up doing. Cheers!
> >
> >
> > Oh, Chris, Spiros, how goes the Test::More conversion? I might want
> > to wait for you to finish, or join in? If you're not going to have
> > time to do any more in the next few weeks, can you please update
> > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or
> > in the opposite case, add your name in)? Its not quite clear to me
> > which tests are assigned to whom. Can someone clarify what the
> > markings mean?
> >
> > Cheers,
> > Sendu.
>
> Not sure how far along spiros is; I handed it over after I finished
> up to the 'Q' tests.  In general the ones marked out have been
> converted over, ones with names next to them have been claimed.  If
> you need help I'll prob. start back up again to finish them off; we
> just need to divy them up.
>
> chris
>


From hlapp at gmx.net  Wed Jun 20 12:27:47 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 12:27:47 -0400
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467949EC.9040100@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
Message-ID: <A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>

Very cool! Sounds like a no-brainer to me to adopt this in all the  
tests. -hilmar

On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:

> In considering updating all the test scripts to take advantage of the
> new network option, and/or reimplementing them in Test::More, I  
> thought
> now would be a good time to standardize all the test scripts and  
> reduce
> the possibility of having to alter them all in the future if something
> changes.
>
> For example we could decide on an alternate way of choosing to run
> network tests, or a new way of deciding to output debug information.
> There are also some inconsistencies in the messages produced by tests
> skipping all, and even an unfortunate mistake that has been copy/ 
> pasted
> through a lot of test scripts.
>
> My solution is t/lib/BioperlTest.pm (documented with perldoc)
>
> We go from this:
>
> ----
> use strict;
> our $DEBUG;
>
> BEGIN {
>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
> 	
>    eval { require Test::More; };
>    if( $@ ) {
>      use lib 't/lib';
>    }
>    use Test::More; # the mistake!
> 	
>    use Module::Build;
>    my $build = Module::Build->current();
>    my $do_network_tests = $build->notes('network');
>
>    eval {
>      require IO::String;
>      require LWP;
>      require LWP::UserAgent;
>    };
>    if ($@) {
>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot  
> installed.
> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping  
> tests';
>    }
>    elsif (!$do_network_tests) {
>      plan skip_all => 'Network tests have not been requested, skipping
> all';
>    }
>    else {
>      plan tests => 21;
>    }
>
>    #...
> }
>
> my $obj = Bio::Object->new(-verbose => $DEBUG);
> #...
> ----
>
> To this:
>
> ----
> use strict;
>
> BEGIN {
>    use lib 't/lib';
>    use BioperlTest;
>
>    test_begin(-requires_modules => [qw(IO::String LWP  
> LWP::UserAgent)],
>               -requires_networking => 1,
>               -tests => 21);
>
>    #...
> }
>
> my $obj = Bio::Object->new(-verbose => test_debug());
> #...
> ----
>
>
> Can anyone identify problems with this approach? Is the interface
> presented by BioperlTest flexible enough that any changes would  
> only be
> additions for new functionality (and therefore all test scripts  
> wouldn't
> need to be altered)? Is BioperlTest missing anything you'd like?
>
> Are there any objections to me updating all tests in this manner?  
> For an
> example, see t/RemoteBlast.t
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 20 12:44:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 20 Jun 2007 11:44:01 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
Message-ID: <BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>

Agreed!  You've already created an example case so there's something  
to go off of.

I plan on changing some EUtilities tests soon so I'll try  
implementing this, basing off your RemoteBlast.t implementation.   
Seems clear enough on the surface; if I run into problems I'll post.

chris

On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote:

> Very cool! Sounds like a no-brainer to me to adopt this in all the
> tests. -hilmar
>
> On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:
>
>> In considering updating all the test scripts to take advantage of the
>> new network option, and/or reimplementing them in Test::More, I
>> thought
>> now would be a good time to standardize all the test scripts and
>> reduce
>> the possibility of having to alter them all in the future if  
>> something
>> changes.
>>
>> For example we could decide on an alternate way of choosing to run
>> network tests, or a new way of deciding to output debug information.
>> There are also some inconsistencies in the messages produced by tests
>> skipping all, and even an unfortunate mistake that has been copy/
>> pasted
>> through a lot of test scripts.
>>
>> My solution is t/lib/BioperlTest.pm (documented with perldoc)
>>
>> We go from this:
>>
>> ----
>> use strict;
>> our $DEBUG;
>>
>> BEGIN {
>>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
>> 	
>>    eval { require Test::More; };
>>    if( $@ ) {
>>      use lib 't/lib';
>>    }
>>    use Test::More; # the mistake!
>> 	
>>    use Module::Build;
>>    my $build = Module::Build->current();
>>    my $do_network_tests = $build->notes('network');
>>
>>    eval {
>>      require IO::String;
>>      require LWP;
>>      require LWP::UserAgent;
>>    };
>>    if ($@) {
>>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot
>> installed.
>> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping
>> tests';
>>    }
>>    elsif (!$do_network_tests) {
>>      plan skip_all => 'Network tests have not been requested,  
>> skipping
>> all';
>>    }
>>    else {
>>      plan tests => 21;
>>    }
>>
>>    #...
>> }
>>
>> my $obj = Bio::Object->new(-verbose => $DEBUG);
>> #...
>> ----
>>
>> To this:
>>
>> ----
>> use strict;
>>
>> BEGIN {
>>    use lib 't/lib';
>>    use BioperlTest;
>>
>>    test_begin(-requires_modules => [qw(IO::String LWP
>> LWP::UserAgent)],
>>               -requires_networking => 1,
>>               -tests => 21);
>>
>>    #...
>> }
>>
>> my $obj = Bio::Object->new(-verbose => test_debug());
>> #...
>> ----
>>
>>
>> Can anyone identify problems with this approach? Is the interface
>> presented by BioperlTest flexible enough that any changes would
>> only be
>> additions for new functionality (and therefore all test scripts
>> wouldn't
>> need to be altered)? Is BioperlTest missing anything you'd like?
>>
>> Are there any objections to me updating all tests in this manner?
>> For an
>> example, see t/RemoteBlast.t
>>
>>
>> Cheers,
>> Sendu.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From wollenbergk at mail.nih.gov  Wed Jun 20 14:11:04 2007
From: wollenbergk at mail.nih.gov (Wollenberg, Kurt (NIH/NIAID))
Date: Wed, 20 Jun 2007 14:11:04 -0400
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
Message-ID: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>

Greetings:

I am working on a script to take a list of sequence IDs, extract the
sequences from GenPept, and then run a BLAST search for each of the
retrieved sequences. I am having a problem with the sequence retrieval,
where some sequences are found and others are not and it's not obvious to me
why this is. 

For example, using a text file containing the two following IDs as input:
SKG3_YEAST
NEM1_YEAST

My script 

while( <IN> ) {
  chomp;
  my $seqid = $_;
  my $seq_obj = get_sequence( 'genpept', $seqid );
}

will create a sequence object for the first ID, (print "Accession of
",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession
number) but for the second I am told

-------------------- WARNING ---------------------
MSG: id (NEM1_YEAST) does not exist
---------------------------------------------------

When I pull up these records using the Entrez cross-databse search in my web
browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using
these search terms). In both records these IDs reside in the same field
("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence finds one
but not the other. Any advice would be greatly appreciated.

Cheers,
Kurt Wollenberg, Ph.D.
Phylogenetics and Sequence Analysis Consultant
Biocomputing Research Consulting Section
Bioinformatics and Scientific IT Program (BSIP)
NIH/NIAID/OTIS
Contractor, Lockheed Martin
http://bioinformatics.niaid.nih.gov

Disclaimer:
The information in this e-mail and any of its attachments is confidential
and may contain sensitive information. It should not be used by anyone who
is not the original intended recipient. If you have received this e-mail in
error please inform the sender and delete it from your mailbox or any other
storage devices. National Institute of Allergy and Infectious Diseases shall
not accept liability for any statements made that are sender's own and not
expressly made on behalf of the NIAID by one of its representatives.


From bosborne11 at verizon.net  Wed Jun 20 14:59:39 2007
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 20 Jun 2007 14:59:39 -0400
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
In-Reply-To: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
Message-ID: <C29EF15B.EAF7%bosborne11@verizon.net>

Kurt,

I can't answer your question but I wouldn't use Bio::Perl myself, I'd use
Bio::DB::GenPept:

501 ~>perl -e 'use Bio::DB::GenPept; $db = Bio::DB::GenPept->new; $seq =
$db->get_Seq_by_acc('NEM1_YEAST'); print $seq->seq;'
MNALKYFSNHLITTKKQKKINVEVTKNQDLLGPSKEVSNKYTSHSENDCVSEVDQQYDHSSSHLKESDQNQERKNS
VPKKPKALRSILIEKIASILWALLLFLPYYLIIKPLMSLWFVFTFPLSVIERRVKHTDKRNRGSNASENELPVSSS
NINDSSEKTNPKNCNLNTIPEAVEDDLNASDEIILQRDNVKGSLLRAQSVKSRPRSYSKSELSLSNHSSSNTVFGT
KRMGRFLFPKKLIPKSVLNTQKKKKLVIDLDETLIHSASRSTTHSNSSQGHLVEVKFGLSGIRTLYFIHKRPYCDL
FLTKVSKWYDLIIFTASMKEYADPVIDWLESSFPSSFSKRYYRSDCVLRDGVGYIKDLSIVKDSEENGKGSSSSLD
DVIIIDNSPVSYAMNVDNAIQVEGWISDPTDTDLLNLLPFLEAMRYSTDVRNILALKHGEKAFNIN502 ~>

It's true that Bio::Perl is easy-to-use but it's also _very_ limited.

Brian O.


On 6/20/07 2:11 PM, "Wollenberg, Kurt (NIH/NIAID)"
<wollenbergk at mail.nih.gov> wrote:

> Greetings:
> 
> I am working on a script to take a list of sequence IDs, extract the
> sequences from GenPept, and then run a BLAST search for each of the
> retrieved sequences. I am having a problem with the sequence retrieval,
> where some sequences are found and others are not and it's not obvious to me
> why this is. 
> 
> For example, using a text file containing the two following IDs as input:
> SKG3_YEAST
> NEM1_YEAST
> 
> My script 
> 
> while( <IN> ) {
>   chomp;
>   my $seqid = $_;
>   my $seq_obj = get_sequence( 'genpept', $seqid );
> }
> 
> will create a sequence object for the first ID, (print "Accession of
> ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession
> number) but for the second I am told
> 
> -------------------- WARNING ---------------------
> MSG: id (NEM1_YEAST) does not exist
> ---------------------------------------------------
> 
> When I pull up these records using the Entrez cross-databse search in my web
> browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using
> these search terms). In both records these IDs reside in the same field
> ("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence finds one
> but not the other. Any advice would be greatly appreciated.
> 
> Cheers,
> Kurt Wollenberg, Ph.D.
> Phylogenetics and Sequence Analysis Consultant
> Biocomputing Research Consulting Section
> Bioinformatics and Scientific IT Program (BSIP)
> NIH/NIAID/OTIS
> Contractor, Lockheed Martin
> http://bioinformatics.niaid.nih.gov
> 
> Disclaimer:
> The information in this e-mail and any of its attachments is confidential
> and may contain sensitive information. It should not be used by anyone who
> is not the original intended recipient. If you have received this e-mail in
> error please inform the sender and delete it from your mailbox or any other
> storage devices. National Institute of Allergy and Infectious Diseases shall
> not accept liability for any statements made that are sender's own and not
> expressly made on behalf of the NIAID by one of its representatives.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Wed Jun 20 16:11:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 20 Jun 2007 15:11:34 -0500
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
In-Reply-To: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
References: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
Message-ID: <F9F5A58E-4767-49C4-80F2-DEE3CA474C01@uiuc.edu>

I'm assuming you are using the Bio::Perl exported sub get_sequence 
().  I am able to reproduce the issue using bioperl-live; it's an odd  
issue as direct use of Bio::DB::GenPept works fine:

use Bio::DB::GenPept;

my $factory = Bio::DB::GenPept->new();

my @accs = qw(SKG3_YEAST NEM1_YEAST);

my $io = $factory->get_Stream_by_acc(\@accs);

while (my $seq = $io->next_seq) {
     print "Accession:",$seq->accession,"\n";
}

chris


On Jun 20, 2007, at 1:11 PM, Wollenberg, Kurt (NIH/NIAID) wrote:

> Greetings:
>
> I am working on a script to take a list of sequence IDs, extract the
> sequences from GenPept, and then run a BLAST search for each of the
> retrieved sequences. I am having a problem with the sequence  
> retrieval,
> where some sequences are found and others are not and it's not  
> obvious to me
> why this is.
>
> For example, using a text file containing the two following IDs as  
> input:
> SKG3_YEAST
> NEM1_YEAST
>
> My script
>
> while( <IN> ) {
>   chomp;
>   my $seqid = $_;
>   my $seq_obj = get_sequence( 'genpept', $seqid );
> }
>
> will create a sequence object for the first ID, (print "Accession of
> ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct  
> accession
> number) but for the second I am told
>
> -------------------- WARNING ---------------------
> MSG: id (NEM1_YEAST) does not exist
> ---------------------------------------------------
>
> When I pull up these records using the Entrez cross-databse search  
> in my web
> browser I find genpept records for both SKG3_YEAST and NEM1_YEAST  
> (using
> these search terms). In both records these IDs reside in the same  
> field
> ("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence  
> finds one
> but not the other. Any advice would be greatly appreciated.
>
> Cheers,
> Kurt Wollenberg, Ph.D.
> Phylogenetics and Sequence Analysis Consultant
> Biocomputing Research Consulting Section
> Bioinformatics and Scientific IT Program (BSIP)
> NIH/NIAID/OTIS
> Contractor, Lockheed Martin
> http://bioinformatics.niaid.nih.gov
>
> Disclaimer:
> The information in this e-mail and any of its attachments is  
> confidential
> and may contain sensitive information. It should not be used by  
> anyone who
> is not the original intended recipient. If you have received this e- 
> mail in
> error please inform the sender and delete it from your mailbox or  
> any other
> storage devices. National Institute of Allergy and Infectious  
> Diseases shall
> not accept liability for any statements made that are sender's own  
> and not
> expressly made on behalf of the NIAID by one of its representatives.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From sac at bioperl.org  Thu Jun 21 02:32:47 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Wed, 20 Jun 2007 23:32:47 -0700
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
	<BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>
Message-ID: <8f200b4c0706202332w25a09547k1de20f24466877d9@mail.gmail.com>

Looks like a nice refactor. After it's in place, don't forget to
update the wiki:
http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests

Steve

On 6/20/07, Chris Fields <cjfields at uiuc.edu> wrote:
> Agreed!  You've already created an example case so there's something
> to go off of.
>
> I plan on changing some EUtilities tests soon so I'll try
> implementing this, basing off your RemoteBlast.t implementation.
> Seems clear enough on the surface; if I run into problems I'll post.
>
> chris
>
> On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote:
>
> > Very cool! Sounds like a no-brainer to me to adopt this in all the
> > tests. -hilmar
> >
> > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:
> >
> >> In considering updating all the test scripts to take advantage of the
> >> new network option, and/or reimplementing them in Test::More, I
> >> thought
> >> now would be a good time to standardize all the test scripts and
> >> reduce
> >> the possibility of having to alter them all in the future if
> >> something
> >> changes.
> >>
> >> For example we could decide on an alternate way of choosing to run
> >> network tests, or a new way of deciding to output debug information.
> >> There are also some inconsistencies in the messages produced by tests
> >> skipping all, and even an unfortunate mistake that has been copy/
> >> pasted
> >> through a lot of test scripts.
> >>
> >> My solution is t/lib/BioperlTest.pm (documented with perldoc)
> >>
> >> We go from this:
> >>
> >> ----
> >> use strict;
> >> our $DEBUG;
> >>
> >> BEGIN {
> >>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
> >>
> >>    eval { require Test::More; };
> >>    if( $@ ) {
> >>      use lib 't/lib';
> >>    }
> >>    use Test::More; # the mistake!
> >>
> >>    use Module::Build;
> >>    my $build = Module::Build->current();
> >>    my $do_network_tests = $build->notes('network');
> >>
> >>    eval {
> >>      require IO::String;
> >>      require LWP;
> >>      require LWP::UserAgent;
> >>    };
> >>    if ($@) {
> >>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot
> >> installed.
> >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping
> >> tests';
> >>    }
> >>    elsif (!$do_network_tests) {
> >>      plan skip_all => 'Network tests have not been requested,
> >> skipping
> >> all';
> >>    }
> >>    else {
> >>      plan tests => 21;
> >>    }
> >>
> >>    #...
> >> }
> >>
> >> my $obj = Bio::Object->new(-verbose => $DEBUG);
> >> #...
> >> ----
> >>
> >> To this:
> >>
> >> ----
> >> use strict;
> >>
> >> BEGIN {
> >>    use lib 't/lib';
> >>    use BioperlTest;
> >>
> >>    test_begin(-requires_modules => [qw(IO::String LWP
> >> LWP::UserAgent)],
> >>               -requires_networking => 1,
> >>               -tests => 21);
> >>
> >>    #...
> >> }
> >>
> >> my $obj = Bio::Object->new(-verbose => test_debug());
> >> #...
> >> ----
> >>
> >>
> >> Can anyone identify problems with this approach? Is the interface
> >> presented by BioperlTest flexible enough that any changes would
> >> only be
> >> additions for new functionality (and therefore all test scripts
> >> wouldn't
> >> need to be altered)? Is BioperlTest missing anything you'd like?
> >>
> >> Are there any objections to me updating all tests in this manner?
> >> For an
> >> example, see t/RemoteBlast.t
> >>
> >>
> >> Cheers,
> >> Sendu.
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From staffa at niehs.nih.gov  Thu Jun 21 14:36:12 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Thu, 21 Jun 2007 14:36:12 -0400
Subject: [Bioperl-l] BIO::DB::FASTA  ID
Message-ID: <C2A03D5E.4DE9%staffa@niehs.nih.gov>

This program below returns only  1527 IDs from a fasta file that I have
constructed, which has
mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa
1820
.
It actually does not return the first 3 ids,
nor the 5th, nor 7..36, 38,39,41..44......
The header lines are of variable length and the sequence lines are 80
characters except at the ends when they might be shorter.
Is there some caveat that I am ignoring in my format that breaks
bio::db::fasta?


#!/usr/bin/perl
#
#
#
use strict;
use Bio::DB::Fasta;
use Bio::Tools::SeqWords;
use Bio::Seq;
use Bio::SeqIO;
$|=1;
#
#
my $Dpse_UTR_file_for_T_orthologs =
"/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa";
my $db = Bio::DB::Fasta->new
('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa',
  -reindex,  -makeid => \&make_my_id);
my @ids = $db->ids;
my $number_in = @ids;
print "number of Dpse IDs = $number_in\n";
foreach my $id (@ids){
print "$id\n";
}
sub make_my_id {
#       parse header line:
#       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT
    my $line = shift;
#    print "line = $line\n";
    $line =~ />(\w+) /;
    my $ID = $1;
#    print "ID = $ID\n";
    return $ID;
      }

-------------- next part --------------
A non-text attachment was scrubbed...
Name: T_orthologs_Dpse_genes.fa
Type: application/octet-stream
Size: 5033676 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070621/07c354d0/attachment-0003.obj>

From jason at bioperl.org  Thu Jun 21 17:19:14 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 21 Jun 2007 14:19:14 -0700
Subject: [Bioperl-l] BIO::DB::FASTA  ID
In-Reply-To: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
References: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
Message-ID: <F3A92546-08EE-4AD5-BFCE-BF006D153AD7@bioperl.org>

Hey Nick -
I think
a) your IDs are not unique
b) you need to declare the function make_my_id BEFORE your call  
Bio::DB::Fasta->new if you want your function to be used.

$ grep "^>" T_orthologs_Dpse_genes.fa | awk '{print $1}' | sort |  
uniq | wc -l
1527


-jason
On Jun 21, 2007, at 11:36 AM, Staffa, Nick (NIH/NIEHS) wrote:

> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> $|=1;
> #
> #
> my $Dpse_UTR_file_for_T_orthologs =
> "/home/staffa/clients/Kari/D_pse_genome/testit/ 
> T_orthologs_Dpse_genes.fa";
> my $db = Bio::DB::Fasta->new
> ('/home/staffa/clients/Kari/D_pse_genome/testit/ 
> T_orthologs_Dpse_genes.fa',
>   -reindex,  -makeid => \&make_my_id);
> my @ids = $db->ids;
> my $number_in = @ids;
> print "number of Dpse IDs = $number_in\n";
> foreach my $id (@ids){
> print "$id\n";
> }
> sub make_my_id {
> #       parse header line:
> #       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0  
> TTATTTATT
>     my $line = shift;
> #    print "line = $line\n";
>     $line =~ />(\w+) /;
>     my $ID = $1;
> #    print "ID = $ID\n";
>     return $ID;
>       }

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From mkiwala at watson.wustl.edu  Thu Jun 21 17:23:46 2007
From: mkiwala at watson.wustl.edu (Michael Kiwala)
Date: Thu, 21 Jun 2007 16:23:46 -0500
Subject: [Bioperl-l] BIO::DB::FASTA  ID
In-Reply-To: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
References: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
Message-ID: <467AEC62.2040508@watson.wustl.edu>

You only have 1527 unique id's in the file.

~$ grep '^>' Desktop/T_orthologs_Dpse_genes.fa|cut -d\  -f1|sort -u|wc -l
1527


Change your make_id function to make sure the id's are unique.


Staffa, Nick (NIH/NIEHS) wrote:
> This program below returns only  1527 IDs from a fasta file that I have
> constructed, which has
> mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa
> 1820
> .
> It actually does not return the first 3 ids,
> nor the 5th, nor 7..36, 38,39,41..44......
> The header lines are of variable length and the sequence lines are 80
> characters except at the ends when they might be shorter.
> Is there some caveat that I am ignoring in my format that breaks
> bio::db::fasta?
>
>
> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> $|=1;
> #
> #
> my $Dpse_UTR_file_for_T_orthologs =
> "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa";
> my $db = Bio::DB::Fasta->new
> ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa',
>   -reindex,  -makeid => \&make_my_id);
> my @ids = $db->ids;
> my $number_in = @ids;
> print "number of Dpse IDs = $number_in\n";
> foreach my $id (@ids){
> print "$id\n";
> }
> sub make_my_id {
> #       parse header line:
> #       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT
>     my $line = shift;
> #    print "line = $line\n";
>     $line =~ />(\w+) /;
>     my $ID = $1;
> #    print "ID = $ID\n";
>     return $ID;
>       }
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Mon Jun 25 09:06:27 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 25 Jun 2007 14:06:27 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467949EC.9040100@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
Message-ID: <467FBDD3.8050009@sendu.me.uk>

Sendu Bala wrote:
> In considering updating all the test scripts to [... use] t/lib/BioperlTest.pm

I'm now in the process of converting all test scripts. In addition to 
those things mentioned previously, BioperlTest now also provides the 
methods test_input_file() and test_output_file().


This:
----
use Bio::Root::IO;
my $output_file = Bio::Root::IO->catfile(qw(t data temp.file));
$obj->new(-file => ">$output_file");

END {
   unlink($output_file);
}

...

$obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file)));
----


Becomes this:
----
my $output_file = test_output_file();
$obj->new(-file => ">$output_file");

...

$obj->new(-file => test_input_file('input.file'));
----


I should think the benefits are obvious, especially for the output 
files, which thanks to inconsistency of using END blocks correctly or at 
all, leaves some output data behind on occasion.

test_input_file() is helpful for the shorthand, but also gets rid of 
many tests' usage of Bio::Root::IO (relying on something you're 
installing and testing in another test script to work in the current 
test script, without testing it in your own test script seems like a 
no-no to me).


From cjfields at uiuc.edu  Mon Jun 25 09:39:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 08:39:21 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467FBDD3.8050009@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
Message-ID: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>

On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> In considering updating all the test scripts to [... use] t/lib/ 
>> BioperlTest.pm
>
> I'm now in the process of converting all test scripts. In addition to
> those things mentioned previously, BioperlTest now also provides the
> methods test_input_file() and test_output_file().
>
>
> This:
> ----
> use Bio::Root::IO;
> my $output_file = Bio::Root::IO->catfile(qw(t data temp.file));
> $obj->new(-file => ">$output_file");
>
> END {
>    unlink($output_file);
> }
>
> ...
>
> $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file)));
> ----
>
>
> Becomes this:
> ----
> my $output_file = test_output_file();
> $obj->new(-file => ">$output_file");
>
> ...
>
> $obj->new(-file => test_input_file('input.file'));
> ----
>
>
> I should think the benefits are obvious, especially for the output
> files, which thanks to inconsistency of using END blocks correctly  
> or at
> all, leaves some output data behind on occasion.

Sounds fine by me, though it's a lot of work.  BTW, did we ever  
decide whether to finish up with Test::More conversion?  I haven't  
heard back yet; let me know what you want to do.

> test_input_file() is helpful for the shorthand, but also gets rid of
> many tests' usage of Bio::Root::IO (relying on something you're
> installing and testing in another test script to work in the current
> test script, without testing it in your own test script seems like a
> no-no to me).

Well, in a way isn't that itself a test of the class (whether it  
breaks or not)?  ; >

Do test_input_file() and test_input_file() handle directory  
structures in an OS-safe way like catfile()?  For instance, I plan on  
adding test data to a new directory similar to Bio::Graphics (t/data/ 
eutil) to prevent cluttering of the t/data directory.  I could use  
'$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base  
directory is 't/data' but that may not be cross-platform compatible  
with win32 file systems, which may still expect something like 't\data 
\eutil\input.xml'.

chris


From bix at sendu.me.uk  Mon Jun 25 09:45:23 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 25 Jun 2007 14:45:23 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
Message-ID: <467FC6F3.6080705@sendu.me.uk>

Chris Fields wrote:
> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:
>> I should think the benefits are obvious, especially for the output
>> files, which thanks to inconsistency of using END blocks correctly or at
>> all, leaves some output data behind on occasion.
> 
> Sounds fine by me, though it's a lot of work.  BTW, did we ever decide 
> whether to finish up with Test::More conversion?  I haven't heard back 
> yet; let me know what you want to do.

I'm doing the remaining Test::More conversions at the same time.


> Do test_input_file() and test_input_file() handle directory structures 
> in an OS-safe way like catfile()?  For instance, I plan on adding test 
> data to a new directory similar to Bio::Graphics (t/data/eutil) to 
> prevent cluttering of the t/data directory.  I could use 
> '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base 
> directory is 't/data' but that may not be cross-platform compatible with 
> win32 file systems, which may still expect something like 
> 't\data\eutil\input.xml'.

Its platform-independent, currently implemented using File::Spec. So 
you'll say:

$obj->new(-file => test_input_file('eutil', 'input.xml'));

Its all documented in the POD of BioperlTest.


From cjfields at uiuc.edu  Mon Jun 25 09:49:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 08:49:51 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467FC6F3.6080705@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
	<467FC6F3.6080705@sendu.me.uk>
Message-ID: <679B8E76-C090-4A29-B843-99B5853FE2FB@uiuc.edu>


On Jun 25, 2007, at 8:45 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:
>>> I should think the benefits are obvious, especially for the output
>>> files, which thanks to inconsistency of using END blocks  
>>> correctly or at
>>> all, leaves some output data behind on occasion.
>> Sounds fine by me, though it's a lot of work.  BTW, did we ever  
>> decide whether to finish up with Test::More conversion?  I haven't  
>> heard back yet; let me know what you want to do.
>
> I'm doing the remaining Test::More conversions at the same time.

Okay.  Just didn't want to do any redundant work if it's already  
being/been done.

>> Do test_input_file() and test_input_file() handle directory  
>> structures in an OS-safe way like catfile()?  For instance, I plan  
>> on adding test data to a new directory similar to Bio::Graphics (t/ 
>> data/eutil) to prevent cluttering of the t/data directory.  I  
>> could use '$obj->new(-file => test_input_file('/eutil/ 
>> input.xml'))' if the base directory is 't/data' but that may not  
>> be cross-platform compatible with win32 file systems, which may  
>> still expect something like 't\data\eutil\input.xml'.
>
> Its platform-independent, currently implemented using File::Spec.  
> So you'll say:
>
> $obj->new(-file => test_input_file('eutil', 'input.xml'));
>
> Its all documented in the POD of BioperlTest.

yay!

chris


From mmokrejs at ribosome.natur.cuni.cz  Mon Jun 25 12:06:24 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Mon, 25 Jun 2007 18:06:24 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <467254DD.3010505@mrc-lmb.cam.ac.uk>
References: <466938F6.7050903@ribosome.natur.cuni.cz>	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>	<467178AE.5040905@ribosome.natur.cuni.cz>	<46717990.6040509@ribosome.natur.cuni.cz>
	<467254DD.3010505@mrc-lmb.cam.ac.uk>
Message-ID: <467FE800.4010300@ribosome.natur.cuni.cz>


Dave Howorth wrote:
> Martin MOKREJ? wrote:
>>>> Also, there is a *huge* amount of documentation and examples on
>>>> the BioPerl website.
>>>>
>>>> http://www.bioperl.org/wiki/HOWTOs
>>> You mean 
>>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>>>  ? ;-)
>> $ perl embl2picture.pl ~/99.gb | display - Error returned while
>> evaluating value of 'description' option for glyph
>> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature
>> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl
>> line 141, <GEN0> line 125.
> 
> Hmm an error at line 141 of a 69 line script? Methinks you're not
> actually running the script that's presented on the wiki page you
> quoted. I cut-and-pasted the script and your file and it worked for me
> (at least, it produced an image, along with a bunch of OOPS lines)

Maybe you used the first version of the script?  There are two or more
scripts, I used the very last one.

M.


From cjfields at uiuc.edu  Mon Jun 25 12:48:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 11:48:30 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <467FE7B0.3010904@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
	<46723F91.60501@ribosome.natur.cuni.cz>
	<A2212781-75F3-4BB7-967F-1668B682E84E@uiuc.edu>
	<467FE7B0.3010904@ribosome.natur.cuni.cz>
Message-ID: <B9DB370F-FB17-4DEF-9664-37489D84FC05@uiuc.edu>

Martin,

Keep bioperl-related discussion on the bioperl mail list.  The large  
majority of this isn't biopython-related, but maybe some devs there  
can add to this?

On Jun 25, 2007, at 11:05 AM, Martin MOKREJ? wrote:

...

> Would you please tell me exactly what is wrong with the spacing?

Here's a section of the seq record attached to your previous email:

DEFINITION .
ACCESSION .
VERSION .
SOURCE .
   ORGANISM .

Normally there is a fixed column width for any data present in a  
field, so it would look more like this:

DEFINITION  PYR4 (DIHYDROOROTASE, PYRIMIDIN 4, dihydroorotase);  
dihydroorotase
             [Arabidopsis thaliana].
ACCESSION   NP_194024
VERSION     NP_194024.1  GI:15235865
DBSOURCE    REFSEQ: accession NM_118422.3
KEYWORDS    .
SOURCE      Arabidopsis thaliana (thale cress)
   ORGANISM  Arabidopsis thaliana
             Eukaryota; Viridiplantae; Streptophyta; Embryophyta;  
Tracheophyta;
             Spermatophyta; Magnoliophyta; eudicotyledons; core  
eudicotyledons;
             rosids; eurosids II; Brassicales; Brassicaceae;  
Arabidopsis.

Here's the relevant bit in the latest release notes:

"The second part of each sequence entry record contains the information
appropriate to its keyword, in positions 13 to 80 for keywords and
positions 11 to 80 for the sequence."

The bioperl devs try to make our parsers as flexible as possible but  
others may not, so it's something in ApE that should probably be  
fixed.  And as mentioned to you several times in the past on the mail  
list and on bugzilla, don't expect sequence records which sway from  
the standard (in this case, the release notes) to parse correctly in  
all cases.  We can try supporting some that sway from that standard  
but only up to a point.  If it causes additional bugs, headaches, or  
degrades performance it won't be supported.

> ...
> Well, I just copy&pasted the script from the bioperl webpages, I think
> from a tutorial or FAQ, don't remember anymore.

Well, can't help you if you can't point out where the code originated  
from.  We would like to know so it can be corrected.

> ...
> Well, my search for such tools available on Unix to be used in a  
> script,
> non-interactively, completely failed. My last hope except getting  
> improved
> ApE is to use the GenomeDiagram under biopython, but so far my .gb  
> files
> cannot be parsed yet. :(
> Martin

As mentioned previously you will likely have to code for it yourself  
(perl or python) or help debug the relevant biopython code to get it  
working.  We can't/won't do this for you unless/until it's something  
we feel warrants implementation.  Judging by the bug list, we also  
haven't the time nor inclination to code for it.  Sorry but we have  
other priorities besides doing your work for you.

chris


From jesper at krogh.cc  Tue Jun 26 03:05:32 2007
From: jesper at krogh.cc (Jesper Krogh)
Date: Tue, 26 Jun 2007 09:05:32 +0200 (CEST)
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
Message-ID: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>

Hi List.

Trying to parse the embl database, the embl-parser fails on: AB019196
http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196


------------- EXCEPTION: Bio::Root::Exception -------------
MSG: AB019196 seems to have an invalid species classification.
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
STACK: Bio::SeqIO::embl::_read_EMBL_Species
/usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
STACK: Bio::SeqIO::embl::next_seq
/usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
STACK: -e:1
-----------------------------------------------------------


It seems to be dissatisfied with this:
OS   Acetobacter aceti
OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.

Thanks.
-- 
Jesper Krogh


From cjfields at uiuc.edu  Tue Jun 26 09:13:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 08:13:50 -0500
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
Message-ID: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>

I can verify this using bioperl-live.  Can you file this as a bug?

http://bugzilla.open-bio.org/

chris

On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:

> Hi List.
>
> Trying to parse the embl database, the embl-parser fails on: AB019196
> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: AB019196 seems to have an invalid species classification.
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
> STACK: Bio::SeqIO::embl::_read_EMBL_Species
> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
> STACK: Bio::SeqIO::embl::next_seq
> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
> STACK: -e:1
> -----------------------------------------------------------
>
>
> It seems to be dissatisfied with this:
> OS   Acetobacter aceti
> OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>
> Thanks.
> -- 
> Jesper Krogh
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From suji_ramin at yahoo.com  Tue Jun 26 00:58:36 2007
From: suji_ramin at yahoo.com (SujiBala)
Date: Mon, 25 Jun 2007 21:58:36 -0700 (PDT)
Subject: [Bioperl-l] Error in constructing Phylogenetic tree using
	BioPerl
Message-ID: <571051.26423.qm@web51107.mail.re2.yahoo.com>

Hi Hello
  This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. 
   
  Error messasge
    Must supply  a valid Bio::Align::AlignI for the _align parameter  in the distance 
  My program
  use Bio::AlignIO;
use Bio::Align::DNAStatistics;
use Bio::Tree::DistanceFactory;
# for a dna alignment  can also use ProteinStatistics
@aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
$stats = Bio::Align::DNAStatistics->new;
$mat = $stats->distance( -align  => @aln,-method => 'Kimura');
$dfactory = Bio::Tree::DistanceFactory->new(-method => 'NJ');
$tree = $dfactory->make_tree($mat);
   
  I am using clustalw formatted fasta file with more than one sequence 
   

SujiBala


---------------------------------
Luggage? GPS? Comic books? 
Check out fitting  gifts for grads at Yahoo! Search.


From bartels.stefan at mh-hannover.de  Tue Jun 26 05:26:03 2007
From: bartels.stefan at mh-hannover.de (don esteban)
Date: Tue, 26 Jun 2007 02:26:03 -0700 (PDT)
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
	<BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
Message-ID: <11302459.post@talk.nabble.com>


Try using the Proxyconfiguration in your script:

$ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080";


L Xu wrote:
> 
> I do have the internet connection bu not use the proxy server.
> I tested the network connection with ping command (below). The ncbi
> website 
> does not response. Is there any special network setting needed for 
> connecting the ncbi website?
> Thank you so much.
> 
> C:\>ping www.yahoo.com
> 
> Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:
> 
> Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=360ms TTL=45
> 
> Ping statistics for 69.147.114.210:
>     Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
> Approximate round trip times in milli-seconds:
>     Minimum = 312ms, Maximum = 363ms, Average = 338ms
> 
> C:\>ping www.ncbi.nlm.nih.gov
> 
> Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:
> 
> Request timed out.
> Request timed out.
> Request timed out.
> Request timed out.
> 
> Ping statistics for 130.14.29.110:
>     Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
> 
> 
> 
> = = = Original message = = =
> 
> Judging by the output it looks like you have no network access or? can't 
> connect to the server (what remoteblast needs).? Make sure you? don't need 
> proxy settings.
> 
> To preempt the next question, no, I'm not going to explain what a? proxy 
> is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
> tool...
> 
> chris
> 
> On Jun 13, 2007, at 7:16 AM, L Xu wrote:
> 
> 
>    ...
> -------------------- WARNING ---------------------
> MSG: <HTML>
> <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> <BODY>
> <H1>An Error Occurred</H1>
> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> </BODY>
> </HTML>
> 
> ---------------------------------------------------
> ...
> 
> ___________________________________________________________
> Sent by ePrompter, the premier email notification software.
> Free download at http://www.ePrompter.com.
> 
> _________________________________________________________________
> Get a preview of Live Earth, the hottest event this summer - only on MSN 
> http://liveearth.msn.com?source=msntaglineliveearthhm
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From rahall2 at ualr.edu  Tue Jun 26 09:51:08 2007
From: rahall2 at ualr.edu (Roger Hall)
Date: Tue, 26 Jun 2007 08:51:08 -0500
Subject: [Bioperl-l] Tuesday: ill
Message-ID: <000001c7b7f9$0d029040$4601a8c0@LIBERAL2>

Well I guess I won't be in today after all.
 
Michael, Stephen, and Ames: please call me from the grad office at 10 on
my cell phone (744-8514). 
 
Phil: please go ahead and meet with Tim, and let me know what questions
remain afterwards.
 
Thanks!
 
Roger Hall
Technical Director
MidSouth Bioinformatics Center
University of Arkansas at Little Rock
(501) 569-8074
 

From cjfields at uiuc.edu  Tue Jun 26 10:02:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 09:02:29 -0500
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <4681185D.5030402@cam.ac.uk>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
	<246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
	<4681185D.5030402@cam.ac.uk>
Message-ID: <EC86EE5C-02DF-4E4F-AF25-6E53925CBC1F@uiuc.edu>

Ill try getting to that ASAP (as well as a few bugs).  The problem is  
we have to patch this in 2-3 places (SeqIO::swiss, SeqIO::embl) due  
to repeated code issues, something I'm trying to rectify with a new  
set of parsers.  Just haven't had the time to work on them lately  
unfortunately.

chris

On Jun 26, 2007, at 8:45 AM, Roy Chaudhuri wrote:

> Sorry, replied to this but forgot to cc the list.
>
> It looks like a related problem to bug 2288 that I filed about  
> Bio::SeqIO::swiss - the period after subgen. is what causes the  
> problems since it is interpreted as a seperator between nodes. I  
> put a patch in for Bio::SeqIO::swiss that works for me, but I guess  
> it might have side effects.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
>
> Chris Fields wrote:
>> I can verify this using bioperl-live.  Can you file this as a bug?
>> http://bugzilla.open-bio.org/
>> chris
>> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:
>>> Hi List.
>>>
>>> Trying to parse the embl database, the embl-parser fails on:  
>>> AB019196
>>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>>>
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: AB019196 seems to have an invalid species classification.
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/ 
>>> Root.pm:359
>>> STACK: Bio::SeqIO::embl::_read_EMBL_Species
>>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
>>> STACK: Bio::SeqIO::embl::next_seq
>>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
>>> STACK: -e:1
>>> -----------------------------------------------------------
>>>
>>>
>>> It seems to be dissatisfied with this:
>>> OS   Acetobacter aceti
>>> OC   Bacteria; Proteobacteria; Alphaproteobacteria;  
>>> Rhodospirillales;
>>> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>>>
>>> Thanks.
>>> -- 
>>> Jesper Krogh
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From rrc22 at cam.ac.uk  Tue Jun 26 09:45:01 2007
From: rrc22 at cam.ac.uk (Roy Chaudhuri)
Date: Tue, 26 Jun 2007 14:45:01 +0100
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
	<246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
Message-ID: <4681185D.5030402@cam.ac.uk>

Sorry, replied to this but forgot to cc the list.

It looks like a related problem to bug 2288 that I filed about 
Bio::SeqIO::swiss - the period after subgen. is what causes the problems 
since it is interpreted as a seperator between nodes. I put a patch in 
for Bio::SeqIO::swiss that works for me, but I guess it might have side 
effects.

Roy.
--
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.

Chris Fields wrote:
> I can verify this using bioperl-live.  Can you file this as a bug?
> 
> http://bugzilla.open-bio.org/
> 
> chris
> 
> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:
> 
>> Hi List.
>>
>> Trying to parse the embl database, the embl-parser fails on: AB019196
>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>>
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: AB019196 seems to have an invalid species classification.
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
>> STACK: Bio::SeqIO::embl::_read_EMBL_Species
>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
>> STACK: Bio::SeqIO::embl::next_seq
>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
>> STACK: -e:1
>> -----------------------------------------------------------
>>
>>
>> It seems to be dissatisfied with this:
>> OS   Acetobacter aceti
>> OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
>> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>>
>> Thanks.
>> -- 
>> Jesper Krogh
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Tue Jun 26 10:13:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 26 Jun 2007 15:13:48 +0100
Subject: [Bioperl-l] Error in constructing Phylogenetic tree
	using	BioPerl
In-Reply-To: <571051.26423.qm@web51107.mail.re2.yahoo.com>
References: <571051.26423.qm@web51107.mail.re2.yahoo.com>
Message-ID: <46811F1C.3020307@sendu.me.uk>

SujiBala wrote:
> Hi Hello
>   This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. 
>    
>   Error messasge
>     Must supply  a valid Bio::Align::AlignI for the _align parameter  in the distance 
>   My program
>   use Bio::AlignIO;
> use Bio::Align::DNAStatistics;
> use Bio::Tree::DistanceFactory;
> # for a dna alignment  can also use ProteinStatistics
> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
> $stats = Bio::Align::DNAStatistics->new;
> $mat = $stats->distance( -align  => @aln,-method => 'Kimura');

Without looking at the docs for these modules, it is immediately obvious 
that Bio::AlignIO->new() is going to return an instance of Bio::AlignIO 
and not an array of alignments. It is also obvious that the -align => 
parameter for the distance() method can't take an array of anything (but 
probably an array ref?).

Check the documentation and make sure you know what objects you're 
generating and passing around.


From schlesi at ebi.ac.uk  Tue Jun 26 10:59:13 2007
From: schlesi at ebi.ac.uk (Felix Schlesinger)
Date: Tue, 26 Jun 2007 15:59:13 +0100
Subject: [Bioperl-l] PAML parser
Message-ID: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>

Hello,

I am trying to use the PAML result parser (BioPerl
Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15.
However on all outputs I have tested no result object is returned
(next_result is undef). This includes the HIV and Lysin datasets
included with PAML.
My code is:

my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir =>
"/.");
my $result = $codemlp->next_result;
foreach my $model ( $result->get_NSSite_results ) {
...

and the error is: Can't call method "get_NSSite_results" on an
undefined value ...

I can include the mlc file is needed. Is this supposed to work? Or do
I have to run paml from bioperl to parse the results?

Thanks
  Felix


From Xianjun.Dong at bccs.uib.no  Tue Jun 26 10:35:17 2007
From: Xianjun.Dong at bccs.uib.no (Xianjun Dong)
Date: Tue, 26 Jun 2007 16:35:17 +0200
Subject: [Bioperl-l] bug for PAML::Baseml
Message-ID: <46812425.8000509@ii.uib.no>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070626/cb3d8193/attachment-0003.html>

From Xianjun.Dong at bccs.uib.no  Tue Jun 26 11:40:47 2007
From: Xianjun.Dong at bccs.uib.no (Xianjun Dong)
Date: Tue, 26 Jun 2007 17:40:47 +0200
Subject: [Bioperl-l] bug for PAML::Baseml
In-Reply-To: <46812425.8000509@ii.uib.no>
References: <46812425.8000509@ii.uib.no>
Message-ID: <4681337F.1000902@ii.uib.no>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070626/604ce866/attachment-0003.html>

From hartzell at alerce.com  Tue Jun 26 14:12:04 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 26 Jun 2007 14:12:04 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
Message-ID: <18049.22260.967524.353173@almost.alerce.com>


There don't seem to be any .cvsignore files in the repository, or in
CVSROOT/cvsignore.

Am I missing something, or don't we use them?

g.


From cjfields at uiuc.edu  Tue Jun 26 15:54:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 14:54:25 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <74515C87-5553-4AF0-9B83-26F3E71E15C8@uiuc.edu>

Not sure.  You may want to email support at open-bio.org; my guess is  
Chris D or Jason would have an answer.

chris

On Jun 26, 2007, at 1:12 PM, George Hartzell wrote:

>
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
>
> Am I missing something, or don't we use them?
>
> g.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Tue Jun 26 15:55:21 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 26 Jun 2007 16:55:21 -0300
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>

Maybe we've been using the default?

On Jun 26, 2007, at 3:12 PM, George Hartzell wrote:

>
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
>
> Am I missing something, or don't we use them?
>
> g.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Tue Jun 26 16:21:30 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 26 Jun 2007 16:21:30 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
Message-ID: <18049.30026.61328.134490@almost.alerce.com>

Chris Fields writes:
 > [...]
 > It looks like George Hartzell may be taking a crack at it, with  
 > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
 > could have something testable relatively soon.  After that we'll need  
 > to work out a few other issues, basically what's on Hilmar's list.

There's a repository on file:///home/hartzell/bioperl with all of the
components projects in place.

If you have a dev.open-bio.org account and you're in the bioperl
group, you're good to get at it via:

  file:///home/hartzell/bioperl

or 

  svn+ssh://dev.open-bio.org/home/hartzell/bioperl

There are a couple of things to think about:

  - how are we going to provide access.  I *think* that I heard a
    decision to use http:// and https://.  Who gets to set that up?

  - what do we want to do about keywords.  The cvs2svn tool guesses
    and automatically sets the svn:keywords property to Author Date
    Revision and Id on many of the files in the tree.  If it looks
    like it got it right, we can stick with it.  Or, we can disable
    that conversion and I've cribbed a little script that'll grep out
    files using Id and set the svn:keywords property accordingly.

  - what do we want to do about svn:ignore?  I haven't seen any
    .cvsignore files.

Beyond that, how does the repo look?

How are we going to cut over?

Are we going to try to push svn commits to the read-mostly CVS repo,
or just keep it around for history's sake (I lean towards the latter).

g.


From jason at bioperl.org  Tue Jun 26 19:22:20 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:22:20 -0300
Subject: [Bioperl-l] PAML parser
In-Reply-To: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>
References: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>
Message-ID: <D536496C-D716-42DF-B614-DD43C1B13A67@bioperl.org>

Can you make sure you have the latest and greatest version of these  
modules from the CVS repository?  We had to fix things to parse 3.15  
-- I can't tell if this is the problem or something else.
You can also add -verbose => 1when you initialize the object and it  
may spit out more warnings about whether it is having problems.


-jason

On Jun 26, 2007, at 11:59 AM, Felix Schlesinger wrote:

> Hello,
>
> I am trying to use the PAML result parser (BioPerl
> Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15.
> However on all outputs I have tested no result object is returned
> (next_result is undef). This includes the HIV and Lysin datasets
> included with PAML.
> My code is:
>
> my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir =>
> "/.");
> my $result = $codemlp->next_result;
> foreach my $model ( $result->get_NSSite_results ) {
> ...
>
> and the error is: Can't call method "get_NSSite_results" on an
> undefined value ...
>
> I can include the mlc file is needed. Is this supposed to work? Or do
> I have to run paml from bioperl to parse the results?
>
> Thanks
>   Felix
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Tue Jun 26 19:27:05 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:27:05 -0300
Subject: [Bioperl-l] Error in constructing Phylogenetic tree
	using	BioPerl
In-Reply-To: <46811F1C.3020307@sendu.me.uk>
References: <571051.26423.qm@web51107.mail.re2.yahoo.com>
	<46811F1C.3020307@sendu.me.uk>
Message-ID: <A99815DC-0FC2-4019-B0C4-CA8EA713FEB0@bioperl.org>


On Jun 26, 2007, at 11:13 AM, Sendu Bala wrote:

> SujiBala wrote:
>> Hi Hello
>>   This is sujatha from singapore. I am trying to construct phylo  
>> tree using DNAStatistics and Kirma method. But I am getting the  
>> following error message. It would be nice if you could help me  
>> resolve this problem asap.
>>
>>   Error messasge
>>     Must supply  a valid Bio::Align::AlignI for the _align  
>> parameter  in the distance
>>   My program
>>   use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>> use Bio::Tree::DistanceFactory;
>> # for a dna alignment  can also use ProteinStatistics
>> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
>> $stats = Bio::Align::DNAStatistics->new;
>> $mat = $stats->distance( -align  => @aln,-method => 'Kimura');
>

yep you want to call next_aln on the Bio::AlignIO object.
I fixed the example code in the HOWTO so it should work properly now;
http://bioperl.org/wiki/HOWTO:Trees#Constructing_Trees

> Without looking at the docs for these modules, it is immediately  
> obvious
> that Bio::AlignIO->new() is going to return an instance of  
> Bio::AlignIO
> and not an array of alignments. It is also obvious that the -align =>
> parameter for the distance() method can't take an array of anything  
> (but
> probably an array ref?).
>
> Check the documentation and make sure you know what objects you're
> generating and passing around.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Tue Jun 26 19:29:11 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:29:11 -0300
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
	<E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>
Message-ID: <5A8FD8A3-9593-4925-AA74-D4B03CDC1C34@bioperl.org>

We don't have one. I have one on my local machine that defined  
basically *~ and .#* so I never had a problem.

Feel free to propose one if you think it is important, I never really  
though it was important.

On Jun 26, 2007, at 4:55 PM, Hilmar Lapp wrote:

> Maybe we've been using the default?
>
> On Jun 26, 2007, at 3:12 PM, George Hartzell wrote:
>
>>
>> There don't seem to be any .cvsignore files in the repository, or in
>> CVSROOT/cvsignore.
>>
>> Am I missing something, or don't we use them?
>>
>> g.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From j_martin at lbl.gov  Tue Jun 26 21:01:29 2007
From: j_martin at lbl.gov (Joel Martin)
Date: Tue, 26 Jun 2007 18:01:29 -0700
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <11302459.post@talk.nabble.com>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
	<BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
	<11302459.post@talk.nabble.com>
Message-ID: <20070627010129.GA8628@eniac.jgi-psf.org>

Hello, 
  The tutorial code snippet is an endless loop, I think it's supposed
to remove the rid.  As the only print statement you added is after the
endless loop, you aren't seeing anything happen.   

Use the code from this instead,

perldoc Bio::Tools::Run::RemoteBlast

  The bptutorial.pl does have a note that it's not useful and to read the pod
for Bio::Tools::Run::RemoteBlast, it's in the next sentences after the code
snippet you used.  

  Though, as it's a tutorial example it might be nice to remove the while
loop .. or at least add the sleep(5) part.
http://www.bioperl.org/wiki/Bptutorial.pl#Running_BLAST_.28using_RemoteBlast.pm.29

  Aside from that, you may have network issues but www.ncbi.nlm.nih.gov
doesn't respond to ping as far as I can tell. 

Joel


On Tue, Jun 26, 2007 at 02:26:03AM -0700, don esteban wrote:
> 
> Try using the Proxyconfiguration in your script:
> 
> $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080";
> 
> 
> 
> 
> L Xu wrote:
> > 
> > I do have the internet connection bu not use the proxy server.
> > I tested the network connection with ping command (below). The ncbi
> > website 
> > does not response. Is there any special network setting needed for 
> > connecting the ncbi website?
> > Thank you so much.
> > 
> > C:\>ping www.yahoo.com
> > 
> > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:
> > 
> > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45
> > 
> > Ping statistics for 69.147.114.210:
> >     Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
> > Approximate round trip times in milli-seconds:
> >     Minimum = 312ms, Maximum = 363ms, Average = 338ms
> > 
> > C:\>ping www.ncbi.nlm.nih.gov
> > 
> > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:
> > 
> > Request timed out.
> > Request timed out.
> > Request timed out.
> > Request timed out.
> > 
> > Ping statistics for 130.14.29.110:
> >     Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
> > 
> > 
> > 
> > = = = Original message = = =
> > 
> > Judging by the output it looks like you have no network access or? can't 
> > connect to the server (what remoteblast needs).? Make sure you? don't need 
> > proxy settings.
> > 
> > To preempt the next question, no, I'm not going to explain what a? proxy 
> > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
> > tool...
> > 
> > chris
> > 
> > On Jun 13, 2007, at 7:16 AM, L Xu wrote:
> > 
> > 
> >    ...
> > -------------------- WARNING ---------------------
> > MSG: <HTML>
> > <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> > <BODY>
> > <H1>An Error Occurred</H1>
> > 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> > </BODY>
> > </HTML>
> > 
> > ---------------------------------------------------
> > ...
> > 
> > ___________________________________________________________
> > Sent by ePrompter, the premier email notification software.
> > Free download at http://www.ePrompter.com.
> > 
> > _________________________________________________________________
> > Get a preview of Live Earth, the hottest event this summer - only on MSN 
> > http://liveearth.msn.com?source=msntaglineliveearthhm
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > 
> > 
> 
> -- 
> View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From melvinp at pacific.net.sg  Wed Jun 27 01:25:08 2007
From: melvinp at pacific.net.sg (Melvin P)
Date: Wed, 27 Jun 2007 13:25:08 +0800
Subject: [Bioperl-l] finding statistics on AA
Message-ID: <4681F4B4.8010609@pacific.net.sg>

Hi, I am new to BioPerl. I am trying to find out if there is any class 
that I can use for occupancy number/occurrence counts, psuedo count, 
observed frequency etc given a few sequences of amino acid. For example, 
what is the observed frequency of residue i at position p. My objective 
is to analyze the information content. Thanks.


From bix at sendu.me.uk  Wed Jun 27 06:23:58 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 11:23:58 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <467FBDD3.8050009@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
Message-ID: <46823ABE.2080300@sendu.me.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> In considering updating all the test scripts to [... use] 
>> t/lib/BioperlTest.pm
> 
> I'm now in the process of converting all test scripts.

And I've now completed that job (for bioperl-live at least), except for 
t/EUtilities.t since I know Chris is working on it.


In addition to converting to Test::More where necessary, I've also made 
all psuedo-TODO blocks real ones. Previously I had advised to use SKIP 
blocks instead since TODO blocks need a Test::Harness upgrade. However I 
think in the next release we ought to make such upgrading compulsory 
(which should be automatic when combined with compulsory usage of 
Module::Build and Test::More in turn: users simply have to update CPAN).


The conversion to BioperlTest directly led to the discovery and fixing 
of 6 minor bugs, so was certainly not without merit.


No user or developer needs to have BIOPERLDEBUG permanently set to true 
anymore. To run all tests you just have to answer yes to the BioDBGFF 
and networking questions of 'perl Build.PL'. With './Build test' you 
then get clean, easy-to-read output where it is obvious to see that we 
currently have these issues:

t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in 
another thread.

t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, 
t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and 
t/Annotation.t all have TODO tests. If you know about those modules, now 
would be a great time to implement those TODOs!

Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are 
deprecated' warnings.


To debug a particular test you could say:
BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t


I've updated the HOWTO for writing test scripts:
http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests


From cjfields at uiuc.edu  Wed Jun 27 07:55:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 06:55:47 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <46823ABE.2080300@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk>
Message-ID: <DC0F57B9-D733-4C89-9B7A-65E1ADFCFDD2@uiuc.edu>


On Jun 27, 2007, at 5:23 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Sendu Bala wrote:
>>> In considering updating all the test scripts to [... use]
>>> t/lib/BioperlTest.pm
>>
>> I'm now in the process of converting all test scripts.
>
> And I've now completed that job (for bioperl-live at least), except  
> for
> t/EUtilities.t since I know Chris is working on it.

The network tests will be much shorter; the bulk will be transferred  
to a new suite for the backend Bio::Tools:EUtilities parser (which  
will test static files in t/data/eutils, so no dynamic changes).

> In addition to converting to Test::More where necessary, I've also  
> made
> all psuedo-TODO blocks real ones. Previously I had advised to use SKIP
> blocks instead since TODO blocks need a Test::Harness upgrade.  
> However I
> think in the next release we ought to make such upgrading compulsory
> (which should be automatic when combined with compulsory usage of
> Module::Build and Test::More in turn: users simply have to update  
> CPAN).

Sounds good to me, but there may be some grumblings out there.

Having specific TODOs are nice b/c we can test them w/o fails.  Handy.

> The conversion to BioperlTest directly led to the discovery and fixing
> of 6 minor bugs, so was certainly not without merit.
>
>
> No user or developer needs to have BIOPERLDEBUG permanently set to  
> true
> anymore. To run all tests you just have to answer yes to the BioDBGFF
> and networking questions of 'perl Build.PL'. With './Build test' you
> then get clean, easy-to-read output where it is obvious to see that we
> currently have these issues:
>
> t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in
> another thread.
>
> t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t,
> t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and
> t/Annotation.t all have TODO tests. If you know about those  
> modules, now
> would be a great time to implement those TODOs!

The RNA_SearchIO.t is from ERPIN output; there's no easy way to  
generate it beyond having the user supply the info (or having the  
program author change the output).

Will have to look at the others to see what's involved; maybe  
something for the priority list?

> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
> deprecated' warnings.

I ran into this with XML::Simple data structures recently; there was  
an easy way around it via XML::Simple using forcearray().  It has to  
do with attempting to assign data to/from a hash in a specific way  
involving array references (though I can't remember exactly how; I  
slept since then).

> To debug a particular test you could say:
> BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t
>
>
> I've updated the HOWTO for writing test scripts:
> http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests

Good work!

chris


From schlesi at ebi.ac.uk  Wed Jun 27 07:57:27 2007
From: schlesi at ebi.ac.uk (Felix Schlesinger)
Date: Wed, 27 Jun 2007 12:57:27 +0100
Subject: [Bioperl-l] Selecting columns from alignment
Message-ID: <7317d50c0706270457i1c3d92a8hb124fa663f51b837@mail.gmail.com>

Hi,

is there an elegant way to select columns from an alignment object
fulfilling a certain property (for example less than x gaps)?
Everything I can see from Align::AlignI seems to involve looking at
the individual sequences, creating lots of slices and appending them.
If there a better way in bioperl or failing that, does anyone know a
software package with similar functionality (t-coffee has lots of
filters for alignments, but nothing to select columns besides by
position it seems). Ideally this would also return a mapping from old
to new positions in one of the sequences of course.

Thanks
  Felix


From cjfields at uiuc.edu  Wed Jun 27 10:36:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 09:36:41 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>


On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:

> ...
> If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
>
>   file:///home/hartzell/bioperl
>
> or
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

I managed to get it working using file://.  Haven't tried svn+ssh yet  
but I've had persistent problems getting ssh to work properly on my  
macbook; not sure why yet but I haven't had time to play around with it.

> There are a couple of things to think about:
>
>   - how are we going to provide access.  I *think* that I heard a
>     decision to use http:// and https://.  Who gets to set that up?

That hasn't been decided yet and will be up to a consensus of the  
core devs, but I think the odds are in favor of allowing https:// but  
against allowing http://.

As for setup that could be anyone with admin privs, though it may be  
best left up to Chris D, Jason, or Mauricio.

>   - what do we want to do about keywords.  The cvs2svn tool guesses
>     and automatically sets the svn:keywords property to Author Date
>     Revision and Id on many of the files in the tree.  If it looks
>     like it got it right, we can stick with it.  Or, we can disable
>     that conversion and I've cribbed a little script that'll grep out
>     files using Id and set the svn:keywords property accordingly.

Probably again a consensus issue, but you can choose one route.  My  
inclination is the former if it's easier.

>   - what do we want to do about svn:ignore?  I haven't seen any
>     .cvsignore files.

Not sure.  I've never used one personally, but (as Jason suggests) if  
you have ideas for one you can propose them, or we can suggest devs  
set up svn::ignore locally.

> Beyond that, how does the repo look?

Seems fine, though a simple 'svn file:///home/hartzell/bioperl'  
checkout gets everything (all distros, branches, etc).  We need to  
make sure everyone uses 'svn co file:///home/hartzell/bioperl/bioperl- 
live/trunk /live' or similar if they just want the latest core/db/etc.

We'll also need to start a svn wiki page to show how to get relevant  
distros (similar in style probably to the cvs page, with dev  
information, how to set up ssh keys, https stuff, etc).

> How are we going to cut over?
>
> Are we going to try to push svn commits to the read-mostly CVS repo,
> or just keep it around for history's sake (I lean towards the latter).

I think a clean cut-over.  Everyone would be warned to hold commits  
for a day (lest they be lost), then probably do something in this order:

- switch cvs to read-only except for svn commits
- run a clean cvs2svn
- set up svn as read/write
- set up test commits to cvs via svn
- disable cvs commit messages to bioperl-guts, enable svn commit  
messages in it's place.
- push svn commits over to read-only cvs

cvs >>must<< be read-only after that point (no cvs->svn commits),  
with write access only available through svn.  If at some future  
point there is no reason to keep it around or that it is more trouble  
than it's worth, we can make a decision then on cvs's fate.

> g.

chris


From rvos at interchange.ubc.ca  Wed Jun 27 10:23:25 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Wed, 27 Jun 2007 07:23:25 -0700 (PDT)
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
Message-ID: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>

 
> Are we going to try to push svn commits to the read-mostly CVS repo,
> or just keep it around for history's sake (I lean towards the latter).

I'm a little confused - surely once the svn is up and running we'll want *no more* cvs commits? Parallel repositories that each accumulate stuff will be a nightmare. I'm probably just not getting your point.

Rutger


From cjfields at uiuc.edu  Wed Jun 27 11:18:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 10:18:03 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>


On Jun 27, 2007, at 9:23 AM, rvos wrote:

>
>> Are we going to try to push svn commits to the read-mostly CVS repo,
>> or just keep it around for history's sake (I lean towards the  
>> latter).
>
> I'm a little confused - surely once the svn is up and running we'll  
> want *no more* cvs commits? Parallel repositories that each  
> accumulate stuff will be a nightmare. I'm probably just not getting  
> your point.
>
> Rutger

Most projects make a clean break with cvs (no more commits) for the  
reasons you point out.  Not sure how the other core devs feel about  
that but I could go for that; it would def. prevent headaches.  We  
could keep cvs for the time being as read-only, with no svn->cvs  
syncing.

There are few projects which have (as a phase-out plan) old read-only  
cvs repositories available, with an automatic svn->cvs commit  
following every new svn commit.  Not sure how that works, esp. for  
branching/merging and so on which I could see potentially getting hairy.

chris


From cjfields at uiuc.edu  Wed Jun 27 12:05:49 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 11:05:49 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <5EA56270-3427-4995-B3C1-2789229AACF1@uiuc.edu>


On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:

> ...If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
>
>   file:///home/hartzell/bioperl
>
> or
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

Did manage to get svn+ssh working (with some password harassment);  
core tests passed enough that I think everything's okay.  If ssh keys  
are set up correctly (mine aren't) it should work fine.

chris


From dmessina at wustl.edu  Wed Jun 27 12:27:32 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 11:27:32 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>

> [Chris]
>
> I managed to get it working using file://.  Haven't tried svn+ssh yet
> but I've had persistent problems getting ssh to work properly on my
> macbook; not sure why yet but I haven't had time to play around  
> with it.

I just did a checkout and a test commit, both via svn+ssh -- works  
great for me.


>> [George]
>>
>>   - what do we want to do about keywords.  The cvs2svn tool guesses
>>     and automatically sets the svn:keywords property to Author Date
>>     Revision and Id on many of the files in the tree.  If it looks
>>     like it got it right, we can stick with it.  Or, we can disable
>>     that conversion and I've cribbed a little script that'll grep out
>>     files using Id and set the svn:keywords property accordingly.


I would think we would want "Author Date Id Rev URL" set on  
everything, no?. So either cvs2svn or your tool (whichever you think  
is better), followed by

	svn propset svn:keywords "Author Date Id Rev URL" *

from the root of a working copy would take care of all of the  
existing files in the repository, I think.

George knows more about this than I do, but I think you can set up a  
global config file with

	enable-auto-props = yes
	* = svn:keywords="Author Date Id Rev URL"

to ensure it gets set on any future additions to the repository.


>>   - what do we want to do about svn:ignore?  I haven't seen any
>>     .cvsignore files.
>
> Not sure.  I've never used one personally, but (as Jason suggests) if
> you have ideas for one you can propose them, or we can suggest devs
> set up svn::ignore locally.

I use the default global-ignores

	global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* .DS_Store

(again, in my system-wide config file), but I'm not tied to that. I  
do think we should have one, though; individuals can easily override  
any settings in the system-wide config with their own ~/.subversion/ 
config.


>> Beyond that, how does the repo look?

Looks great, George! Thanks for doing this.


Dave


From hartzell at alerce.com  Wed Jun 27 13:00:53 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 13:00:53 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <18050.38853.526224.791878@almost.alerce.com>

rvos writes:
 >  
 > > Are we going to try to push svn commits to the read-mostly CVS repo,
 > > or just keep it around for history's sake (I lean towards the latter).
 > 
 > I'm a little confused - surely once the svn is up and running we'll
 > want *no more* cvs commits? Parallel repositories that each
 > accumulate stuff will be a nightmare. I'm probably just not getting
 > your point. 

There had been some point of keeping a CVS repository around as a
read-only mirror of the svn repo, presumably for people who's habits
or setup won't let them use svn.

In theory, each commit to the svn repo can be automagically pushed
down into CVS w/out user intervention, google will tell you how but
I've never run anything that way.

g.


From dmessina at wustl.edu  Wed Jun 27 13:27:01 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 12:27:01 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <99969FC2-479E-408C-AADB-7664EBE937CF@wustl.edu>

> [Chris]
> We'll also need to start a svn wiki page to show how to get relevant
> distros (similar in style probably to the cvs page, with dev
> information, how to set up ssh keys, https stuff, etc).

I cloned the CVS page and have started adapting it for Subversion:

	http://www.bioperl.org/wiki/Using_Subversion

I'll do some more on it later today, but if anyone wants to fiddle  
with it in the interim, please do.


Dave


From n.haigh at sheffield.ac.uk  Wed Jun 27 14:44:16 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 19:44:16 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <46823ABE.2080300@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk>
Message-ID: <4682B000.2050707@sheffield.ac.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> Sendu Bala wrote:
>>> In considering updating all the test scripts to [... use] 
>>> t/lib/BioperlTest.pm
>> I'm now in the process of converting all test scripts.
> 
> And I've now completed that job (for bioperl-live at least), except for 
> t/EUtilities.t since I know Chris is working on it.
> 
> 
> In addition to converting to Test::More where necessary, I've also made 
> all psuedo-TODO blocks real ones. Previously I had advised to use SKIP 
> blocks instead since TODO blocks need a Test::Harness upgrade. However I 
> think in the next release we ought to make such upgrading compulsory 
> (which should be automatic when combined with compulsory usage of 
> Module::Build and Test::More in turn: users simply have to update CPAN).
> 
> 
> The conversion to BioperlTest directly led to the discovery and fixing 
> of 6 minor bugs, so was certainly not without merit.
> 
> 
> No user or developer needs to have BIOPERLDEBUG permanently set to true 
> anymore. To run all tests you just have to answer yes to the BioDBGFF 
> and networking questions of 'perl Build.PL'. With './Build test' you 
> then get clean, easy-to-read output where it is obvious to see that we 
> currently have these issues:
> 
> t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in 
> another thread.
> 
> t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, 
> t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and 
> t/Annotation.t all have TODO tests. If you know about those modules, now 
> would be a great time to implement those TODOs!
> 
> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are 
> deprecated' warnings.

Ah, that reminds me!

I recently tried to do an install of the cvs head (a week or two ago) on
a clean installation of Debian 4.0 (etch). During the installation, of
dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
Bioperl. I seem to remember this circular dependency cropping up before
- am I correct - and can you remind me how this was "fixed"?

Cheers
Nath


From bix at sendu.me.uk  Wed Jun 27 14:52:01 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 19:52:01 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B000.2050707@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
Message-ID: <4682B1D1.3080206@sendu.me.uk>

Nathan S. Haigh wrote:
> I recently tried to do an install of the cvs head (a week or two ago) on
> a clean installation of Debian 4.0 (etch). During the installation, of
> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
> Bioperl. I seem to remember this circular dependency cropping up before
> - am I correct - and can you remind me how this was "fixed"?

Yes, it always happens. It was 'fixed' by being completely ignored by 
me. Installation is guaranteed to fail, but if you really want it, 
trying to install again after you already have Bioperl installed will 
result in success.

Clearly something nicer could be done. Suggestions on a postcard...


From cjfields at uiuc.edu  Wed Jun 27 15:01:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 14:01:01 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B000.2050707@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
Message-ID: <A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>


On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote:

> Sendu Bala wrote:
>> ...
>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
>> deprecated' warnings.
>
> Ah, that reminds me!
>
> I recently tried to do an install of the cvs head (a week or two  
> ago) on
> a clean installation of Debian 4.0 (etch). During the installation, of
> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
> Bioperl. I seem to remember this circular dependency cropping up  
> before
> - am I correct - and can you remind me how this was "fixed"?
>
> Cheers
> Nath

Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part  
of Bioperl (and he could be come a dev).  That would solve it.

chris


From n.haigh at sheffield.ac.uk  Wed Jun 27 15:16:40 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 20:16:40 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
Message-ID: <4682B798.1010409@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> 
> On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote:
> 
>> Sendu Bala wrote:
>>> ...
>>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
>>> deprecated' warnings.
>>
>> Ah, that reminds me!
>>
>> I recently tried to do an install of the cvs head (a week or two ago) on
>> a clean installation of Debian 4.0 (etch). During the installation, of
>> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
>> Bioperl. I seem to remember this circular dependency cropping up before
>> - am I correct - and can you remind me how this was "fixed"?
>>
>> Cheers
>> Nath
> 
> Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of
> Bioperl (and he could be come a dev).  That would solve it.
> 
> chris

Just to put the feelers out to see what people think.

It seems (to me at least) that Bioperl modules could/should? be released
as individual modules and that "bioperl" would really constitute a
"bundle" of all these modules - in terms of CPAN anyway. Am I correct in
this thinking? The Bio::ASN1::EntrezGene could simply require a
particular module rather than the whole of bioperl - might get out of
the circular dependency theoretically!?

I'm not suggesting moving in this direction, but just wondered what
others thought about this concept?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgreYczuW2jkwy2gRAi5IAJ9/Alq1fktEmAF16DlKcBVcy7d+jQCeIj+X
tOFQUQ7cGJLUITEDw1+QLxc=
=Yc+g
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Wed Jun 27 15:31:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 14:31:44 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B798.1010409@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
	<4682B798.1010409@sheffield.ac.uk>
Message-ID: <33C76559-4771-4FDC-9EEA-1645BC3C576C@uiuc.edu>


On Jun 27, 2007, at 2:16 PM, Nathan S. Haigh wrote:

> ...
>
> Just to put the feelers out to see what people think.
>
> It seems (to me at least) that Bioperl modules could/should? be  
> released
> as individual modules and that "bioperl" would really constitute a
> "bundle" of all these modules - in terms of CPAN anyway. Am I  
> correct in
> this thinking? The Bio::ASN1::EntrezGene could simply require a
> particular module rather than the whole of bioperl - might get out of
> the circular dependency theoretically!?
>
> I'm not suggesting moving in this direction, but just wondered what
> others thought about this concept?
>
> Nath

Well, Steve suggested splitting some of core into distinct groups,  
which I tend to agree with in some respects (speed up releases for  
those modules, such as SearchIO, DB, Graphics).  The problem we have  
yet to solve is what we consider 'core'.  Is it Bio::Seq and  
related?  Should it include Bio::DB*?  Should it just be Bio::*  
modules with no or very few external dependencies?  And so on...,   
probably not a decision we want to make immediately (until after svn  
migration, tests finished, maybe a release or two, a beer)...

The Bioperl module dependency that Bio::ASN1::EntrezGene has is  
Bio::Index::AbstractSeq.  You could try a test build of  
Bio::ASN1::EntrezGene to see what happens.

chris


From hlapp at gmx.net  Wed Jun 27 15:49:15 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 16:49:15 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
Message-ID: <E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>


On Jun 27, 2007, at 1:27 PM, David Messina wrote:

> I would think we would want "Author Date Id Rev URL" set on
> everything, no?. So either cvs2svn or your tool (whichever you think
> is better), followed by
>
> 	svn propset svn:keywords "Author Date Id Rev URL" *

Shouldn't this be done recursively?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Jun 27 15:50:27 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 16:50:27 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
Message-ID: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>


On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:

> Most projects make a clean break with cvs (no more commits) for the
> reasons you point out.  Not sure how the other core devs feel about
> that but I could go for that; it would def. prevent headaches.

There shouldn't be any cvs write support after the cut-over I think.  
I don't see the benefit that would justify the huge headache potential.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 27 16:01:40 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:01:40 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
Message-ID: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>


On Jun 27, 2007, at 2:50 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:
>
>> Most projects make a clean break with cvs (no more commits) for the
>> reasons you point out.  Not sure how the other core devs feel about
>> that but I could go for that; it would def. prevent headaches.
>
> There shouldn't be any cvs write support after the cut-over I  
> think. I don't see the benefit that would justify the huge headache  
> potential.
>
> 	-hilmar

Agreed, so maybe we should set that in stone.  That means no svn->cvs  
syncing post-migration as well, I assume.

Now how about a quick straw poll, what kind of access?  svn+ssh is  
already available, but some (Aaron among them) have indicated they  
would like https as well (not sure how involved it would be to set up).

chris


From hlapp at gmx.net  Wed Jun 27 16:08:40 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:08:40 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
Message-ID: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>


On Jun 27, 2007, at 5:01 PM, Chris Fields wrote:

> That means no svn->cvs syncing post-migration as well, I assume.

That's a bit of a different story. People out there have URL links  
into our anonymous CVS repository. If it's not too troublesome (and  
tend to I think it's not) I'd like to maintain those in working  
order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi  
script that maps between the URL flavors (i.e., that maps a CVS-style  
URL to the equivalent SVN link).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Wed Jun 27 16:15:10 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 16:15:10 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
Message-ID: <18050.50510.84363.355034@almost.alerce.com>

David Messina writes:
 > > [Chris]
 > >
 > > I managed to get it working using file://.  Haven't tried svn+ssh yet
 > > but I've had persistent problems getting ssh to work properly on my
 > > macbook; not sure why yet but I haven't had time to play around  
 > > with it.
 > 
 > I just did a checkout and a test commit, both via svn+ssh -- works  
 > great for me.

Is there anyone working outside of bioperl-{run,live,ext}?

g.


From bix at sendu.me.uk  Wed Jun 27 16:22:13 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 21:22:13 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B798.1010409@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
	<4682B798.1010409@sheffield.ac.uk>
Message-ID: <4682C6F5.4020406@sendu.me.uk>

Nathan S. Haigh wrote:
> It seems (to me at least) that Bioperl modules could/should? be released
> as individual modules and that "bioperl" would really constitute a
> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
> this thinking? The Bio::ASN1::EntrezGene could simply require a
> particular module rather than the whole of bioperl - might get out of
> the circular dependency theoretically!?

No, it wouldn't. The 'problem' only arises because the user is 
/choosing/ to install both Bioperl and Bio::ASN1::EntrezGene at the same 
time. So even if Bioperl was released as separate modules there would 
still be that 'bundle' and users would still choose to do the same 
thing: install all the Bioperl modules as well as all its /optional/ 
recommended modules. And there lies the problem: Bio::ASN1::EntrezGene 
requires  Bioperl modules, and one Bioperl module requires 
Bio::ASN1::EntrezGene, so the circularity isn't solved.


(FYI:
Bio::ASN1::EntrezGene requires Bio::Index::AbstractSeq
Bio::Index::AbstractSeq requires a couple of Bioperl modules, including 
Bio::Root::Root

Bio::SeqIO::entrezgene requires Bio::ASN1::EntrezGene and a bunch of 
Bioperl modules, including Bio::Root::Root.
)


You only avoid circularity by choosing not to install everything in one 
go. Which is something you can do right now with no problems.


From n.haigh at sheffield.ac.uk  Wed Jun 27 16:24:18 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 21:24:18 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
Message-ID: <4682C772.5070502@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hilmar Lapp wrote:
> On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:
> 
>> Most projects make a clean break with cvs (no more commits) for the
>> reasons you point out.  Not sure how the other core devs feel about
>> that but I could go for that; it would def. prevent headaches.
> 
> There shouldn't be any cvs write support after the cut-over I think.  
> I don't see the benefit that would justify the huge headache potential.
> 
> 	-hilmar

I agree. A clean switch from cvs read/write to svn read/write plus cvs
read only sounds the least problematic!

However, how will links to cvs be dealt with? Links on Bioperl could be
switched over to point to svn, but what about possible links from
external sources? Maybe a more generic approach of redirection could
work? Or a simple warning page stating the fact that we have moved from
cvs to svn and provide a common link to follow?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgsdyczuW2jkwy2gRAtuyAKDIpN0TNX0U7sTuE3i+fj6WFZ1K0QCfcX7Y
81KurFwJlRtYFxSmLZP56Sk=
=pp7b
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 27 16:30:19 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:30:19 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>


On Jun 26, 2007, at 5:21 PM, George Hartzell wrote:

>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>

Cool - this works for me.

One thing I notice is that in cvs log you see which version is in  
which branch which is useful to answer user queries that might be a  
version problem. svn log doesn't seem to want to show that. Does  
anyone have ideas for how to do this in svn?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Jun 27 16:32:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:32:18 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4682C772.5070502@sheffield.ac.uk>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<4682C772.5070502@sheffield.ac.uk>
Message-ID: <D080DC49-A2A4-44E4-9027-A63C1772CD85@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 27, 2007, at 5:24 PM, Nathan S. Haigh wrote:

> However, how will links to cvs be dealt with?

Well I said before that probably one can write a couple of lines of  
Perl to write a cgi script that returns the appropriate redirect URL  
with a redirect status code.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGgslWuV6N2JxL7qsRAvsTAKDjR18NzWzlj74mCF+diNpe2dLV2ACgn/4Y
f6sJ/ngeKEGpKHgyAHM1DAA=
=8n0E
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Wed Jun 27 16:50:11 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:50:11 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
Message-ID: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>


On Jun 27, 2007, at 3:30 PM, Hilmar Lapp wrote:

>
> On Jun 26, 2007, at 5:21 PM, George Hartzell wrote:
>
>>
>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>>
>
> Cool - this works for me.
>
> One thing I notice is that in cvs log you see which version is in  
> which branch which is useful to answer user queries that might be a  
> version problem. svn log doesn't seem to want to show that. Does  
> anyone have ideas for how to do this in svn?
>
> 	-hilmar

We prob. should move it to a new directory ASAP which george can  
write to when he needs to update.  cvs is in /home/repository/ 
bioperl, so maybe something similar, like /home/svn/repository/bioperl?

chris


From cjfields at uiuc.edu  Wed Jun 27 16:51:37 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:51:37 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>
Message-ID: <4D8CAAD9-4774-47FB-84E0-7FBA50EC377B@uiuc.edu>


On Jun 27, 2007, at 3:08 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 5:01 PM, Chris Fields wrote:
>
>> That means no svn->cvs syncing post-migration as well, I assume.
>
> That's a bit of a different story. People out there have URL links  
> into our anonymous CVS repository. If it's not too troublesome (and  
> tend to I think it's not) I'd like to maintain those in working  
> order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi  
> script that maps between the URL flavors (i.e., that maps a CVS- 
> style URL to the equivalent SVN link).
>
> 	-hilmar

I'll try getting a wiki page up as a checklist for this, including  
what direction we're heading in, ideas (your list and CGI redirect  
ideas, svn::ignore issues, etc).  Dave has already started on the  
'getting bioperl using svn' wiki page.

If we intend to sync cvs with svn we need to find the right tools or  
at least check for other projects which have done something similar.   
I haven't googled on that yet but I'll attempt to tonight.

chris


From cjfields at uiuc.edu  Wed Jun 27 16:53:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:53:08 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <C2A83EA3.EC27%bosborne11@verizon.net>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
Message-ID: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>

bioperl-run also.  I think the run CVS repo has some binary files, so  
if there are any problems with cvs2svn it'll be there.

chris

On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote:

> George,
>
> bioperl-db and bioperl-network should be included, I think.
>
> Brian O
>
>
> On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:
>
>> David Messina writes:
>>>> [Chris]
>>>>
>>>> I managed to get it working using file://.  Haven't tried svn 
>>>> +ssh yet
>>>> but I've had persistent problems getting ssh to work properly on my
>>>> macbook; not sure why yet but I haven't had time to play around
>>>> with it.
>>>
>>> I just did a checkout and a test commit, both via svn+ssh -- works
>>> great for me.
>>
>> Is there anyone working outside of bioperl-{run,live,ext}?
>>
>> g.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Wed Jun 27 17:05:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 22:05:50 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682C6F5.4020406@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk>
Message-ID: <4682D12E.3000803@sendu.me.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> It seems (to me at least) that Bioperl modules could/should? be released
>> as individual modules and that "bioperl" would really constitute a
>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>> particular module rather than the whole of bioperl - might get out of
>> the circular dependency theoretically!?
> 
> No, it wouldn't.
[snip]
> You only avoid circularity by choosing not to install everything in one 
> go.

Errr... I take that back. Since CPAN bundles install things in a certain 
order, you just have to make sure that everything Bio::ASN1::EntrezGene 
needs is installed first, then Bio::ASN1::EntrezGene, then 
Bio::SeqIO::entrezgene.

But the main problem with this approach is that maintenance, 
global-style code improvements and releases become a nightmare. I could, 
perhaps, imagine a scenario where the repository stayed as-is (one 
monolithic collection), but the dist action of Build.PL could be altered 
to generate a release package per module instead of one big release 
package of all modules, as is currently the case.

Is there much value in doing that? Does anyone want me to look into the 
feasibility of such a thing?


From bosborne11 at verizon.net  Wed Jun 27 16:19:47 2007
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 27 Jun 2007 16:19:47 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <18050.50510.84363.355034@almost.alerce.com>
Message-ID: <C2A83EA3.EC27%bosborne11@verizon.net>

George,

bioperl-db and bioperl-network should be included, I think.

Brian O


On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:

> David Messina writes:
>>> [Chris]
>>> 
>>> I managed to get it working using file://.  Haven't tried svn+ssh yet
>>> but I've had persistent problems getting ssh to work properly on my
>>> macbook; not sure why yet but I haven't had time to play around
>>> with it.
>> 
>> I just did a checkout and a test commit, both via svn+ssh -- works
>> great for me.
> 
> Is there anyone working outside of bioperl-{run,live,ext}?
> 
> g.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Wed Jun 27 17:25:53 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 22:25:53 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682D12E.3000803@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
Message-ID: <4682D5E1.2030507@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> It seems (to me at least) that Bioperl modules could/should? be released
>>> as individual modules and that "bioperl" would really constitute a
>>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
>>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>>> particular module rather than the whole of bioperl - might get out of
>>> the circular dependency theoretically!?
>>
>> No, it wouldn't.
> [snip]
>> You only avoid circularity by choosing not to install everything in
>> one go.
> 
> Errr... I take that back. Since CPAN bundles install things in a certain
> order, you just have to make sure that everything Bio::ASN1::EntrezGene
> needs is installed first, then Bio::ASN1::EntrezGene, then
> Bio::SeqIO::entrezgene.
> 
> But the main problem with this approach is that maintenance,
> global-style code improvements and releases become a nightmare. I could,
> perhaps, imagine a scenario where the repository stayed as-is (one
> monolithic collection), but the dist action of Build.PL could be altered
> to generate a release package per module instead of one big release
> package of all modules, as is currently the case.
> 
> Is there much value in doing that? Does anyone want me to look into the
> feasibility of such a thing?


I think the value would be in other external modules being able to use
bioperl modules with more ease (not sure how many modules have, or
currently depend on bioperl) as they would depend on a single module,
rather than the whole package. However, how would the dependencies of
each module be handled? I'm clearly thinking aloud, but....Maybe this
would tease apart "cliques" of modules that are interdependent? and
could in themselves be shipped as bundles e.g. Bio::Graphics and have a
"master" bioperl bundle that installa all the bioperl modules.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgtXhczuW2jkwy2gRAiftAKDZQGDpaq5saEyE3ZfPyFqli4j+8QCfXbIB
2EZjccEFEzfFlx4H47gzwLk=
=nobl
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 27 17:35:28 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 18:35:28 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
Message-ID: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>

Is there a reason not to port every subproject over?

	-hilmar

On Jun 27, 2007, at 5:53 PM, Chris Fields wrote:

> bioperl-run also.  I think the run CVS repo has some binary files, so
> if there are any problems with cvs2svn it'll be there.
>
> chris
>
> On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote:
>
>> George,
>>
>> bioperl-db and bioperl-network should be included, I think.
>>
>> Brian O
>>
>>
>> On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:
>>
>>> David Messina writes:
>>>>> [Chris]
>>>>>
>>>>> I managed to get it working using file://.  Haven't tried svn
>>>>> +ssh yet
>>>>> but I've had persistent problems getting ssh to work properly  
>>>>> on my
>>>>> macbook; not sure why yet but I haven't had time to play around
>>>>> with it.
>>>>
>>>> I just did a checkout and a test commit, both via svn+ssh -- works
>>>> great for me.
>>>
>>> Is there anyone working outside of bioperl-{run,live,ext}?
>>>
>>> g.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 27 17:36:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:36:29 -0500
Subject: [Bioperl-l] Splits again, formerly  Test overhaul complete
In-Reply-To: <4682D12E.3000803@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
Message-ID: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>


On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> It seems (to me at least) that Bioperl modules could/should? be  
>>> released
>>> as individual modules and that "bioperl" would really constitute a
>>> "bundle" of all these modules - in terms of CPAN anyway. Am I  
>>> correct in
>>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>>> particular module rather than the whole of bioperl - might get  
>>> out of
>>> the circular dependency theoretically!?
>> No, it wouldn't.
> [snip]
>> You only avoid circularity by choosing not to install everything  
>> in one go.
>
> Errr... I take that back. Since CPAN bundles install things in a  
> certain order, you just have to make sure that everything  
> Bio::ASN1::EntrezGene needs is installed first, then  
> Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene.
>
> But the main problem with this approach is that maintenance, global- 
> style code improvements and releases become a nightmare. I could,  
> perhaps, imagine a scenario where the repository stayed as-is (one  
> monolithic collection), but the dist action of Build.PL could be  
> altered to generate a release package per module instead of one big  
> release package of all modules, as is currently the case.
>
> Is there much value in doing that? Does anyone want me to look into  
> the feasibility of such a thing?

Not for the time being, at least in my opinion.  Too much on our  
plate at this point with svn migration, test conversion, bugzilla  
running over (next point of attack!), etc.  Maybe something to think  
about after, though I like the idea of a few splits to core as Steve  
suggested (SearchIO, Graphics, some LWP-related DB modules).

My (albeit extreme) thought is to have a lean-and-mean set of 'core'  
modules with as few external dependencies as possible, which could  
work around the circular dependency issue in this case:

                dep.on                  dep.on
Bio::Auxiliary -----> ASN1::EntrezGene -----> core
(with EntrezGene)                            (basic SeqIO, Index, DB,  
etc)
       \---->------>--- dep.on ->----->----->----/

Bioperl auxiliary modules would list core as a required dependency  
along with anything else needed for that particular aux. section  
(i.e. XML parsers, LWP, GD, etc.).  The whole mess, if needed, would  
be installed using Bundle::BioPerl or similar, with no part released  
w/o testing on the whole 'base' to ensure proper interaction.

If a fix needed to be made in one set, make the fix, test against  
bioperl 'base' as a whole, and release when possible.  No need to  
wait for a full-fledged 1.5.3 release.

Maybe wishful thinking...

chris


From cjfields at uiuc.edu  Wed Jun 27 17:44:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:44:47 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
	<9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
Message-ID: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>

We should port them all, yes.

chris

On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote:

> Is there a reason not to port every subproject over?
>
> 	-hilmar


From cjfields at uiuc.edu  Wed Jun 27 17:53:02 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:53:02 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682D5E1.2030507@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<4682D5E1.2030507@sheffield.ac.uk>
Message-ID: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>


On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote:

>> ...
>> Is there much value in doing that? Does anyone want me to look  
>> into the
>> feasibility of such a thing?
>
>
> I think the value would be in other external modules being able to use
> bioperl modules with more ease (not sure how many modules have, or
> currently depend on bioperl) as they would depend on a single module,
> rather than the whole package. However, how would the dependencies of
> each module be handled? I'm clearly thinking aloud, but....Maybe this
> would tease apart "cliques" of modules that are interdependent? and
> could in themselves be shipped as bundles e.g. Bio::Graphics and  
> have a
> "master" bioperl bundle that installa all the bioperl modules.

See my response to Sendu, and Steve Chervitz's original post and  
related thread:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ 
focus=15315

which pretty much covers the same ground.  I think at most 4-5 split  
'cliques', including core, with the fewest possible dependencies in  
core.  If we do any of this, it prob. should wait until after an svn  
migration and bugzilla bug stomping unless there is a (well-argued)  
advantage to doing it now.

chris


From n.haigh at sheffield.ac.uk  Wed Jun 27 18:07:31 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 23:07:31 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<4682D5E1.2030507@sheffield.ac.uk>
	<1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>
Message-ID: <4682DFA3.9090100@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> 
> On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote:
> 
>>> ...
>>> Is there much value in doing that? Does anyone want me to look into the
>>> feasibility of such a thing?
>>
>>
>> I think the value would be in other external modules being able to use
>> bioperl modules with more ease (not sure how many modules have, or
>> currently depend on bioperl) as they would depend on a single module,
>> rather than the whole package. However, how would the dependencies of
>> each module be handled? I'm clearly thinking aloud, but....Maybe this
>> would tease apart "cliques" of modules that are interdependent? and
>> could in themselves be shipped as bundles e.g. Bio::Graphics and have a
>> "master" bioperl bundle that installa all the bioperl modules.
> 
> See my response to Sendu, and Steve Chervitz's original post and related
> thread:
> 
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/focus=15315
> 
> which pretty much covers the same ground.  I think at most 4-5 split
> 'cliques', including core, with the fewest possible dependencies in
> core.  If we do any of this, it prob. should wait until after an svn
> migration and bugzilla bug stomping unless there is a (well-argued)
> advantage to doing it now.
> 
> chris


That's fine by me - or should I say, the best way forward - I was really
just thinking aloud :)

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgt+jczuW2jkwy2gRAhPmAKDCgI1BOp/MOQVUQhQGqWaRRfPTaACfTPix
TSi/e8PtYTwpxn6x+ewrjBs=
=7Vp1
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Wed Jun 27 18:43:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 23:43:48 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
Message-ID: <4682E824.1050507@sendu.me.uk>

Chris Fields wrote:
> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:
>> But the main problem with this approach is that maintenance, global- 
>> style code improvements and releases become a nightmare. I could,  
>> perhaps, imagine a scenario where the repository stayed as-is (one  
>> monolithic collection), but the dist action of Build.PL could be  
>> altered to generate a release package per module instead of one big  
>> release package of all modules, as is currently the case.
>>
>> Is there much value in doing that? Does anyone want me to look into  
>> the feasibility of such a thing?
> 
> Not for the time being, at least in my opinion.  Too much on our  
> plate at this point with svn migration, test conversion, bugzilla  
> running over (next point of attack!), etc.  Maybe something to think  
> about after, though I like the idea of a few splits to core as Steve  
> suggested (SearchIO, Graphics, some LWP-related DB modules).
[snip]
> If a fix needed to be made in one set, make the fix, test against  
> bioperl 'base' as a whole, and release when possible.  No need to  
> wait for a full-fledged 1.5.3 release.

What advantage is there of these defined splits instead of individual 
modules? As I see it you lose some of the potential benefits of breaking 
Bioperl up completely, whilst also suffering the maintenance problems I 
outlined in my objection to Steve's post.

Being able to work on all Bioperl from a single cvs (ne svn) check out/ 
archive, whilst distributing it as individual modules on CPAN seems like 
the best of both worlds to me. What am I missing?


From hartzell at alerce.com  Wed Jun 27 20:41:01 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:41:01 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
	<9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>
Message-ID: <18051.925.23313.932916@almost.alerce.com>

Chris Fields writes:
 > [...]
 > We prob. should move it to a new directory ASAP which george can  
 > write to when he needs to update.  cvs is in /home/repository/ 
 > bioperl, so maybe something similar, like /home/svn/repository/bioperl?

I'd be parsimonious (lazy...) and go for /home/svn/bioperl.

g.


From hartzell at alerce.com  Wed Jun 27 20:46:29 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:46:29 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
Message-ID: <18051.1253.87485.235496@almost.alerce.com>

Chris Fields writes:
 > [...]
 > Now how about a quick straw poll, what kind of access?  svn+ssh is  
 > already available, but some (Aaron among them) have indicated they  
 > would like https as well (not sure how involved it would be to set up).

What we do here, in large part, depends on what our host machine makes
available to us.

Is there an apache instance that we can use?  Maybe a separate one?

May someone among us configure it, or do we need to ask for help?  (in
other words, does anyone have sudo?)

Is there some reason to not include http: (using Digest authentication
so that passwords aren't passed in the clear?)?  Maybe even go so far
as to ask why bother with https:, it's not like we need to transfer
any data encrypted....

g.


From dmessina at wustl.edu  Wed Jun 27 23:02:25 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 22:02:25 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
Message-ID: <D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>


On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 1:27 PM, David Messina wrote:
>
>> I would think we would want "Author Date Id Rev URL" set on
>> everything, no?. So either cvs2svn or your tool (whichever you think
>> is better), followed by
>>
>> 	svn propset svn:keywords "Author Date Id Rev URL" *
>
> Shouldn't this be done recursively?


Yep, good catch! Thanks, Hilmar.

Should be:

	svn propset --recursive svn:keywords "Author Date Id Rev URL" *


From jason at bioperl.org  Wed Jun 27 23:29:09 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 28 Jun 2007 00:29:09 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <18051.1253.87485.235496@almost.alerce.com>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
Message-ID: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>

I think Chris D and I will need to confer a bit on https+svn.  I  
don't know when we'll have a good chance to discuss everything.  At  
some point this discussion is may need to be taken off bioperl and  
just the interested parties as we're delving into hardware geek land.

The repository machine (dev) is a locked down machine meaning it only  
really runs ssh and not many servers include httpd.  We have  
anonymous CVS (client and through httpd browsing) running on a  
separate machine (code) that has the info rsynced over every 10 or 15  
minutes. The foundation websites and mailing lists run on a third  
machine (portal).


If we decide to support https we'll need to spend a little time  
deciding how well we can keep it locked down - it will only be https  
not http for example and we may want to see about limiting ssh access  
to everyone if we migrate all OBF projects over to SVN and only  
support https.

Again to re-iterate what I think we would do:
  - SVN read/write will live on 'dev', _WHEN_ we switch over no  
writes to the CVS repository. It will be available by ssh+svn and  
potentially by https+svn
  - SVN read-only will live on 'code', it will be accessible by http+svn
  - CVS read-only will live on 'code', this will only be a sync from  
the SVN to the CVS.  See http://svn2cvs.tigris.org/ for details


As I tried to ask for in the past, would someone also illustrate the  
importance of why _WE_ need to switch to SVN on a wiki page on  
Bioperl so that when someone complains/asks about this in the future  
the arguments are already laid out.  I am basically fine with it, but  
I don't honestly see a compelling reason beyond what has been  
mentioned wrt better integration in IDEs.
http://bioperl.org/wiki/Why_SVN

-jason
On Jun 27, 2007, at 9:46 PM, George Hartzell wrote:

> Chris Fields writes:
>> [...]
>> Now how about a quick straw poll, what kind of access?  svn+ssh is
>> already available, but some (Aaron among them) have indicated they
>> would like https as well (not sure how involved it would be to set  
>> up).
>
> What we do here, in large part, depends on what our host machine makes
> available to us.
>
> Is there an apache instance that we can use?  Maybe a separate one?
>
> May someone among us configure it, or do we need to ask for help?  (in
> other words, does anyone have sudo?)
>
> Is there some reason to not include http: (using Digest authentication
> so that passwords aren't passed in the clear?)?  Maybe even go so far
> as to ask why bother with https:, it's not like we need to transfer
> any data encrypted....
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Wed Jun 27 23:51:32 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 28 Jun 2007 00:51:32 -0300
Subject: [Bioperl-l] Splits again
In-Reply-To: <4682E824.1050507@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
Message-ID: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>

Hey guys - I'm wading in a bit late as I haven't had time to keep up  
with whole discussion.

So you are suggesting 800+ individual CPAN modules?  I don't think  
that is a good idea.  Why would you split up Bio::Seq::RichSeq and  
Bio::Seq into two separate packages for example? I think if you  
really want to move away from the monolithic install it has to be  
more logical by function - but I am not that optimistic that this is  
going to actually be easier for people.  Maybe I'm misunderstanding.

What are the arguments for separating things -- to make it so people  
aren't scared by the number of modules so they'll code?  It seems  
like some people just want it to be installed and run scripts - does  
having them install dozens of modules work.  Do we need to consider  
people how much this would suck if someone can't use CPAN or  
Module::Builder to automate dependancy tracking installation?  How  
does it work when modules are deprecated?

I'm not sure I have made up my mind on what I'd like to see, but at  
some point I think we need to get a clearer idea of what audience we  
are trying to serve best.  If want it to be easy to install maybe we  
should invest time into making OSX double-click installers, RPMs, and  
the Windows stuff easily installable.  If we want to serve the  
developers who aren't using SVN so we want to push out releases of  
modules ASAP?  I just am not clear on the motivation for some of the  
proposed changes.

Also - the main point I wanted to make - Can I suggest we spend a  
little time discussing what it will take to get a stable release for  
the current code as it stands (bioperl-live and bioperl-run)?  It  
seems like we really need to do this first so that we have a stable  
release that can be followed by CVS -> SVN migration, then consider  
major changes to the repository structure and release packaging, and  
potential deprecation and incorporation of other modules.


I assume there is no chance that we'd have a 1.6 candidate by BOSC  
next month?

Will it be productive to schedule a fair amount of time at BOSC  
discussing how to partition out the packages into separate sub- 
packages after we've done a successful release rather than trying to  
change things right now? I realize not everyone will be there but  
maybe it will be easier to interact on this then.

I think it will also be time to talk with Lincoln/Scott about how  
Gbrowse is structured and if that is working for them.  There is too  
much code in different places that I think we need to figure out how  
to structure it properly so those packages can be released.  It would  
probably mean moving Bio::Graphics, Bio::DB::GFF and  
Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages  
so they could be released more regularly on par with Gbrowse  
schedules.   Also I think someone needs to figure out Bio::Tools::GFF  
vs Bio::FeatureIO -- what do we want to do?  I don't think we really  
fully support GFF3 that well -- the X2GFF scripts probably need some  
more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL,  
etc... ) and or migration to the proper GFF writing.


-jason
On Jun 27, 2007, at 7:43 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:
>>> But the main problem with this approach is that maintenance, global-
>>> style code improvements and releases become a nightmare. I could,
>>> perhaps, imagine a scenario where the repository stayed as-is (one
>>> monolithic collection), but the dist action of Build.PL could be
>>> altered to generate a release package per module instead of one big
>>> release package of all modules, as is currently the case.
>>>
>>> Is there much value in doing that? Does anyone want me to look into
>>> the feasibility of such a thing?
>>
>> Not for the time being, at least in my opinion.  Too much on our
>> plate at this point with svn migration, test conversion, bugzilla
>> running over (next point of attack!), etc.  Maybe something to think
>> about after, though I like the idea of a few splits to core as Steve
>> suggested (SearchIO, Graphics, some LWP-related DB modules).
> [snip]
>> If a fix needed to be made in one set, make the fix, test against
>> bioperl 'base' as a whole, and release when possible.  No need to
>> wait for a full-fledged 1.5.3 release.
>
> What advantage is there of these defined splits instead of individual
> modules? As I see it you lose some of the potential benefits of  
> breaking
> Bioperl up completely, whilst also suffering the maintenance  
> problems I
> outlined in my objection to Steve's post.
>
> Being able to work on all Bioperl from a single cvs (ne svn) check  
> out/
> archive, whilst distributing it as individual modules on CPAN seems  
> like
> the best of both worlds to me. What am I missing?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From chris at bioteam.net  Thu Jun 28 00:08:25 2007
From: chris at bioteam.net (Chris Dagdigian)
Date: Thu, 28 Jun 2007 00:08:25 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <97A3257B-8E00-48D7-8B7D-51AD728CB8F7@bioteam.net>


My understanding of "https+svn" is that it is actually WebDAV-over- 
HTTP which means that not only would we need to light up a HTTPD  
server on the developer box we'd also have to get a stable mod_dav  
module installed (sometimes not trivial) and then we would have to  
figure out how to handle the authentication bits. Right now with SSH  
we use Unix group permissions to figure out who can write to what  
repository -- WebDAV makes this a lot more complicated.

Forcing encryption over https will prevent someone from sniffing a  
developer password which removes the main security issue. The next  
problem is going to be integrating the DAV module with Linux PAM so  
that existing usernames and passwords can be used, -OR- we have to  
set up and maintain an entirely separate set of username and password  
maps for each developer and each SVN project.

I'm not super concerned about this -- BioTeam runs svn internally and  
we expose our SVN for employees both via WebDAV and SVN+SSH - it's  
not that hard to set up.

My biggest concern really has to do with how much extra work this  
will mean for the OBF sysadmin team. If there is an easy way to get a  
stable Apache/DAV/SVN integration going with authentication coming  
from Linux PAM then this is no big deal. If we have to manually  
maintain separate authentication lists then it will be kind of a hassle.

Like Jason mentioned, the OBF currently segregates "stuff" onto three  
different servers with three levels of security:

- dev.open-bio.org -- Developers only, SSH access only (main  
sourcecode repository for OBF)
- portal.open-bio.org -- Websites, Wikis, Blogs, Mailing list servers  
and helpdesk.open-bio.org
- code.open-bio.org -- "Disposable" anonymous access server that we  
can easily burn/wipe/reinstall if it ever gets hacked

Everything else that Jason mentioned is fine and easy to set up (if  
not already running):

  - SVN+SSH for developers
  - Anonymous SVN and Anonymous RSYNC for community access on  
code.open-bio.org
  - svn2cvs for whomever wants it on code.open-bio.org
  - web based SVN code browser installed on http://code.open-bio.org


Regards,
Chris


On Jun 27, 2007, at 11:29 PM, Jason Stajich wrote:

> I think Chris D and I will need to confer a bit on https+svn.  I  
> don't know when we'll have a good chance to discuss everything.  At  
> some point this discussion is may need to be taken off bioperl and  
> just the interested parties as we're delving into hardware geek land.
>
> The repository machine (dev) is a locked down machine meaning it  
> only really runs ssh and not many servers include httpd.  We have  
> anonymous CVS (client and through httpd browsing) running on a  
> separate machine (code) that has the info rsynced over every 10 or  
> 15 minutes. The foundation websites and mailing lists run on a  
> third machine (portal).
>
>
> If we decide to support https we'll need to spend a little time  
> deciding how well we can keep it locked down - it will only be  
> https not http for example and we may want to see about limiting  
> ssh access to everyone if we migrate all OBF projects over to SVN  
> and only support https.
>
> Again to re-iterate what I think we would do:
>  - SVN read/write will live on 'dev', _WHEN_ we switch over no  
> writes to the CVS repository. It will be available by ssh+svn and  
> potentially by https+svn
>  - SVN read-only will live on 'code', it will be accessible by http 
> +svn
>  - CVS read-only will live on 'code', this will only be a sync from  
> the SVN to the CVS.  See http://svn2cvs.tigris.org/ for details
>
>
> As I tried to ask for in the past, would someone also illustrate  
> the importance of why _WE_ need to switch to SVN on a wiki page on  
> Bioperl so that when someone complains/asks about this in the  
> future the arguments are already laid out.  I am basically fine  
> with it, but I don't honestly see a compelling reason beyond what  
> has been mentioned wrt better integration in IDEs.
> http://bioperl.org/wiki/Why_SVN
>
> -jason
> On Jun 27, 2007, at 9:46 PM, George Hartzell wrote:
>
>> Chris Fields writes:
>>> [...]
>>> Now how about a quick straw poll, what kind of access?  svn+ssh is
>>> already available, but some (Aaron among them) have indicated they
>>> would like https as well (not sure how involved it would be to  
>>> set up).
>>
>> What we do here, in large part, depends on what our host machine  
>> makes
>> available to us.
>>
>> Is there an apache instance that we can use?  Maybe a separate one?
>>
>> May someone among us configure it, or do we need to ask for help?   
>> (in
>> other words, does anyone have sudo?)
>>
>> Is there some reason to not include http: (using Digest  
>> authentication
>> so that passwords aren't passed in the clear?)?  Maybe even go so far
>> as to ask why bother with https:, it's not like we need to transfer
>> any data encrypted....
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>


From cjfields at uiuc.edu  Thu Jun 28 00:18:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 23:18:03 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4682E824.1050507@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
Message-ID: <FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>


On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:

> Chris Fields wrote:
> ...
>> If a fix needed to be made in one set, make the fix, test against   
>> bioperl 'base' as a whole, and release when possible.  No need to   
>> wait for a full-fledged 1.5.3 release.
>
> What advantage is there of these defined splits instead of  
> individual modules? As I see it you lose some of the potential  
> benefits of breaking Bioperl up completely, whilst also suffering  
> the maintenance problems I outlined in my objection to Steve's post.
>
> Being able to work on all Bioperl from a single cvs (ne svn) check  
> out/ archive, whilst distributing it as individual modules on CPAN  
> seems like the best of both worlds to me. What am I missing?

Okay, forewarned, but here's my long-winded reasoning.  The short and  
sweet version: I (very) respectfully don't agree with you, at least  
re: the idea we should commit all modules to CPAN independently.  It  
doesn't make any sense to me, but maybe you can elaborate more?   
Maybe I'm misinterpreting what you mean?

Also, I agree with Steve C. that core is anything but a  
representation of a 'core' set of modules, and some sections could  
(should?) be split off into discrete, cohesive units.  We may be  
alone in that camp, though it doesn't seem so (it's popped up more  
than a few times, in one form or another).  If you want an in-depth  
explanation for both opinions, read on (below my sig), or feel free  
to bypass it.  I'll understand.

Finally, all of this should wait until later.  Much later, like after  
a decent release, after svn, etc kind of 'later'.  I think we can  
agree on that.

.
.
.
.
.

Still here?  Okay... each issue (skip as needed):

Individual CPAN modules:

CPAN is not our personal versioning system; it may be if a  
distribution consists of only a few modules, but not when it's one of  
the largest distros present.  If someone wants to update an  
individual bioperl module for a quick bug fix they are more than  
welcome to download it via cvs, svn, or even using a web browser, and  
replace the one they have.  In most cases, it works w/o problems.   
With Module::Build you have even made it easier if a full  
installation is necessary.

I'm trying to reason how one could break up the individual SeqIO/ 
SearchIO/otherIO modules into single module distributions.  They are  
intrinsically tied together (SeqIO::genbank won't work w/o SeqIO,  
which relies on the various interfaces, RootIO, and on down).  How  
would tests be run off CPAN when the modules are distributed  
independently?  Would they also be individually distributed?  What  
would you use to tie all the individual modules together?  How would  
you explain to the CPAN maintainers that you want to split bioperl  
into 990 individual modules, all updated independently, but intend on  
bundling them afterwards anyway?

I'm failing to see the advantages to this approach, but if you can  
find an example where this was done successfully on CPAN or elsewhere  
maybe I could see what you mean.

Splitting up core:

As I see it, here are the advantages of a defined split as Steve and  
I see it (off the top of my head).  Some of this probably reiterates  
my previous points, as well as Steve's, so apologies in advance.

- A lean, mean, focused set of bioperl base modules (core) w/o or  
with very few external deps, minimal installation issues, etc.  The  
very basic stuff to get up and running.

- BioPerl bundled modules (Nathan's 'cliques') with defined, focused  
functionality, code, and tests, which add a bit more 'sugar' to the  
base functionality of the core.  If you only care about parsing BLAST  
reports, get SearchIO, which requires core and optionally other  
modules (XML::SAX).  If you want additional DB functionality apart  
from the very basic ones in core, install DB (with it's additional  
requirements, including core, DBI, and so on).  Same with Graphics,  
Tools, Tree/Phylo, etc.  We just need to define and limit the number  
of splits.

- Easier to add additional bundled modules.  For instance, I could  
focus all of my RNA work into a discrete set of modules (say, bioperl- 
rna) which I maintain, I ensure works with the latest core code, I  
ensure also plays well with the other children =) , and I distribute  
via CPAN.  Same with EUtilities, which could go into a separated DB- 
related set or stay in core.

- If we want a full-fledged 'install everything', the CPAN Bundle  
system is available.  I think it's easier to use a Bundle for 4-5,  
even 10 groups of modules as opposed to over 900.

- A Bundle or a build file where discrete distributions are listed  
(Bio::SearchIO, etc) wouldn't need to be updated every time a new  
module is added to a distribution.  I suppose this could be  
automated, but why have the additional headache?

- A chance to cut out some cruft.  We all know that particular areas  
need work or a complete overhaul (Restriction, Structure, maybe a few  
others).  Smaller, concentrated sets of modules I believe would be  
easier to maintain, and those that don't get use will eventually fall  
out of favor and may be lost or replaced from the more maintained  
group of modules.  Survival of the fittest.

- We already have had practice; bioperl-db, bioperl-run, bioperl- 
network, and others.  Those that have been routinely maintained and  
enjoy wide use (db, run, network) have survived; others not so much  
(corba-related stuff, microarray, ext, etc., though the code is still  
available if someone else wants to take it up and revive it!).

Disadvantages of a defined split:

- The initial headache of identifying which groups go where,  
coordinating with those who rely on bioperl (GMOD, etc) on how this  
will be set up, so on...

- Separate groups of modules require testing together to ensure  
functionality is consistent and maintained (something I think you  
pointed out previously).

- I think an increased possibility of branching is possible.

- Extra headaches for devs, who have to keep track of the various  
critical distributions and make sure they work well together.

- Maybe others, but it's getting late here.  Add more as needed; I'm  
sure there are a number more.


chris


From cjfields at uiuc.edu  Thu Jun 28 01:17:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 00:17:01 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <671B8432-28DA-47DA-9E0C-66AF0E3D5973@uiuc.edu>

D'oh!  Just when I wanted to go to bed.  It's not fair, you're in  
California...

On Jun 27, 2007, at 10:51 PM, Jason Stajich wrote:

> Hey guys - I'm wading in a bit late as I haven't had time to keep up
> with whole discussion.
>
> So you are suggesting 800+ individual CPAN modules?  I don't think
> that is a good idea.  Why would you split up Bio::Seq::RichSeq and
> Bio::Seq into two separate packages for example? I think if you
> really want to move away from the monolithic install it has to be
> more logical by function - but I am not that optimistic that this is
> going to actually be easier for people.  Maybe I'm misunderstanding.

Okay, so maybe it wasn't just me.

> What are the arguments for separating things -- to make it so people
> aren't scared by the number of modules so they'll code?  It seems
> like some people just want it to be installed and run scripts - does
> having them install dozens of modules work.  Do we need to consider
> people how much this would suck if someone can't use CPAN or
> Module::Builder to automate dependancy tracking installation?  How
> does it work when modules are deprecated?

What I envision for core is maybe not just one distribution, but a  
cluster of distributions:

base - Bio::Seq; Bio::SeqIO; Bio::AlignIO, some Bio::DB, associated  
modules.  Bare bones, with as few dependencies as possible.
aux - Any Bio::SeqIO, Bio::AlignIO, Bio::DB etc. that requires  
additional modules.
search - Bio::Search and SearchIO
tools - Bio::Tools, Bio::Restriction, maybe DB modules, GFF-related  
stuff?
graphics - Bio::Graphics.  Maybe GMOD-related stuff here?

The last four would list bioperl-core as a dependency themselves  
along with any other modules necessary.  We could also have the core  
Build.PL ask the user if they want to install the other non-base  
distros, and maybe include bioperl-db, bioperl-network, and bioperl- 
run in the loop if requested.

All would be installed as a bundle similar to Bundle::BioPerl, but  
have regular CPAN point releases (1.x.x) independently from one  
another i.e. for bug fixes, with a yearly/biyearly timed full release  
(1.x) of the whole shebang.  Any point release for any 'core'  
distribution would have to be tested against the others prior to  
release.

This is basically following Steve's train of thought, though more  
elaborated:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ 
focus=15315

> I'm not sure I have made up my mind on what I'd like to see, but at
> some point I think we need to get a clearer idea of what audience we
> are trying to serve best.  If want it to be easy to install maybe we
> should invest time into making OSX double-click installers, RPMs, and
> the Windows stuff easily installable.  If we want to serve the
> developers who aren't using SVN so we want to push out releases of
> modules ASAP?  I just am not clear on the motivation for some of the
> proposed changes.

I think regular CPAN releases with updated PPMs hosted via portal  
work fine for the most part, but it would be nice to host RPMs.   
Others (Allen Day, for instance) have donated time to generate RPMs  
but they seem to lag behind a bit more.

The original idea for svn arose from an unrelated thread with Mark  
Johnson discussing something (Glimmer maybe?) and took off from  
there.  I was actually pretty surprised it took on a life of it's  
own.  As for the motivation to switch, I haven't specifically used it  
myself, but the large number of responses seem to indicate others  
have and seem happy with it.  Rutger Vos had also indicated he would  
move Bio::Phylo over to the repo if we used svn.  We def. should  
address the issues you bring up (why _WE_ need svn) more succinctly  
but that shouldn't be an issue.

> Also - the main point I wanted to make - Can I suggest we spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

Agreed.  We prob. need to schedule a good couple of days (or so) to  
squash bugs.

> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

Um, not likely as nothing has been addressed Feature/Annotation-wise  
(overloads are still there, methods have not been deprecated, etc).   
There was an underlying assumption these would have an effect on GMOD- 
related stuff (I remember reading a post from Scott Cain in the mail  
archive mentioning something along these lines after the 1.5 release  
hubbub).

Maybe a quick 1.5.3 for BOSC, with a 1.6 for fall?

> Will it be productive to schedule a fair amount of time at BOSC
> discussing how to partition out the packages into separate sub-
> packages after we've done a successful release rather than trying to
> change things right now? I realize not everyone will be there but
> maybe it will be easier to interact on this then.

How many are going to be there?  I can't go this year except on my  
own dime (which I don't have many of, student loans and all, sorry),  
though I'll likely be in a new lab by spring which is likely more  
amenable to funding.  If there is a hackathon in the late fall (post- 
sept) I'll make it a point to go regardless.

> I think it will also be time to talk with Lincoln/Scott about how
> Gbrowse is structured and if that is working for them.  There is too
> much code in different places that I think we need to figure out how
> to structure it properly so those packages can be released.  It would
> probably mean moving Bio::Graphics, Bio::DB::GFF and
> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
> so they could be released more regularly on par with Gbrowse
> schedules.   Also I think someone needs to figure out Bio::Tools::GFF
> vs Bio::FeatureIO -- what do we want to do?  I don't think we really
> fully support GFF3 that well -- the X2GFF scripts probably need some
> more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL,
> etc... ) and or migration to the proper GFF writing.
>
>
> -jason

Will Lincoln or Scott be at BOSC?

chris


From dmessina at wustl.edu  Thu Jun 28 01:21:58 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 00:21:58 -0500
Subject: [Bioperl-l] finding statistics on AA
In-Reply-To: <4681F4B4.8010609@pacific.net.sg>
References: <4681F4B4.8010609@pacific.net.sg>
Message-ID: <F57E70E8-BBDA-45CF-B2C7-E05AED04F4C6@wustl.edu>

Hi Melvin,

I don't think BioPerl has any information content-related code. I'm  
not terribly familiar with it myself, but the usual recommendation is  
to look at the EMBOSS package:

	http://en.wikipedia.org/wiki/EMBOSS

Dave


From bix at sendu.me.uk  Thu Jun 28 02:38:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 07:38:48 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <46835778.5070901@sendu.me.uk>

Jason Stajich wrote:
> So you are suggesting ou are suggesting 800+ individual CPAN modules?
> I don't think that is a good idea.  Why would you split up
> Bio::Seq::RichSeq and Bio::Seq into two separate packages for
> example? I think if you really want to move away from the monolithic
> install it has to be more logical by function - but I am not that
> optimistic that this is going to actually be easier for people.
> Maybe I'm misunderstanding.
> 
> What are the arguments for separating things -- to make it so people
>  aren't scared by the number of modules so they'll code?  It seems
> like some people just want it to be installed and run scripts - does
> having them install dozens of modules work.  Do we need to consider
> people how much this would suck if someone can't use CPAN or
> Module::Builder to automate dependancy tracking installation?  How
> does it work when modules are deprecated?

See my upcoming reply to Chris. Briefly, if the only change is to the
dist action of Build.PL, we can make a single archive of all modules
available to non-CPAN users, and individual modules available to CPAN
users. No problems.


> Also - the main point I wanted to make - Can I suggest we spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

I'd recommend that a 'stable' release shouldn't happen until we resolve
all the missing tests and bugzilla bugs (because I think the opportunity
should be taken to have it stable both in terms of interface /and/
bugs). Which is a lot of work.


> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

None.


From bix at sendu.me.uk  Thu Jun 28 03:25:03 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 08:25:03 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
Message-ID: <4683624F.6020402@sendu.me.uk>

Chris Fields wrote:
> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>> What advantage is there of these defined splits instead of  
>> individual modules? As I see it you lose some of the potential  
>> benefits of breaking Bioperl up completely, whilst also suffering  
>> the maintenance problems I outlined in my objection to Steve's post.
>>
>> Being able to work on all Bioperl from a single cvs (ne svn) check  
>> out/ archive, whilst distributing it as individual modules on CPAN  
>> seems like the best of both worlds to me. What am I missing?
> 
> Okay, forewarned, but here's my long-winded reasoning.  The short and  
> sweet version: I (very) respectfully don't agree with you, at least  
> re: the idea we should commit all modules to CPAN independently. It  
> doesn't make any sense to me, but maybe you can elaborate more?   
> Maybe I'm misinterpreting what you mean?

The short and sweet version: my proposal has all the benefits of yours, 
but none of the disadvantages. What's not to like?


> Finally, all of this should wait until later.  Much later, like after  
> a decent release, after svn, etc kind of 'later'.  I think we can  
> agree on that.

Hmm, not really. If it can be implemented by a change in just Build.PL 
and ModuleBuildBioperl, its really independent of everything else. 
That's the beauty of it: the only thing that changes is how things are 
uploaded to and downloaded from CPAN. The only person that normally 
deals with that issue is the pumpkin for a release, and he only cares 
about it at release time.

In fact, if we're going to do it at all it makes sense to try it out on 
a minor release like 1.5.3. We've already got experience of doing it 
split-style from 1.5.2. (And let me tell you: splits at the code-base 
level suck.)


> Individual CPAN modules:
> 
> CPAN is not our personal versioning system; it may be if a  
> distribution consists of only a few modules, but not when it's one of  
> the largest distros present.  If someone wants to update an  
> individual bioperl module for a quick bug fix they are more than  
> welcome to download it via cvs, svn, or even using a web browser, and  
> replace the one they have.

And where is the harm in letting them do it via CPAN as well? In fact, 
there are significant benefits:


> I'm trying to reason how one could break up the individual SeqIO/ 
> SearchIO/otherIO modules into single module distributions.  They are  
> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO,  
> which relies on the various interfaces, RootIO, and on down).  How  
> would tests be run off CPAN when the modules are distributed  
> independently?

Bio::SeqIO::genbank would have a dependency on the latest version of 
Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.

So when a user wants to get the latest version of Bio::SeqIO::genbank, 
they no longer have to worry about what other modules in its dependency 
hierarchy they should also install.

Instead they just request Bio::SeqIO::genbank which itself ensures you 
have the latest version of all its dependencies before installing itself 
and running its tests.

When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank 
users should have, he could just call './Build dist Bio::SeqIO::genbank' 
which would generate a new package for Bio::SeqIO::genbank suitable for 
uploading to CPAN. No more long release cycles and having to constantly 
tell people to 'use CVS' to get working Bioperl code.


> Would they also be individually distributed?  What  
> would you use to tie all the individual modules together?  How would  
> you explain to the CPAN maintainers that you want to split bioperl  
> into 990 individual modules, all updated independently, but intend on  
> bundling them afterwards anyway?

They would be tied together by a CPAN bundle. You don't have to 
'explain' anything to the CPAN maintainers because you're not doing 
anything wrong. In fact, you're using it the way you're supposed to.


> Splitting up core:
> 
> As I see it, here are the advantages of a defined split as Steve and  
> I see it (off the top of my head).  Some of this probably reiterates  
> my previous points, as well as Steve's, so apologies in advance.

Below I answer with how it would be with my single-module approach 
compared to the defined splits.


> - A lean, mean, focused set of bioperl base modules (core) w/o or  
> with very few external deps, minimal installation issues, etc.  The  
> very basic stuff to get up and running.

Even leaner, even more focused.


> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused  
> functionality, code, and tests, which add a bit more 'sugar' to the  
> base functionality of the core.  If you only care about parsing BLAST  
> reports, get SearchIO, which requires core and optionally other  
> modules (XML::SAX).  If you want additional DB functionality apart  
> from the very basic ones in core, install DB (with it's additional  
> requirements, including core, DBI, and so on).  Same with Graphics,  
> Tools, Tree/Phylo, etc.  We just need to define and limit the number  
> of splits.

The same can be achieved with CPAN bundles for each kind of functional 
grouping you can think of. And since its just a single text file that 
defines such a grouping, its easy to change or add new ones as you feel 
like it, as opposed to the rather more permanent and substantial effort 
of creating one of your splits on the code-base level.

Also, the world doesn't have to rely on /our/ ideas of what a useful 
functional split is. If someone just wants to parse Blast results, they 
can just use CPAN to install Bio::SearchIO::blast_pull instead of having 
to install all of SearchIO.


> - Easier to add additional bundled modules.  For instance, I could  
> focus all of my RNA work into a discrete set of modules (say, bioperl- 
> rna) which I maintain, I ensure works with the latest core code, I  
> ensure also plays well with the other children =) , and I distribute  
> via CPAN.  Same with EUtilities, which could go into a separated DB- 
> related set or stay in core.

And if you lose interest in them? They eventually die because they no 
longer have someone looking after them by default (the pumpkin and other 
devs). Alternatively you could just make a CPAN bundle. One text file! 
Easy! No duplication of modules in CPAN, no new hassle for you or the 
Bioperl 'core' pumpkin to ensure that the latest version of each work 
with each other and other splits.


> - If we want a full-fledged 'install everything', the CPAN Bundle  
> system is available.  I think it's easier to use a Bundle for 4-5,  
> even 10 groups of modules as opposed to over 900.

No, it isn't any easier. Its /equally/ easy to install a bundle of 900 
packages of 900 modules as it is to install 5 packages of 900 modules.

When not installing absolutely everything, but perhaps 'most' things, 
there's the additional benefit that it would be easier to skip a 
particular Bio::module because you didn't want to install its external 
dependencies and weren't that interested in it anyway.


> - A Bundle or a build file where discrete distributions are listed  
> (Bio::SearchIO, etc) wouldn't need to be updated every time a new  
> module is added to a distribution.  I suppose this could be  
> automated, but why have the additional headache?

Yes, it would be automated, and no, it wouldn't at all be any kind of 
additional headache. I'm proposing a fully-automated system that the 
pumpkin wouldn't even have to think about it. Much /less/ of a headache 
than dealing with splits. Orders of magnitude easier to deal with.


> - A chance to cut out some cruft.  We all know that particular areas  
> need work or a complete overhaul (Restriction, Structure, maybe a few  
> others).  Smaller, concentrated sets of modules I believe would be  
> easier to maintain, and those that don't get use will eventually fall  
> out of favor and may be lost or replaced from the more maintained  
> group of modules.  Survival of the fittest.

And the smallest, most concentrated set of modules is the individual module.


> - We already have had practice; bioperl-db, bioperl-run, bioperl- 
> network, and others.  Those that have been routinely maintained and  
> enjoy wide use (db, run, network) have survived; others not so much  
> (corba-related stuff, microarray, ext, etc., though the code is still  
> available if someone else wants to take it up and revive it!).

The reason some of these existing splits (micoarray, ext) have fallen by 
the way-side? /Because/ they're splits. If they had been part of 
bioperl-live all along, they'd have been kept in a working, compatible 
state and would have been released along with everything else in 1.5.2


> Disadvantages of a defined split:
> 
> - The initial headache of identifying which groups go where,  
> coordinating with those who rely on bioperl (GMOD, etc) on how this  
> will be set up, so on...

No need to worry about this with individual modules.


> - Separate groups of modules require testing together to ensure  
> functionality is consistent and maintained (something I think you  
> pointed out previously).

No need to worry.


> - I think an increased possibility of branching is possible.
> 
> - Extra headaches for devs, who have to keep track of the various  
> critical distributions and make sure they work well together.

No headaches.


From charles-listes+bioperl at plessy.org  Thu Jun 28 03:40:04 2007
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Thu, 28 Jun 2007 16:40:04 +0900
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
Message-ID: <20070628074004.GD6338@kunpuu.plessy.org>

Dear developpers,

I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
it would make sense to call it "bioperl-live" and distribute it in
parallel with the stable 1.4.0 version, if bioperl-live means "the
current developepr version".

If I am wrong, can somebody explain me what bioperl-live exactly refers
to ?

Have a nice day,

-- 
Charles Plessy
Debian-med packaging team
Wako, Saitama, Japan


From n.haigh at sheffield.ac.uk  Thu Jun 28 04:23:10 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:23:10 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <46836FEE.5030203@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Chris Fields wrote:
>> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>>> What advantage is there of these defined splits instead of 
>>> individual modules? As I see it you lose some of the potential 
>>> benefits of breaking Bioperl up completely, whilst also suffering 
>>> the maintenance problems I outlined in my objection to Steve's post.
>>>
>>> Being able to work on all Bioperl from a single cvs (ne svn) check 
>>> out/ archive, whilst distributing it as individual modules on CPAN 
>>> seems like the best of both worlds to me. What am I missing?
>>
>> Okay, forewarned, but here's my long-winded reasoning.  The short and 
>> sweet version: I (very) respectfully don't agree with you, at least 
>> re: the idea we should commit all modules to CPAN independently. It 
>> doesn't make any sense to me, but maybe you can elaborate more?  
>> Maybe I'm misinterpreting what you mean?
> 
> The short and sweet version: my proposal has all the benefits of yours,
> but none of the disadvantages. What's not to like?
> 
> 
>> Finally, all of this should wait until later.  Much later, like after 
>> a decent release, after svn, etc kind of 'later'.  I think we can 
>> agree on that.
> 
> Hmm, not really. If it can be implemented by a change in just Build.PL
> and ModuleBuildBioperl, its really independent of everything else.
> That's the beauty of it: the only thing that changes is how things are
> uploaded to and downloaded from CPAN. The only person that normally
> deals with that issue is the pumpkin for a release, and he only cares
> about it at release time.
> 
> In fact, if we're going to do it at all it makes sense to try it out on
> a minor release like 1.5.3. We've already got experience of doing it
> split-style from 1.5.2. (And let me tell you: splits at the code-base
> level suck.)
> 
> 
>> Individual CPAN modules:
>>
>> CPAN is not our personal versioning system; it may be if a 
>> distribution consists of only a few modules, but not when it's one of 
>> the largest distros present.  If someone wants to update an 
>> individual bioperl module for a quick bug fix they are more than 
>> welcome to download it via cvs, svn, or even using a web browser, and 
>> replace the one they have.
> 
> And where is the harm in letting them do it via CPAN as well? In fact,
> there are significant benefits:
> 
> 
>> I'm trying to reason how one could break up the individual SeqIO/
>> SearchIO/otherIO modules into single module distributions.  They are 
>> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, 
>> which relies on the various interfaces, RootIO, and on down).  How 
>> would tests be run off CPAN when the modules are distributed 
>> independently?
> 
> Bio::SeqIO::genbank would have a dependency on the latest version of
> Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.
> 
> So when a user wants to get the latest version of Bio::SeqIO::genbank,
> they no longer have to worry about what other modules in its dependency
> hierarchy they should also install.
> 
> Instead they just request Bio::SeqIO::genbank which itself ensures you
> have the latest version of all its dependencies before installing itself
> and running its tests.

This was my thinking when I first brought this up at the
begining/splitting of this thread. This way of thinking of modules as
the constituent parts of a larger package should make it easier for
people to define dependencies far easier as well as users only needing
to install those parts they require. As Sendu points out, if the user
wants to convert seqs from genbank to fasta they could simply install
Bio::SeqIO::genbank and Bio::SeqIO::fasta and they would get all the
other modules that are the dependencies of Bio::SeqIO::genbank and
Bio::SeqIO::fasta.

> 
> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
> users should have, he could just call './Build dist Bio::SeqIO::genbank'
> which would generate a new package for Bio::SeqIO::genbank suitable for
> uploading to CPAN. No more long release cycles and having to constantly
> tell people to 'use CVS' to get working Bioperl code.

However, how would the test suite work out with this? e.g. when someone
installs Bio::SeqIO::genbank they want to have the tests associated with
Bio::SeqIO::genbank to be run. Would there be tests that would be run
redundantly if for example someone installed Bio::SeqIO::genbank and
Bio::SeqIO::fasta?

> 
> 
>> Would they also be individually distributed?  What  would you use to
>> tie all the individual modules together?  How would  you explain to
>> the CPAN maintainers that you want to split bioperl  into 990
>> individual modules, all updated independently, but intend on  bundling
>> them afterwards anyway?
> 
> They would be tied together by a CPAN bundle. You don't have to
> 'explain' anything to the CPAN maintainers because you're not doing
> anything wrong. In fact, you're using it the way you're supposed to.

Yep. real modules are released as modules, each with their own set of
dependencies. The use CPAN bundles the way there were supposed to be for
- - distributing a set of CPAN modules that make a coherent set of
functionality. You "could" also bundle in other authors modules e.g.
Bio::ASN1::EntrezGene?

> 
> 
>> Splitting up core:
>>
>> As I see it, here are the advantages of a defined split as Steve and 
>> I see it (off the top of my head).  Some of this probably reiterates 
>> my previous points, as well as Steve's, so apologies in advance.
> 
> Below I answer with how it would be with my single-module approach
> compared to the defined splits.
> 
> 
>> - A lean, mean, focused set of bioperl base modules (core) w/o or 
>> with very few external deps, minimal installation issues, etc.  The 
>> very basic stuff to get up and running.
> 
> Even leaner, even more focused.
> 
> 
>> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused 
>> functionality, code, and tests, which add a bit more 'sugar' to the 
>> base functionality of the core.  If you only care about parsing BLAST 
>> reports, get SearchIO, which requires core and optionally other 
>> modules (XML::SAX).  If you want additional DB functionality apart 
>> from the very basic ones in core, install DB (with it's additional 
>> requirements, including core, DBI, and so on).  Same with Graphics, 
>> Tools, Tree/Phylo, etc.  We just need to define and limit the number 
>> of splits.
> 
> The same can be achieved with CPAN bundles for each kind of functional
> grouping you can think of. And since its just a single text file that
> defines such a grouping, its easy to change or add new ones as you feel
> like it, as opposed to the rather more permanent and substantial effort
> of creating one of your splits on the code-base level.
> 
> Also, the world doesn't have to rely on /our/ ideas of what a useful
> functional split is. If someone just wants to parse Blast results, they
> can just use CPAN to install Bio::SearchIO::blast_pull instead of having
> to install all of SearchIO.
> 
> 
>> - Easier to add additional bundled modules.  For instance, I could 
>> focus all of my RNA work into a discrete set of modules (say, bioperl-
>> rna) which I maintain, I ensure works with the latest core code, I 
>> ensure also plays well with the other children =) , and I distribute 
>> via CPAN.  Same with EUtilities, which could go into a separated DB-
>> related set or stay in core.
> 
> And if you lose interest in them? They eventually die because they no
> longer have someone looking after them by default (the pumpkin and other
> devs). Alternatively you could just make a CPAN bundle. One text file!
> Easy! No duplication of modules in CPAN, no new hassle for you or the
> Bioperl 'core' pumpkin to ensure that the latest version of each work
> with each other and other splits.

Hmm, how would module versions be handled? Wouldn't this approach
require each module to have it's own independent version number, which
could then be used for building the dependencies? Each new release of
that module would only bump that module's version number.

Bundles can specify the minimum version of a module to be installed,
such that bug fixes to individual modules and be released into CPAN and
would automatically get picked up when installing bundles etc.

I'm not quite sure how the current stable/dev releases would work. I
assume bug fixes would have to be made on a branch e.g. branch 1.6 and
released to cpan from there. Then when the next stable release is made,
all module versions would be bumped and and released to CPAN. With any
modifications to the content of the bundle to be made. Is it possible to
have a stable and developer release bundles that are able to specify the
minimum stable and developer modules versions respectively?


> 
> 
>> - If we want a full-fledged 'install everything', the CPAN Bundle 
>> system is available.  I think it's easier to use a Bundle for 4-5, 
>> even 10 groups of modules as opposed to over 900.
> 
> No, it isn't any easier. Its /equally/ easy to install a bundle of 900
> packages of 900 modules as it is to install 5 packages of 900 modules.
> 
> When not installing absolutely everything, but perhaps 'most' things,
> there's the additional benefit that it would be easier to skip a
> particular Bio::module because you didn't want to install its external
> dependencies and weren't that interested in it anyway.
> 
> 
>> - A Bundle or a build file where discrete distributions are listed 
>> (Bio::SearchIO, etc) wouldn't need to be updated every time a new 
>> module is added to a distribution.  I suppose this could be 
>> automated, but why have the additional headache?
> 
> Yes, it would be automated, and no, it wouldn't at all be any kind of
> additional headache. I'm proposing a fully-automated system that the
> pumpkin wouldn't even have to think about it. Much /less/ of a headache
> than dealing with splits. Orders of magnitude easier to deal with.
> 
> 
>> - A chance to cut out some cruft.  We all know that particular areas 
>> need work or a complete overhaul (Restriction, Structure, maybe a few 
>> others).  Smaller, concentrated sets of modules I believe would be 
>> easier to maintain, and those that don't get use will eventually fall 
>> out of favor and may be lost or replaced from the more maintained 
>> group of modules.  Survival of the fittest.
> 
> And the smallest, most concentrated set of modules is the individual
> module.
> 
> 
>> - We already have had practice; bioperl-db, bioperl-run, bioperl-
>> network, and others.  Those that have been routinely maintained and 
>> enjoy wide use (db, run, network) have survived; others not so much 
>> (corba-related stuff, microarray, ext, etc., though the code is still 
>> available if someone else wants to take it up and revive it!).
> 
> The reason some of these existing splits (micoarray, ext) have fallen by
> the way-side? /Because/ they're splits. If they had been part of
> bioperl-live all along, they'd have been kept in a working, compatible
> state and would have been released along with everything else in 1.5.2
> 
> 
>> Disadvantages of a defined split:
>>
>> - The initial headache of identifying which groups go where, 
>> coordinating with those who rely on bioperl (GMOD, etc) on how this 
>> will be set up, so on...
> 
> No need to worry about this with individual modules.
> 
> 
>> - Separate groups of modules require testing together to ensure 
>> functionality is consistent and maintained (something I think you 
>> pointed out previously).
> 
> No need to worry.

Maye need to worry aout how the tests are run when installing individual
modules etc?

> 
> 
>> - I think an increased possibility of branching is possible.
>>
>> - Extra headaches for devs, who have to keep track of the various 
>> critical distributions and make sure they work well together.
> 
> No headaches.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg2/uczuW2jkwy2gRAlR4AJ44kHIXWWapNVGOIrkFBJdP9rn3vwCdErhT
VkymyXNshguE44/RilEXWDA=
=O5ex
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Thu Jun 28 04:27:54 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:27:54 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <4683710A.9010808@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Chris Fields wrote:
>> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>>> What advantage is there of these defined splits instead of 
>>> individual modules? As I see it you lose some of the potential 
>>> benefits of breaking Bioperl up completely, whilst also suffering 
>>> the maintenance problems I outlined in my objection to Steve's post.
>>>
>>> Being able to work on all Bioperl from a single cvs (ne svn) check 
>>> out/ archive, whilst distributing it as individual modules on CPAN 
>>> seems like the best of both worlds to me. What am I missing?
>>
>> Okay, forewarned, but here's my long-winded reasoning.  The short and 
>> sweet version: I (very) respectfully don't agree with you, at least 
>> re: the idea we should commit all modules to CPAN independently. It 
>> doesn't make any sense to me, but maybe you can elaborate more?  
>> Maybe I'm misinterpreting what you mean?
> 
> The short and sweet version: my proposal has all the benefits of yours,
> but none of the disadvantages. What's not to like?
> 
> 
>> Finally, all of this should wait until later.  Much later, like after 
>> a decent release, after svn, etc kind of 'later'.  I think we can 
>> agree on that.
> 
> Hmm, not really. If it can be implemented by a change in just Build.PL
> and ModuleBuildBioperl, its really independent of everything else.
> That's the beauty of it: the only thing that changes is how things are
> uploaded to and downloaded from CPAN. The only person that normally
> deals with that issue is the pumpkin for a release, and he only cares
> about it at release time.
> 
> In fact, if we're going to do it at all it makes sense to try it out on
> a minor release like 1.5.3. We've already got experience of doing it
> split-style from 1.5.2. (And let me tell you: splits at the code-base
> level suck.)
> 
> 
>> Individual CPAN modules:
>>
>> CPAN is not our personal versioning system; it may be if a 
>> distribution consists of only a few modules, but not when it's one of 
>> the largest distros present.  If someone wants to update an 
>> individual bioperl module for a quick bug fix they are more than 
>> welcome to download it via cvs, svn, or even using a web browser, and 
>> replace the one they have.
> 
> And where is the harm in letting them do it via CPAN as well? In fact,
> there are significant benefits:
> 
> 
>> I'm trying to reason how one could break up the individual SeqIO/
>> SearchIO/otherIO modules into single module distributions.  They are 
>> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, 
>> which relies on the various interfaces, RootIO, and on down).  How 
>> would tests be run off CPAN when the modules are distributed 
>> independently?
> 
> Bio::SeqIO::genbank would have a dependency on the latest version of
> Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.
> 
> So when a user wants to get the latest version of Bio::SeqIO::genbank,
> they no longer have to worry about what other modules in its dependency
> hierarchy they should also install.
> 
> Instead they just request Bio::SeqIO::genbank which itself ensures you
> have the latest version of all its dependencies before installing itself
> and running its tests.
> 
> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
> users should have, he could just call './Build dist Bio::SeqIO::genbank'
> which would generate a new package for Bio::SeqIO::genbank suitable for
> uploading to CPAN. No more long release cycles and having to constantly
> tell people to 'use CVS' to get working Bioperl code.
> 
> 
>> Would they also be individually distributed?  What  would you use to
>> tie all the individual modules together?  How would  you explain to
>> the CPAN maintainers that you want to split bioperl  into 990
>> individual modules, all updated independently, but intend on  bundling
>> them afterwards anyway?
> 
> They would be tied together by a CPAN bundle. You don't have to
> 'explain' anything to the CPAN maintainers because you're not doing
> anything wrong. In fact, you're using it the way you're supposed to.
> 


The successor to Bundles - may prove interesting:
http://search.cpan.org/~adamk/Task-1.01/lib/Task.pm


> 
>> Splitting up core:
>>
>> As I see it, here are the advantages of a defined split as Steve and 
>> I see it (off the top of my head).  Some of this probably reiterates 
>> my previous points, as well as Steve's, so apologies in advance.
> 
> Below I answer with how it would be with my single-module approach
> compared to the defined splits.
> 
> 
>> - A lean, mean, focused set of bioperl base modules (core) w/o or 
>> with very few external deps, minimal installation issues, etc.  The 
>> very basic stuff to get up and running.
> 
> Even leaner, even more focused.
> 
> 
>> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused 
>> functionality, code, and tests, which add a bit more 'sugar' to the 
>> base functionality of the core.  If you only care about parsing BLAST 
>> reports, get SearchIO, which requires core and optionally other 
>> modules (XML::SAX).  If you want additional DB functionality apart 
>> from the very basic ones in core, install DB (with it's additional 
>> requirements, including core, DBI, and so on).  Same with Graphics, 
>> Tools, Tree/Phylo, etc.  We just need to define and limit the number 
>> of splits.
> 
> The same can be achieved with CPAN bundles for each kind of functional
> grouping you can think of. And since its just a single text file that
> defines such a grouping, its easy to change or add new ones as you feel
> like it, as opposed to the rather more permanent and substantial effort
> of creating one of your splits on the code-base level.
> 
> Also, the world doesn't have to rely on /our/ ideas of what a useful
> functional split is. If someone just wants to parse Blast results, they
> can just use CPAN to install Bio::SearchIO::blast_pull instead of having
> to install all of SearchIO.
> 
> 
>> - Easier to add additional bundled modules.  For instance, I could 
>> focus all of my RNA work into a discrete set of modules (say, bioperl-
>> rna) which I maintain, I ensure works with the latest core code, I 
>> ensure also plays well with the other children =) , and I distribute 
>> via CPAN.  Same with EUtilities, which could go into a separated DB-
>> related set or stay in core.
> 
> And if you lose interest in them? They eventually die because they no
> longer have someone looking after them by default (the pumpkin and other
> devs). Alternatively you could just make a CPAN bundle. One text file!
> Easy! No duplication of modules in CPAN, no new hassle for you or the
> Bioperl 'core' pumpkin to ensure that the latest version of each work
> with each other and other splits.
> 
> 
>> - If we want a full-fledged 'install everything', the CPAN Bundle 
>> system is available.  I think it's easier to use a Bundle for 4-5, 
>> even 10 groups of modules as opposed to over 900.
> 
> No, it isn't any easier. Its /equally/ easy to install a bundle of 900
> packages of 900 modules as it is to install 5 packages of 900 modules.
> 
> When not installing absolutely everything, but perhaps 'most' things,
> there's the additional benefit that it would be easier to skip a
> particular Bio::module because you didn't want to install its external
> dependencies and weren't that interested in it anyway.
> 
> 
>> - A Bundle or a build file where discrete distributions are listed 
>> (Bio::SearchIO, etc) wouldn't need to be updated every time a new 
>> module is added to a distribution.  I suppose this could be 
>> automated, but why have the additional headache?
> 
> Yes, it would be automated, and no, it wouldn't at all be any kind of
> additional headache. I'm proposing a fully-automated system that the
> pumpkin wouldn't even have to think about it. Much /less/ of a headache
> than dealing with splits. Orders of magnitude easier to deal with.
> 
> 
>> - A chance to cut out some cruft.  We all know that particular areas 
>> need work or a complete overhaul (Restriction, Structure, maybe a few 
>> others).  Smaller, concentrated sets of modules I believe would be 
>> easier to maintain, and those that don't get use will eventually fall 
>> out of favor and may be lost or replaced from the more maintained 
>> group of modules.  Survival of the fittest.
> 
> And the smallest, most concentrated set of modules is the individual
> module.
> 
> 
>> - We already have had practice; bioperl-db, bioperl-run, bioperl-
>> network, and others.  Those that have been routinely maintained and 
>> enjoy wide use (db, run, network) have survived; others not so much 
>> (corba-related stuff, microarray, ext, etc., though the code is still 
>> available if someone else wants to take it up and revive it!).
> 
> The reason some of these existing splits (micoarray, ext) have fallen by
> the way-side? /Because/ they're splits. If they had been part of
> bioperl-live all along, they'd have been kept in a working, compatible
> state and would have been released along with everything else in 1.5.2
> 
> 
>> Disadvantages of a defined split:
>>
>> - The initial headache of identifying which groups go where, 
>> coordinating with those who rely on bioperl (GMOD, etc) on how this 
>> will be set up, so on...
> 
> No need to worry about this with individual modules.
> 
> 
>> - Separate groups of modules require testing together to ensure 
>> functionality is consistent and maintained (something I think you 
>> pointed out previously).
> 
> No need to worry.
> 
> 
>> - I think an increased possibility of branching is possible.
>>
>> - Extra headaches for devs, who have to keep track of the various 
>> critical distributions and make sure they work well together.
> 
> No headaches.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg3EKczuW2jkwy2gRAriiAJ47Qz9jTshEXuaG0XMYrUTI0hHqAwCeL45r
r/BykCKbM9lqJM0khARuEms=
=NB4B
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Thu Jun 28 04:51:19 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:51:19 +0100
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org>
References: <20070628074004.GD6338@kunpuu.plessy.org>
Message-ID: <46837687.7010101@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Charles Plessy wrote:
> Dear developpers,
> 
> I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
> it would make sense to call it "bioperl-live" and distribute it in
> parallel with the stable 1.4.0 version, if bioperl-live means "the
> current developepr version".
> 
> If I am wrong, can somebody explain me what bioperl-live exactly refers
> to ?
> 
> Have a nice day,
> 

bioperl-live really means the HEAD of the cvs repository so is the most
bleeding-edge code available.

Version 1.5.* is the developer release, while the 1.4.* is the stable
release. However, there have been few updates to the 1.4.* release which
means that it is more unstable than the 1.5.* dev release. I think the
consensus, was to have more rapid release cycles of the stable branch in
future in order to avoid this. I'm sure there are others more qualified
to expand/correct me on this if needs e.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg3aHczuW2jkwy2gRAo5pAJ95BGqrA5bLwRKNfUQi/HfBnkUJjwCg0mYB
/fHFyYkqAvcmOSxu4djPll0=
=KwVH
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Thu Jun 28 05:11:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 10:11:39 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <46836FEE.5030203@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk> <46836FEE.5030203@sheffield.ac.uk>
Message-ID: <46837B4B.7060705@sendu.me.uk>

Nathan S. Haigh wrote:
(Please try and snip more: don't quote whole posts just to reply to 
certain paragraphs)

> Sendu Bala wrote:
>> Chris Fields wrote:
>> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
>> users should have, he could just call './Build dist Bio::SeqIO::genbank'
>> which would generate a new package for Bio::SeqIO::genbank suitable for
>> uploading to CPAN. No more long release cycles and having to constantly
>> tell people to 'use CVS' to get working Bioperl code.
> 
> However, how would the test suite work out with this? e.g. when someone
> installs Bio::SeqIO::genbank they want to have the tests associated with
> Bio::SeqIO::genbank to be run. Would there be tests that would be run
> redundantly if for example someone installed Bio::SeqIO::genbank and
> Bio::SeqIO::fasta?

We would want to move to a strict test-script-per-module system. But 
that's desirable in any case, as it would greatly ease reaching our goal 
of complete test coverage, and subsequent maintenance of those tests.

The genbank test would only run tests specific to genbank parsing, and 
likewise for fasta. They would both have a dependency on Bio::SeqIO, and 
if that was also recently updated, it would get installed prior to you 
installing genbank (and therefor run its own generic SeqIO tests), but 
wouldn't get installed again (wouldn't run its tests again) when you 
install fasta afterwards.


On the subject of tests, I'm reminded of another benefit of the 
individual-module approach. Currently if a test fails during a CPAN 
install, nothing gets installed. Users do one of:

# refuse to install at all (strict sys-admins)
# cry and give up (newbies)
# cry and seek help (newbies who really really need Bioperl)
# force install, leaving them in some undefined state because they 
didn't understand the problems (most remaining users)
# force install, happy that the problems are ok (some Bioperl devs)

With a bundle of individual modules you would install virtually all 
Bioperl modules with no problems, and the problems with the remainder 
would be clear to everyone. No one would need to force install since the 
tests results would now be meaningful: the thing you're trying to 
install really isn't going to work if the tests are failing. If you 
really needed that particular Bioperl module you could then pay 
particular attention to why its failing (most likely some problem with 
an external dependency).


>>> Would they also be individually distributed?  What  would you use to
>>> tie all the individual modules together?
>>
>> They would be tied together by a CPAN bundle. You don't have to
>> 'explain' anything to the CPAN maintainers because you're not doing
>> anything wrong. In fact, you're using it the way you're supposed to.
> 
> Yep. real modules are released as modules, each with their own set of
> dependencies. The use CPAN bundles the way there were supposed to be for
> - - distributing a set of CPAN modules that make a coherent set of
> functionality. You "could" also bundle in other authors modules e.g.
> Bio::ASN1::EntrezGene?

Any bundle featuring Bio::SeqIO::entrezgene would necessarily include 
Bio::ASN1::EntrezGene in the bundle.


> Hmm, how would module versions be handled? Wouldn't this approach
> require each module to have it's own independent version number, which
> could then be used for building the dependencies? Each new release of
> that module would only bump that module's version number.

Yes, that's how it would work. No more global version number.


> Bundles can specify the minimum version of a module to be installed,
> such that bug fixes to individual modules and be released into CPAN and
> would automatically get picked up when installing bundles etc.

Yes.


> I'm not quite sure how the current stable/dev releases would work. I
> assume bug fixes would have to be made on a branch e.g. branch 1.6 and
> released to cpan from there. Then when the next stable release is made,
> all module versions would be bumped and and released to CPAN. With any
> modifications to the content of the bundle to be made. Is it possible to
> have a stable and developer release bundles that are able to specify the
> minimum stable and developer modules versions respectively?

No, the distinction becomes pretty meaningless. We could still do big 
major releases, but modules wouldn't be version-bumped. The big release 
would just be an update of the bundle that specifies the latest version 
of all Bioperl modules.

Remember that bundles only specify the minimum version, not the required 
version: in this brave new world users would end up with the same 
versions of modules if they installed a 1.8 bundle compared to 1.7 bundle.

The only way to get a true snapshot of 1.7 after it was released would 
be if we took snapshots and archived them, making them available from 
bioperl.org (or by checking out the 1.7 tag from cvs/svn).

I don't see that as a significant problem. You lose the trivial benefit 
of being able to install old snapshots from CPAN. The people who have a 
great need to install old snapshots can find their way to bioperl.org no 
problem.


From bix at sendu.me.uk  Thu Jun 28 04:50:09 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 09:50:09 +0100
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org>
References: <20070628074004.GD6338@kunpuu.plessy.org>
Message-ID: <46837641.8050106@sendu.me.uk>

Charles Plessy wrote:
> I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
> it would make sense to call it "bioperl-live" and distribute it in
> parallel with the stable 1.4.0 version, if bioperl-live means "the
> current developepr version".
> 
> If I am wrong, can somebody explain me what bioperl-live exactly refers
> to ?

bioperl-live is the name of the CVS repository containing what is 
currently considered the 'Core package' or core modules.
http://www.bioperl.org/wiki/Using_CVS

If you want to call it something to distinguish it from stable, call it 
'developer' vs 'stable' or '1.5.2' vs '1.4.0'.

To distinguish them both from the other packages, call them 'core' vs 
'run' etc.


From hlapp at gmx.net  Thu Jun 28 06:31:29 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 28 Jun 2007 07:31:29 -0300
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>


On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote:

> [...] Also - the main point I wanted to make - Can I suggest we  
> spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

I agree we need to discuss a path towards 1.6, but I think that  
should be kept separate from the cvs->svn migration. Otherwise one  
stalls the other (by stopping people who seem to have the energy and  
motivation right now to do one but not the other) for no really good  
reason.

> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

I'm not sure that's feasible to be happening but if someone steps up  
it maybe it is.

>
> Will it be productive to schedule a fair amount of time at BOSC
> discussing how to partition out the packages into separate sub-
> packages after we've done a successful release rather than trying to
> change things right now?

I agree. I also don't think that people are partitioning right now  
(other than the existing partitioning), though maybe I'm mistaken.

> [...]
> It would  probably mean moving Bio::Graphics, Bio::DB::GFF and
> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
> so they could be released more regularly on par with Gbrowse
> schedules.

Possibly. I'm not fully sure why those modules couldn't also be  
released more often out of the "main trunk" of modules. In Java/ant,  
it'd be relatively easy to write build script filters that select the  
appropriate modules and package them on the fly. I'm not sure whether  
the build tools for Perl can do that too, though.

>   Also I think someone needs to figure out Bio::Tools::GFF
> vs Bio::FeatureIO -- what do we want to do?

I believe FeatureIO has the ontology download tied into it?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Thu Jun 28 06:47:39 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 28 Jun 2007 07:47:39 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>


On Jun 28, 2007, at 12:29 AM, Jason Stajich wrote:

> As I tried to ask for in the past, would someone also illustrate the
> importance of why _WE_ need to switch to SVN on a wiki page on
> Bioperl so that when someone complains/asks about this in the future
> the arguments are already laid out.  I am basically fine with it, but
> I don't honestly see a compelling reason beyond what has been
> mentioned wrt better integration in IDEs.
> http://bioperl.org/wiki/Why_SVN

I guess at the end of the day svn is just the system of choice for  
new developers. I've had people tell me who started with svn that cvs  
seems a lot harder to use. The newer projects are all on svn and for  
example to integrate Bio::Phylo into BioPerl should become a question  
of the revision control system.

At the end of the day if being on svn makes it easier for new people  
to contribute it's enough of an argument for me, whether it's  
rational or not.

IMHO, there's two advantages that svn has over cvs. First,  
directories are versioned, have properties, and generally are the  
same class of citizens as files. They can be added, renamed, and  
removed from the repository. In cvs, we all know what a hassle it is  
to rename or even retire directories. Second, svn log gives you the  
commits, i.e., the set of changes that constituted one particular  
commit (and therefore version increase). In cvs that's hard or  
impossible to reconstruct.

Bottom line - I don't think many people if any will question why we  
moved from cvs to svn ...

My $0.02 ...

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Wed Jun 27 20:34:37 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:34:37 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
	<9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
	<1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>
Message-ID: <18051.541.684705.567954@almost.alerce.com>

Chris Fields writes:
 > We should port them all, yes.
 > 
 > chris
 > 
 > On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote:
 > 
 > > Is there a reason not to port every subproject over?
 > >
 > > 	-hilmar

They're all there.  At least everything that I found in the CVS repo.
Some of the directories were empty, some had very little content, I
was just mechanical about it.

Here's what I have:

  [hartzell at dev ~]$ svn ls file://`pwd`/bioperl
  biodata/
  bioperl-cookbook/
  bioperl-corba-client/
  bioperl-corba-server/
  bioperl-das-client/
  bioperl-db/
  bioperl-ext/
  bioperl-gui/
  bioperl-live/
  bioperl-microarray/
  bioperl-network/
  bioperl-papers/
  bioperl-pedigree/
  bioperl-pipeline/
  bioperl-run/
  biosql-schema/
  html/
  task-manager/
  xml-html/

I wasn't very clear in my original request, but I was hoping that
someone out there who's familiar with the various out-of-the-way bits
and pieces could take a look at them.  I was afraid that everyone was
just checking out bioperl-live and doing 'make test'.

Someone (chris?) made a point about binary files in bioperl-run.  It'd
be great if someone in the know could check on them.

Also, to the degree that it's possible, look around at various tags
and branches and see if they're what you'd expect.

Thanks!

g.


From bix at sendu.me.uk  Thu Jun 28 08:21:37 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 13:21:37 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <4683A7D1.8070403@sendu.me.uk>

George Hartzell wrote:
> Chris Fields writes:
>  > [...]
>  > It looks like George Hartzell may be taking a crack at it, with  
>  > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
>  > could have something testable relatively soon.  After that we'll need  
>  > to work out a few other issues, basically what's on Hilmar's list.
> 
> There's a repository on file:///home/hartzell/bioperl with all of the
> components projects in place.
> 
> If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
> 
>   file:///home/hartzell/bioperl

I'm confused. Presumably that only works whilst logged into 
dev.open-bio.org?


>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

I just tried:

svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl

on Mac OS X and things seemed to go well, except for this error message 
at the end:


svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
svn: Can't move source to dest
svn: Can't move 
'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
to 
'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
No such file or directory

I also ended up with only:
bioperl-corba-server    bioperl-db              bioperl-live 
bioperl-network         bioperl-papers          biosql-schema


Am I doing something totally wrong here?


From hartzell at alerce.com  Thu Jun 28 08:32:36 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:32:36 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN
	and	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <18051.43620.481558.447399@almost.alerce.com>

Jason Stajich writes:
 > [...]
 > The repository machine (dev) is a locked down machine meaning it only  
 > really runs ssh and not many servers include httpd.  We have  
 > anonymous CVS (client and through httpd browsing) running on a  
 > separate machine (code) that has the info rsynced over every 10 or 15  
 > minutes.

A great way to provide a read-only mirror of the repos. for anonymous
users is to have svnsync running out of cron on code.open-bio.org,
configured to pull from the dev.open-bio.org repository.  It might
actually work to have rsync mirror the fsfs-backed repository, but
that's scary-poking-into-the-internals.

g.


From hartzell at alerce.com  Thu Jun 28 08:43:37 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:43:37 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
Message-ID: <18051.44281.831316.749586@almost.alerce.com>

David Messina writes:
 > 
 > On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote:
 > 
 > >
 > > On Jun 27, 2007, at 1:27 PM, David Messina wrote:
 > >
 > >> I would think we would want "Author Date Id Rev URL" set on
 > >> everything, no?. So either cvs2svn or your tool (whichever you think
 > >> is better), followed by
 > >>
 > >> 	svn propset svn:keywords "Author Date Id Rev URL" *
 > >
 > > Shouldn't this be done recursively?
 > 
 > 
 > Yep, good catch! Thanks, Hilmar.
 > 
 > Should be:
 > 
 > 	svn propset --recursive svn:keywords "Author Date Id Rev URL" *

That's not quite what you want either.  It'll set the the keyword
property on all of the files, including things where you probably
don't want expansion to happen (e.g. images, someone said there are
binary wads in bioperl-run, etc...).

The Right Thing To Do is to grub around (grep) for '\$Id:' (and the
others) and set svn:keywords to files that are already using
keywords.  I have a bourne shell hack that'll do this, although it's
painful because it has to run in working directories....

Once we settle on a list of keywords to use, I'll take a wack at the
demo repository.

Likewise, you probably DON'T want to use this in your config file:

	  enable-auto-props = yes
	  * = svn:keywords="Author Date Id Rev URL"

since it'll do the same thing.

The Right Thing To Do is a more tedious 

	  *.pl = svn:keywords="Author Date Id Rev URL"
	  *.pm = svn:keywords="Author Date Id Rev URL"
  	  *.c = svn:keywords="Author Date Id Rev URL"

A bit of googling will give you a good starting point for the list,
and we should probably maintain a common one somewhere in the repo.

I don't think that there's a server side way of doing this, short of
running some script via a hook around commit time.

g.


From hartzell at alerce.com  Thu Jun 28 08:54:40 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:54:40 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN
	and	...Re:	Perltidy]
In-Reply-To: <F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
	<F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>
Message-ID: <18051.44944.982207.37624@almost.alerce.com>

Hilmar Lapp writes:
 > [...]
 > IMHO, there's two advantages that svn has over cvs. First,  
 > directories are versioned, have properties, and generally are the  
 > same class of citizens as files. They can be added, renamed, and  
 > removed from the repository. In cvs, we all know what a hassle it is  
 > to rename or even retire directories. Second, svn log gives you the  
 > commits, i.e., the set of changes that constituted one particular  
 > commit (and therefore version increase). In cvs that's hard or  
 > impossible to reconstruct.

Two more:

  - svn groups changes into revisions, so that they can be considered
    together, CVS versions individual files.
  - subversion tracks renames/moves correctly,
  - subversion commits are atomic, so you never have to worry about
    all of your stuff making it into the repos. at the same time [if
    you've never had to un-muck this, count yourself blessed!] ,
  - svk, which allows disconnected development while still commiting
    your work to a repo at natural points along the way (you can
    revert, branch, etc.... to your hearts content).

[yeah, that's 3, err, 4. Math is hard.]

g.


From cjfields at uiuc.edu  Thu Jun 28 09:07:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 08:07:24 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
	<23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>
Message-ID: <01812F01-9409-49FB-9061-330FA52177C1@uiuc.edu>


On Jun 28, 2007, at 5:31 AM, Hilmar Lapp wrote:

>
> On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote:
>
>> ...It
>> seems like we really need to do this first so that we have a stable
>> release that can be followed by CVS -> SVN migration, then consider
>> major changes to the repository structure and release packaging, and
>> potential deprecation and incorporation of other modules.
>
> I agree we need to discuss a path towards 1.6, but I think that
> should be kept separate from the cvs->svn migration. Otherwise one
> stalls the other (by stopping people who seem to have the energy and
> motivation right now to do one but not the other) for no really good
> reason.

It's good to discuss it as long as it doesn't take time and energy  
away from other priorities.

>> I assume there is no chance that we'd have a 1.6 candidate by BOSC
>> next month?
>
> I'm not sure that's feasible to be happening but if someone steps up
> it maybe it is.

Maybe a 1.5.3 and (if we work hard on it) a 1.6 soon after.  Then  
maybe work on partitioning if everyone's up for it and a scheme is  
worked out.

>> Will it be productive to schedule a fair amount of time at BOSC
>> discussing how to partition out the packages into separate sub-
>> packages after we've done a successful release rather than trying to
>> change things right now?
>
> I agree. I also don't think that people are partitioning right now
> (other than the existing partitioning), though maybe I'm mistaken.

The original proposal was based on Steve's idea of splitting up  
core.  I don't think a partition is feasible at this point, at least  
until we put more thought into it  (our energy should be focused  
elsewhere), but it's well worth discussing as a future path.

At this time there are two proposals:

1)  Steve's and my 'split into discrete sections' proposal, where we  
split core into self-sustaining sections with a common core listed as  
a dependency, tying installation of all together with a Bundle or  
similar.

2)  Sendu's 'break everything up' approach where all modules are  
submitted independently to CPAN, with their own tests, dependencies,  
etc.

There are advantages and disadvantages to both approaches.  Not sure  
if CPAN would go for the latter (it's pretty drastic), but I don't  
know for sure.  If you want in on that discussion (in this thread)  
feel free to join in!  The more the merrier!

>> [...]
>> It would  probably mean moving Bio::Graphics, Bio::DB::GFF and
>> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
>> so they could be released more regularly on par with Gbrowse
>> schedules.
>
> Possibly. I'm not fully sure why those modules couldn't also be
> released more often out of the "main trunk" of modules. In Java/ant,
> it'd be relatively easy to write build script filters that select the
> appropriate modules and package them on the fly. I'm not sure whether
> the build tools for Perl can do that too, though.

Both approaches above would probably use Module::Build to install  
other bioperl dependencies, each of which could have it's own  
dependency set, possibly using a Bundle to tie everything together.

>>   Also I think someone needs to figure out Bio::Tools::GFF
>> vs Bio::FeatureIO -- what do we want to do?
>
> I believe FeatureIO has the ontology download tied into it?
>
> 	-hilmar

 From recent posts here and on the gbrowse mail list by Scott and  
Lincoln, it seemed like they were moving away from using Bio::DB::GFF  
and were trying to get users to switch to Bio::DB::SeqFeature.  Maybe  
should get a more direct response?

chris


From hartzell at alerce.com  Thu Jun 28 09:16:18 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 09:16:18 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <18051.46242.942184.758493@almost.alerce.com>

Sendu Bala writes:
 > George Hartzell wrote:
 > > Chris Fields writes:
 > >  > [...]
 > >  > It looks like George Hartzell may be taking a crack at it, with  
 > >  > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
 > >  > could have something testable relatively soon.  After that we'll need  
 > >  > to work out a few other issues, basically what's on Hilmar's list.
 > > 
 > > There's a repository on file:///home/hartzell/bioperl with all of the
 > > components projects in place.
 > > 
 > > If you have a dev.open-bio.org account and you're in the bioperl
 > > group, you're good to get at it via:
 > > 
 > >   file:///home/hartzell/bioperl
 > 
 > I'm confused. Presumably that only works whilst logged into 
 > dev.open-bio.org?

Yes, that only works if you're actually on the machine.

 > >   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > I just tried:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > on Mac OS X and things seemed to go well, except for this error message 
 > at the end:
 > 
 > 
 > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
 > svn: Can't move source to dest
 > svn: Can't move 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
 > to 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
 > No such file or directory
 > 
 > I also ended up with only:
 > bioperl-corba-server    bioperl-db              bioperl-live 
 > bioperl-network         bioperl-papers          biosql-schema
 > 
 > 
 > Am I doing something totally wrong here?

It looks like you tried to check out the *entire* repository.  It
never occured to me to try that.  I'll take a look at what you
reported.

g.


From bix at sendu.me.uk  Thu Jun 28 09:20:19 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 14:20:19 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.46242.942184.758493@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.46242.942184.758493@almost.alerce.com>
Message-ID: <4683B593.3050108@sendu.me.uk>

George Hartzell wrote:
> Sendu Bala writes:
>> I just tried:
>> 
>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
[snip]
> It looks like you tried to check out the *entire* repository.

Yes. If you don't want everything, how does one 'browse' the repository
to find out the address of the thing you /do/ want?


> It never occured to me to try that.  I'll take a look at what you 
> reported.

Cheers.


From bix at sendu.me.uk  Thu Jun 28 09:27:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 14:27:29 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <4683B741.5020600@sendu.me.uk>

George Hartzell wrote:
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
> 
> Am I missing something, or don't we use them?

It would be great to have the following files svn:ignored :

In all package roots:
? Build
? MANIFEST
? MANIFEST.SKIP
? META.yml
? _build
? bioperl-*.tar.bz2
? bioperl-*.tar.gz
? bioperl-*.zip
? blib
? cover_db

In any and all directories:
? .DS_Store
? .DAV

In bioperl-live:
? t/BioDBSeqFeature.t
? t/BioDBSeqFeature_BDB.t
? t/BioDBSeqFeature_mysql.t


Can't think of anything else right now.

Thanks for your efforts,
Sendu.


From cjfields at uiuc.edu  Thu Jun 28 09:30:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 08:30:43 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <A2B0A715-BEF7-4632-91B3-1A215FBFE3D5@uiuc.edu>


On Jun 28, 2007, at 7:21 AM, Sendu Bala wrote:

>> ...
>>   file:///home/hartzell/bioperl
>
> I'm confused. Presumably that only works whilst logged into
> dev.open-bio.org?

Yes, it's just a tester.

>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>
> I just tried:
>
> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl

Try 'svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/trunk /mybiodir' to check out the main trunk for core.

chris


From hartzell at alerce.com  Thu Jun 28 09:57:00 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 09:57:00 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <18051.48684.996884.134046@almost.alerce.com>

Sendu Bala writes:
 > [...]
 > I just tried:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > on Mac OS X and things seemed to go well, except for this error message 
 > at the end:
 > 
 > 
 > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
 > svn: Can't move source to dest
 > svn: Can't move 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
 > to 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
 > No such file or directory
 > 
 > I also ended up with only:
 > bioperl-corba-server    bioperl-db              bioperl-live 
 > bioperl-network         bioperl-papers          biosql-schema
 > 
 > 
 > Am I doing something totally wrong here?

So, you probably wanted something like

  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

to pick up the head of the bioperl live tree (or
/.../bioperl-run/trunk, etc...).

I just checked out

  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/

and it ran to completion and gave me 

   (delicious)[6:50am]~/tmp>>ls bioperl | cat
   biodata
   bioperl-cookbook
   bioperl-corba-client
   bioperl-corba-server
   bioperl-das-client
   bioperl-db
   bioperl-ext
   bioperl-gui
   bioperl-live
   bioperl-microarray
   bioperl-network
   bioperl-papers
   bioperl-pedigree
   bioperl-pipeline
   bioperl-run
   biosql-schema
   html
   task-manager
   xml-html

Can another mac os x user out there give the Great Big Checkout a try
and see if it runs to completion.  Potential problems that come to
mind are:

  - the "mac's are case insensitive, sort of" problem
  - you filled up your disk
  - something else.

g.


From charles-listes+bioperl at plessy.org  Thu Jun 28 09:44:56 2007
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Thu, 28 Jun 2007 22:44:56 +0900
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <46837687.7010101@sheffield.ac.uk>
References: <20070628074004.GD6338@kunpuu.plessy.org>
	<46837687.7010101@sheffield.ac.uk>
Message-ID: <20070628134456.GB14492@kunpuu.plessy.org>

Le Thu, Jun 28, 2007 at 09:51:19AM +0100, Nathan S. Haigh a ?crit :
> 
> Version 1.5.* is the developer release, while the 1.4.* is the stable
> release. However, there have been few updates to the 1.4.* release which
> means that it is more unstable than the 1.5.* dev release. I think the
> consensus, was to have more rapid release cycles of the stable branch in
> future in order to avoid this. I'm sure there are others more qualified
> to expand/correct me on this if needs e.

Ok, thank you all for the answers. I think that I will simply upgrade
bioperl to 1.5.2 in Debian testing, and maybe rename it bioperl-core
when I will package other components.

Have a nice day,

-- 
Charles Plessy
Debian-Med packaging team
Wako, Saitama, Japan


From bix at sendu.me.uk  Thu Jun 28 10:19:49 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 15:19:49 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.48684.996884.134046@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
Message-ID: <4683C385.3050904@sendu.me.uk>

George Hartzell wrote:
> Sendu Bala writes:
>  > [...]
>  > I just tried:
>  > 
>  > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>  > 
>  > on Mac OS X and things seemed to go well, except for this error message 
>  > at the end:
>  > 
>  > 
>  > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
>  > svn: Can't move source to dest
>  > svn: Can't move 
>  > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
>  > to 
>  > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
>  > No such file or directory
>  > 
>  > I also ended up with only:
>  > bioperl-corba-server    bioperl-db              bioperl-live 
>  > bioperl-network         bioperl-papers          biosql-schema

I tried again in the same location and it told me I had to 'svn 
cleanup', which I did. But subsequently it kept complaining about files 
already being there.


> I just checked out
> 
>   svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/
> 
> and it ran to completion
[snip]
> Can another mac os x user out there give the Great Big Checkout a try
> and see if it runs to completion.  Potential problems that come to
> mind are:
> 
>   - the "mac's are case insensitive, sort of" problem
>   - you filled up your disk
>   - something else.

Well, I didn't run out of disc space. After a rm -fr * and trying again 
it failed at exactly the same point, in the same way.

svn co 
svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data

causes this repeatable problem:

[...]
A    data/phredfile.phd
svn: In directory 'data'
svn: Can't move source to dest
svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 
'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory

That is with Mac OS X svn command-line client, version 1.4.4

I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with 
a linux svn command-line client, version 1.2.3.


Cheers,
Sendu.


From dmessina at wustl.edu  Thu Jun 28 11:08:59 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 10:08:59 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18051.44281.831316.749586@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
Message-ID: <F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>

> [George]
> Likewise, you probably DON'T want to use this in your config file:
>
> 	  enable-auto-props = yes
> 	  * = svn:keywords="Author Date Id Rev URL"
>
> since it'll do the same thing.

Ah, so I've been doing it wrong all along then. :) Thanks, George!


> The Right Thing To Do is a more tedious
>
> 	  *.pl = svn:keywords="Author Date Id Rev URL"
> 	  *.pm = svn:keywords="Author Date Id Rev URL"
>   	  *.c = svn:keywords="Author Date Id Rev URL"
>
> A bit of googling will give you a good starting point for the list,
> and we should probably maintain a common one somewhere in the repo.


I've googled around and gathered the following as a possible list for  
our repo. Since I obviously don't know what I'm doing :), of course  
adjust and refine as necessary.

Dave

-------
[auto-props]
# Code formats
*.c          = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.cpp        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.h          = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.java       = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.as         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.cgi        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn-mine-type=text/plain
*.js         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/javascript
*.php        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL" Rev Date; svn:mime-type=text/x-php
*.pl         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-perl; svn:executable
*.pm         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-perl
*.py         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-python; svn:executable
*.sh         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-sh; svn:executable

# Image formats
*.bmp        = svn:mime-type=image/bmp
*.gif        = svn:mime-type=image/gif
*.ico        = svn:mime-type=image/ico
*.jpeg       = svn:mime-type=image/jpeg
*.jpg        = svn:mime-type=image/jpeg
*.png        = svn:mime-type=image/png
*.tif        = svn:mime-type=image/tiff
*.tiff       = svn:mime-type=image/tiff

# Data formats
*.pdf        = svn:mime-type=application/pdf
*.avi        = svn:mime-type=video/avi
*.doc        = svn:mime-type=application/msword
*.eps        = svn:mime-type=application/postscript
*.gz         = svn:mime-type=application/gzip
*.mov        = svn:mime-type=video/quicktime
*.mp3        = svn:mime-type=audio/mpeg
*.ppt        = svn:mime-type=application/vnd.ms-powerpoint
*.ps         = svn:mime-type=application/postscript
*.psd        = svn:mime-type=application/photoshop
*.rtf        = svn:mime-type=text/rtf
*.swf        = svn:mime-type=application/x-shockwave-flash
*.tgz        = svn:mime-type=application/gzip
*.wav        = svn:mime-type=audio/wav
*.xls        = svn:mime-type=application/vnd.ms-excel
*.zip        = svn:mime-type=application/zip

# Text formats
.htaccess    = svn:mime-type=text/plain
*.css        = svn:mime-type=text/css
*.dtd        = svn:mime-type=text/xml
*.html       = svn:mime-type=text/html
*.ini        = svn:mime-type=text/plain
*.sql        = svn:mime-type=text/x-sql
*.txt        = svn:mime-type=text/plain
*.xhtml      = svn:mime-type=text/xhtml+xml
*.xml        = svn:mime-type=text/xml
*.xsd        = svn:mime-type=text/xml
*.xsl        = svn:mime-type=text/xml
*.xslt       = svn:mime-type=text/xml
*.xul        = svn:mime-type=text/xul
*.yml        = svn:mime-type=text/plain
CHANGES      = svn:mime-type=text/plain
COPYING      = svn:mime-type=text/plain
INSTALL      = svn:mime-type=text/plain
Makefile*    = svn:mime-type=text/plain
README       = svn:mime-type=text/plain
TODO         = svn:mime-type=text/plain


From dmessina at wustl.edu  Thu Jun 28 11:11:23 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 10:11:23 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683B593.3050108@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.46242.942184.758493@almost.alerce.com>
	<4683B593.3050108@sendu.me.uk>
Message-ID: <F55A8B8A-B7B8-4354-85B7-E459B3679E41@wustl.edu>

> [Sendu]
>
> Yes. If you don't want everything, how does one 'browse' the  
> repository
> to find out the address of the thing you /do/ want?

svn ls file://dev.open-bio.org/home/hartzell/bioperl

or

svn ls svn+ssh://dev.open-bio.org/home/hartzell/bioperl


From n.haigh at sheffield.ac.uk  Thu Jun 28 11:13:58 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 16:13:58 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683B593.3050108@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>	<18051.46242.942184.758493@almost.alerce.com>
	<4683B593.3050108@sendu.me.uk>
Message-ID: <4683D036.5060109@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> George Hartzell wrote:
>> Sendu Bala writes:
>>> I just tried:
>>>
>>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
> [snip]
>> It looks like you tried to check out the *entire* repository.
> 
> Yes. If you don't want everything, how does one 'browse' the repository
> to find out the address of the thing you /do/ want?
> 

You could try:
svn ls

or

svn ls -R

to get a list of directories.

> 
>> It never occured to me to try that.  I'll take a look at what you 
>> reported.
> 
> Cheers.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg9A2czuW2jkwy2gRAgirAKCnMAg6a7W7RM22O2rOi4vD5w3HPwCePsku
akLhIszoQbRc/aVX3d/Jp7w=
=mlHY
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Thu Jun 28 11:20:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 10:20:46 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683C385.3050904@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
Message-ID: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>

I can replicate the same problem (Mac OS X) with a full checkout:

svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
svn: Can't move source to dest
svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/ 
tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/ 
tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base':  
No such file or directory

What local (mac) svn version are you using?  I'm running off macports:

svn --version
svn, version 1.4.4 (r25188)
    compiled Jun 16 2007, 23:40:53

chris

On Jun 28, 2007, at 9:19 AM, Sendu Bala wrote:
...

> I tried again in the same location and it told me I had to 'svn
> cleanup', which I did. But subsequently it kept complaining about  
> files
> already being there.
>>
> [snip]
>> Can another mac os x user out there give the Great Big Checkout a try
>> and see if it runs to completion.  Potential problems that come to
>> mind are:
>>
>>   - the "mac's are case insensitive, sort of" problem
>>   - you filled up your disk
>>   - something else.
>
> Well, I didn't run out of disc space. After a rm -fr * and trying  
> again
> it failed at exactly the same point, in the same way.
>
> svn co
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/ 
> release-0-9-2/t/data
>
> causes this repeatable problem:
>
> [...]
> A    data/phredfile.phd
> svn: In directory 'data'
> svn: Can't move source to dest
> svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to
> 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or  
> directory
>
> That is with Mac OS X svn command-line client, version 1.4.4
>
> I can get bioperl-live/tags/release-0-9-2/t/data to check out fine  
> with
> a linux svn command-line client, version 1.2.3.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Jun 28 11:37:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 10:37:27 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>

On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> ...
>
> The short and sweet version: my proposal has all the benefits of  
> yours, but none of the disadvantages. What's not to like?

The short and sweet version: I'm more convinced after you laid out  
your argument in detail, which would have saved me some typing last  
night, BTW, thanks! ; >

The other core devs need to chip in and we need to openly (candidly)  
discuss it some more (I've added Hilmar to this).  There is also a  
tenable solution that allows both aspects ('cliques' and single mode)  
which might make everybody happy.

Let's say we only want to install Bio::SeqIO::genbank.  The  
Bio::SeqIO::genbank Build.PL would only install what was needed (as  
you indicated), only Bio::SeqIO::genbank-related tests would run  
(along with dependency test, if available), and life would go on.   
However, what if we wanted to install everything in SeqIO/DB/AlignIO/ 
etc?

We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO  
modules installed or a select few (maybe a quick 'install all (y/n)?'  
followed by a list, which installs them one at a time along with  
dependencies), or have the option to specifically denote them as  
passed args to SeqIO's Build.PL, something like 'perl Build.PL - 
install-plugins genbank embl swiss', 'perl Build.PL -install-plugins  
all', etc.  If a specific module (Bio::SeqIO::genbank) is installed  
directly then maybe the installation q&a's of followed modules could  
be bypassed when installing down the dependency tree with additional  
passed args.

This would, in effect, be a bioperl-specific mini-CPAN within CPAN.   
Nice!

Now, this doesn't address several related issues, such as how we  
handle versioning of the independent modules (should be in a  
controlled manner), what we do about deprecated modules which linger  
about on CPAN, how we deal with PPMs/RPMs/packaging, and so on.  All  
have possible reasonable ways they can be addressed, I believe.   
Also, I think we should still think about doing regular full-scale  
'stable' (1.#) releases (sort of our stamp of approval for that batch  
of modules at that point in time, with a reasonable 'sell-by' date).

Again, it should be seriously discussed among the core devs and the  
bioperl community at large prior to any serious work on it, and it  
would be quite a large-scale project, but possibly worth it.  It can  
only go forward if there is enough momentum behind it.

>> Finally, all of this should wait until later.  Much later, like  
>> after  a decent release, after svn, etc kind of 'later'.  I think  
>> we can  agree on that.
>
> Hmm, not really. If it can be implemented by a change in just  
> Build.PL and ModuleBuildBioperl, its really independent of  
> everything else. That's the beauty of it: the only thing that  
> changes is how things are uploaded to and downloaded from CPAN. The  
> only person that normally deals with that issue is the pumpkin for  
> a release, and he only cares about it at release time.
>
> In fact, if we're going to do it at all it makes sense to try it  
> out on a minor release like 1.5.3. We've already got experience of  
> doing it split-style from 1.5.2. (And let me tell you: splits at  
> the code-base level suck.)

BOSC is coming up, and I would like to focus on getting svn migration  
taken care of ASAP (which is sounding more and more like we plan on  
moving all open-bio over, unless I misread Jason's post?) and  
stomping of bugs (my next priority after EUtilities).  Maybe in the  
interim we should try focusing on bug squashing, get out a quick  
standard dev release (1.5.3) before BOSC, and then a few of us could  
all communicate there via email/text/IM/phone off-list?  Maybe post  
updates via the bioperl blog and list?

> And where is the harm in letting them do it via CPAN as well? In  
> fact, there are significant benefits:
...

I'm already pretty convinced...

> The same can be achieved with CPAN bundles for each kind of  
> functional grouping you can think of. And since its just a single  
> text file that defines such a grouping, its easy to change or add  
> new ones as you feel like it, as opposed to the rather more  
> permanent and substantial effort of creating one of your splits on  
> the code-base level.

... or it could be run right in Module::Build for specific parent  
classes (as I mention above).  Bundling could be instituted for  
something like a standard GBrowse release (Bundle::BioPerl::GBrowse)  
where the functionality might be more spread out (Bio::DB*,  
Bio::Graphics, Bio::FeatureIO, etc).  For a full-scale old-style core  
install, another Bundle (Bundle::BioPerl::Standard).

...

> Yes, it would be automated, and no, it wouldn't at all be any kind  
> of additional headache. I'm proposing a fully-automated system that  
> the pumpkin wouldn't even have to think about it. Much /less/ of a  
> headache than dealing with splits. Orders of magnitude easier to  
> deal with.

The 'headache' would be the initial setup (splitting test, individual  
Build.PL, etc), but this could be done stepwise or section-wise, I  
suppose.
...

> And the smallest, most concentrated set of modules is the  
> individual module.

Well, only if it runs correctly (i.e. has the entire dep. tree  
installed).  But the 'follow' tests would handle that.

> The reason some of these existing splits (micoarray, ext) have  
> fallen by the way-side? /Because/ they're splits. If they had been  
> part of bioperl-live all along, they'd have been kept in a working,  
> compatible state and would have been released along with everything  
> else in 1.5.2

microarray fell out of favor for other reasons (much faster ways to  
do the same thing via R), though I think it still could be salvaged  
if someone wanted to take it up.

the other bioperl distros (network, db, run, etc) would also  
necessitate following the same path as core, but I guess they could  
be bundled as well.

> ...
> No headaches.

I already have one, sorry!

chris


From n.haigh at sheffield.ac.uk  Thu Jun 28 11:53:52 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 16:53:52 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <4683D990.8090909@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>> ...
>>
>> The short and sweet version: my proposal has all the benefits of
>> yours, but none of the disadvantages. What's not to like?
> 
> The short and sweet version: I'm more convinced after you laid out your
> argument in detail, which would have saved me some typing last night,
> BTW, thanks! ; >
> 
> The other core devs need to chip in and we need to openly (candidly)
> discuss it some more (I've added Hilmar to this).  There is also a
> tenable solution that allows both aspects ('cliques' and single mode)
> which might make everybody happy.

Couldn't "cliques" simply be satisfied with CPAN Bundles?

> 
> Let's say we only want to install Bio::SeqIO::genbank.  The
> Bio::SeqIO::genbank Build.PL would only install what was needed (as you
> indicated), only Bio::SeqIO::genbank-related tests would run (along with
> dependency test, if available), and life would go on.  However, what if
> we wanted to install everything in SeqIO/DB/AlignIO/etc?

I think this might be where Bundles come in for installing these
"cliques" of related modules?

- -- snip --

> 
>> Yes, it would be automated, and no, it wouldn't at all be any kind of
>> additional headache. I'm proposing a fully-automated system that the
>> pumpkin wouldn't even have to think about it. Much /less/ of a
>> headache than dealing with splits. Orders of magnitude easier to deal
>> with.
> 
> The 'headache' would be the initial setup (splitting test, individual
> Build.PL, etc), but this could be done stepwise or section-wise, I suppose.

Yes, I think this is where most of the labour will be. However, setting
the test suite up like this would be beneficial with or without
publishing modules individually.

- -- snip --
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg9mQczuW2jkwy2gRAlfBAKCFP7XUvWXsjycSv0MVGN3Ru40D/wCcDiDg
UKE/Q/wA3gu1Gb7S6rarCQw=
=WQdY
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Thu Jun 28 12:03:54 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 17:03:54 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <4683DBEA.90005@sendu.me.uk>

Chris Fields wrote:
> On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:
> Let's say we only want to install Bio::SeqIO::genbank.  The 
> Bio::SeqIO::genbank Build.PL would only install what was needed (as you 
> indicated), only Bio::SeqIO::genbank-related tests would run (along with 
> dependency test, if available), and life would go on.  However, what if 
> we wanted to install everything in SeqIO/DB/AlignIO/etc?
> 
> We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO 
> modules installed or a select few (maybe a quick 'install all (y/n)?' 
> followed by a list, which installs them one at a time along with 
> dependencies), or have the option to specifically denote them as passed 
> args to SeqIO's Build.PL, something like 'perl Build.PL -install-plugins 
> genbank embl swiss', 'perl Build.PL -install-plugins all', etc.  If a 
> specific module (Bio::SeqIO::genbank) is installed directly then maybe 
> the installation q&a's of followed modules could be bypassed when 
> installing down the dependency tree with additional passed args.

I'd probably stay away from something like this. My primary reason 
being, off-the-top-of-my-head I don't see how to get it to work. If 
you're installing Bio::SeqIO for the first time via CPAN you can't ask 
it to install Bio::SeqIO::genbank et al. at the same time because 
Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some circularity.

I also wouldn't want these things to be complicated. There should be 
little in the way of questions to ask during install. Each module's 
Build.PL should be ultra-simple with no advanced logic at all. It should 
just specify things that are absolute requirements. This simplicity 
helps avoid some of the problems we face by distributing the monolithic 
Bioperl.

No, much better for us and for users to provide a Bundle::Bio-SeqIO.


> Now, this doesn't address several related issues, such as how we handle 
> versioning of the independent modules (should be in a controlled 
> manner),

When a module is changed, it gets a version bump. Nothing complicated 
needs to be done. Transparent and obvious, behaving like all other CPAN 
modules would be my choice.


> what we do about deprecated modules which linger about on CPAN,

Delete them from CPAN seems appropriate.


> how we deal with PPMs/RPMs/packaging, and so on.  All have possible 
> reasonable ways they can be addressed, I believe.  Also, I think we 
> should still think about doing regular full-scale 'stable' (1.#) 
> releases (sort of our stamp of approval for that batch of modules at 
> that point in time, with a reasonable 'sell-by' date).

Yes, we can still choose to take a snapshot and announce it to the 
world, but at the module-level nothing special would happen. There would 
just be an updated Bundle::Bioperl-everything (or whatever).


> Again, it should be seriously discussed among the core devs and the 
> bioperl community at large prior to any serious work on it, and it would 
> be quite a large-scale project, but possibly worth it.  It can only go 
> forward if there is enough momentum behind it.

The requirement for this approach is per-module test scripts. Which as I 
identified already, is very desirable anyway so we can hit 100% test 
coverage.

So, regardless of anything else can we all agree that per-module test 
scripts are a good idea and should be worked on? If so, I'll look into 
the feasibility and figure out how much work will be involved.


From cjfields at uiuc.edu  Thu Jun 28 13:17:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 12:17:50 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683DBEA.90005@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
Message-ID: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>


On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:

> ...
> I'd probably stay away from something like this. My primary reason  
> being, off-the-top-of-my-head I don't see how to get it to work. If  
> you're installing Bio::SeqIO for the first time via CPAN you can't  
> ask it to install Bio::SeqIO::genbank et al. at the same time  
> because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some  
> circularity.

True...

> I also wouldn't want these things to be complicated. There should  
> be little in the way of questions to ask during install. Each  
> module's Build.PL should be ultra-simple with no advanced logic at  
> all. It should just specify things that are absolute requirements.  
> This simplicity helps avoid some of the problems we face by  
> distributing the monolithic Bioperl.
>
> No, much better for us and for users to provide a Bundle::Bio-SeqIO.

I just don't want too much Bundle-itis as it'll gets confusing for  
newbie (i.e. Vista-itis, or AdobeCS-itis).  It should be limited to  
functional grouping (SeqIO, AlignIO, DB, etc), 'install everything',  
or distribution-specific (GBrowse).

I also think (though Hilmar may veto this) that we should work on  
integrating bioperl-db, network, etc. into this if it goes forward.

Here's a question: how do we plan on handling uploading bioperl  
updates to CPAN via PAUSE?  Do we want to run every single module  
through one pumpkin?  Or do we want to have a core dev group PAUSE  
account?  I can see, for instance, removing everything EUtilities- 
related and submitting it independently using my own PAUSE account,  
but it would be nice to have it under an umbrella 'bioperl-devs'  
account instead.

>> Now, this doesn't address several related issues, such as how we  
>> handle versioning of the independent modules (should be in a  
>> controlled manner),
>
> When a module is changed, it gets a version bump. Nothing  
> complicated needs to be done. Transparent and obvious, behaving  
> like all other CPAN modules would be my choice.
>
>> what we do about deprecated modules which linger about on CPAN,
>
> Delete them from CPAN seems appropriate.

I know you can do that via PAUSE, but I think it lingers about on  
search.cpan.org (unless that's been fixed).  This would prob. have to  
be used sparingly.

>> how we deal with PPMs/RPMs/packaging, and so on.  All have  
>> possible reasonable ways they can be addressed, I believe.  Also,  
>> I think we should still think about doing regular full-scale  
>> 'stable' (1.#) releases (sort of our stamp of approval for that  
>> batch of modules at that point in time, with a reasonable 'sell- 
>> by' date).
>
> Yes, we can still choose to take a snapshot and announce it to the  
> world, but at the module-level nothing special would happen. There  
> would just be an updated Bundle::Bioperl-everything (or whatever).

Right, it would basically be a stamp of certification.

>> Again, it should be seriously discussed among the core devs and  
>> the bioperl community at large prior to any serious work on it,  
>> and it would be quite a large-scale project, but possibly worth  
>> it.  It can only go forward if there is enough momentum behind it.
>
> The requirement for this approach is per-module test scripts. Which  
> as I identified already, is very desirable anyway so we can hit  
> 100% test coverage.
>
> So, regardless of anything else can we all agree that per-module  
> test scripts are a good idea and should be worked on? If so, I'll  
> look into the feasibility and figure out how much work will be  
> involved.

I think so, but the feasibility issue is critical.  Do we want cvs/ 
svn to be divided up into 900 subdirectories (one for each module),  
or do we want to have a similar directory structure as we have now,  
but with each module in it's own directory?  Or leave everything as  
is and generate Build.PL on-the-fly (prob. least feasible)?

This is where it might be wise to do it piece-meal at first (maybe  
starting with something somewhat segregated like Bio::Tools), then  
progress from there.

chris


From hartzell at alerce.com  Thu Jun 28 13:38:48 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 13:38:48 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
Message-ID: <18051.61992.627473.323346@almost.alerce.com>

David Messina writes:
 > > [George]
 > > Likewise, you probably DON'T want to use this in your config file:
 > >
 > > 	  enable-auto-props = yes
 > > 	  * = svn:keywords="Author Date Id Rev URL"
 > >
 > > since it'll do the same thing.
 > 
 > Ah, so I've been doing it wrong all along then. :) Thanks, George!

It's not *wrong* if it's never done anything to you that you've
regretted.  The right answer depends on your situation....

 > [...]
 > I've googled around and gathered the following as a possible list for  
 > our repo. Since I obviously don't know what I'm doing :), of course  
 > adjust and refine as necessary.
 > 

That's a great starting point.  Do you have write access to the wiki?
Could you link it off of the instructions for using svn?

g.


From hartzell at alerce.com  Thu Jun 28 14:06:50 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 14:06:50 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683C385.3050904@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
Message-ID: <18051.63674.685297.426813@almost.alerce.com>

Sendu Bala writes:
 > [...]
 > I tried again in the same location and it told me I had to 'svn 
 > cleanup', which I did. But subsequently it kept complaining about files 
 > already being there.

You need to do the cleanup because svn exited gracelessly and you
needed to help it get back in it's feet.  The cleanup doesn't remove
the stuff that you did get checked out, so it's still there getting in
the way of your new checkout.

 > [...]
 > svn co 
 > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data
 > 
 > causes this repeatable problem:
 > 
 > [...]
 > A    data/phredfile.phd
 > svn: In directory 'data'
 > svn: Can't move source to dest
 > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 
 > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory
 > 
 > That is with Mac OS X svn command-line client, version 1.4.4
 > 
 > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with 
 > a linux svn command-line client, version 1.2.3.

I'm not 100% sure what's going on here, but I'm inclined to say "get a
real computer" (and yes, I'm typing this on a mac...).  I have a mac
pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
the tiger used to say)....

I think that we're having trouble with case sensitivity.  My only
evidence is that I can see where there have been both HUMBETGLOA.FASTA
and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
anything else that's weird about that file.  On the other hand, I
can't see how this would cause the error you're seeing though.

The experiment would be to grab a usb or firewire disk (or even a
memory stick), partition/format it as case sensitive (or even *unix*)
and try to do

 svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data

into it.  If it works, voila.  If not, I'll keep making stuff up, err,
thinking about it.

g.


From dmessina at wustl.edu  Thu Jun 28 14:15:32 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 13:15:32 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>
Message-ID: <459D9BC0-4FBA-4560-80A8-E6243DE9D9CC@wustl.edu>

Same svn error here on the full checkout.


> What local (mac) svn version are you using?  I'm running off macports:
>
> svn --version
> svn, version 1.4.4 (r25188)
>     compiled Jun 16 2007, 23:40:53

I have svn 1.4.3.

% svn --version
svn, version 1.4.3 (r23084)
    compiled Apr  1 2007, 02:47:14

Copyright (C) 2000-2006 CollabNet.
Subversion is open source software, see http://subversion.tigris.org/
This product includes software developed by CollabNet (http:// 
www.Collab.Net/).

The following repository access (RA) modules are available:

* ra_dav : Module for accessing a repository via WebDAV (DeltaV)  
protocol.
   - handles 'http' scheme
* ra_svn : Module for accessing a repository using the svn network  
protocol.
   - handles 'svn' scheme
* ra_local : Module for accessing a repository on local disk.
   - handles 'file' scheme


From cjfields at uiuc.edu  Thu Jun 28 14:54:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 13:54:15 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.63674.685297.426813@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
Message-ID: <D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>


On Jun 28, 2007, at 1:06 PM, George Hartzell wrote:

> ...
> I'm not 100% sure what's going on here, but I'm inclined to say "get a
> real computer" (and yes, I'm typing this on a mac...).  I have a mac
> pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
> the tiger used to say)....

Ouch!  Though it could be worse (**coughwindowscough**).

> I think that we're having trouble with case sensitivity.  My only
> evidence is that I can see where there have been both HUMBETGLOA.FASTA
> and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
> anything else that's weird about that file.  On the other hand, I
> can't see how this would cause the error you're seeing though.

Odd that other branches (including the main trunk) work but that one  
doesn't.

> The experiment would be to grab a usb or firewire disk (or even a
> memory stick), partition/format it as case sensitive (or even *unix*)
> and try to do
>
>  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data
>
> into it.  If it works, voila.  If not, I'll keep making stuff up, err,
> thinking about it.
>
> g.

I'll have to figure out why I can't get ssh keys to work locally to  
test it out more (I have a usb drive to test with); just don't have  
time at the moment.

chris


From dmessina at wustl.edu  Thu Jun 28 14:47:04 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 13:47:04 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18051.61992.627473.323346@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
Message-ID: <0027C4E0-26B1-41F3-8FD8-EAB5465CA80E@wustl.edu>

> That's a great starting point.  Do you have write access to the wiki?
> Could you link it off of the instructions for using svn?

Done.

http://www.bioperl.org/wiki/Svn_auto-props

linked from:
http://www.bioperl.org/wiki/Using_Subversion (bottom of page)


From bix at sendu.me.uk  Thu Jun 28 15:19:35 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 20:19:35 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
Message-ID: <468409C7.7020102@sendu.me.uk>

Chris Fields wrote:
> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:
> Here's a question: how do we plan on handling uploading bioperl  
> updates to CPAN via PAUSE?  Do we want to run every single module  
> through one pumpkin?  Or do we want to have a core dev group PAUSE  
> account?  I can see, for instance, removing everything EUtilities- 
> related and submitting it independently using my own PAUSE account,  
> but it would be nice to have it under an umbrella 'bioperl-devs'  
> account instead.

All Bioperl modules (except the Bundle!) are owned by BIOPERLML on 
PAUSE. Its a little akward since PAUSE is uploader-centric, but see my 
notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release

And certainly, everything that wants to consider itself part of Bioperl 
(and gain the benefit of lots of devs looking after it) should certainly 
  have BIOPERLML as the primary owner.


> I think so, but the feasibility issue is critical.  Do we want cvs/ 
> svn to be divided up into 900 subdirectories (one for each module),  
> or do we want to have a similar directory structure as we have now,  
> but with each module in it's own directory?  Or leave everything as  
> is and generate Build.PL on-the-fly (prob. least feasible)?

Very definitely the latter. The key benefit of my approach is that the 
organisation stays as is and that a snapshot of the repository remains a 
single directory of modules in Bio so that people don't have to 
'install' Bioperl, they can still just uncompress the archive (or check 
out the package from svn) and point their PERL5LIB to the root dir of 
the package.

For that reason I very much like the idea of folding the current 
split-out packages (run, network etc.) back into the core package so 
everything is one place. Folding them back in should obviously wait 
until everything is in place and working with core already.


My proposal obviously wasn't very clear. As far as all other devs are 
concerned, nothing changes at all (except for lots of new improved test 
scripts). The pumpkin will, however, be able to say:

./Build dist

Right now that generates the distribution archives (in different 
compression formats) - one big archive containing everything.
My proposal is simply that instead it generates lots of archives, one 
archive per module. It will also generate some Bundles and whatever else 
might be needed.

I don't envisage any major difficulties in achieving this. The 
'feasibility' issue I was going to look into was strictly regarding 
doing all the new test scripts.


From hartzell at alerce.com  Thu Jun 28 15:43:38 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 15:43:38 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
Message-ID: <18052.3946.224905.415905@almost.alerce.com>

Chris Fields writes:
 > 
 > On Jun 28, 2007, at 1:06 PM, George Hartzell wrote:
 > 
 > > ...
 > > I'm not 100% sure what's going on here, but I'm inclined to say "get a
 > > real computer" (and yes, I'm typing this on a mac...).  I have a mac
 > > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
 > > the tiger used to say)....
 > 
 > Ouch!  Though it could be worse (**coughwindowscough**).
 > 
 > > I think that we're having trouble with case sensitivity.  My only
 > > evidence is that I can see where there have been both HUMBETGLOA.FASTA
 > > and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
 > > anything else that's weird about that file.  On the other hand, I
 > > can't see how this would cause the error you're seeing though.
 > 
 > Odd that other branches (including the main trunk) work but that one  
 > doesn't.
 > 
 > > The experiment would be to grab a usb or firewire disk (or even a
 > > memory stick), partition/format it as case sensitive (or even *unix*)
 > > and try to do
 > >
 > >  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
 > > live/tags/release-0-9-2/t/data
 > >
 > > into it.  If it works, voila.  If not, I'll keep making stuff up, err,
 > > thinking about it.
 > >
 > > g.
 > 
 > I'll have to figure out why I can't get ssh keys to work locally to  
 > test it out more (I have a usb drive to test with); just don't have  
 > time at the moment.

I just did the experiment, and filename-insensitivity seems to be
breaking something.

I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.

I reformatted a memory stick to be case sensitive and co of

  bioperl/bioperl-live/tags/release-0-9-2/t 

worked, then I made a directory in my home dir (normal mac thing) and
got the same error as above.

I can get a copy of the trunk, so I'm inclined to ask someone to
mention the problem on the wiki and then just ignore it.

g.


From cjfields at uiuc.edu  Thu Jun 28 16:29:09 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 15:29:09 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <468409C7.7020102@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
Message-ID: <026156F4-4C46-4CC6-82B5-07FC5326A244@uiuc.edu>


On Jun 28, 2007, at 2:19 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:
>> Here's a question: how do we plan on handling uploading bioperl
>> updates to CPAN via PAUSE?  Do we want to run every single module
>> through one pumpkin?  Or do we want to have a core dev group PAUSE
>> account?  I can see, for instance, removing everything EUtilities-
>> related and submitting it independently using my own PAUSE account,
>> but it would be nice to have it under an umbrella 'bioperl-devs'
>> account instead.
>
> All Bioperl modules (except the Bundle!) are owned by BIOPERLML on
> PAUSE. Its a little akward since PAUSE is uploader-centric, but see my
> notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release
>
> And certainly, everything that wants to consider itself part of  
> Bioperl
> (and gain the benefit of lots of devs looking after it) should  
> certainly
>   have BIOPERLML as the primary owner.

Alrighty then.

>> I think so, but the feasibility issue is critical.  Do we want cvs/
>> svn to be divided up into 900 subdirectories (one for each module),
>> or do we want to have a similar directory structure as we have now,
>> but with each module in it's own directory?  Or leave everything as
>> is and generate Build.PL on-the-fly (prob. least feasible)?
>
> Very definitely the latter. The key benefit of my approach is that the
> organisation stays as is and that a snapshot of the repository  
> remains a
> single directory of modules in Bio so that people don't have to
> 'install' Bioperl, they can still just uncompress the archive (or  
> check
> out the package from svn) and point their PERL5LIB to the root dir of
> the package.

Okay, makes sense.

> For that reason I very much like the idea of folding the current
> split-out packages (run, network etc.) back into the core package so
> everything is one place. Folding them back in should obviously wait
> until everything is in place and working with core already.

I agree, but that's up to Brian, Hilmar, and the others who donated  
the packages (or at least a consensus of core devs).  One thing at a  
time.

> My proposal obviously wasn't very clear. As far as all other devs are
> concerned, nothing changes at all (except for lots of new improved  
> test
> scripts). The pumpkin will, however, be able to say:
>
> ./Build dist
>
> Right now that generates the distribution archives (in different
> compression formats) - one big archive containing everything.
> My proposal is simply that instead it generates lots of archives, one
> archive per module. It will also generate some Bundles and whatever  
> else
> might be needed.

We'll need to define which tests and data goes with each module and  
so on.

> I don't envisage any major difficulties in achieving this. The
> 'feasibility' issue I was going to look into was strictly regarding
> doing all the new test scripts.

Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3  
is ready to go.  We'll still need to get thoughts on this from other  
core devs out there, and it prob. should until everybody is  
comfortable with the idea.

chris


From dmessina at wustl.edu  Thu Jun 28 18:13:48 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 17:13:48 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>

Coming late to this party, I'm replying to snippets from multiple  
emails.


> [Chris]
> what we do about deprecated modules which linger
> about on CPAN

> [Sendu]
> Delete them from CPAN seems appropriate.

I coulda sworn this was frowned upon, but a recent thread suggests  
it's totally kosher.

	http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html


> [Sendu]
> So, regardless of anything else can we all agree that per-module test
> scripts are a good idea and should be worked on?

I agree.


> [Sendu]
> people don't have to
> 'install' Bioperl, they can still just uncompress the archive (or  
> check
> out the package from svn) and point their PERL5LIB to the root dir of
> the package.

Could you elaborate a bit on how this works? How is XS code that  
needs compiling handled? Or the scripts directory? I would love to be  
able to do this.


> [Sendu]
> For that reason I very much like the idea of folding the current
> split-out packages (run, network etc.) back into the core package so
> everything is one place. Folding them back in should obviously wait
> until everything is in place and working with core already.

 From an organizational standpoint, I'm concerned that with ~900  
modules in core right now, adding all of the additional stuff from  
the split-out packages would make for a daunting directory.

But as you said, this is way down the road, so this proposal doesn't  
bear on the other, closer-to-now issues on the table.


> [Chris]
> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
> is ready to go.  We'll still need to get thoughts on this from other
> core devs out there, and it prob. should until everybody is
> comfortable with the idea.

If we go forward with the CPAN split plan, I like the idea of having  
a trial. We can foresee some of the issues that such a change may  
bring, and yet still more no doubt wait for us once we do it.


Dave


From bix at sendu.me.uk  Thu Jun 28 18:59:35 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 23:59:35 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <46843D57.2080409@sendu.me.uk>

David Messina wrote:
>> people don't have to 'install' Bioperl, they can still just
>> uncompress the archive (or check out the package from svn) and
>> point their PERL5LIB to the root dir of the package.
> 
> Could you elaborate a bit on how this works? How is XS code that 
> needs compiling handled? Or the scripts directory? I would love to be
> able to do this.

I meant for the most part. Core doesn't have any XS code so that's not 
an issue. Scripts can be run manually like any other perl script. When 
you discover something isn't working because of a missing external 
dependency, you just install it. (But that happens very rarely.)

Personally I've /never/ installed Bioperl and used that installed set of 
modules. I've always just pointed my PERL5LIB at the distribution folder 
or my cvs checkout.

Which makes me a strange candidate for advocating all these 
CPAN-specific changes, but there you go ;)


From cjfields at uiuc.edu  Thu Jun 28 19:03:02 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 18:03:02 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <8B6FBB52-5CCE-4122-876C-B9827C86E46E@uiuc.edu>


On Jun 28, 2007, at 5:13 PM, David Messina wrote:

> Coming late to this party, I'm replying to snippets from multiple  
> emails.
>
>
>> [Chris]
>> what we do about deprecated modules which linger
>> about on CPAN
>
>> [Sendu]
>> Delete them from CPAN seems appropriate.
>
> I coulda sworn this was frowned upon, but a recent thread suggests  
> it's totally kosher.
>
> 	http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html

As long as it doesn't show up somewhere to confuse newbies I'm okay  
with it.

>> [Sendu]
>> people don't have to
>> 'install' Bioperl, they can still just uncompress the archive (or  
>> check
>> out the package from svn) and point their PERL5LIB to the root dir of
>> the package.
>
> Could you elaborate a bit on how this works? How is XS code that  
> needs compiling handled? Or the scripts directory? I would love to  
> be able to do this.

Maybe Sendu can add to this, but the XS code is limited to bioperl- 
ext AFAIK.  We could keep that separate until it plays well with  
bioperl itself.

Scripts and examples - maybe packaged along with a Bundle?

>> [Sendu]
>> For that reason I very much like the idea of folding the current
>> split-out packages (run, network etc.) back into the core package so
>> everything is one place. Folding them back in should obviously wait
>> until everything is in place and working with core already.
>
> From an organizational standpoint, I'm concerned that with ~900  
> modules in core right now, adding all of the additional stuff from  
> the split-out packages would make for a daunting directory.
>
> But as you said, this is way down the road, so this proposal  
> doesn't bear on the other, closer-to-now issues on the table.

Well, the code in bioperl-db and network complement code in core, so  
I agree with Sendu they belong there.  They should be under the same  
scrutiny as the rest anyway (code, tests, etc), but won't be bundled  
unles there is an 'install everything' Bundle.

>> [Chris]
>> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
>> is ready to go.  We'll still need to get thoughts on this from other
>> core devs out there, and it prob. should until everybody is
>> comfortable with the idea.
>
> If we go forward with the CPAN split plan, I like the idea of  
> having a trial. We can foresee some of the issues that such a  
> change may bring, and yet still more no doubt wait for us once we  
> do it.

That's what branches are for; testing stuff out like this.

chris


From hartzell at alerce.com  Thu Jun 28 19:05:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 19:05:32 -0400
Subject: [Bioperl-l] problem with binary files.
Message-ID: <18052.16060.932502.183552@almost.alerce.com>


Ok, after pointing out the problem with setting the svn:keywords
property on binary files, it turns out that I *did* that.  Worse yet,
I set the svn:eol-style to 'native' on everything, including binary
files, so depending on your platform they're likely to be fubar.

For example, bioperl-run/t/data/H_pylori_J99.glimmer2.icm may or may
not be what you expect it to be, depending on whether your eol-style
matches the servers and whether any conversions were done.

I'll touch up the way that the little tool I'm using calls cvs2svn and
redo the repository.

g.


From n.haigh at sheffield.ac.uk  Fri Jun 29 02:59:21 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 07:59:21 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>	<4682C6F5.4020406@sendu.me.uk>
	<4682D12E.3000803@sendu.me.uk>	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>	<4682E824.1050507@sendu.me.uk>	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>	<4683624F.6020402@sendu.me.uk>	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <4684ADC9.8040404@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- -- split --
>> [Sendu]
>> For that reason I very much like the idea of folding the current
>> split-out packages (run, network etc.) back into the core package so
>> everything is one place. Folding them back in should obviously wait
>> until everything is in place and working with core already.
> 
>  From an organizational standpoint, I'm concerned that with ~900  
> modules in core right now, adding all of the additional stuff from  
> the split-out packages would make for a daunting directory.
> 
> But as you said, this is way down the road, so this proposal doesn't  
> bear on the other, closer-to-now issues on the table.
> 

I don't think this is an issue - it would simply mean everything is
under the same version control hierarchy. And with svn it's Soooooo much
easier to fiddle around with directory structures

> 
> 
>> [Chris]
>> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
>> is ready to go.  We'll still need to get thoughts on this from other
>> core devs out there, and it prob. should until everybody is
>> comfortable with the idea.
> 
> If we go forward with the CPAN split plan, I like the idea of having  
> a trial. We can foresee some of the issues that such a change may  
> bring, and yet still more no doubt wait for us once we do it.
> 

Under svn it would be easy to make an "svn copy" of run, network etc
into a branch of live to test this out. Not that this might be a
problem, but: Since we are looking at bioperl-* packages being under the
same svn repository, then then "svn copy's" are cheap for disk space.

> 
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhK3JczuW2jkwy2gRAtI2AJ4kNrpGY8XMMh9KxOqs+l0PrEVcwgCfVFj6
BCvltmPyWF4ImueYmd7VFAc=
=ktl+
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Fri Jun 29 03:05:33 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 08:05:33 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <18051.61992.627473.323346@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
Message-ID: <4684AF3D.5090907@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:

- -- snip --

>  > [...]
>  > I've googled around and gathered the following as a possible list for  
>  > our repo. Since I obviously don't know what I'm doing :), of course  
>  > adjust and refine as necessary.
>  > 
> 
> That's a great starting point.  Do you have write access to the wiki?
> Could you link it off of the instructions for using svn?
> 
> g.

Don't .t files need adding to the auto-props?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhK89czuW2jkwy2gRAnRGAJ0VnBNVBAdQdfUnqPhmvsyQnD/bswCggSHC
/Iivb6Lc4/51bUdrTmRQYlE=
=V+t2
-----END PGP SIGNATURE-----


From sac at bioperl.org  Fri Jun 29 04:25:36 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Fri, 29 Jun 2007 01:25:36 -0700
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>

On 6/27/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:
>
> > ...
> > If you have a dev.open-bio.org account and you're in the bioperl
> > group, you're good to get at it via:
> >
> >   file:///home/hartzell/bioperl
> >
> > or
> >
> >   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>
> I managed to get it working using file://.  Haven't tried svn+ssh yet
> but I've had persistent problems getting ssh to work properly on my
> macbook; not sure why yet but I haven't had time to play around with it.

Are you using the ssh that comes installed with OSX? If so, I'd
recommend installing openssh from MacPorts. I recall having issues
with the stock version which were resolved by using the more
up-to-date version you can get via MacPorts.

BTW, I haven't been able to check out the new svn repository via
svn+ssh:// because I can't get svn to authenticate with an alternative
username. My username on dev.open-bio.org differs from what it is on
my local machine, so I issue a command such as:

steve at localhost $ svn --username sac checkout
svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

but I get challenged with:
steve at dev.open-bio.org's password:

I also tried putting the --username argument after the subcommand, but
it still wants to use my local username. I can ssh -l sac into the dev
box no problem. Any suggestions?

Steve


From bix at sendu.me.uk  Fri Jun 29 04:52:42 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 29 Jun 2007 09:52:42 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <4684C85A.5030206@sendu.me.uk>

Steve Chervitz wrote:
> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username. My username on dev.open-bio.org differs from what it is on
> my local machine, so I issue a command such as:
> 
> steve at localhost $ svn --username sac checkout
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> 
> but I get challenged with:
> steve at dev.open-bio.org's password:
> 
> I also tried putting the --username argument after the subcommand, but
> it still wants to use my local username. I can ssh -l sac into the dev
> box no problem. Any suggestions?

Set up your ssh key on the dev machine. I'm also on a machine with the 
wrong username and it works even without attempting to supply the 
correct one.

It does, however, show the 'Welcome to the new developer system' message 
2 or 3 times for every svn+ssh action, which freaks me out a little.


From N.Haigh at sheffield.ac.uk  Fri Jun 29 05:32:38 2007
From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 10:32:38 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <1183109558.4684d1b69bcec@webmail.shef.ac.uk>

Quoting Steve Chervitz <sac at bioperl.org>:

-- snip --

> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username. My username on dev.open-bio.org differs from what it is on
> my local machine, so I issue a command such as:
> 
> steve at localhost $ svn --username sac checkout
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> 
> but I get challenged with:
> steve at dev.open-bio.org's password:
> 
> I also tried putting the --username argument after the subcommand, but
> it still wants to use my local username. I can ssh -l sac into the dev
> box no problem. Any suggestions?
> 
> Steve
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


You could try:
svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

Nath


From dmessina at wustl.edu  Fri Jun 29 08:28:26 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 29 Jun 2007 07:28:26 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>

>
> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username.

I have the same issue. I set up a stanza in my ~/.ssh/config:

Host dev.open-bio.org
   User dave_messina

where dave_messina is my dev.open-bio.org username.


From cjfields at uiuc.edu  Fri Jun 29 13:00:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 29 Jun 2007 12:00:27 -0500
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
Message-ID: <F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>


On Jun 29, 2007, at 7:28 AM, David Messina wrote:

>>
>> BTW, I haven't been able to check out the new svn repository via
>> svn+ssh:// because I can't get svn to authenticate with an  
>> alternative
>> username.
>
> I have the same issue. I set up a stanza in my ~/.ssh/config:
>
> Host dev.open-bio.org
>    User dave_messina
>
> where dave_messina is my dev.open-bio.org username.

I changed to the macports ssh w/o luck.  It appears the key is  
offered up, so maybe the problem is how I have everything set up on  
dev (though I followed everything on the wiki):

....
  Contact 'support at open-bio.org' for
your new login information.
======================================
debug1: Authentications that can continue: publickey,gssapi-with- 
mic,password
debug1: Next authentication method: publickey
debug1: Offering public key: /Users/cjfields/.ssh/id_dsa
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,gssapi-with- 
mic,password
debug2: we did not send a packet, disable method
debug1: Next authentication method: password

It's odd; I can use passwordless logins for other servers (admittedly  
Mac servers) w/o problems using ssh keys, but dev.open-bio.org always  
prompts for a password regardless.

My feeling is it's something with my local ssh or sshd config; I'll  
try fiddling with it to see what happens.  Anyone have suggestions?   
I've lost enough hair as is; don't want to lose more!

chris


From sac at bioperl.org  Fri Jun 29 13:07:45 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Fri, 29 Jun 2007 10:07:45 -0700
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <1183109558.4684d1b69bcec@webmail.shef.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<1183109558.4684d1b69bcec@webmail.shef.ac.uk>
Message-ID: <8f200b4c0706291007x2b765323n75c9003a47fe7cbb@mail.gmail.com>

On 6/29/07, Nathan S. Haigh <N.Haigh at sheffield.ac.uk> wrote:
> Quoting Steve Chervitz <sac at bioperl.org>:
>
> -- snip --
>
> > BTW, I haven't been able to check out the new svn repository via
> > svn+ssh:// because I can't get svn to authenticate with an alternative
> > username. My username on dev.open-bio.org differs from what it is on
> > my local machine, so I issue a command such as:
> >
> > steve at localhost $ svn --username sac checkout
> > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> >
> > but I get challenged with:
> > steve at dev.open-bio.org's password:
> >
> > I also tried putting the --username argument after the subcommand, but
> > it still wants to use my local username. I can ssh -l sac into the dev
> > box no problem. Any suggestions?
>
> [...]
> You could try:
> svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

Bingo. Thanks for the tips, guys.

BTW, setting up ssh keys was not the issue, since my key is already
set up on the dev machine. The svn --username setting appears to not
be operative at the ssh layer. I  suspected this might be the case
given that the usage info says:

 $ svn --help co
  --username arg           : specify a username ARG
  --password arg           : specify a password ARG

which seemed insecure. I didn't want to send my password in the clear,
and didn't know if or whether svn would hand it off to ssh. It wasn't
even sending my username to ssh, so I knew something was wrong. These
args are probably only intended for accessing local svn repositories,
or non-svn+ssh-based checkouts.

BTW, the svn+ssh check out on Mac OS X works for me. I'm using svn and
openssh installed via MacPorts:

$ svn --version
svn, version 1.4.4 (r25188)
   compiled Jun 28 2007, 23:51:53

$ ssh -version
OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007

Steve


From hartzell at alerce.com  Fri Jun 29 15:19:31 2007
From: hartzell at alerce.com (George Hartzell)
Date: Fri, 29 Jun 2007 15:19:31 -0400
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
	<F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
Message-ID: <18053.23363.102371.602742@almost.alerce.com>

Chris Fields writes:
 > 
 > On Jun 29, 2007, at 7:28 AM, David Messina wrote:
 > 
 > >>
 > >> BTW, I haven't been able to check out the new svn repository via
 > >> svn+ssh:// because I can't get svn to authenticate with an  
 > >> alternative
 > >> username.
 > >
 > > I have the same issue. I set up a stanza in my ~/.ssh/config:
 > >
 > > Host dev.open-bio.org
 > >    User dave_messina
 > >
 > > where dave_messina is my dev.open-bio.org username.
 > 
 > I changed to the macports ssh w/o luck.  It appears the key is  
 > offered up, so maybe the problem is how I have everything set up on  
 > dev (though I followed everything on the wiki):

A couple of things to check.

  - make sure that you put your public key in ~/.ssh/authorized_keys2
    (not authorized_keys)

  - make sure that authorized_keys2 is chmod'ed 600 (644 might be
    enough...).

  - make sure that ~/.ssh is chmoded 700.

  - make sure that your home directory is 755.

Then see if it works.  You might be able to relax some of those
protections a bit, but ssh's uptight about letting other people mess
with that data.

g.


From dmessina at wustl.edu  Fri Jun 29 18:47:14 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 29 Jun 2007 17:47:14 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <4684AF3D.5090907@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
Message-ID: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>

> [Nathan]
> Don't .t files need adding to the auto-props?

Yes -- thanks for reminding me. Please feel free to add it to the  
wiki page. I'll be tweaking it some more later on in any case.


Dave


From n.haigh at sheffield.ac.uk  Sat Jun 30 05:55:56 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 30 Jun 2007 10:55:56 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
Message-ID: <468628AC.9060200@sheffield.ac.uk>

David Messina wrote:
>> [Nathan]
>> Don't .t files need adding to the auto-props?
> 
> Yes -- thanks for reminding me. Please feel free to add it to the wiki 
> page. I'll be tweaking it some more later on in any case.
> 
> 
> Dave

I noticed this has already been done. I have just been through the 
t/data dir and added a list of extensions I found (without props). There 
are some files without extensions, how should these be dealt with? There 
seems to be a plethora of file naming styles which means there's a 
pretty long list of non-standard extensions. So at some point someone 
will commit a new data file with a new extension (often describing what 
program created the output or the test for which it's intended) that 
won't be in the auto-props file - can you think of a way around this?

Nath


From cjfields at uiuc.edu  Sat Jun 30 08:48:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 07:48:10 -0500
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <18053.23363.102371.602742@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
	<F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
	<18053.23363.102371.602742@almost.alerce.com>
Message-ID: <3874B4EE-0119-40BC-8B92-11133A766417@uiuc.edu>


On Jun 29, 2007, at 2:19 PM, George Hartzell wrote:

> Chris Fields writes:
>>
>> On Jun 29, 2007, at 7:28 AM, David Messina wrote:
>>
>>>>
>>>> BTW, I haven't been able to check out the new svn repository via
>>>> svn+ssh:// because I can't get svn to authenticate with an
>>>> alternative
>>>> username.
>>>
>>> I have the same issue. I set up a stanza in my ~/.ssh/config:
>>>
>>> Host dev.open-bio.org
>>>    User dave_messina
>>>
>>> where dave_messina is my dev.open-bio.org username.
>>
>> I changed to the macports ssh w/o luck.  It appears the key is
>> offered up, so maybe the problem is how I have everything set up on
>> dev (though I followed everything on the wiki):
>
> A couple of things to check.
>
>   - make sure that you put your public key in ~/.ssh/authorized_keys2
>     (not authorized_keys)
>
>   - make sure that authorized_keys2 is chmod'ed 600 (644 might be
>     enough...).
>
>   - make sure that ~/.ssh is chmoded 700.
>
>   - make sure that your home directory is 755.
>
> Then see if it works.  You might be able to relax some of those
> protections a bit, but ssh's uptight about letting other people mess
> with that data.
>
> g.

Got it working; it was the permissions on my home dir (the last  
one).  Thanks George!

chris


From dmessina at wustl.edu  Sat Jun 30 11:37:44 2007
From: dmessina at wustl.edu (David Messina)
Date: Sat, 30 Jun 2007 10:37:44 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <468628AC.9060200@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
Message-ID: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>

> I have just been through the t/data dir and added a list of  
> extensions I found

Thanks! That's a big help. I'll add prop definitions to those shortly.


>  There are some files without extensions, how should these be dealt  
> with?

If you look in the text files section, there are some files there  
which don't have extensions, e.g. AUTHORS, BUGS. There's also

	Makefile.*

so we have some flexibility in how svn knows to auto-prop a file. I  
haven't read up on the details yet to find out how it handles files  
that match multiple criteria -- it may be dependent simply on the  
order they're defined.


> There seems to be a plethora of file naming styles which means  
> there's a pretty long list of non-standard extensions. So at some  
> point someone will commit a new data file with a new extension  
> (often describing what program created the output or the test for  
> which it's intended) that won't be in the auto-props file - can you  
> think of a way around this?

Ive been thinking about this a bit. How about this?

- We have just "standard" files and extensions (like *.blast,  
*.fasta) in the auto-props list.

- We manually add props for the files that have nonstandard,  
arbitrary extensions so all the files have now are prop'd.

- At some point we rename those nonstandard files to have standard  
extensions. Especially for the t/data/ files, we'll have to make sure  
to update the tests that rely on them.

- We can have the suggested list of extensions for new files that get  
added. I don't think we need to strictly enforce this just for the  
sake of svn (after all, its primary function of version control will  
work just fine without any properties set), but it would be nice if  
we could try to keep to it mostly.

Many distros come with an /etc/mime.types file which has the list of  
officially registered MIME types. I found a script that will take  
this list and convert it into auto-props format. I don't think we  
need to support *all* of the gazillion filetypes since most of the  
them our repository will never see, but we certainly could.


Dave


From dmessina at wustl.edu  Sat Jun 30 12:26:27 2007
From: dmessina at wustl.edu (David Messina)
Date: Sat, 30 Jun 2007 11:26:27 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
Message-ID: <D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>


On Jun 30, 2007, at 10:37 AM, David Messina wrote:

> - We manually add props for the files that have nonstandard,
> arbitrary extensions so all the files have now are prop'd.

Er, that should be

- We manually add props for the files that have nonstandard,  
arbitrary extensions so that all the files now in the repository are  
prop'd.


From n.haigh at sheffield.ac.uk  Sat Jun 30 13:25:58 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 30 Jun 2007 18:25:58 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
Message-ID: <46869226.70203@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- -- snip --
> 
> 
>> There seems to be a plethora of file naming styles which means there's
>> a pretty long list of non-standard extensions. So at some point
>> someone will commit a new data file with a new extension (often
>> describing what program created the output or the test for which it's
>> intended) that won't be in the auto-props file - can you think of a
>> way around this?
> 
> Ive been thinking about this a bit. How about this?
> 
> - We have just "standard" files and extensions (like *.blast, *.fasta)
> in the auto-props list.

I think the list of seq formats recognised by Bioperl in Bio::SeqIO and
Bio::AlignIO would be a good start. As these are likely to be the ones
that are sensitive to file format recognition and thus could break tests
if renamed.

I think a lot of people have used "." in file names as an alternative to
a space. I think it would be beneficial to use an underscore "_" in
these cases and leave the "." to represent the beginning of the file
extension.

> 
> - We manually add props for the files that have nonstandard, arbitrary
> extensions so all the files that we currently have now are prop'd.
> 
> - At some point we rename those nonstandard files to have standard
> extensions. Especially for the t/data/ files, we'll have to make sure to
> update the tests that rely on them.

Nice and easy with svn :)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhpHiczuW2jkwy2gRAuZ5AKCnd2MvCsvSn1NemDVMmabnieR2vACg1Qk0
pYVvXwxq0lpiGfM09RQ6A1I=
=3Lhw
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Sat Jun 30 15:11:52 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 14:11:52 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
	<D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>
Message-ID: <C274666B-9771-4296-80BB-8DFFB036F29C@uiuc.edu>


On Jun 30, 2007, at 11:26 AM, David Messina wrote:

>
> On Jun 30, 2007, at 10:37 AM, David Messina wrote:
>
>> - We manually add props for the files that have nonstandard,
>> arbitrary extensions so all the files have now are prop'd.
>
> Er, that should be
>
> - We manually add props for the files that have nonstandard,
> arbitrary extensions so that all the files now in the repository are
> prop'd.

Do we need to define every filetype extension, or can there be a  
fallback (eg if it isn't on the list or has no extension it's plain  
text)?

chris


From hlapp at gmx.net  Sat Jun 30 17:26:22 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 17:26:22 -0400
Subject: [Bioperl-l] Splits again
In-Reply-To: <468409C7.7020102@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
Message-ID: <A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>


On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:

> [...]
> Very definitely the latter. The key benefit of my approach is that  
> the organisation stays as is and that a snapshot of the repository  
> remains a single directory of modules in Bio so that people don't  
> have to 'install' Bioperl, they can still just uncompress the  
> archive (or check out the package from svn) and point their  
> PERL5LIB to the root dir of the package.

I think this is absolutely key to keep in mind. Anything without this  
feature will likely be a non-starter.

I don't really have time to follow the discussion let alone  
participate, so really all I can contribute is to offer some sanity/ 
reality checks (such as the above).

In this sense, I understand a release pumpkin will generate ~900  
packages to upload to CPAN? How much hassle is that compared to what  
uploading a bioperl release means right now?

How brittle is all the Build.PL code that will be needed to automate  
all of this, and how difficult will it be to maintain? For example,  
if someone adds in 10 new modules, what Build.PL-related work will  
need to be done?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Sat Jun 30 17:32:52 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 30 Jun 2007 22:32:52 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
	<A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
Message-ID: <4686CC04.6000403@sendu.me.uk>

Hilmar Lapp wrote:
> On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:
> 
>> [...]
>> Very definitely the latter. The key benefit of my approach is that  
>> the organisation stays as is and that a snapshot of the repository  
>> remains a single directory of modules in Bio so that people don't  
>> have to 'install' Bioperl, they can still just uncompress the  
>> archive (or check out the package from svn) and point their  
>> PERL5LIB to the root dir of the package.
[snip]
> In this sense, I understand a release pumpkin will generate ~900  
> packages to upload to CPAN? How much hassle is that compared to what  
> uploading a bioperl release means right now?

I'd have to investigate. I did my uploads using the PAUSE website, which 
for 900 packages would be unfeasible. Will have to see if the process 
can be automated.


> How brittle is all the Build.PL code that will be needed to automate  
> all of this, and how difficult will it be to maintain? For example,  
> if someone adds in 10 new modules, what Build.PL-related work will  
> need to be done?

Well, my plan will be that once the work is done, you won't need to 
touch the Build.PL code again. My intent is that the pumpkin can just 
type one command and not think about anything.

As for the reality, I won't know until I think about it properly and 
experiment.


From hlapp at gmx.net  Sat Jun 30 19:36:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 19:36:45 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18052.3946.224905.415905@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
Message-ID: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>


On Jun 28, 2007, at 3:43 PM, George Hartzell wrote:

> I just did the experiment, and filename-insensitivity seems to be
> breaking something.
>
> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.
>
> I reformatted a memory stick to be case sensitive and co of
>
>   bioperl/bioperl-live/tags/release-0-9-2/t
>
> worked, then I made a directory in my home dir (normal mac thing) and
> got the same error as above.

You picked up a rename of a file from lower case extension to upper  
case extension. Unfortunately, there are several months between  
adding the upper-case and removing the lower-case version.

We can reconstruct what happened with this using svn log on the  
directory (this does not require a checkout):

$ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ 
bioperl-live/trunk/t/data

Searching for HUMBETGLOA yields the following two commits that added  
one and removed the other:

------------------------------------------------------------------------
r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines
Changed paths:
    M /bioperl-live/trunk/t/SearchIO.t
    A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA
    A /bioperl-live/trunk/t/data/cysprot1.FASTA

added tests for FASTA

------------------------------------------------------------------------
r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines
Changed paths:
    A /bioperl-live/trunk/t/data/HUMBETGLOA.fa
    D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta

renaming file to avoid clobbering on windows

Unfortunately, both files are in the tag (again, no checkout required):

$ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta
HUMBETGLOA.FASTA
HUMBETGLOA.fasta

We can remove the offending version from the repository (again,  
without needing a checkout):

$ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta

I did this, and now the tag checks out fine on OSX. Can anyone confirm?

(BTW the ability to operate on the repository w/o needing a checkout  
is another advantage of svn)

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Jun 30 20:40:53 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 19:40:53 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
	<2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
Message-ID: <A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>

Checkout worked for me (Mac OS X) using both:

svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
tags/release-0-9-2/t/data
svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
tags/release-0-9-2/

so removing the offending file worked (good catch!).  Haven't run a  
full co but probably isn't necessary.

chris

On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote:

>
> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote:
>
>> I just did the experiment, and filename-insensitivity seems to be
>> breaking something.
>>
>> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.
>>
>> I reformatted a memory stick to be case sensitive and co of
>>
>>   bioperl/bioperl-live/tags/release-0-9-2/t
>>
>> worked, then I made a directory in my home dir (normal mac thing) and
>> got the same error as above.
>
> You picked up a rename of a file from lower case extension to upper  
> case extension. Unfortunately, there are several months between  
> adding the upper-case and removing the lower-case version.
>
> We can reconstruct what happened with this using svn log on the  
> directory (this does not require a checkout):
>
> $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ 
> bioperl/bioperl-live/trunk/t/data
>
> Searching for HUMBETGLOA yields the following two commits that  
> added one and removed the other:
>
> ---------------------------------------------------------------------- 
> --
> r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines
> Changed paths:
>    M /bioperl-live/trunk/t/SearchIO.t
>    A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA
>    A /bioperl-live/trunk/t/data/cysprot1.FASTA
>
> added tests for FASTA
>
> ---------------------------------------------------------------------- 
> --
> r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines
> Changed paths:
>    A /bioperl-live/trunk/t/data/HUMBETGLOA.fa
>    D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta
>
> renaming file to avoid clobbering on windows
>
> Unfortunately, both files are in the tag (again, no checkout  
> required):
>
> $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta
> HUMBETGLOA.FASTA
> HUMBETGLOA.fasta
>
> We can remove the offending version from the repository (again,  
> without needing a checkout):
>
> $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta
>
> I did this, and now the tag checks out fine on OSX. Can anyone  
> confirm?
>
> (BTW the ability to operate on the repository w/o needing a  
> checkout is another advantage of svn)
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hartzell at alerce.com  Sat Jun 30 20:48:06 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 30 Jun 2007 17:48:06 -0700
Subject: [Bioperl-l] Take 2 of the new subversion repository.
Message-ID: <18054.63942.316904.413911@almost.alerce.com>


There's a second cut at the subversion repository.  I've done a better
job of setting svn:keywords and svn:eol-style on various files.  The
defaults were more cautious and I used an auto-props files based on
the wiki version.

  svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2

The old repository's still around as

  svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1

I renamed it so that people would work with it by mistake.  If, for
some hard-to-imagine reason, you have a working copy that you want to
run against it, you should be able to do an svn switch --relocate on
your working copy and be back in shape.  In fact, it might be a good
time to give it a try....

g.


From hartzell at alerce.com  Sat Jun 30 21:17:18 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 30 Jun 2007 18:17:18 -0700
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
	<2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
	<A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
Message-ID: <18055.158.30409.808612@almost.alerce.com>

Chris Fields writes:
 > Checkout worked for me (Mac OS X) using both:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
 > tags/release-0-9-2/t/data
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/ 
 > tags/release-0-9-2/
 > 
 > so removing the offending file worked (good catch!).  Haven't run a  
 > full co but probably isn't necessary.
 > [...]

I'll keep a note of that as something to do when I prepare the final
cut of the repository.

g.


From jason at bioperl.org  Sat Jun 30 21:25:30 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 30 Jun 2007 18:25:30 -0700
Subject: [Bioperl-l] Take 2 of the new subversion repository.
In-Reply-To: <18054.63942.316904.413911@almost.alerce.com>
References: <18054.63942.316904.413911@almost.alerce.com>
Message-ID: <D8C71EF7-6E2E-498E-8638-373512ADE3EE@bioperl.org>

Thanks George -
I also did
chgrp -R bioperl /home/hartzell/bioperl_take?
to make sure the group permission was set right.

We may also want to do a chmod g+s on all the dirs in there as well  
so that permissions are preserved when this gets deployed for real.

If anyone wants to make some changes to files and commit them, as  
well as make some branches/tags to play around a little bit since  
we'll likely throw this away and do it again from locked down version  
from CVS at some appointed time.

Do you know how to have svn commit messages generate summary emails  
as well?

-j
On Jun 30, 2007, at 5:48 PM, George Hartzell wrote:

>
> There's a second cut at the subversion repository.  I've done a better
> job of setting svn:keywords and svn:eol-style on various files.  The
> defaults were more cautious and I used an auto-props files based on
> the wiki version.
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2
>
> The old repository's still around as
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1
>
> I renamed it so that people would work with it by mistake.  If, for
> some hard-to-imagine reason, you have a working copy that you want to
> run against it, you should be able to do an svn switch --relocate on
> your working copy and be back in shape.  In fact, it might be a good
> time to give it a try....
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hlapp at gmx.net  Sat Jun 30 22:21:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 22:21:25 -0400
Subject: [Bioperl-l] Take 2 of the new subversion repository.
In-Reply-To: <18054.63942.316904.413911@almost.alerce.com>
References: <18054.63942.316904.413911@almost.alerce.com>
Message-ID: <5F53A433-BAA9-431D-A0C5-5955690D0B73@gmx.net>


On Jun 30, 2007, at 8:48 PM, George Hartzell wrote:

> I renamed it so that people would work with it by mistake.  If, for
> some hard-to-imagine reason, you have a working copy that you want to
> run against it,

It's not so hard to imagine - checking out the entire repository  
takes a long time.

> you should be able to do an svn switch --relocate on
> your working copy and be back in shape.  In fact, it might be a good
> time to give it a try....

It doesn't work:

svn: The repository at 'svn+ssh://dev.open-bio.org/home/hartzell/ 
bioperl_take2' has uuid '31277767-6726-dc11-ab4c-0019e3f901d6', but  
the WC has '27e854f1-f323-dc11-8c1b-0019e3f901d6'

You can't relocate to a totally new repository (relocating to  
bioperl_take1 does work though).

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Jun 30 22:39:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 21:39:27 -0500
Subject: [Bioperl-l] Take 2 of the new subversion repository.
In-Reply-To: <D8C71EF7-6E2E-498E-8638-373512ADE3EE@bioperl.org>
References: <18054.63942.316904.413911@almost.alerce.com>
	<D8C71EF7-6E2E-498E-8638-373512ADE3EE@bioperl.org>
Message-ID: <7C6FD6C9-CBED-40D3-BA90-4B34F79E6DE0@uiuc.edu>

There are a few CPAN modules available; here's one:

http://search.cpan.org/~dwheeler/SVN-Notify-2.66/lib/SVN/Notify.pm

chris

On Jun 30, 2007, at 8:25 PM, Jason Stajich wrote:

> Thanks George -
> I also did
> chgrp -R bioperl /home/hartzell/bioperl_take?
> to make sure the group permission was set right.
>
> We may also want to do a chmod g+s on all the dirs in there as well
> so that permissions are preserved when this gets deployed for real.
>
> If anyone wants to make some changes to files and commit them, as
> well as make some branches/tags to play around a little bit since
> we'll likely throw this away and do it again from locked down version
> from CVS at some appointed time.
>
> Do you know how to have svn commit messages generate summary emails
> as well?
>
> -j
> On Jun 30, 2007, at 5:48 PM, George Hartzell wrote:
>
>>
>> There's a second cut at the subversion repository.  I've done a  
>> better
>> job of setting svn:keywords and svn:eol-style on various files.  The
>> defaults were more cautious and I used an auto-props files based on
>> the wiki version.
>>
>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take2
>>
>> The old repository's still around as
>>
>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl_take1
>>
>> I renamed it so that people would work with it by mistake.  If, for
>> some hard-to-imagine reason, you have a working copy that you want to
>> run against it, you should be able to do an svn switch --relocate on
>> your working copy and be back in shape.  In fact, it might be a good
>> time to give it a try....
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sat Jun 30 22:46:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 21:46:05 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4686CC04.6000403@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
	<A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
	<4686CC04.6000403@sendu.me.uk>
Message-ID: <D10BF6DE-D8A6-448A-8850-A7B13AE54266@uiuc.edu>


On Jun 30, 2007, at 4:32 PM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:
>>> [...]
>>> Very definitely the latter. The key benefit of my approach is  
>>> that  the organisation stays as is and that a snapshot of the  
>>> repository  remains a single directory of modules in Bio so that  
>>> people don't  have to 'install' Bioperl, they can still just  
>>> uncompress the  archive (or check out the package from svn) and  
>>> point their  PERL5LIB to the root dir of the package.
> [snip]
>> In this sense, I understand a release pumpkin will generate ~900   
>> packages to upload to CPAN? How much hassle is that compared to  
>> what  uploading a bioperl release means right now?
>
> I'd have to investigate. I did my uploads using the PAUSE website,  
> which for 900 packages would be unfeasible. Will have to see if the  
> process can be automated.

Not that they would care one way or another but maybe we should  
contact the CPAN maintainers to get their thoughts.  They might have  
some ideas...

>> How brittle is all the Build.PL code that will be needed to  
>> automate  all of this, and how difficult will it be to maintain?  
>> For example,  if someone adds in 10 new modules, what Build.PL- 
>> related work will  need to be done?
>
> Well, my plan will be that once the work is done, you won't need to  
> touch the Build.PL code again. My intent is that the pumpkin can  
> just type one command and not think about anything.
>
> As for the reality, I won't know until I think about it properly  
> and experiment.

A good experiment for a branch.  I still think this could be  
accomplished step-wise; for instance run a quick test using something  
with a simple dependency tree like Bio::Root::Root (only needs  
RootI), finish up with Bio::Root*, then work down into PrimarySeq,  
Seq, etc.  Submit them to CPAN piecemeal or in batches (all  
Bio::Seq*, so on).

If the Build.PL, etc are to be generated on the fly then maybe there  
should be a simple way of registering or matching tests to modules  
(or vice versa) to ease the pain, particularly for new code.

chris


From hlapp at gmx.net  Sat Jun 30 22:56:04 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 22:56:04 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
	<2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>
	<A348C2D6-F00B-4E76-A78F-E192A912E785@uiuc.edu>
Message-ID: <E250DB37-E2C1-4F71-A2FE-B64603EB69FD@gmx.net>

It turns out that both files are also present on the release-0-9-3,  
bioperl-1-0-0, bioperl-1-0-alpha, and bioperl-1-0-alpha2-rc tags, so add

$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/release-0-9-3/t/data/ 
HUMBETGLOA.fasta
$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-0/t/data/ 
HUMBETGLOA.fasta
$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha/t/data/ 
HUMBETGLOA.fasta
$ svn rm -m "Removing offending duplicate" svn+ssh://dev.open-bio.org/ 
home/hartzell/bioperl/bioperl-live/tags/bioperl-1-0-alpha2-rc/t/data/ 
HUMBETGLOA.fasta

to the post-processing commands.

	-hilmar

On Jun 30, 2007, at 8:40 PM, Chris Fields wrote:

> Checkout worked for me (Mac OS X) using both:
>
> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data
> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/
>
> so removing the offending file worked (good catch!).  Haven't run a  
> full co but probably isn't necessary.
>
> chris
>
> On Jun 30, 2007, at 6:36 PM, Hilmar Lapp wrote:
>
>>
>> On Jun 28, 2007, at 3:43 PM, George Hartzell wrote:
>>
>>> I just did the experiment, and filename-insensitivity seems to be
>>> breaking something.
>>>
>>> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.
>>>
>>> I reformatted a memory stick to be case sensitive and co of
>>>
>>>   bioperl/bioperl-live/tags/release-0-9-2/t
>>>
>>> worked, then I made a directory in my home dir (normal mac thing)  
>>> and
>>> got the same error as above.
>>
>> You picked up a rename of a file from lower case extension to  
>> upper case extension. Unfortunately, there are several months  
>> between adding the upper-case and removing the lower-case version.
>>
>> We can reconstruct what happened with this using svn log on the  
>> directory (this does not require a checkout):
>>
>> $ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/ 
>> bioperl/bioperl-live/trunk/t/data
>>
>> Searching for HUMBETGLOA yields the following two commits that  
>> added one and removed the other:
>>
>> --------------------------------------------------------------------- 
>> ---
>> r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2  
>> lines
>> Changed paths:
>>    M /bioperl-live/trunk/t/SearchIO.t
>>    A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA
>>    A /bioperl-live/trunk/t/data/cysprot1.FASTA
>>
>> added tests for FASTA
>>
>> --------------------------------------------------------------------- 
>> ---
>> r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2  
>> lines
>> Changed paths:
>>    A /bioperl-live/trunk/t/data/HUMBETGLOA.fa
>>    D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta
>>
>> renaming file to avoid clobbering on windows
>>
>> Unfortunately, both files are in the tag (again, no checkout  
>> required):
>>
>> $ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ 
>> bioperl-live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i  
>> fasta
>> HUMBETGLOA.FASTA
>> HUMBETGLOA.fasta
>>
>> We can remove the offending version from the repository (again,  
>> without needing a checkout):
>>
>> $ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
>> live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta
>>
>> I did this, and now the tag checks out fine on OSX. Can anyone  
>> confirm?
>>
>> (BTW the ability to operate on the repository w/o needing a  
>> checkout is another advantage of svn)
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Fri Jun  1 08:06:04 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 01 Jun 2007 09:06:04 +0100
Subject: [Bioperl-l] ClustalW Score?
In-Reply-To: <1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu>
References: <00e201c7a2de$91f60f50$2d01a8c0@PICO><DFEEDFC9-68C4-4821-846F-69AC9559C70B@bioperl.org><465E9B58.1020403@sendu.me.uk>	<49B6333A-18B9-4B63-80EF-81C57A295494@bioperl.org>
	<1A4207F8295607498283FE9E93B775B40334A01A@EX02.asurite.ad.asu.edu>
Message-ID: <465FD36C.5060603@sendu.me.uk>

Kevin Brown wrote:
>> you're right --- it is not really my code, I was just 
>> elaborating Kevin's example --- it would probably need to be 
>> more specific or perhaps the last Score seen is sufficient 
>> for what one is trying to capture?
> 
> I took that code from a pairwise clustal alignment script that I wrote
> to deal with aligning a bunch of short sequences against a long one to
> see where they line up at.  When all of them were fed to Clustal the
> short sequences all ended up aligned to each other and not well aligned
> to the longer sequence.  I only saw one score in the output from the
> pairwise, so that is what I used to find a reasonable value.

Ok, well I've hedged my bets and used both. Now commited to CVS.


From jy at genseq.co.uk  Sat Jun  2 02:39:48 2007
From: jy at genseq.co.uk (Jean-Yves Sireau)
Date: Sat, 2 Jun 2007 10:39:48 +0800
Subject: [Bioperl-l] Genseq
Message-ID: <20070602103948.093d713c@jys.my.regentmarkets.com>

Dear List members,

I would like to let you know of the formation of Genseq Ltd., a
bioinformatics company that will (in time!) offer genome sequencing to
high net worth individuals and bioinformatic analysis of the sequence
data to detect predisposition to illness.  The company's website is
www.genseq.co.uk

Genseq would be willing to sponsor bioperl, whether financially or by
providing resources, notably for any bioperl-related activities in the
Asia Pacific region.  Genseq's bioinformatics team will be based in
Cyberjaya (Malaysia), and we are in particular interested to promote
bioperl in Malaysia.  We are also actively recruiting at the moment
in Malaysia and India.

If there was sufficient demand, we would be willing to organise a
bioperl conference in Cyberjaya at the Cyberview Lodge
(www.cyberview-lodge.com), which would be the ideal place for such a
conference in Malaysia.

Looking forward to your comments, suggestions and proposals.

Best regards
Jean-Yves Sireau

-- 

Jean-Yves Sireau
CEO, Genseq Ltd.
www.genseq.co.uk


From cjfields at uiuc.edu  Sat Jun  2 05:16:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 00:16:05 -0500
Subject: [Bioperl-l] EUtilities overhaul started
Message-ID: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>

To anyone using Bio::DB::EUilities,

I am in the midst of a major overhaul to the various EUtilities tools  
and to Bio::DB::GenericWebDBI (the latter which I am forming into  
more or less a test bed for other database interfaces).  I'm about  
80% done at this point, and will likely start committing changes this  
coming week.

The overall interface will change (something I had warned about in  
the Bio::DB::EUtilities POD) but I am hoping it will be more  
intuitive and easier to use in the long run.  I'll describe the  
overall redesign and use in an upcoming HOWTO (as recommended by  
Brian a while back).

If anyone has any suggestions/ideas/flames, please let me know!

Cheers!

chris


From cjfields at uiuc.edu  Sat Jun  2 14:39:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 09:39:25 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
Message-ID: <AF243C87-B82E-4C33-939D-2B84B9E41537@uiuc.edu>

Yes, there are a few odd issues, though that's one I've not heard of  
yet.  You might try one of the sub-nucleotide databases (nuccore,  
nucest, nucgss).

I'll try looking into it and (if necessary) pester NCBI about it.   
I'll pass this on to the mail list to see if anyone else knows about  
the problem.

chris

On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote:

> Hi Chris,
>
> Thanks for your work on EUtilities.
> For a production task, I used EUtilitities directly (given your
> announced overhaul). I noticed a recent problem at NCBI (reported two
> weeks ago to NCBI, no reply yet). Possibly you may run into this with
> testing: if you ePOST gi ids to the EU server and then use this set in
> Esearch (using the query key) no results are returned for the
> nucleotide database.
> ESearches like "db=$db%23$QueryKey" typically fail if the $db is
> nucleotide (but work f $db='protein'). The XML output has Count 0 and
> an empty QueryTranslationSet for db=nucleotide only.
> For completeness, I attach a simple test script I used.
>
>
> Best regards,
> Bernd
>
>
> On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> To anyone using Bio::DB::EUilities,
>>
>> I am in the midst of a major overhaul to the various EUtilities tools
>> and to Bio::DB::GenericWebDBI (the latter which I am forming into
>> more or less a test bed for other database interfaces).  I'm about
>> 80% done at this point, and will likely start committing changes this
>> coming week.
>>
>> The overall interface will change (something I had warned about in
>> the Bio::DB::EUtilities POD) but I am hoping it will be more
>> intuitive and easier to use in the long run.  I'll describe the
>> overall redesign and use in an upcoming HOWTO (as recommended by
>> Brian a while back).
>>
>> If anyone has any suggestions/ideas/flames, please let me know!
>>
>> Cheers!
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> <EUsearch.pl>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Jun  3 04:51:57 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 2 Jun 2007 23:51:57 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <e572b3c70706020948l708f14c8q706b65c73617c86d@mail.gmail.com>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<e572b3c70706020628v71b10e7bm34cebfab4954890c@mail.gmail.com>
	<AF243C87-B82E-4C33-939D-2B84B9E41537@uiuc.edu>
	<e572b3c70706020948l708f14c8q706b65c73617c86d@mail.gmail.com>
Message-ID: <1A2AF5C4-6A58-4FDD-A4CA-6ABCE30F0D1B@uiuc.edu>

I can confirm this; however it only relates to the use of history  
with esearch and nucleotide (use of the history with other eutils  
seems to work fine); retrieving sequences via efetch is not  
affected.  If I find out anything more I'll post something on the  
mail list.

chris

On Jun 2, 2007, at 11:48 AM, Bernd Brandt wrote:

> I can confirm that using the correct sub-nucleotide database works
> (nuccore in my case).
> This seems to be a quite recent change/bug at NCBI. Until recently,
> db=nucleotide worked. Moreover, EInfo still lists nucleotide as valid
> db.
> It is not optimal to have to choose the sub-database and the searches
> work via the Entrez web-interface. Note that this problem is related
> to the ESearch and db=nucleotide.
>
> bernd
>
> On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> Yes, there are a few odd issues, though that's one I've not heard of
>> yet.  You might try one of the sub-nucleotide databases (nuccore,
>> nucest, nucgss).
>>
>> I'll try looking into it and (if necessary) pester NCBI about it.
>> I'll pass this on to the mail list to see if anyone else knows about
>> the problem.
>>
>> chris
>>
>> On Jun 2, 2007, at 8:28 AM, Bernd Brandt wrote:
>>
>> > Hi Chris,
>> >
>> > Thanks for your work on EUtilities.
>> > For a production task, I used EUtilitities directly (given your
>> > announced overhaul). I noticed a recent problem at NCBI  
>> (reported two
>> > weeks ago to NCBI, no reply yet). Possibly you may run into this  
>> with
>> > testing: if you ePOST gi ids to the EU server and then use this  
>> set in
>> > Esearch (using the query key) no results are returned for the
>> > nucleotide database.
>> > ESearches like "db=$db%23$QueryKey" typically fail if the $db is
>> > nucleotide (but work f $db='protein'). The XML output has Count  
>> 0 and
>> > an empty QueryTranslationSet for db=nucleotide only.
>> > For completeness, I attach a simple test script I used.
>> >
>> >
>> > Best regards,
>> > Bernd
>> >
>> >
>> > On 6/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> >> To anyone using Bio::DB::EUilities,
>> >>
>> >> I am in the midst of a major overhaul to the various EUtilities  
>> tools
>> >> and to Bio::DB::GenericWebDBI (the latter which I am forming into
>> >> more or less a test bed for other database interfaces).  I'm about
>> >> 80% done at this point, and will likely start committing  
>> changes this
>> >> coming week.
>> >>
>> >> The overall interface will change (something I had warned about in
>> >> the Bio::DB::EUtilities POD) but I am hoping it will be more
>> >> intuitive and easier to use in the long run.  I'll describe the
>> >> overall redesign and use in an upcoming HOWTO (as recommended by
>> >> Brian a while back).
>> >>
>> >> If anyone has any suggestions/ideas/flames, please let me know!
>> >>
>> >> Cheers!
>> >>
>> >> chris
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >>
>> >> <EUsearch.pl>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From basu at pharm.stonybrook.edu  Sun Jun  3 14:44:18 2007
From: basu at pharm.stonybrook.edu (Siddhartha Basu)
Date: Sun, 03 Jun 2007 10:44:18 -0400
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
Message-ID: <web-5961520@pharm.stonybrook.edu>

On Sat, 2 Jun 2007 00:16:05 -0500
  Chris Fields <cjfields at uiuc.edu> wrote:
> To anyone using Bio::DB::EUilities,
> 
> I am in the midst of a major overhaul to the various 
>EUtilities tools  
> and to Bio::DB::GenericWebDBI (the latter which I am 
>forming into  
> more or less a test bed for other database interfaces). 
> I'm about  
> 80% done at this point, and will likely start committing 
>changes this  
> coming week.
> 
> The overall interface will change (something I had 
>warned about in  
> the Bio::DB::EUtilities POD) but I am hoping it will be 
>more  
> intuitive and easier to use in the long run.  I'll 
>describe the  
> overall redesign and use in an upcoming HOWTO (as 
>recommended by  
> Brian a while back).

Hi chris,
Being a frequent user of EUtilities, hopefully this api 
facelift and upcoming howto will definitely be more 
helpful.
Anyway, one thing i noticed that for each eutil call such 
as efetch,epost,esearch,esummary a new 
'Bio::DB::Utilities' object has to be
instantiated. And thereafter it cannot be set during 
runtime such as
$eutils->id('ids'), for example....

my $eutils = Bio::DB::Eutilities->new ( -id => $id,
                                        -eutil => 
'esummary',
                                        -db => 'protein',
                                      );
my $ct = $eutils->get_response->content();

## -- now i cannot do this...
$eutils->id($newid);
my $ct = $eutils->get_response->content();

Is the new api going to address something along this line 
or is there currently anyway to reuse
the object.
Thanks again for this nice toolkit.

-siddhartha


> 
> If anyone has any suggestions/ideas/flames, please let 
>me know!
> 
> Cheers!
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Sun Jun  3 23:52:39 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 3 Jun 2007 18:52:39 -0500
Subject: [Bioperl-l] EUtilities overhaul started
In-Reply-To: <web-5961520@pharm.stonybrook.edu>
References: <BA26E0D6-5D3F-45E5-A1D2-2409290B3A61@uiuc.edu>
	<web-5961520@pharm.stonybrook.edu>
Message-ID: <5120BD7B-CA89-46E4-8D6B-6B24C1F93A5E@uiuc.edu>

On Jun 3, 2007, at 9:44 AM, Siddhartha Basu wrote:

> ...
> Hi chris,
> Being a frequent user of EUtilities, hopefully this api facelift  
> and upcoming howto will definitely be more helpful.
> Anyway, one thing i noticed that for each eutil call such as  
> efetch,epost,esearch,esummary a new 'Bio::DB::Utilities' object has  
> to be
> instantiated. And thereafter it cannot be set during runtime such as
> $eutils->id('ids'), for example....
>
> my $eutils = Bio::DB::Eutilities->new ( -id => $id,
>                                        -eutil => 'esummary',
>                                        -db => 'protein',
>                                      );
> my $ct = $eutils->get_response->content();
>
> ## -- now i cannot do this...
> $eutils->id($newid);
> my $ct = $eutils->get_response->content();

I'll have to check up on that, though changing id() should work with  
the old API.  It won't matter with the new API (it works fine), but  
it is still troubling...

> Is the new api going to address something along this line or is  
> there currently anyway to reuse
> the object.
> Thanks again for this nice toolkit.
>
> -siddhartha

The old API was based upon the idea of creating discrete user agents  
for each eutil to retrieve data.  The problem with the old interface  
is it attempts to do too much (take care of parameters, set up  
requests, retrieve responses, parse data, etc), and many tasks  
required instantiating a new EUtilities object.  I was never really  
satisfied with it.

The new interface is a composition of three classes: the web user  
agent (LWP::UserAgent), a class encapsulating parameter handling, and  
a parser class (all which can be used independently if needed).  When  
parameters change a new request is made 'lazily' (i.e. only when  
needed).  Similarly, when data is requested after any parameter  
change a new parser instance is created and the new response is parsed.

With that in mind you can now do the following:
----------------------------------------
my @params = (-eutil => 'esearch',
               -db    => 'protein',
               -term => 'BRCA1',
               -retmax => 100);

my $eutil = Bio::DB::EUtilities->new(@params);

# no need to get response first; get_ids() calls that if needed

my @ids = $eutil->get_ids;

# below changes only those parameters, leaves all others set as before
$eutil->set_parameters(-eutil => 'efetch',
                        -id  => \@ids,
                        -retmode => 'text',
                        -rettype => 'fasta');

# sends streamed content directly to a file
$eutil->get_response(-content_file => 'seqs.fas');

# or to a LWP::UserAgent-supported request callback
$eutil->get_response(-content_cb => \&my_cb);

my @newparams = (-eutil => 'esearch',
               -db    => 'protein',
               -term => 'BRCA2',
               -retmax => 100);

# Resets eutility to passed parameters (or undef)
$eutil->reset_parameters(@newparams);

# retrieve new IDs
my @new_ids = $eutil->get_ids;
----------------------------------------

Note the same eutil object is used for all of the above, so to answer  
your last question, yes, you should be able to create data pipelines  
using the same object if necessary.

chris


From sac at bioperl.org  Mon Jun  4 17:56:57 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Mon, 4 Jun 2007 10:56:57 -0700
Subject: [Bioperl-l] question about Bio::Restriction::Analysis
In-Reply-To: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu>
References: <3E4CBE0B-6EE4-4973-80DF-90C7E778DA83@cshl.edu>
Message-ID: <8f200b4c0706041056o4dbaadfexddf9f82fc33c6da@mail.gmail.com>

Hi Apurva,

I'm cc:ing the list to let others know you have found performance
issues with Bio::Restriction::Analysis. Ideally, we should focus on
addressing those issues rather than fixing a module that is now
deprecated.

But taking a quick look at my Bio::Tools::RestrictionEnzyme module,
I'm not sure why HpaII would give slower performance relative to other
non-ambiguous cutters. This enzyme has a 4-base recognition sequence
CCGG, and if you're feeding it a large CG-rich input sequence, that
could be a factor. To test, you might try using some other 4-base
cutters that aren't CG-rich (TaqI, TasI) or try some other input
sequences. There is no special flag to indicate that the enzyme is
non-ambiguous. The module handles that automatically.

Good luck,
Steve

On 6/4/07, Apurva Narechania <apurva at cshl.edu> wrote:
> Hi Rob and Steve,
>
> I was hoping you could answer a quick performance question regarding
> the Bio::Restriction::Analysis module. I have found that though this
> module works well, it is considerably slower than the deprecated
> Bio::Tools::RestrictionEnzyme. I see that there are two algorithms
> available to your module, and since I am using HpaII, a non-ambiguous
> enzyme, I thought I might find similar performance to the older,
> deprecated module, but I do not. Is it possible that I am not setting
> the non-ambiguous flag correctly? Does it need to be set in the first
> place?
>
> As far as Bio::Tools::RestrictionEnzyme, though it is faster, I have
> found instances where it is inaccurate, especially in calculating
> fragments of extremely small size 1-5 base pairs, so I would like to
> use your module if possible. It just seems slow to me.
>
> Can you clarify?
>
> I have copied my code below since it is a short, simple script.
>
> Thanks!
> Apurva Narechania
> Ware Lab
> Cold Spring Harbor Labs
>
> ----------
>
> #!/usr/bin/perl
>
> # This program generates a fasta of restriction frags given an
> # input fasta and a restriction cut site
>
> use Getopt::Std;
> use Bio::Seq;
> use Bio::SeqIO;
> use strict;
>
> use Bio::Tools::RestrictionEnzyme;
>
> my %opts = ();
> getopts ('f:', \%opts);
> my $fasta  = $opts{'f'};
>
> # read fasta file
> my $seqin = Bio::SeqIO -> new (-format => 'Fasta', -file => "$fasta");
>
> my $x = 0;
> while (my $sequence_obj = $seqin -> next_seq()){
>      $x++;
>      my $id = $sequence_obj->id();
>
>      print STDERR "$x Working on $id\n";
>
>      # generate the rx object
>      my $ra = new Bio::Tools::RestrictionEnzyme(-NAME=>'HpaII');
>
>      my @frags = $ra->cut_seq($sequence_obj);
>
>      my $counter = 0;
>      foreach my $frag (@frags){
>          $counter++;
>          my $length = length ($frag);
>          print ">$id.$counter length=$length\n$frag\n";
>      }
>
> }
>
>


From anhthu.tieu at gsf.de  Tue Jun  5 08:14:09 2007
From: anhthu.tieu at gsf.de (Tieu, Anh-Thu)
Date: Tue, 5 Jun 2007 10:14:09 +0200
Subject: [Bioperl-l] problems with image maps and IE 6 or higher
Message-ID: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>

Hi, 

 I have a problem using the bioperl image maps function with the IE6 or and
 higher browser. It might be a more general problem with IE6 rather than with bioperl,
 but as I used bioperl to create my image maps, I thought I could still post this problem 
 here and ask for people's opinion. I wondered if anyone else faced the same problem and if
 possible if anyone could share their experiences and their solutions. 
 
  
<div>
<p><img src="/ggtc/tmp_bilder/19727dab708e1cbf567dd48480febb96.png" usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/></p>
<map name="mapnameD064C01" id="mapnameD064C01">
<area shape="rect" coords="108,0,608,20" href="javascript:void(0)" onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale " alt="scale " target="_blank"/>
<area shape="rect" coords="234,44,244,55" href="javascript:void(0)" onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
alue: ' ));;return false;" title="alignment5 " alt="alignment5 " target="_blank"/>
<area shape="rect" coords="241,57,247,68" href="javascript:void(0)" onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '', 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
alue: ' ));;return false;" title="integration_pt " alt="integration_pt " target="_blank"/>
<area shape="rect" coords="108,70,608,81" href="javascript:void(0)" onclick="javascript:void(zmenu( 'Nphs1                                   ', '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', '
stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene " alt="gene " target="_blank"/>
<area shape="rect" coords="108,83,117,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop: 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a
lt="exon1 " target="_blank"/>
<area shape="rect" coords="117,83,119,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop: 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1
 " alt="intron1 " target="_blank"/>
<area shape="rect" coords="119,83,123,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop: 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a
lt="exon2 " target="_blank"/>
<area shape="rect" coords="123,83,124,94" href="javascript:void(0)" onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop: 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2
...
</div>


 This is part of the code I used in my HTML file to display the image map and it really runs beautifully
 with Mozilla 1.7 or the latest Firefox version. However, if used in IE6 the clickable pop-ups do not appear/ work.
 
 I appreciate any help and would like to thank everyone for their help. 
 
 Best regards, 
 
 
 Anh-Thu
________________________________________________________________________
GSF-Forschungszentrum

Ingolst?dter Landstr. 1

85764 M?nchen-Neuherberg, Germany

Chairman of Supervisory Board: MinDir Dr. Peter Lange

Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum

Register of Societies: Amtsgericht M?nchen HRB 6466


From lstein at cshl.edu  Tue Jun  5 13:56:57 2007
From: lstein at cshl.edu (Lincoln Stein)
Date: Tue, 5 Jun 2007 09:55:57 -0401
Subject: [Bioperl-l] problems with image maps and IE 6 or higher
In-Reply-To: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>
References: <93739F94E0F3BA43AD72423E2482341A1435F6@sw-rz010.gsf.de>
Message-ID: <6dce9a0b0706050656n783d27b3u9229f948b2710d90@mail.gmail.com>

Hi Anh-Thu,

Could you send me a snippet of the code that is generating this imagemap? It
looks like you are relying on a javascript library for the zmenu() call, and
it may be that this library is in need of updating.

You might also consider replacing the library with Sheldon McKay's popup
balloon library, located at
http://www.wormbase.org/wiki/index.php/Balloon_Tooltips

Lincoln

On 6/5/07, Tieu, Anh-Thu <anhthu.tieu at gsf.de> wrote:
>
> Hi,
>
> I have a problem using the bioperl image maps function with the IE6 or and
> higher browser. It might be a more general problem with IE6 rather than
> with bioperl,
> but as I used bioperl to create my image maps, I thought I could still
> post this problem
> here and ask for people's opinion. I wondered if anyone else faced the
> same problem and if
> possible if anyone could share their experiences and their solutions.
>
>
> <div>
> <p><img src="/ggtc/tmp_bilder/19727dab708e1cbf567dd48480febb96.png"
> usemap="mapnameD064C01" style="border:2px solid #CCCCCC;"/></p>
> <map name="mapnameD064C01" id="mapnameD064C01">
> <area shape="rect" coords="108,0,608,20" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'scale' ));;return false;" title="scale "
> alt="scale " target="_blank"/>
> <area shape="rect" coords="234,44,244,55" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '',
> 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
> alue: ' ));;return false;" title="alignment5 " alt="alignment5 "
> target="_blank"/>
> <area shape="rect" coords="241,57,247,68" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'alignment 5splk', '', 'seq_id: ', '',
> 'start: ', '', 'stop: 0', '', 'length:  bp', '', 'identity: ', '', 'e-v
> alue: ' ));;return false;" title="integration_pt " alt="integration_pt "
> target="_blank"/>
> <area shape="rect" coords="108,70,608,81" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'Nphs1                                   ',
> '', 'ensembl_id: ENSMUSG00000006649', '', 'start: 30168485', '', '
> stop: 30195968', '', 'length: 27483 bp' ));;return false;" title="gene "
> alt="gene " target="_blank"/>
> <area shape="rect" coords="108,83,117,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'exon1', '', 'start: 30168485', '', 'stop:
> 30169003', '', 'length: 518 bp' ));;return false;" title="exon1 " a
> lt="exon1 " target="_blank"/>
> <area shape="rect" coords="117,83,119,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'intron1', '', 'start: 30169004', '', 'stop:
> 30169083', '', 'length: 79 bp ' ));;return false;" title="intron1
> " alt="intron1 " target="_blank"/>
> <area shape="rect" coords="119,83,123,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'exon2', '', 'start: 30169084', '', 'stop:
> 30169299', '', 'length: 215 bp' ));;return false;" title="exon2 " a
> lt="exon2 " target="_blank"/>
> <area shape="rect" coords="123,83,124,94" href="javascript:void(0)"
> onclick="javascript:void(zmenu( 'intron2', '', 'start: 30169300', '', 'stop:
> 30169373', '', 'length: 73 bp ' ));;return false;" title="intron2
> ..
> </div>
>
>
> This is part of the code I used in my HTML file to display the image map
> and it really runs beautifully
> with Mozilla 1.7 or the latest Firefox version. However, if used in IE6
> the clickable pop-ups do not appear/ work.
>
> I appreciate any help and would like to thank everyone for their help.
>
> Best regards,
>
>
> Anh-Thu
> ________________________________________________________________________
> GSF-Forschungszentrum
>
> Ingolst?dter Landstr. 1
>
> 85764 M?nchen-Neuherberg, Germany
>
> Chairman of Supervisory Board: MinDir Dr. Peter Lange
>
> Board of Directors: Prof. Dr. G?nther Wess and Dr. Nikolaus Blum
>
> Register of Societies: Amtsgericht M?nchen HRB 6466
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From cjfields at uiuc.edu  Tue Jun  5 15:28:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 5 Jun 2007 10:28:24 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <46656D64.7010508@ribosome.natur.cuni.cz>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
Message-ID: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>

Martin,

The example file you give in the bioperl bugzilla report has several  
blank annotation lines which may lead to additional problems.  When  
the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,  
DEFINITION, etc) then it expects there will also be relevant data  
(text descriptions) accompanying it; I assume the BioPython parser  
expects likewise though I may be wrong.

AFAIK the inclusion of field names w/o text isn't GenBank/EMBL- 
compliant.  GenBank records lacking text either have a '.' instead or  
are left out entirely:

http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html

We could add a fix but you should probably contact the ApE developers  
and request that field names w/o text be left out or have '.' added.

chris

On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:

> Ezequiel Panepucci wrote:
>>>     genbank entry = parser.parse(fhandle)
>>
>> there is a space character between "genbank" and "entry".
>> It is a syntax error.
>> I suppose you meant "genbank_entry" ?
>
> Yes, the next command was right and has shown the error. Sorry, I  
> forgot
> to delete the first attempt. ;-)
>
>>>> genbank_entry = parser.parse(fhandle)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",  
> line 187, in parse
>    self._scanner.feed(handle, self._consumer)
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",  
> line 360, in feed
>    self._feed_first_line(consumer, self.line)
>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",  
> line 835, in _feed_first_line
>    assert False, \
> AssertionError: Did not recognise the LOCUS line layout:
> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>
>>>>
>
> Martin
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stewarta at nmrc.navy.mil  Tue Jun  5 15:34:14 2007
From: stewarta at nmrc.navy.mil (Andrew Stewart)
Date: Tue, 5 Jun 2007 11:34:14 -0400
Subject: [Bioperl-l] Setting attributes on a Bio::DB::GFF::Feature object
Message-ID: <95C9F539-A4C4-4B6A-8DA8-079B957BF909@nmrc.navy.mil>

I see bidirectional mutator methods for source, type, strand, etc. in  
the Bio::DB::GFF::Feature documentation but I see that ->attributes  
is only able to get and not set the feature attributes.  Is there no  
way to modify the attributes of a Bio::DB::GFF::Feature live?


--
Andrew Stewart
Research Assistant, Genomics Team
Navy Medical Research Center (NMRC)
Biological Defense Research Directorate (BDRD)
BDRD Annex
12300 Washington Avenue, 2nd Floor
Rockville, MD 20852

email: stewarta at nmrc.navy.mil
phone: 301-231-6700 Ext 270


From cjfields at uiuc.edu  Tue Jun  5 16:07:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 5 Jun 2007 11:07:41 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
Message-ID: <D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>

One thing I missed which explains the biopython error: the LOCUS line  
is missing the locus identifier (see the NCBI example record link).   
This doesn't choke the bioperl parser but it appears to stop the  
biopython parser in it's tracks (maybe a feature instead of a bug!).

You should try adding a unique identifier (maybe the name of the file  
or record) to the LOCUS line to see if it works:

LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006

The bioperl parser in CVS writes out the correct alphabet when this  
is added:

LOCUS       testfile                6499 bp    ds-DNA  linear   02- 
AUG-2006

I'll try adding a warning to the bioperl parser for this.

chris

On Jun 5, 2007, at 10:28 AM, Chris Fields wrote:

> Martin,
>
> The example file you give in the bioperl bugzilla report has several
> blank annotation lines which may lead to additional problems.  When
> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,
> DEFINITION, etc) then it expects there will also be relevant data
> (text descriptions) accompanying it; I assume the BioPython parser
> expects likewise though I may be wrong.
>
> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL-
> compliant.  GenBank records lacking text either have a '.' instead or
> are left out entirely:
>
> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html
>
> We could add a fix but you should probably contact the ApE developers
> and request that field names w/o text be left out or have '.' added.
>
> chris
>
> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:
>
>> Ezequiel Panepucci wrote:
>>>>     genbank entry = parser.parse(fhandle)
>>>
>>> there is a space character between "genbank" and "entry".
>>> It is a syntax error.
>>> I suppose you meant "genbank_entry" ?
>>
>> Yes, the next command was right and has shown the error. Sorry, I
>> forgot
>> to delete the first attempt. ;-)
>>
>>>>> genbank_entry = parser.parse(fhandle)
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in ?
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",
>> line 187, in parse
>>    self._scanner.feed(handle, self._consumer)
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>> line 360, in feed
>>    self._feed_first_line(consumer, self.line)
>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>> line 835, in _feed_first_line
>>    assert False, \
>> AssertionError: Did not recognise the LOCUS line layout:
>> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>>
>>>>>
>>
>> Martin
>> _______________________________________________
>> BioPython mailing list  -  BioPython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From staffa at niehs.nih.gov  Wed Jun  6 02:00:34 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Tue, 05 Jun 2007 22:00:34 -0400
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C170E69F.246E%staffa@niehs.nih.gov>
Message-ID: <C28B8D82.51AE%staffa@niehs.nih.gov>

I am wondering if I knew what this error message exactly meant, if I could
discern my error. 
I don't see much difference in this program and programs that worked.
Can I assume that the new worked because an index file exists?
I don't know how the filehandle UTR_TT_GENES gets involved.
Maybe I should use some other module, but I really would like to have
get_Seq_by_id functionality.

The error message:
Dpse ortholog = Dpse_GA17307
fetching GA17307
Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,
<UTR_TT_GENES> line 4.

Relevant code:
#!/usr/bin/perl
#
#
#
use strict;
use Bio::DB::Fasta;
use Bio::Tools::SeqWords;
use Bio::Seq;
use Bio::SeqIO;
#
my $db = 
Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/TT_orthol
ogs_Dpse_genes.fa',
                                -makeid => \&make_my_id);
...
...
...
my $pse_obj = $db->get_Seq_by_id('GA17307');
my $pse_sequence = $pse_obj->seq;


Nick Staffa 
Telephone: 919-316-4569  (NIEHS: 6-4569)
Scientific Computing Support Group
NIEHS Information Technology Support Services Contract
(Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov)
National Institute of Environmental Health Sciences
National Institutes of Health
Research Triangle Park, North Carolina


From jason at bioperl.org  Wed Jun  6 03:12:40 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 5 Jun 2007 20:12:40 -0700
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C28B8D82.51AE%staffa@niehs.nih.gov>
References: <C28B8D82.51AE%staffa@niehs.nih.gov>
Message-ID: <EC9E4A2E-2C06-4ADE-8317-9E25DDF1C9C4@bioperl.org>

the file handle is probably not important, Perl just reports this if  
there is a filehandle open.

more importantly what is on line 84....

my guess is you are trying to get a sequence out and it doesn't exist  
- some error code around the lines getting the sequence out would be  
helpful.


On Jun 5, 2007, at 7:00 PM, Staffa, Nick (NIH/NIEHS) wrote:

> I am wondering if I knew what this error message exactly meant, if  
> I could
> discern my error.
> I don't see much difference in this program and programs that worked.
> Can I assume that the new worked because an index file exists?
> I don't know how the filehandle UTR_TT_GENES gets involved.
> Maybe I should use some other module, but I really would like to have
> get_Seq_by_id functionality.
>
> The error message:
> Dpse ortholog = Dpse_GA17307
> fetching GA17307
> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl  
> line 84,
> <UTR_TT_GENES> line 4.
>
> Relevant code:
> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> #
> my $db =
> Bio::DB::Fasta->new('/home/staffa/clients/Kari/D_pse_genome/testit/ 
> TT_orthol
> ogs_Dpse_genes.fa',
>                                 -makeid => \&make_my_id);
> ...
> ...
> ...
> my $pse_obj = $db->get_Seq_by_id('GA17307');
> my $pse_sequence = $pse_obj->seq;
>
>
>
>
> Nick Staffa
> Telephone: 919-316-4569  (NIEHS: 6-4569)
> Scientific Computing Support Group
> NIEHS Information Technology Support Services Contract
> (Science Task Monitor: John D. Grovenstein (grovens1 at niehs.nih.gov)
> National Institute of Environmental Health Sciences
> National Institutes of Health
> Research Triangle Park, North Carolina
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070605/7e056ff6/attachment-0004.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2613 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070605/7e056ff6/attachment.p7s>

From torsten.seemann at infotech.monash.edu.au  Wed Jun  6 06:06:37 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 6 Jun 2007 16:06:37 +1000
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <C28B8D82.51AE%staffa@niehs.nih.gov>
References: <C170E69F.246E%staffa@niehs.nih.gov>
	<C28B8D82.51AE%staffa@niehs.nih.gov>
Message-ID: <a79f6a4b0706052306r16f7ce61y28448c18349ac3f4@mail.gmail.com>

Nick,

> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,

The error makes it pretty clear. You are calling the ->seq method on
an undefined value, ie. $pse_obj.

> my $pse_obj = $db->get_Seq_by_id('GA17307');

# check we got something!
die "sequence not in database" unless $pse_obj;

> my $pse_sequence = $pse_obj->seq;


-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From shameer at ncbs.res.in  Wed Jun  6 06:27:42 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Wed, 6 Jun 2007 11:57:42 +0530 (IST)
Subject: [Bioperl-l] Validation of files using BioPerl
Message-ID: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>

Dear All,

How to validate an input file in fasta/PIR/GenPept/PDB format using
Bioperl ? (This is to avoid unnecessary files to be submitted to servers
by new users).   Any module available ?

Many thanks in advance,
-- 
Shameer Khadar


From cjfields at uiuc.edu  Wed Jun  6 12:37:28 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 6 Jun 2007 07:37:28 -0500
Subject: [Bioperl-l] Validation of files using BioPerl
In-Reply-To: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>
References: <34441.192.168.1.1.1181111262.squirrel@mail.ncbs.res.in>
Message-ID: <39F5F622-0C93-4DC5-B969-491F789FC932@uiuc.edu>

It has been discussed but never coded.  I believe if it passes  
through the Bio::SeqIO parser it's generally considered validly  
formatted (spacing, balanced quotes), though it doesn't specifically  
check FT keys and qualifiers for invalid ones, look for missing  
annotation, check taxonomy, etc.

As long as the end sequence mark (//) is present for every file, you  
cold try parsing the file into chunks (read with 'local $/ = '//';')  
and tossing the seq chunks as a filehandle (via IO::String) to a  
Bio::SeqIO object wrapped in an eval block (the parser resets $/, so  
it should work).  Follow the eval with a check of $@ for caught  
errors.  It might get tedious for big sequences...

chris

On Jun 6, 2007, at 1:27 AM, Shameer Khadar wrote:

> Dear All,
>
> How to validate an input file in fasta/PIR/GenPept/PDB format using
> Bioperl ? (This is to avoid unnecessary files to be submitted to  
> servers
> by new users).   Any module available ?
>
> Many thanks in advance,
> -- 
> Shameer Khadar
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From staffa at niehs.nih.gov  Wed Jun  6 14:40:49 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Wed, 06 Jun 2007 10:40:49 -0400
Subject: [Bioperl-l] Bio::DB::Fasta
In-Reply-To: <a79f6a4b0706052306r16f7ce61y28448c18349ac3f4@mail.gmail.com>
Message-ID: <C28C3FB1.4B73%staffa@niehs.nih.gov>

Indeed.
One must know what is actually in his header,
AND 
one must write the appropriate make_id subroutine
AND
one must specify the exact ID.
THEN things might work.
And they did!
THANK YOU


On 6/6/07 2:06 AM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

> Nick,
> 
>> Can't call method "seq" on an undefined value at Match-emNEWTEST.pl line 84,
> 
> The error makes it pretty clear. You are calling the ->seq method on
> an undefined value, ie. $pse_obj.
> 
>> my $pse_obj = $db->get_Seq_by_id('GA17307');
> 
> # check we got something!
> die "sequence not in database" unless $pse_obj;
> 
>> my $pse_sequence = $pse_obj->seq;
> 


From jaudall at gmail.com  Wed Jun  6 21:51:33 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Wed, 6 Jun 2007 15:51:33 -0600
Subject: [Bioperl-l] blastxml interation
Message-ID: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>

I was searching in the deobfuscator under
*Bio::Search::Result::BlastResult*but there doesn't seem to be a
method to extract the iteration number from a
blastxml report.  I can see this number being possibly useful to count the
number of queries that didn't hit anything since the are no empty reports in
the blastxml output.  If I'm missing something, I would welcome an example
how to retrieve the result iteration number.  Thanks in advance for any
suggestions.

Josh


From dmessina at wustl.edu  Wed Jun  6 22:18:26 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 6 Jun 2007 17:18:26 -0500
Subject: [Bioperl-l] blastxml interation
In-Reply-To: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
Message-ID: <CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>

I think you want to look at the hits(), num_hits() and no_hits_found 
() methods. There is a private method _next_iteration_index() which  
should do what you asked for, but num_hits() looks like the better way.

By the way, hits() and num_hits() are listed on the Deobfuscator as  
having no documentation. This (as the below shows) is incorrect and  
is due to some nonstandard formatting issues which I will correct.  
_next_iteration_index() isn't listed on the Deobfuscator because it's  
a private method.


Hope this helps!
Dave


hits()

This method overrides Bio::Search::Result::GenericResult::hits to take
into account the possibility of multiple iterations, as occurs in PSI- 
BLAST reports.
If there are multiple iterations, all 'new' hits for all iterations  
are returned.
These are the hits that did not occur in a previous iteration.
See Also: Bio::Search::Result::GenericResult::hits

num_hits()

This method overrides Bio::Search::Result::GenericResult::num_hits to  
take
into account the possibility of multiple iterations, as occurs in PSI- 
BLAST reports.
If there are multiple iterations, calling num_hits() returns the  
number of
'new' hits for each iteration. These are the hits that did not occur
in a previous iteration.
See Also: Bio::Search::Result::GenericResult::num_hits

no_hits_found()

  Usage     : $nohits = $blast->no_hits_found( $iteration_number );
  Purpose   : Get boolean indicator indicating whether or not any hits
              were present in the report.
              This is NOT the same as determining the number of hits via
              the hits() method, which will return zero hits if there  
were no
              hits in the report or if all hits were filtered out  
during the parse.

              Thus, this method can be used to distinguish these  
possibilities
              for hitless reports generated when filtering.

  Returns   : Boolean
  Argument  : (optional) integer indicating the iteration number (PSI- 
BLAST)
              If iteration number is not specified and this is a PSI- 
BLAST result,
              then this method will return true only if all  
iterations had
              no hits found.


From apurva at cshl.edu  Wed Jun  6 23:51:45 2007
From: apurva at cshl.edu (Apurva Narechania)
Date: Wed, 6 Jun 2007 19:51:45 -0400
Subject: [Bioperl-l] non-palindromic issue in Bio::Restriction::Analysis
Message-ID: <3F7C7E33-416A-4141-969A-DDC4716E8A44@cshl.edu>

Hi,

I was hoping you could confirm and give me some feedback on an issue  
I think I've found with the Bio::Restriction::Analysis module. I am  
using the enzyme AciI, a non-palindromic restriction enzyme with a 5'  
C | CGC 3' recognition site. The module should search both the  
forward and the reverse complement strings in the case of a non- 
palindromic enzyme. I have found that the this works only  
intermittently. For example, the following sequence:

GAAAAAAACAAAGGAAGAAGCTAGCTAGCAGGGCACGCGGTTTGAGGATGGCTGGTGGCCGACCGCAGGGCG 
CGCGGTTG
GAGGATTGCTGGTGGCCGACCAGATGAAACTCACGCGCGGCTGGGGACAGCTGGAATATTTGGGCGGCGGCG 
GCTGGTAT
TACGGGAAAGGAGAGATAGGGTTTTGGACGGCAGCAGCTGGTATTTGGGCCACCAATTTTGCGCGCCAGTAC 
AGGACACC
GATGCCGCAAATTGCACAATGCCTTTTATGGCGACTGACAGTGCGATGCTATAGGTATGAATTGTCGACTGA 
CAAAGTGA
CACTATTCACATATAAATATAACGAATAACACTCAGTTGGAATATAGACATATGCCGACTCACCATCTGTGG 
CAATGTAT
ACCGACTAACAATTCGATGCTAATTCTCTATTTATAGCGACAGTCGTCAGACACTAATTTGGTGTTGTGGTA 
TAATGCTA
GTGCCTCACCGCTGTAGGTGTTGGTCTACTGGTGC

Should digest into 10 fragments using this enzyme, but the module  
produces only 7. Could you please confirm this behavior, and if  
observed, suggest some possible fixes? This may be a bug in the  
_non_pal_enz method, or may be me overlooking something pretty obvious.

Thanks,
Apurva Narechania.


From cjfields at uiuc.edu  Thu Jun  7 00:51:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 6 Jun 2007 19:51:00 -0500
Subject: [Bioperl-l] blastxml interation
In-Reply-To: <CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>
References: <52cea20c0706061451i39e44aeev8dc58d1e635665e7@mail.gmail.com>
	<CBBAAD1F-563D-4B43-B086-F707989939EA@wustl.edu>
Message-ID: <B494A9F2-80CE-4761-B67F-127B37358819@uiuc.edu>

Joshua,

Just to make sure there is no confusion, do you mean a  
Bio::Search::Iteration::IterationI-based object?  The iteration tags  
have multiple meanings apparently in BLAST XML output (multiple  
queries, multiple PSI-BLAST iterations).  The current  
SearchIO::blastxml parser returns multiple  
Bio::Search::Result::BlastResult objects based on the iterations, so  
PSI-BLAST output is treated as multiple BLAST reports regardless  
(i.e. no Iteration objects).  This is something I want to rectify but  
it may not be a easy fix.

chris

On Jun 6, 2007, at 5:18 PM, David Messina wrote:

> I think you want to look at the hits(), num_hits() and no_hits_found
> () methods. There is a private method _next_iteration_index() which
> should do what you asked for, but num_hits() looks like the better  
> way.
>
> By the way, hits() and num_hits() are listed on the Deobfuscator as
> having no documentation. This (as the below shows) is incorrect and
> is due to some nonstandard formatting issues which I will correct.
> _next_iteration_index() isn't listed on the Deobfuscator because it's
> a private method.
>
>
> Hope this helps!
> Dave
>
>
> hits()
>
> This method overrides Bio::Search::Result::GenericResult::hits to take
> into account the possibility of multiple iterations, as occurs in PSI-
> BLAST reports.
> If there are multiple iterations, all 'new' hits for all iterations
> are returned.
> These are the hits that did not occur in a previous iteration.
> See Also: Bio::Search::Result::GenericResult::hits
>
> num_hits()
>
> This method overrides Bio::Search::Result::GenericResult::num_hits to
> take
> into account the possibility of multiple iterations, as occurs in PSI-
> BLAST reports.
> If there are multiple iterations, calling num_hits() returns the
> number of
> 'new' hits for each iteration. These are the hits that did not occur
> in a previous iteration.
> See Also: Bio::Search::Result::GenericResult::num_hits
>
> no_hits_found()
>
>   Usage     : $nohits = $blast->no_hits_found( $iteration_number );
>   Purpose   : Get boolean indicator indicating whether or not any hits
>               were present in the report.
>               This is NOT the same as determining the number of  
> hits via
>               the hits() method, which will return zero hits if there
> were no
>               hits in the report or if all hits were filtered out
> during the parse.
>
>               Thus, this method can be used to distinguish these
> possibilities
>               for hitless reports generated when filtering.
>
>   Returns   : Boolean
>   Argument  : (optional) integer indicating the iteration number (PSI-
> BLAST)
>               If iteration number is not specified and this is a PSI-
> BLAST result,
>               then this method will return true only if all
> iterations had
>               no hits found.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Thu Jun  7 00:45:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 6 Jun 2007 20:45:14 -0400
Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db
Message-ID: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>

I have added support to BioSQL and bioperl-db for schemas in  
PostgreSQL. A schema in PostgreSQL is more or less a namespace for  
database objects (tables, indexes, views, etc) within a database.

(A database in PostgreSQL is similar to the concept of a user in  
Oracle or MySQL, and therefore for the latter two schemas are  
synonymous with a user. [Not sure I'm still up-to-date on this for  
MySQL, but at least that's what I recall.])

When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts,  
you specify the schema in which BioSQL resides using the --schema  
option.

If you are using bioperl-db as a library, the Bio::DB::BioDB->new()  
call also accepts a -schema named parameter, and Bio::DB::DBContextI  
objects have a $dbc->schema() property for getting/setting the  
schema, Bio::DB::SimpleDBContext->new() accepts a -schema parameter,  
and you may also add the property to the .bioperldb connection  
parameter file (-schema => 'yourschemahere').

Thanks for Brian Osborne for being the instigator (and tester, and  
for adding the code to load_ncbi_taxonomy.pl - I came too late).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jaudall at gmail.com  Wed Jun  6 21:41:08 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Wed, 6 Jun 2007 15:41:08 -0600
Subject: [Bioperl-l] blastxml interation number
Message-ID: <52cea20c0706061441n96ce803v9422e8d14461c2bd@mail.gmail.com>

I was searching in the deobfuscator under
*Bio::Search::Result::BlastResult*but there doesn't seem to be a
method to extract the iteration number from a
blastxml report.  I can see this number being very useful to count the
number of queries that didn't hit anything since the are no empty reports in
the blastxml output.  If I'm missing something, I would welcome an example
how to retrieve the result iteration number, otherwise I'm suggesting that
an iteration_count feature be added to the Result object.  Thanks in advance
for any suggestions.

Josh


From holland at ebi.ac.uk  Thu Jun  7 07:33:25 2007
From: holland at ebi.ac.uk (Richard Holland)
Date: Thu, 07 Jun 2007 08:33:25 +0100
Subject: [Bioperl-l] PostgreSQL schema support in BioSQL and bioperl-db
In-Reply-To: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>
References: <DC1816E2-68C0-400C-A777-F6D14DE2B870@gmx.net>
Message-ID: <4667B4C5.6070107@ebi.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sounds great.

BioJava users shouldn't need to change anything to get this to work as
PostgreSQL JDBC connection objects already require you to specify a schema.

cheers,
Richard


Hilmar Lapp wrote:
> I have added support to BioSQL and bioperl-db for schemas in PostgreSQL.
> A schema in PostgreSQL is more or less a namespace for database objects
> (tables, indexes, views, etc) within a database.
> 
> (A database in PostgreSQL is similar to the concept of a user in Oracle
> or MySQL, and therefore for the latter two schemas are synonymous with a
> user. [Not sure I'm still up-to-date on this for MySQL, but at least
> that's what I recall.])
> 
> When using the load_{seqdatabase,ontology,ncbi_taxonomy}.pl scripts, you
> specify the schema in which BioSQL resides using the --schema option.
> 
> If you are using bioperl-db as a library, the Bio::DB::BioDB->new() call
> also accepts a -schema named parameter, and Bio::DB::DBContextI objects
> have a $dbc->schema() property for getting/setting the schema,
> Bio::DB::SimpleDBContext->new() accepts a -schema parameter, and you may
> also add the property to the .bioperldb connection parameter file
> (-schema => 'yourschemahere').
> 
> Thanks for Brian Osborne for being the instigator (and tester, and for
> adding the code to load_ncbi_taxonomy.pl - I came too late).
> 
>     -hilmar
> --===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGZ7TF4C5LeMEKA/QRApwUAJ48q46iX152pB6Xcc/717Ie8foUTQCgm3ij
W/+0iO/ZsNDn1pLuf5yXbYA=
=asUn
-----END PGP SIGNATURE-----


From mmokrejs at ribosome.natur.cuni.cz  Thu Jun  7 14:26:44 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 07 Jun 2007 16:26:44 +0200
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
	<D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
Message-ID: <466815A4.9060505@ribosome.natur.cuni.cz>

Hi,

Chris Fields wrote:
> One thing I missed which explains the biopython error: the LOCUS line is 
> missing the locus identifier (see the NCBI example record link).  This 
> doesn't choke the bioperl parser but it appears to stop the biopython 
> parser in it's tracks (maybe a feature instead of a bug!).
> 
> You should try adding a unique identifier (maybe the name of the file or 
> record) to the LOCUS line to see if it works:
> 
> LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006
> 
> The bioperl parser in CVS writes out the correct alphabet when this is 
> added:
> 
> LOCUS       testfile                6499 bp    ds-DNA  linear   02-AUG-2006
> 
> I'll try adding a warning to the bioperl parser for this.

I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305 but let me
emphasize the LOCUS line now contains 

LOCUS                      pRL        5428 bp ds-DNA   linear       07-JUN-2007


which still does not comply with the line you have proposed. But it can be
parsed by bioperl-live from cvs. Is it still wrong? Testcase as pRL.gb-new
in the bugzilla record #2305.

Martin

> 
> chris
> 
> On Jun 5, 2007, at 10:28 AM, Chris Fields wrote:
> 
>> Martin,
>>
>> The example file you give in the bioperl bugzilla report has several
>> blank annotation lines which may lead to additional problems.  When
>> the BioPerl SeqIO parser finds annotation fields (SOURCE, ORGANISM,
>> DEFINITION, etc) then it expects there will also be relevant data
>> (text descriptions) accompanying it; I assume the BioPython parser
>> expects likewise though I may be wrong.
>>
>> AFAIK the inclusion of field names w/o text isn't GenBank/EMBL-
>> compliant.  GenBank records lacking text either have a '.' instead or
>> are left out entirely:
>>
>> http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html
>>
>> We could add a fix but you should probably contact the ApE developers
>> and request that field names w/o text be left out or have '.' added.
>>
>> chris
>>
>> On Jun 5, 2007, at 9:04 AM, Martin MOKREJ? wrote:
>>
>>> Ezequiel Panepucci wrote:
>>>>>     genbank entry = parser.parse(fhandle)
>>>>
>>>> there is a space character between "genbank" and "entry".
>>>> It is a syntax error.
>>>> I suppose you meant "genbank_entry" ?
>>>
>>> Yes, the next command was right and has shown the error. Sorry, I
>>> forgot
>>> to delete the first attempt. ;-)
>>>
>>>>>> genbank_entry = parser.parse(fhandle)
>>> Traceback (most recent call last):
>>>  File "<stdin>", line 1, in ?
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py",
>>> line 187, in parse
>>>    self._scanner.feed(handle, self._consumer)
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>>> line 360, in feed
>>>    self._feed_first_line(consumer, self.line)
>>>  File "/usr/lib/python2.4/site-packages/Bio/GenBank/Scanner.py",
>>> line 835, in _feed_first_line
>>>    assert False, \
>>> AssertionError: Did not recognise the LOCUS line layout:
>>> LOCUS               6499 bp ds-DNA     linear       02-AUG-2006
>>>
>>>>>>
>>>
>>> Martin
>>> _______________________________________________
>>> BioPython mailing list  -  BioPython at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>
>> _______________________________________________
>> BioPython mailing list  -  BioPython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs


From cjfields at uiuc.edu  Thu Jun  7 15:31:45 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 7 Jun 2007 10:31:45 -0500
Subject: [Bioperl-l] [BioPython] Cannot parse GenBank file
In-Reply-To: <466815A4.9060505@ribosome.natur.cuni.cz>
References: <46655550.70400@ribosome.natur.cuni.cz>
	<bf30be80706050702v2ffa618hd3b60d05abb39461@mail.gmail.com>
	<46656D64.7010508@ribosome.natur.cuni.cz>
	<24065CBD-BBF6-4CA3-9523-AD50C524DAE5@uiuc.edu>
	<D29FB970-CFE0-44FF-A89B-7ACCDB072341@uiuc.edu>
	<466815A4.9060505@ribosome.natur.cuni.cz>
Message-ID: <2A403865-F1E8-4D19-8D19-455C22E7C6D9@uiuc.edu>

On Jun 7, 2007, at 9:26 AM, Martin MOKREJ? wrote:

> Hi,
>
> Chris Fields wrote:
>> One thing I missed which explains the biopython error: the LOCUS  
>> line is missing the locus identifier (see the NCBI example record  
>> link).  This doesn't choke the bioperl parser but it appears to  
>> stop the biopython parser in it's tracks (maybe a feature instead  
>> of a bug!).
>> You should try adding a unique identifier (maybe the name of the  
>> file or record) to the LOCUS line to see if it works:
>> LOCUS  testfile           6499 bp ds-DNA     linear       02-AUG-2006
>> The bioperl parser in CVS writes out the correct alphabet when  
>> this is added:
>> LOCUS       testfile                6499 bp    ds-DNA  linear   02- 
>> AUG-2006
>> I'll try adding a warning to the bioperl parser for this.
>
> I have updated http://bugzilla.open-bio.org/show_bug.cgi?id=2305  
> but let me
> emphasize the LOCUS line now contains
> LOCUS                      pRL        5428 bp ds-DNA   linear        
> 07-JUN-2007
>
>
> which still does not comply with the line you have proposed. But it  
> can be
> parsed by bioperl-live from cvs. Is it still wrong? Testcase as  
> pRL.gb-new
> in the bugzilla record #2305.
>
> Martin

That should work.  There isn't a strict uniqueness test (that would  
require caching and isn't worth the trouble IMHO), though it's  
required you add something unique for the accession/locus if you plan  
on indexing them in the future.

Parsing GenBank data produced from third-party software is  
problematic at best; there seems to be no steadfast rule with GenBank  
output for some programs, even though the specification is plainly  
stated in the NCBI release notes.  My take on that is to have a  
stricter (read:follows release notes) GenBank parser which passes off  
the data in the record to default handler methods.  A user could then  
subjugate the defined handlers with their own by subclassing the  
default handler class and overloading the methods or adding their own  
code references directly.

chris

...


From rich at thevillas.eclipse.co.uk  Fri Jun  8 11:00:45 2007
From: rich at thevillas.eclipse.co.uk (richard)
Date: Fri, 08 Jun 2007 12:00:45 +0100
Subject: [Bioperl-l] protparam
Message-ID: <466936DD.8080604@thevillas.eclipse.co.uk>


Hi,

I noticed that in April someone asked whether there was a bioperl mod 
for obtaining protein sequence related properties using protparam.
I have a module that could potentially be submitted to bioperl for this 
purpose. Does anybody have any thoughts on whether it should go in?

Example script and the module are at:

http://81.5.159.173/webshare/ 


Cheers
Rich


From cjfields at uiuc.edu  Fri Jun  8 12:37:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 8 Jun 2007 07:37:27 -0500
Subject: [Bioperl-l] protparam
In-Reply-To: <466936DD.8080604@thevillas.eclipse.co.uk>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
Message-ID: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>

Richard,

We'll gladly add this in, though it'll need to be bioperlized  
(inherit Bio::Root::Root).  We also generally ask for tests but it  
should be easy to write up a quick test suite using any protein seq.

If you can could you add some bioperl-like POD to the module (i.e.  
SYNOPSIS, AUTHOR, DESCRIPTION, etc)?

thanks!

chris

On Jun 8, 2007, at 6:00 AM, richard wrote:

>
> Hi,
>
> I noticed that in April someone asked whether there was a bioperl mod
> for obtaining protein sequence related properties using protparam.
> I have a module that could potentially be submitted to bioperl for  
> this
> purpose. Does anybody have any thoughts on whether it should go in?
>
> Example script and the module are at:
>
> http://81.5.159.173/webshare/
>
>
> Cheers
> Rich
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From mmokrejs at ribosome.natur.cuni.cz  Fri Jun  8 11:09:42 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Fri, 08 Jun 2007 13:09:42 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted file?
Message-ID: <466938F6.7050903@ribosome.natur.cuni.cz>

Hi,
  how can I convert GenBank/EMBL formatted file to a GFF file? The manpage for
Bio::Graphics::FeatureFile does not help me in this way. The information is in
the file, so I want just to extract the features to a GFF format, probably somewhere
the sequence has to be stored ...
 Is there a tool so I can convert it automatically? ;) This would be great. I
can't make the GFF manually for every file. Other programs draw plasmid maps
also automatically from the GenBank formatted input so how can I do it in bioperl?
Thanks for help,
Martin


From shameer at ncbs.res.in  Fri Jun  8 14:11:00 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Fri, 8 Jun 2007 19:41:00 +0530 (IST)
Subject: [Bioperl-l] protparam
In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
Message-ID: <54411.192.168.1.1.1181311860.squirrel@mail.ncbs.res.in>

Richard,

I asked for protparam module in bioperl !
Thats a good job.

Cheers,
SK

> Richard,
>
> We'll gladly add this in, though it'll need to be bioperlized
> (inherit Bio::Root::Root).  We also generally ask for tests but it
> should be easy to write up a quick test suite using any protein seq.
>
> If you can could you add some bioperl-like POD to the module (i.e.
> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>
> thanks!
>
> chris
>
> On Jun 8, 2007, at 6:00 AM, richard wrote:
>
>>
>> Hi,
>>
>> I noticed that in April someone asked whether there was a bioperl mod
>> for obtaining protein sequence related properties using protparam.
>> I have a module that could potentially be submitted to bioperl for
>> this
>> purpose. Does anybody have any thoughts on whether it should go in?
>>
>> Example script and the module are at:
>>
>> http://81.5.159.173/webshare/
>>
>>
>> Cheers
>> Rich
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From dmessina at wustl.edu  Fri Jun  8 14:58:20 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 8 Jun 2007 09:58:20 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <466938F6.7050903@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
Message-ID: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>

Hi Martin,

You're in luck -- the BioPerl core distribution includes two scripts  
for doing just that:

	genbank2gff
	genbank2gff3

Look in the scripts directory of the distro.

Also, there is a *huge* amount of documentation and examples on the  
BioPerl website.

	http://www.bioperl.org/wiki/HOWTOs

Reading those, reading the FAQ, and searching the mailing list  
archives are where I look first when I don't know how to do something  
in BioPerl.


Dave

--
Dave Messina
Senior Analyst, Assembly Group
Genome Sequencing Center
Washington University
St. Louis, MO


From rich at thevillas.eclipse.co.uk  Fri Jun  8 15:51:21 2007
From: rich at thevillas.eclipse.co.uk (richard)
Date: Fri, 08 Jun 2007 16:51:21 +0100
Subject: [Bioperl-l] protparam
In-Reply-To: <4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
Message-ID: <46697AF9.2090502@thevillas.eclipse.co.uk>


Hi,

ok, great, that's no problem. I'll add the POD and bioperlize it,

thanks
Rich

Chris Fields wrote:
> Richard,
>
> We'll gladly add this in, though it'll need to be bioperlized  
> (inherit Bio::Root::Root).  We also generally ask for tests but it  
> should be easy to write up a quick test suite using any protein seq.
>
> If you can could you add some bioperl-like POD to the module (i.e.  
> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>
> thanks!
>
> chris
>
> On Jun 8, 2007, at 6:00 AM, richard wrote:
>
>   
>> Hi,
>>
>> I noticed that in April someone asked whether there was a bioperl mod
>> for obtaining protein sequence related properties using protparam.
>> I have a module that could potentially be submitted to bioperl for  
>> this
>> purpose. Does anybody have any thoughts on whether it should go in?
>>
>> Example script and the module are at:
>>
>> http://81.5.159.173/webshare/
>>
>>
>> Cheers
>> Rich
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>   


From cjfields at uiuc.edu  Fri Jun  8 17:45:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 8 Jun 2007 12:45:17 -0500
Subject: [Bioperl-l] protparam
In-Reply-To: <46697AF9.2090502@thevillas.eclipse.co.uk>
References: <466936DD.8080604@thevillas.eclipse.co.uk>
	<4F4085B4-E500-4FF1-88A2-9AA27F28F661@uiuc.edu>
	<46697AF9.2090502@thevillas.eclipse.co.uk>
Message-ID: <AA43E9C9-7064-438A-89A9-12E4B21E4F04@uiuc.edu>

Another issue is namespace.  I suggest Bio::Tools::ProtParam, though  
there may be some others out there.

We can add support for direct Bio::Seq/PrimarySeq input and other  
odds and ends once it's committed.  Good work!

chris

On Jun 8, 2007, at 10:51 AM, richard wrote:

>
> Hi,
>
> ok, great, that's no problem. I'll add the POD and bioperlize it,
>
> thanks
> Rich
>
> Chris Fields wrote:
>> Richard,
>>
>> We'll gladly add this in, though it'll need to be bioperlized
>> (inherit Bio::Root::Root).  We also generally ask for tests but it
>> should be easy to write up a quick test suite using any protein seq.
>>
>> If you can could you add some bioperl-like POD to the module (i.e.
>> SYNOPSIS, AUTHOR, DESCRIPTION, etc)?
>>
>> thanks!
>>
>> chris
>>
>> On Jun 8, 2007, at 6:00 AM, richard wrote:
>>
>>
>>> Hi,
>>>
>>> I noticed that in April someone asked whether there was a bioperl  
>>> mod
>>> for obtaining protein sequence related properties using protparam.
>>> I have a module that could potentially be submitted to bioperl for
>>> this
>>> purpose. Does anybody have any thoughts on whether it should go in?
>>>
>>> Example script and the module are at:
>>>
>>> http://81.5.159.173/webshare/
>>>
>>>
>>> Cheers
>>> Rich
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Mon Jun 11 11:30:24 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 11 Jun 2007 07:30:24 -0400
Subject: [Bioperl-l] script to load ITIS taxonomy
Message-ID: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>

Hi all -

I added a script to load the ITIS taxonomy (www.itis.gov) into the  
phylodb module. It is called load_itis_taxonomy.pl and is in the  
scripts/ directory.

It is independent of BioPerl right now (the ITIS download is either a  
MS SQL Server or an Informix dump - no kidding), but I'm hoping that  
at some point support for this can be integrated into Bio::TreeIO.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 11 12:24:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 11 Jun 2007 07:24:50 -0500
Subject: [Bioperl-l] script to load ITIS taxonomy
In-Reply-To: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>
References: <897DB32F-4AEE-4388-A499-C71BFD2281DE@gmx.net>
Message-ID: <99AC6C0F-10DD-4587-AFB3-32BC495CD2BD@uiuc.edu>


On Jun 11, 2007, at 6:30 AM, Hilmar Lapp wrote:

> Hi all -
>
> I added a script to load the ITIS taxonomy (www.itis.gov) into the
> phylodb module. It is called load_itis_taxonomy.pl and is in the
> scripts/ directory.
>
> It is independent of BioPerl right now (the ITIS download is either a
> MS SQL Server or an Informix dump - no kidding), but I'm hoping that
> at some point support for this can be integrated into Bio::TreeIO.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

I second the TreeIO support.  Anyone up for it?

chris


From ryanx07 at hotmail.com  Mon Jun 11 15:24:31 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Mon, 11 Jun 2007 10:24:31 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>

I just started to learn BioPerl by reading the BioPerl Tutorial on the 
BioPerl website. By trying the 1st example on my window,
use Bio::Perl;
$seq_object = get_sequence('swiss',"ID ROA1_HUMAN");
write_sequence(">roa1.fasta",'fasta',$seq_object);

I got the error as the following:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
3
STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
STACK: t8.pl:7

I cannot figure out where is wrong but cannot find the solution on the web. 
Could someone help me please?

Also, this lead to my 2nd question: is there a way to search in the archieve 
of the current list?

Thanks so much


R

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Like puzzles? Play free games & earn great prizes. Play Clink now. 
http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2


From dmessina at wustl.edu  Mon Jun 11 16:34:29 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 11 Jun 2007 11:34:29 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>
References: <BAY106-F372C263F25DA66E3F2B986B41A0@phx.gbl>
Message-ID: <25517EA3-7BDA-44AC-BDF3-93A6810D9D63@wustl.edu>

The example code works here, but I'm on OS X. Could you tell us which  
version of Perl and BioPerl you are using, and which operating system?

Are you getting anything in the roa1.fasta file?


> is there a way to search in the archieve of the current list?

http://www.bioperl.org/wiki/Mailing_lists


Dave


From dmessina at wustl.edu  Mon Jun 11 18:48:23 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 11 Jun 2007 13:48:23 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F39783926A21896CCB15CD9B41A0@phx.gbl>
References: <BAY106-F39783926A21896CCB15CD9B41A0@phx.gbl>
Message-ID: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu>

Hi,

Please use 'Reply All' so everyone on the list can follow the  
discussion.

Try adding the following line after the line that starts with  
$seq_object:

	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";

And then run the program again. What do you get? Could you post a  
complete printout of what you're doing?


Dave


On Jun 11, 2007, at 11:45 AM, L Xu wrote:
> I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
> activeperl 5.8.8.819 Thank you very much.


From johnsonm at gmail.com  Tue Jun 12 00:45:13 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Mon, 11 Jun 2007 19:45:13 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
Message-ID: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>

    This bit in Bio::SeqFeature::Gene::Exon is causing me some
problems trying to extend Bio::Tools::Glimmer to handle 'wraparound'
genes (circular genomes):

sub location {
   my ($self,$value) = @_;

   if(defined($value) && $value->isa('Bio::Location::SplitLocationI')) {
       $self->throw("split or compound location is not allowed ".
                    "for an object of type " . ref($self));
   }
   return $self->SUPER::location($value);
}

    That seems to be there all the way back to the initial revision
(checked in by Hilmar).  I presume it's there because of code like
this ( from the seq() method in Bio::SeqFeature::Generic):

# assumming our seq object is sensible, it should not have to yank
# the entire sequence out here.

my $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end());

    That's not going to work too well with a feature that has a
Bio::Location::Split location.  Fixing it up seems straightforward, if
a bit hackish.  Something like:

my $seq;

if (ref($self->location()) eq 'Bio::Location::Split')) {
    my $seqstring;
    my @sublocs = $self->location()->sub_Location();

    foreach my $subloc (@sublocs) {
        $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(),
$subloc->end())->seq();
    }

    my $seq = Bio::Seq->new(
                                          -id =>
$self->{'_gsf_seq'}->display_id(),
                                          -seq => $seqstring
                                         );
}
else {
    $seq = $self->{'_gsf_seq'}->trunc($self->start(), $self->end());
}

    I don't see any companion to trunc() in Bio::PrimarySeqI for
joining sequences.  A join() would be handy, and make the above
cleaner.
    Comments, suggestions, rotten fruit?


From torsten.seemann at infotech.monash.edu.au  Tue Jun 12 06:18:27 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 12 Jun 2007 16:18:27 +1000
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
Message-ID: <a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>

Mark,

> if (ref($self->location()) eq 'Bio::Location::Split')) {
>     my $seqstring;
>     my @sublocs = $self->location()->sub_Location();
>
>     foreach my $subloc (@sublocs) {
>         $seqstring .= $self->{'_gsf_seq'}->trunc($subloc->start(),
> $subloc->end())->seq();
>     }

Can you use the ->spliced_seq() method to do this?

http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From pengchy at yahoo.com.cn  Tue Jun 12 07:00:46 2007
From: pengchy at yahoo.com.cn (=?gb2312?q?=D1=EE=20=C5=F4=B3=CC?=)
Date: Tue, 12 Jun 2007 15:00:46 +0800 (CST)
Subject: [Bioperl-l] Can't locate loadable object for module
	TFBS::Ext::pwmsearch
Message-ID: <66745.92089.qm@web15205.mail.cnb.yahoo.com>

hi all,
   
  Today, I download the TFBS package from http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the files contained in the TFBS and Ext directories to directory "C:\perl\site\lib", then put Ext under the TFBS directory. I run the example script1.pl, but a wrong message respond: 
   
  Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC (@INC contains: C:/perl/site/lib C:/perl/lib .) at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 141, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, <
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52, <DATA> line 206.
Compilation failed in require at script1.pl line 3, <DATA> line 206.
BEGIN failed--compilation aborted at script1.pl line 3, <DATA> line 206.
shell returned 2
   
  when I run the list_matrices.pl script, the same message respond. But when I empty the pwmsearch.pm file, following message respond:
   
  TFBS/Ext/pwmsearch.pm did not return a true value at :/perl/site/lib/TFBS/Matr
x/PWM.pm line 141, <DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PWM.pm line 11, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 137,
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/Matrix/PFM.pm line 17, <DATA> line 206.
Compilation failed in require at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line 52,
DATA> line 206.
BEGIN failed--compilation aborted at C:/perl/site/lib/TFBS/DB/TRANSFAC.pm line2, <DATA> line 206.
Compilation failed in require at script1.pl line 3, <DATA> line 206.
BEGIN failed--compilation aborted at script1.pl line 3, <DATA> line 206.
   
  Is anyone else meet the same problem? Is it a bug for TFBS package?


Best wishes!

Sincerely, Pengcheng
       
---------------------------------
????????3.5G???20M??? 


From bix at sendu.me.uk  Tue Jun 12 07:32:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 12 Jun 2007 08:32:02 +0100
Subject: [Bioperl-l] Can't locate loadable object for
	module	TFBS::Ext::pwmsearch
In-Reply-To: <66745.92089.qm@web15205.mail.cnb.yahoo.com>
References: <66745.92089.qm@web15205.mail.cnb.yahoo.com>
Message-ID: <466E4BF2.7020504@sendu.me.uk>

? ?? wrote:
> hi all,
> 
> Today, I download the TFBS package from
> http://forkhead.cgb.ki.se/TFBS/, and uncompress it and copy all the
> files contained in the TFBS and Ext directories to directory
> "C:\perl\site\lib", then put Ext under the TFBS directory. I run the
> example script1.pl, but a wrong message respond:
> 
> Can't locate loadable object for module TFBS::Ext::pwmsearch in @INC

You have to follow the installation instructions in the README file.
Copying the files out is insufficient - you have to 'make'.


From ryanx07 at hotmail.com  Tue Jun 12 11:30:09 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 06:30:09 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <127743A7-1923-4DBF-A96E-276B5E0A7692@wustl.edu>
Message-ID: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>

Here is the code:

use Bio::Perl;
$seq_object = get_sequence('swiss',"ROA1_HUMAN");
print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
write_sequence(">roa1.fasta",'fasta',$seq_object);

The output looks like the same as the previous version:

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\~Scripts>perl test.pl

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
STACK: Bio::SeqIO::swiss::next_seq C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
3
STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
STACK: test.pl:7
-----------------------------------------------------------

Thanks.


>From: David Messina <dmessina at wustl.edu>
>To: L Xu <ryanx07 at hotmail.com>
>CC: BioPerl list <bioperl-l at lists.open-bio.org>
>Subject: Re: [Bioperl-l] basic questions
>Date: Mon, 11 Jun 2007 13:48:23 -0500
>
>Hi,
>
>Please use 'Reply All' so everyone on the list can follow the  discussion.
>
>Try adding the following line after the line that starts with  $seq_object:
>
>	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
>
>And then run the program again. What do you get? Could you post a  complete 
>printout of what you're doing?
>
>
>Dave
>
>
>On Jun 11, 2007, at 11:45 AM, L Xu wrote:
>>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
>>activeperl 5.8.8.819 Thank you very much.
>

_________________________________________________________________
Picture this ? share your photos and you could win big!  
http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us


From pengchy at yahoo.com.cn  Tue Jun 12 14:33:15 2007
From: pengchy at yahoo.com.cn (Pengcheng Yang)
Date: Tue, 12 Jun 2007 22:33:15 +0800 (CST)
Subject: [Bioperl-l]
	=?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=20basic=20questions?=
In-Reply-To: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>
Message-ID: <936780.8655.qm@web15215.mail.cnb.yahoo.com>


I got the same questions.

I guess that the swissprote database has some problems!

code:
use Bio::DB::SwissProt;
$sp = new Bio::DB::SwissProt;
$seq = $sp->get_Seq_by_id('KPY1_ECOLI'); 
print ref($seq),"\t",$seq->display_id,"\n"

the mesage:

------------- EXCEPTION  -------------
MSG: swissprot stream with no ID. Not swissprot in my book
STACK Bio::SeqIO::swiss::next_seq C:/perl/site/lib/Bio\SeqIO\swiss.pm:180
STACK Bio::DB::WebDBSeqI::get_Seq_by_id
C:/perl/site/lib/Bio/DB/WebDBSeqI.pm:154

STACK toplevel t.pl:7

--------------------------------------


--- L Xu <ryanx07 at hotmail.com>??:

> Here is the code:
> 
> use Bio::Perl;
> $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
> write_sequence(">roa1.fasta",'fasta',$seq_object);
> 
> The output looks like the same as the previous version:
> 
> Microsoft Windows XP [Version 5.1.2600]
> (C) Copyright 1985-2001 Microsoft Corp.
> 
> C:\~Scripts>perl test.pl
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: swissprot stream with no ID. Not swissprot in my book
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:350
> STACK: Bio::SeqIO::swiss::next_seq
> C:/Perl/site/lib/Bio\SeqIO\swiss.pm:178
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_id 
> C:/Perl/site/lib/Bio/DB/WebDBSeqI.pm:15
> 3
> STACK: Bio::Perl::get_sequence C:/Perl/site/lib/Bio/Perl.pm:510
> STACK: test.pl:7
> -----------------------------------------------------------
> 
> Thanks.
> 
> 
> 
> 
> 
> >From: David Messina <dmessina at wustl.edu>
> >To: L Xu <ryanx07 at hotmail.com>
> >CC: BioPerl list <bioperl-l at lists.open-bio.org>
> >Subject: Re: [Bioperl-l] basic questions
> >Date: Mon, 11 Jun 2007 13:48:23 -0500
> >
> >Hi,
> >
> >Please use 'Reply All' so everyone on the list can follow the 
> discussion.
> >
> >Try adding the following line after the line that starts with 
> $seq_object:
> >
> >	print STDERR ref($seq_object), "\t", $seq_object->display_id, "\n";
> >
> >And then run the program again. What do you get? Could you post a 
> complete 
> >printout of what you're doing?
> >
> >
> >Dave
> >
> >
> >On Jun 11, 2007, at 11:45 AM, L Xu wrote:
> >>I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and  
> >>activeperl 5.8.8.819 Thank you very much.
> >
> 
> _________________________________________________________________
> Picture this ?share your photos and you could win big!  
> http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us
> 
> > _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


Best wishes!

Sincerely, Pengcheng


      ___________________________________________________________ 
????????3.5G???20M??? 
http://cn.mail.yahoo.com


From drummike at gmail.com  Tue Jun 12 15:49:36 2007
From: drummike at gmail.com (Mike Williams)
Date: Tue, 12 Jun 2007 11:49:36 -0400
Subject: [Bioperl-l]
	=?GB2312?B?UmU6IFtCaW9wZXJsLWxdILvYuLSjuiBSZTogYmFzaWMgcXVlc3Rpb25z?=
In-Reply-To: <936780.8655.qm@web15215.mail.cnb.yahoo.com>
References: <BAY106-F120C708A32F5077BA4DE68B4190@phx.gbl>
	<936780.8655.qm@web15215.mail.cnb.yahoo.com>
Message-ID: <bc95ab8d0706120849qc60ee50qf743f4a7342580e1@mail.gmail.com>

On 6/12/07, Pengcheng Yang <pengchy at yahoo.com.cn> wrote:
> I got the same questions.
> I guess that the swissprote database has some problems!
> code:
> use Bio::DB::SwissProt;
> $sp = new Bio::DB::SwissProt;
> $seq = $sp->get_Seq_by_id('KPY1_ECOLI');
> print ref($seq),"\t",$seq->display_id,"\n"
> ------------- EXCEPTION  -------------
> MSG: swissprot stream with no ID. Not swissprot in my book
> STACK toplevel t.pl:7

This is a different problem.  The id was not valid.  If you change
KPY1 to KPYK1 it works fine.

$seq = $sp->get_Seq_by_id('KPYK1_ECOLI');
print ref($seq),"\t",$seq->display_id,"\n"
[mike at Wheatley]$ ./bio_quest2.pl

Bio::Seq::RichSeq       KPYK1_ECOLI

If you got this example from the bio perl site would you please post
the url?  Seems to me this same problem has come up before, but I
could not find it in the archives nor on the web site.

Mike


From ryanx07 at hotmail.com  Tue Jun 12 15:42:28 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 10:42:28 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>

I tested another code (the 2nd test on the same machine) from the tutorial 
and got error again. I don't know what happened and please help.
Thanks so much.

===========================================================Code:
use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection;
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection){
   print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";
   # prints name, recognition site, overhang
}
=========================================== Results:

C:\~Scripts>perl t9.pl
Can't use string ("Bio::Restriction::EnzymeCollecti") as a HASH ref while 
"stric
t refs" in use at C:/Perl/site/lib/Bio/Restriction/EnzymeCollection.pm line 
236.


= = = Original message = = =

On Jun 11, 2007, at 11:45 AM, L Xu wrote:

   I used WinXP with BioPerl Inst_version 2.1.8 (Bioperl 1.5.2) and? 
activeperl 5.8.8.819 Thank you very much.

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Need a break? Find your escape route with Live Search Maps. 
http://maps.live.com/default.aspx?ss=Restaurants~Hotels~Amusement%20Park&cp=33.832922~-117.915659&style=r&lvl=13&tilt=-90&dir=0&alt=-1000&scene=1118863&encType=1&FORM=MGAC01


From limericksean at gmail.com  Tue Jun 12 16:04:40 2007
From: limericksean at gmail.com (Sean O'Keeffe)
Date: Tue, 12 Jun 2007 18:04:40 +0200
Subject: [Bioperl-l] gff2xml
Message-ID: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>

Hi all,
I posted this on the gbrowse list earlier. I'm looking to convert gff
data files into xml. Does anyone know of a module written to do this
already?

respect,
sean.


From johnsonm at gmail.com  Tue Jun 12 16:10:45 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 11:10:45 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
Message-ID: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>

On 6/12/07, Torsten Seemann <torsten.seemann at infotech.monash.edu.au> wrote:
> Can you use the ->spliced_seq() method to do this?
>
> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/SeqFeatureI.html#POD11
>
> --
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> --Tel +61 3 9905 9010

    Actually, I'd forgotten about spliced_seq().  That seems like it
will Do The Right Thing.  It's just up to the invoker to call
spliced_seq() instead of seq() as appropriate.
    So, is there any other code that will break if I modify
Bio::SeqFeature::Gene::Exon::location to not throw an exception when
encountering Bio::Location::SplitLocationI?  I'm wondering if it's
just a paranoid check or if it's there to guard against something.  If
the latter, I need to know what code to fix.  I'll dig and look, but
if anybody knows or has an idea, save me some time.  I suppose I can
just change it and see what tests start failing. 8)


From dmessina at wustl.edu  Tue Jun 12 16:11:36 2007
From: dmessina at wustl.edu (David Messina)
Date: Tue, 12 Jun 2007 11:11:36 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>
References: <BAY106-F277321F382D18F01FE6C77B4190@phx.gbl>
Message-ID: <30B8F841-E694-4577-8C15-8703E846CDFE@wustl.edu>

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps  
Perl wasn't seeing the second argument to get_sequence. And then your  
new program has the error 'Can't use string  
("Bio::Restriction::EnzymeCollecti")' where the end of the word is  
cut off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.  Are  
there any example scripts that come with ActivePerl? If there are,  
and they run correctly, perhaps you could look to see how the line  
breaks are done and make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem --  
anyone else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall BioPerl  
and make sure that you run the full test suite and that all of the  
tests pass. My guess is that something in your current setup is not  
quite right.

Dave


From cjfields at uiuc.edu  Tue Jun 12 16:42:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 11:42:29 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
Message-ID: <E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>


On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:

> On 6/12/07, Torsten Seemann  
> <torsten.seemann at infotech.monash.edu.au> wrote:
>> Can you use the ->spliced_seq() method to do this?
>>
>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ 
>> SeqFeatureI.html#POD11
>>
>> --
>> --Torsten Seemann
>> --Victorian Bioinformatics Consortium, Monash University
>> --Tel +61 3 9905 9010
>
>     Actually, I'd forgotten about spliced_seq().  That seems like it
> will Do The Right Thing.  It's just up to the invoker to call
> spliced_seq() instead of seq() as appropriate.
>     So, is there any other code that will break if I modify
> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> just a paranoid check or if it's there to guard against something.  If
> the latter, I need to know what code to fix.  I'll dig and look, but
> if anybody knows or has an idea, save me some time.  I suppose I can
> just change it and see what tests start failing. 8)

I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to  
describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs  
state that the Exon class is used to specifically describe exons, as  
the name implies.  Exons are primarily eukaryotic in origin, so you  
shouldn't encounter wraparounds, and should not have split locations  
by definition (which likely explains the exception).

Wouldn't a SeqFeature::Generic work just as well using a split location?

chris


From johnsonm at gmail.com  Tue Jun 12 16:59:54 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 11:59:54 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
Message-ID: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>

    That's a good point.  Both Bio::Tools::Glimmer and
Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
prokaryotic sequence (multiple exons for eukaryotic).  There are
eukaryotic and prokaryotic versions of both predictor families.  Maybe
the most elegant solution would be to simply modify both modules to
only emit Bio::SeqFeature::Generic features when operating on
prokaryotic mode output?  Fix the data model and the problem goes
away.  8)

On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>
> > On 6/12/07, Torsten Seemann
> > <torsten.seemann at infotech.monash.edu.au> wrote:
> >> Can you use the ->spliced_seq() method to do this?
> >>
> >> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
> >> SeqFeatureI.html#POD11
> >>
> >> --
> >> --Torsten Seemann
> >> --Victorian Bioinformatics Consortium, Monash University
> >> --Tel +61 3 9905 9010
> >
> >     Actually, I'd forgotten about spliced_seq().  That seems like it
> > will Do The Right Thing.  It's just up to the invoker to call
> > spliced_seq() instead of seq() as appropriate.
> >     So, is there any other code that will break if I modify
> > Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> > encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> > just a paranoid check or if it's there to guard against something.  If
> > the latter, I need to know what code to fix.  I'll dig and look, but
> > if anybody knows or has an idea, save me some time.  I suppose I can
> > just change it and see what tests start failing. 8)
>
> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
> state that the Exon class is used to specifically describe exons, as
> the name implies.  Exons are primarily eukaryotic in origin, so you
> shouldn't encounter wraparounds, and should not have split locations
> by definition (which likely explains the exception).
>
> Wouldn't a SeqFeature::Generic work just as well using a split location?
>
> chris
>


From ryanx07 at hotmail.com  Tue Jun 12 17:17:18 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 12:17:18 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>

I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8 build 820.
However, both scripts generated the same error with my computer. I tested 
the code in another WinXP computer with the same versions of activePerl and 
BioPerl, the one for the swissprot did work but the restriction enzyme 
generated the same error.

= = = Original message = = =

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps? Perl 
wasn't seeing the second argument to get_sequence. And then your? new 
program has the error 'Can't use string? 
("Bio::Restriction::EnzymeCollecti")' where the end of the word is? cut off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.? Are? there 
any example scripts that come with ActivePerl? If there are,? and they run 
correctly, perhaps you could look to see how the line? breaks are done and 
make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem --? anyone 
else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall BioPerl? and 
make sure that you run the full test suite and that all of the? tests pass. 
My guess is that something in your current setup is not? quite right.

Dave

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Tue Jun 12 17:51:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 12:51:47 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>
References: <BAY106-F19A3F4E0FD58F28A6CD765B4190@phx.gbl>
Message-ID: <D01CF97A-FE62-4E40-A3DD-FAFD97D8BA45@uiuc.edu>

This is an instance where 'use strict' would have shown the problem  
right away.  You left off your constructor call:

my $all_collection = Bio::Restriction::EnzymeCollection;

should be

my $all_collection = Bio::Restriction::EnzymeCollection->new;

chris

On Jun 12, 2007, at 12:17 PM, L Xu wrote:

> I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8  
> build 820.
> However, both scripts generated the same error with my computer. I  
> tested
> the code in another WinXP computer with the same versions of  
> activePerl and
> BioPerl, the one for the swissprot did work but the restriction enzyme
> generated the same error.
>
> = = = Original message = = =
>
> Hmm, it almost looks like you're having an issue with line breaks.
>
> The 'swissprot stream with no ID' error made me think that perhaps?  
> Perl
> wasn't seeing the second argument to get_sequence. And then your? new
> program has the error 'Can't use string?
> ("Bio::Restriction::EnzymeCollecti")' where the end of the word is?  
> cut off.
>
> I don't know how ActivePerl handles Windows vs UNIX line breaks.?  
> Are? there
> any example scripts that come with ActivePerl? If there are,? and  
> they run
> correctly, perhaps you could look to see how the line? breaks are  
> done and
> make sure the your program does it the same way.
>
> Other than that, I'm not seeing an obvious answer to your problem  
> --? anyone
> else have a suggestion?
>
> Perhaps the easiest thing for you to do would be to reinstall  
> BioPerl? and
> make sure that you run the full test suite and that all of the?  
> tests pass.
> My guess is that something in your current setup is not? quite right.
>
> Dave
>
> ___________________________________________________________
> Sent by ePrompter, the premier email notification software.
> Free download at http://www.ePrompter.com.
>
> _________________________________________________________________
> Get a preview of Live Earth, the hottest event this summer - only  
> on MSN
> http://liveearth.msn.com?source=msntaglineliveearthhm
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ryanx07 at hotmail.com  Tue Jun 12 18:11:15 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Tue, 12 Jun 2007 13:11:15 -0500
Subject: [Bioperl-l] basic questions
Message-ID: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>

Thank you very much, it did make the script advanced a bit but I got the 
following error:

C:\~Scripts>perl t9.pl
Can't locate object method "name" via package 
"Bio::Restriction::EnzymeCollectio
n" at t9.pl line 5, <DATA> line 532.

I checked the documentation , there is no "name" method for the package. 
Thanks.

= = = Original message = = =

This is an instance where 'use strict' would have shown the problem? right 
away.? You left off your constructor call:

my $all_collection = Bio::Restriction::EnzymeCollection;

should be

my $all_collection = Bio::Restriction::EnzymeCollection->new;

chris

On Jun 12, 2007, at 12:17 PM, L Xu wrote:


   I reinstalled activePerl and BioPerl, now the activePerl is 5.8.8? build 
820.
However, both scripts generated the same error with my computer. I? tested
the code in another WinXP computer with the same versions of? activePerl and
BioPerl, the one for the swissprot did work but the restriction enzyme
generated the same error.

= = = Original message = = =

Hmm, it almost looks like you're having an issue with line breaks.

The 'swissprot stream with no ID' error made me think that perhaps?? Perl
wasn't seeing the second argument to get_sequence. And then your? new
program has the error 'Can't use string?
("Bio::Restriction::EnzymeCollecti")' where the end of the word is?? cut 
off.

I don't know how ActivePerl handles Windows vs UNIX line breaks.?? Are? 
there
any example scripts that come with ActivePerl? If there are,? and? they run
correctly, perhaps you could look to see how the line? breaks are? done and
make sure the your program does it the same way.

Other than that, I'm not seeing an obvious answer to your problem? --? 
anyone
else have a suggestion?

Perhaps the easiest thing for you to do would be to reinstall? BioPerl? and
make sure that you run the full test suite and that all of the?? tests pass.
My guess is that something in your current setup is not? quite right.

Dave

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only? on MSN
http://liveearth.msn.com?source=msntaglineliveearthhm

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Tue Jun 12 18:35:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 13:35:15 -0500
Subject: [Bioperl-l] basic questions
In-Reply-To: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>
References: <BAY106-F317762269B8D57367D89F3B4190@phx.gbl>
Message-ID: <287E93E2-1902-4796-971E-B1DCA805D032@uiuc.edu>

Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme  
objects, each with its own name().  Using grouped methods like  
'$collection->cutters(6)' will retrieve a new EnzymeCollection  
containing all six-cutters from the original collection.  You should  
use one of the EnzymeCollection accessor methods to retrieve the  
enzyme that you wanted first or iterate through them all.  This works  
for me:

use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection->new();
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection->each_enzyme){
    print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";
}

chris

On Jun 12, 2007, at 1:11 PM, L Xu wrote:

> Thank you very much, it did make the script advanced a bit but I  
> got the following error:
>
> C:\~Scripts>perl t9.pl
> Can't locate object method "name" via package  
> "Bio::Restriction::EnzymeCollectio
> n" at t9.pl line 5, <DATA> line 532.
>
> I checked the documentation , there is no "name" method for the  
> package. Thanks.


From johnsonm at gmail.com  Tue Jun 12 19:07:57 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Tue, 12 Jun 2007 14:07:57 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
Message-ID: <ebf5eb170706121207p4ad86a6cr9af85e766168cfbe@mail.gmail.com>

I'll wait a day, and if there is no opinion to the contrary, implement
it this way.

On 6/12/07, Mark Johnson <johnsonm at gmail.com> wrote:
>     That's a good point.  Both Bio::Tools::Glimmer and
> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
> prokaryotic sequence (multiple exons for eukaryotic).  There are
> eukaryotic and prokaryotic versions of both predictor families.  Maybe
> the most elegant solution would be to simply modify both modules to
> only emit Bio::SeqFeature::Generic features when operating on
> prokaryotic mode output?  Fix the data model and the problem goes
> away.  8)


From torsten.seemann at infotech.monash.edu.au  Wed Jun 13 00:18:27 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 13 Jun 2007 10:18:27 +1000
Subject: [Bioperl-l] gff2xml
In-Reply-To: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
Message-ID: <a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>

Sean

> I posted this on the gbrowse list earlier. I'm looking to convert gff
> data files into xml. Does anyone know of a module written to do this
> already?

What DTD do you want the XML to conform to?
eg. ChadoXML, TinySeq XML, TIGR XML ... ?

What program are you trying to get to load the XML?

BioPerl has some Bio::SeqIO:xxxxx modules for some XML formats that
you could use. There is a script "bp_seqconvert.pl -h" which comes
with BioPerl which may be useful.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From hlapp at gmx.net  Wed Jun 13 00:55:57 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 12 Jun 2007 20:55:57 -0400
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
Message-ID: <0915FAB4-E554-4E65-BA3F-1B916F0F95FC@gmx.net>

I think it was just trying to guard against people trying to do  
stupid things.

I'm actually not sure that representing locations on a circular  
genome using split locations really is the best thing. I'm wondering  
whether one shouldn't rather introduce a CircularLocation object  
(though obviously it isn't the location that's circular...).

Just a thought. In the end, if you have a way to make this work that  
you feel comfortable with than go for it.

	-hilmar

On Jun 12, 2007, at 12:10 PM, Mark Johnson wrote:

> On 6/12/07, Torsten Seemann  
> <torsten.seemann at infotech.monash.edu.au> wrote:
>> Can you use the ->spliced_seq() method to do this?
>>
>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/ 
>> SeqFeatureI.html#POD11
>>
>> --
>> --Torsten Seemann
>> --Victorian Bioinformatics Consortium, Monash University
>> --Tel +61 3 9905 9010
>
>     Actually, I'd forgotten about spliced_seq().  That seems like it
> will Do The Right Thing.  It's just up to the invoker to call
> spliced_seq() instead of seq() as appropriate.
>     So, is there any other code that will break if I modify
> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
> just a paranoid check or if it's there to guard against something.  If
> the latter, I need to know what code to fix.  I'll dig and look, but
> if anybody knows or has an idea, save me some time.  I suppose I can
> just change it and see what tests start failing. 8)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Jun 13 00:57:06 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 12 Jun 2007 20:57:06 -0400
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
Message-ID: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>

I like that. Don't force a model to do what you want if it doesn't  
really apply anyway.

	-hilmar

On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote:

>     That's a good point.  Both Bio::Tools::Glimmer and
> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
> prokaryotic sequence (multiple exons for eukaryotic).  There are
> eukaryotic and prokaryotic versions of both predictor families.  Maybe
> the most elegant solution would be to simply modify both modules to
> only emit Bio::SeqFeature::Generic features when operating on
> prokaryotic mode output?  Fix the data model and the problem goes
> away.  8)
>
> On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>
>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>>
>>> On 6/12/07, Torsten Seemann
>>> <torsten.seemann at infotech.monash.edu.au> wrote:
>>>> Can you use the ->spliced_seq() method to do this?
>>>>
>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
>>>> SeqFeatureI.html#POD11
>>>>
>>>> --
>>>> --Torsten Seemann
>>>> --Victorian Bioinformatics Consortium, Monash University
>>>> --Tel +61 3 9905 9010
>>>
>>>     Actually, I'd forgotten about spliced_seq().  That seems like it
>>> will Do The Right Thing.  It's just up to the invoker to call
>>> spliced_seq() instead of seq() as appropriate.
>>>     So, is there any other code that will break if I modify
>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception when
>>> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
>>> just a paranoid check or if it's there to guard against  
>>> something.  If
>>> the latter, I need to know what code to fix.  I'll dig and look, but
>>> if anybody knows or has an idea, save me some time.  I suppose I can
>>> just change it and see what tests start failing. 8)
>>
>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
>> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
>> state that the Exon class is used to specifically describe exons, as
>> the name implies.  Exons are primarily eukaryotic in origin, so you
>> shouldn't encounter wraparounds, and should not have split locations
>> by definition (which likely explains the exception).
>>
>> Wouldn't a SeqFeature::Generic work just as well using a split  
>> location?
>>
>> chris
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 13 01:20:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 12 Jun 2007 20:20:41 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Gene::Exon throws exception when
	encountering split location (Bio::Location::Split)
In-Reply-To: <80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>
References: <ebf5eb170706111745n78f1a6cesdc4008b1cef43eb5@mail.gmail.com>
	<a79f6a4b0706112318j6e1a1a29j6a287ff0c1a6475b@mail.gmail.com>
	<ebf5eb170706120910y47f32632j3bf68699ea1ca3be@mail.gmail.com>
	<E1CDA356-B327-4878-AEF5-65FEEDF0639C@uiuc.edu>
	<ebf5eb170706120959xd874006wdb4058b07f941fd4@mail.gmail.com>
	<80EAA2F1-B2DA-45F0-B591-8534C356E679@gmx.net>
Message-ID: <951EB9CA-2066-4CD1-BCD5-4E00232CA507@uiuc.edu>

It will be interesting to see if bioperl handles wrap-around split  
locations via spliced_seq() and other methods.  I can't see why it  
wouldn't but one never knows.  Might be something to add to location  
tests at some point...

chris

On Jun 12, 2007, at 7:57 PM, Hilmar Lapp wrote:

> I like that. Don't force a model to do what you want if it doesn't
> really apply anyway.
>
> 	-hilmar
>
> On Jun 12, 2007, at 12:59 PM, Mark Johnson wrote:
>
>>     That's a good point.  Both Bio::Tools::Glimmer and
>> Bio::Tools::Genemark produce Bio::SeqFeature::Gene objects, each with
>> a single Bio::SeqFeature::Gene::Exon, when parsing predictions for
>> prokaryotic sequence (multiple exons for eukaryotic).  There are
>> eukaryotic and prokaryotic versions of both predictor families.   
>> Maybe
>> the most elegant solution would be to simply modify both modules to
>> only emit Bio::SeqFeature::Generic features when operating on
>> prokaryotic mode output?  Fix the data model and the problem goes
>> away.  8)
>>
>> On 6/12/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>>
>>> On Jun 12, 2007, at 11:10 AM, Mark Johnson wrote:
>>>
>>>> On 6/12/07, Torsten Seemann
>>>> <torsten.seemann at infotech.monash.edu.au> wrote:
>>>>> Can you use the ->spliced_seq() method to do this?
>>>>>
>>>>> http://doc.bioperl.org/releases/bioperl-1.5.2/Bio/
>>>>> SeqFeatureI.html#POD11
>>>>>
>>>>> --
>>>>> --Torsten Seemann
>>>>> --Victorian Bioinformatics Consortium, Monash University
>>>>> --Tel +61 3 9905 9010
>>>>
>>>>     Actually, I'd forgotten about spliced_seq().  That seems  
>>>> like it
>>>> will Do The Right Thing.  It's just up to the invoker to call
>>>> spliced_seq() instead of seq() as appropriate.
>>>>     So, is there any other code that will break if I modify
>>>> Bio::SeqFeature::Gene::Exon::location to not throw an exception  
>>>> when
>>>> encountering Bio::Location::SplitLocationI?  I'm wondering if it's
>>>> just a paranoid check or if it's there to guard against
>>>> something.  If
>>>> the latter, I need to know what code to fix.  I'll dig and look,  
>>>> but
>>>> if anybody knows or has an idea, save me some time.  I suppose I  
>>>> can
>>>> just change it and see what tests start failing. 8)
>>>
>>> I'm wondering why you want to use Bio::SeqFeature::Gene::Exon to
>>> describe the 'wrap-around' genes.  The SeqFeature::Gene::Exon docs
>>> state that the Exon class is used to specifically describe exons, as
>>> the name implies.  Exons are primarily eukaryotic in origin, so you
>>> shouldn't encounter wraparounds, and should not have split locations
>>> by definition (which likely explains the exception).
>>>
>>> Wouldn't a SeqFeature::Generic work just as well using a split
>>> location?
>>>
>>> chris
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ryanx07 at hotmail.com  Wed Jun 13 12:16:15 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Wed, 13 Jun 2007 07:16:15 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
Message-ID: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>

Thanks so much, Chris, it works now.
All the codes I tested were copied from Bioperl Tutorial. Why did they have 
such problems, because of the platform issue or different versions of 
BioPerl? I tested so far 6 scripts, three work and three don't.

Here is the problem for the 3rd failed script:
=================================
use strict;
use Bio::Tools::Run::RemoteBlast;
my $remote_blast = Bio::Tools::Run::RemoteBlast->new (
         -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' );
my $r = $remote_blast->submit_blast("d1.fa");
my $rc;
while ( my @rids = $remote_blast->each_rid ) {
    for my $rid ( @rids ) {
       $rc = $remote_blast->retrieve_blast($rid);
    }
}
print "$rc\n"; #I just want to print sth here before parsing the result
=========================================================d1.fa
>example
CCCTTCAGGTACCCCGAGGTAACACGAGACACTCGGGATCTGGGAAGGGGACTGGGGCTTCTTTAAAAGCGCTCAGTTTAAAAAGCTTCTATGCCTGAATAGGTGACCGGAGGCCGGCACC
=========================================================result
C:\>perl t13.pl

-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------

-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------
Terminating on signal SIGINT(2)

C:\>


Please help me to correct the problem, thanks.


= = = Original message = = =

Bio::Restriction::EnzymeCollection holds Bio::Restriction::Enzyme? objects, 
each with its own name().? Using grouped methods like? 
'$collection->cutters(6)' will retrieve a new EnzymeCollection? containing 
all six-cutters from the original collection.? You should? use one of the 
EnzymeCollection accessor methods to retrieve the? enzyme that you wanted 
first or iterate through them all.? This works? for me:

use Bio::Restriction::EnzymeCollection;
my $all_collection = Bio::Restriction::EnzymeCollection->new();
my $six_cutter_collection = $all_collection->cutters(6);
for my $enz ($six_cutter_collection->each_enzyme)
?? print $enz->name,"\t",$enz->site,"\t",$enz->overhang_seq,"\n";


chris

On Jun 12, 2007, at 1:11 PM, L Xu wrote:


   Thank you very much, it did make the script advanced a bit but I? got the 
following error:

C:\~Scripts>perl t9.pl
Can't locate object method "name" via package? 
"Bio::Restriction::EnzymeCollectio
n" at t9.pl line 5, <DATA> line 532.

I checked the documentation , there is no "name" method for the? package. 
Thanks.

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Make every IM count. Download Messenger and join the i?m Initiative now. 
It?s free. http://im.live.com/messenger/im/home/?source=TAGHM_June07


From cjfields at uiuc.edu  Wed Jun 13 14:41:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 09:41:55 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
Message-ID: <4F7BE556-BD8C-4378-BDE7-1F31364F49DA@uiuc.edu>

Judging by the output it looks like you have no network access or  
can't connect to the server (what remoteblast needs).  Make sure you  
don't need proxy settings.

To preempt the next question, no, I'm not going to explain what a  
proxy is.  The RemoteBlast docs show how to set them, and Google is a  
wonderful tool...

chris

On Jun 13, 2007, at 7:16 AM, L Xu wrote:

> ...
> -------------------- WARNING ---------------------
> MSG: <HTML>
> <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> <BODY>
> <H1>An Error Occurred</H1>
> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> </BODY>
> </HTML>
>
> ---------------------------------------------------
> ...


From ryanx07 at hotmail.com  Wed Jun 13 15:01:07 2007
From: ryanx07 at hotmail.com (L Xu)
Date: Wed, 13 Jun 2007 10:01:07 -0500
Subject: [Bioperl-l] Example code in Bioperl Tutorial
Message-ID: <BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>

I do have the internet connection bu not use the proxy server.
I tested the network connection with ping command (below). The ncbi website 
does not response. Is there any special network setting needed for 
connecting the ncbi website?
Thank you so much.

C:\>ping www.yahoo.com

Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:

Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
Reply from 69.147.114.210: bytes=32 time=360ms TTL=45

Ping statistics for 69.147.114.210:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 312ms, Maximum = 363ms, Average = 338ms

C:\>ping www.ncbi.nlm.nih.gov

Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:

Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 130.14.29.110:
    Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),


= = = Original message = = =

Judging by the output it looks like you have no network access or? can't 
connect to the server (what remoteblast needs).? Make sure you? don't need 
proxy settings.

To preempt the next question, no, I'm not going to explain what a? proxy 
is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
tool...

chris

On Jun 13, 2007, at 7:16 AM, L Xu wrote:


   ...
-------------------- WARNING ---------------------
MSG: <HTML>
<HEAD><TITLE>An Error Occurred</TITLE></HEAD>
<BODY>
<H1>An Error Occurred</H1>
500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
</BODY>
</HTML>

---------------------------------------------------
...

___________________________________________________________
Sent by ePrompter, the premier email notification software.
Free download at http://www.ePrompter.com.

_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN 
http://liveearth.msn.com?source=msntaglineliveearthhm


From cjfields at uiuc.edu  Wed Jun 13 16:14:22 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 11:14:22 -0500
Subject: [Bioperl-l] method naming
Message-ID: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>

Some quick questions on method naming.  I couldn't find this on the  
mail list previously and just want some opinions.

1) Is there any preference on how to name a method that returns a  
list of class instances vs. data?  I have seen 'each' (each_Location,  
each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.  
simple (hits, hsps).

2) Do we want have methods which return objects have the object name  
in Title Case (each_Location, get_Seq_by_id, etc) or does it really  
matter?

chris


From dmessina at wustl.edu  Wed Jun 13 16:41:53 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 13 Jun 2007 11:41:53 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
Message-ID: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>

> 1) Is there any preference on how to name a method that returns a
> list of class instances vs. data?  I have seen 'each' (each_Location,
> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
> simple (hits, hsps).

I'd prefer 'get_all' because it's more intuitive to me what the  
method is doing. 'Each' is too programmer-y.


> 2) Do we want have methods which return objects have the object name
> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
> matter?

I like Title Case because it reinforces the notion that what you're  
getting back is a specific object with that name (Seq) rather than  
the generic thing that the name represents (AGTCTGTGATAT, the actual  
sequence as a string).


Dave


From hlapp at gmx.net  Wed Jun 13 17:03:59 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 13:03:59 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
Message-ID: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>

We set a convention a while back on how to name these. It is  
implemented in the bioperl.lisp file (too bad no one is using emacs  
any more these days - it's a great editor), and in fact we started a  
renaming campaign (not sure when that was) on the SeqI and  
SeqFeatureI classes (you'll still see the old names aliased).

However, we never got to finish the clean up.

The convention was to use get_{ClassName}s, and get_all_{ClassName}s  
if there is a difference to the former (mostly because of  
hierarchical data; for example features can be nested, and  
get_all_SeqFeatures returns them all flattened out, while  
get_SeqFeatures returns only the top objects), and for modifying add_ 
{ClassName} and remove_{ClassName}s.

The class name was to be in title case to emphasize the fact that it  
is an array of object you'd be getting back (and what kind of  
objects). If it is strings or any other scalar type, the name would  
be in lower case.

	-hilmar

On Jun 13, 2007, at 12:14 PM, Chris Fields wrote:

> Some quick questions on method naming.  I couldn't find this on the
> mail list previously and just want some opinions.
>
> 1) Is there any preference on how to name a method that returns a
> list of class instances vs. data?  I have seen 'each' (each_Location,
> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
> simple (hits, hsps).
>
> 2) Do we want have methods which return objects have the object name
> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
> matter?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 13 17:19:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 12:19:43 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
Message-ID: <B7E2E5CA-3027-4D25-B9EA-998D2BC59DBB@uiuc.edu>

Sounds good.  I agree with Dave also one the use of 'each', as it's a  
bit ambiguous (seems to imply iteration as opposed to returning a  
whole list).

We probably need to post this somewhere on the wiki for future  
reference; maybe in Advanced BioPerl?  I'll add this in shortly.

chris

On Jun 13, 2007, at 12:03 PM, Hilmar Lapp wrote:

> We set a convention a while back on how to name these. It is  
> implemented in the bioperl.lisp file (too bad no one is using emacs  
> any more these days - it's a great editor), and in fact we started  
> a renaming campaign (not sure when that was) on the SeqI and  
> SeqFeatureI classes (you'll still see the old names aliased).
>
> However, we never got to finish the clean up.
>
> The convention was to use get_{ClassName}s, and get_all_{ClassName} 
> s if there is a difference to the former (mostly because of  
> hierarchical data; for example features can be nested, and  
> get_all_SeqFeatures returns them all flattened out, while  
> get_SeqFeatures returns only the top objects), and for modifying  
> add_{ClassName} and remove_{ClassName}s.
>
> The class name was to be in title case to emphasize the fact that  
> it is an array of object you'd be getting back (and what kind of  
> objects). If it is strings or any other scalar type, the name would  
> be in lower case.
>
> 	-hilmar
>
> On Jun 13, 2007, at 12:14 PM, Chris Fields wrote:
>
>> Some quick questions on method naming.  I couldn't find this on the
>> mail list previously and just want some opinions.
>>
>> 1) Is there any preference on how to name a method that returns a
>> list of class instances vs. data?  I have seen 'each' (each_Location,
>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
>> simple (hits, hsps).
>>
>> 2) Do we want have methods which return objects have the object name
>> in Title Case (each_Location, get_Seq_by_id, etc) or does it really
>> matter?
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Jun 13 18:43:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 13:43:41 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <467036FC.8000505@watson.wustl.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
	<467036FC.8000505@watson.wustl.edu>
Message-ID: <286EE81C-0926-4AAE-9110-02948DFADF36@uiuc.edu>


On Jun 13, 2007, at 1:27 PM, Michael Kiwala wrote:

>
> David Messina wrote:
>>> 1) Is there any preference on how to name a method that returns a
>>> list of class instances vs. data?  I have seen  
>>> 'each' (each_Location,
>>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures)  
>>> vs.
>>> simple (hits, hsps).
>>>
>>
>> I'd prefer 'get_all' because it's more intuitive to me what the   
>> method is doing. 'Each' is too programmer-y.
>>
>>
>>
> When I think 'get_all', I think of a method that returns a list of  
> objects at once. When I think of 'each', I think of a method that  
> returns a scalar but can be called multiple times to iterate over a  
> set of objects.

Yep, hence the ambiguity issue (and my confusion).  I think it was so  
you could both iterate and return a list using this:

for my $obj ($seq->each_Class) {...}
my @objs = $seq->each_Class;

I use 'next' and 'get/get_all' as an iterator and get accessor  
(similar to how it's used in Bio::SearchIO):

while (my $obj = $seq->next_Class) {...}
my @objs = $seq->get_Class; # or get_all_Class for flattened lists

which to me is much clearer.

chris


From mkiwala at watson.wustl.edu  Wed Jun 13 18:27:08 2007
From: mkiwala at watson.wustl.edu (Michael Kiwala)
Date: Wed, 13 Jun 2007 13:27:08 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<9A51046D-4827-4CF7-A2B7-7880E03129E9@wustl.edu>
Message-ID: <467036FC.8000505@watson.wustl.edu>


David Messina wrote:
>> 1) Is there any preference on how to name a method that returns a
>> list of class instances vs. data?  I have seen 'each' (each_Location,
>> each_tag_value) vs. 'get_all' (get_all_tags, get_all_SeqFeatures) vs.
>> simple (hits, hsps).
>>     
>
> I'd prefer 'get_all' because it's more intuitive to me what the  
> method is doing. 'Each' is too programmer-y.
>
>
>   
When I think 'get_all', I think of a method that returns a list of 
objects at once. When I think of 'each', I think of a method that 
returns a scalar but can be called multiple times to iterate over a set 
of objects.


From sac at bioperl.org  Wed Jun 13 21:17:27 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Wed, 13 Jun 2007 14:17:27 -0700
Subject: [Bioperl-l] method naming
In-Reply-To: <4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
Message-ID: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>

On 6/13/07, Hilmar Lapp <hlapp at gmx.net> wrote:
> We set a convention a while back on how to name these. It is
> implemented in the bioperl.lisp file (too bad no one is using emacs
> any more these days - it's a great editor),

As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we
could improve the visibility of bioperl.lisp. In truth, I had
forgotten about it, though lit turns out I was loading an old version
of it. (Btw, using the latest version of bioperl.lisp with xemacs
21.4.17, I don't get a bioperl menu item, though I can access bioperl
functions via M-x. Suggestions?)

I see bioperl.lisp is mentioned twice parenthetically in the advanced
bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here
would help. While we're at it, maybe we could add a bioperl.vi file to
the distribution (if you can do such things with vi/vim).

On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
> We probably need to post this somewhere on the wiki for future
> reference; maybe in Advanced BioPerl?  I'll add this in shortly.

Another idea: Add a method naming check to the set of audits we
perform on CVS committed code. It could check for agreement with our
conventions and warn if nothing was found (may not be a problem
though).

Steve


From arareko at campus.iztacala.unam.mx  Wed Jun 13 22:03:34 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 13 Jun 2007 17:03:34 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
Message-ID: <467069B6.7080003@campus.iztacala.unam.mx>

By the time of the 1.5.2 release, I jumped onto the idea of creating a 
BioPerl template for Komodo. Chris F handed me one he had already made 
but in the end I didn't had enough spare time to get into it. If someone 
wants to give it a try please let ChrisF/me know.

Regards,
Mauricio.

Steve Chervitz wrote:
> On 6/13/07, Hilmar Lapp <hlapp at gmx.net> wrote:
>> We set a convention a while back on how to name these. It is
>> implemented in the bioperl.lisp file (too bad no one is using emacs
>> any more these days - it's a great editor),
> 
> As a staunch xemacs-ophile I couldn't let that one slip by. Maybe we
> could improve the visibility of bioperl.lisp. In truth, I had
> forgotten about it, though lit turns out I was loading an old version
> of it. (Btw, using the latest version of bioperl.lisp with xemacs
> 21.4.17, I don't get a bioperl menu item, though I can access bioperl
> functions via M-x. Suggestions?)
> 
> I see bioperl.lisp is mentioned twice parenthetically in the advanced
> bioperl wiki page. Perhaps a separate 'Editor/IDE support' item here
> would help. While we're at it, maybe we could add a bioperl.vi file to
> the distribution (if you can do such things with vi/vim).
> 
> On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
>> We probably need to post this somewhere on the wiki for future
>> reference; maybe in Advanced BioPerl?  I'll add this in shortly.
> 
> Another idea: Add a method naming check to the set of audits we
> perform on CVS committed code. It could check for agreement with our
> conventions and warn if nothing was found (may not be a problem
> though).
> 
> Steve
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From hlapp at gmx.net  Wed Jun 13 22:41:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 18:41:45 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
Message-ID: <FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>


On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:

> using the latest version of bioperl.lisp with xemacs 21.4.17, I  
> don't get a bioperl menu item

I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item  
it showing up just beautifully. (BTW it also have very nice icons for  
various functions - though I always feel guilty for using keystrokes  
instead.)

Is GNU Emacs finally winning this? ;)

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jason at bioperl.org  Wed Jun 13 22:58:51 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 13 Jun 2007 15:58:51 -0700
Subject: [Bioperl-l] method naming
In-Reply-To: <FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
Message-ID: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>

Post your dualing screenshots to the wiki!

I had started a couple of IDE pages on the wiki a while ago:
  http://bioperl.org/wiki/Emacs
  http://bioperl.org/wiki/Emacs_template
  http://bioperl.org/wiki/Vi

If anyone is feeling excited enough to write a few more IDE pages and  
link them into a common article that would be great.

-jason
On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:

>
> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>
>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>> don't get a bioperl menu item
>
> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item
> it showing up just beautifully. (BTW it also have very nice icons for
> various functions - though I always feel guilty for using keystrokes
> instead.)
>
> Is GNU Emacs finally winning this? ;)
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From cjfields at uiuc.edu  Wed Jun 13 23:08:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 18:08:17 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
Message-ID: <E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>

Would probably be worth writing one up for Komodo since Mauricio,  
Sendu, and I use it.

I updated the Advanced BioPerl page with Hilmar's methods suggestions/ 
rules (as well as a few I found dating back a number of years on the  
mail list).  It might be worth a glance in case there are any changes  
needed:

http://www.bioperl.org/wiki/Advanced_BioPerl

chris

On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote:

> Post your dualing screenshots to the wiki!
>
> I had started a couple of IDE pages on the wiki a while ago:
>  http://bioperl.org/wiki/Emacs
>  http://bioperl.org/wiki/Emacs_template
>  http://bioperl.org/wiki/Vi
>
> If anyone is feeling excited enough to write a few more IDE pages  
> and link them into a common article that would be great.
>
> -jason
> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:
>
>>
>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>>
>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>>> don't get a bioperl menu item
>>
>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu item
>> it showing up just beautifully. (BTW it also have very nice icons for
>> various functions - though I always feel guilty for using keystrokes
>> instead.)
>>
>> Is GNU Emacs finally winning this? ;)
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Wed Jun 13 23:28:17 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 13 Jun 2007 19:28:17 -0400
Subject: [Bioperl-l] method naming
In-Reply-To: <E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
	<E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
Message-ID: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>

Thanks Chris for doing this - looks great. The only comment that I  
have is that method names should never start with a capital letter.  
If the getter/setter is for a single object (as opposed to a list),  
the name should probably be similar (if not identical) to the class  
being expected and returned, but lower-case.

E.g., $feature->location(), $seq->species() etc

	-hilmar

On Jun 13, 2007, at 7:08 PM, Chris Fields wrote:

> Would probably be worth writing one up for Komodo since Mauricio,  
> Sendu, and I use it.
>
> I updated the Advanced BioPerl page with Hilmar's methods  
> suggestions/rules (as well as a few I found dating back a number of  
> years on the mail list).  It might be worth a glance in case there  
> are any changes needed:
>
> http://www.bioperl.org/wiki/Advanced_BioPerl
>
> chris
>
> On Jun 13, 2007, at 5:58 PM, Jason Stajich wrote:
>
>> Post your dualing screenshots to the wiki!
>>
>> I had started a couple of IDE pages on the wiki a while ago:
>>  http://bioperl.org/wiki/Emacs
>>  http://bioperl.org/wiki/Emacs_template
>>  http://bioperl.org/wiki/Vi
>>
>> If anyone is feeling excited enough to write a few more IDE pages  
>> and link them into a common article that would be great.
>>
>> -jason
>> On Jun 13, 2007, at 3:41 PM, Hilmar Lapp wrote:
>>
>>>
>>> On Jun 13, 2007, at 5:17 PM, Steve Chervitz wrote:
>>>
>>>> using the latest version of bioperl.lisp with xemacs 21.4.17, I
>>>> don't get a bioperl menu item
>>>
>>> I'm using GNU Emacs 22.0.50.1 (as Aquamacs) and the BioPerl menu  
>>> item
>>> it showing up just beautifully. (BTW it also have very nice icons  
>>> for
>>> various functions - though I always feel guilty for using keystrokes
>>> instead.)
>>>
>>> Is GNU Emacs finally winning this? ;)
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 13 23:44:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 18:44:08 -0500
Subject: [Bioperl-l] method naming
In-Reply-To: <06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>
References: <724E5A3F-22CF-41B6-AC33-CD5EAD7D1251@uiuc.edu>
	<4E61A33B-39A9-4F43-997A-3CD7D51B69A6@gmx.net>
	<8f200b4c0706131417y3b089058sae076e6d00adf81a@mail.gmail.com>
	<FF228FE7-491E-4C21-BD82-28CCF082B029@gmx.net>
	<4D2FAB6B-49DE-443A-A040-25579B2E7212@bioperl.org>
	<E22EE74E-7C6F-46B0-9D05-BA5D02F4F6C6@uiuc.edu>
	<06AE29E7-6FFA-4F92-8BC8-39D9E48549E4@gmx.net>
Message-ID: <91AF2018-EC27-49FD-A4D1-C31C0E73DEFB@uiuc.edu>

Agreed.  We can definitely add that in.

As we edge towards another release we try another round of cleaning  
up.  I wouldn't mind pushing out another 1.5 point release before  
summer's up if possible; most of the tough work was done for v.1.5.2  
by Sendu.

chris

On Jun 13, 2007, at 6:28 PM, Hilmar Lapp wrote:

> Thanks Chris for doing this - looks great. The only comment that I
> have is that method names should never start with a capital letter.
> If the getter/setter is for a single object (as opposed to a list),
> the name should probably be similar (if not identical) to the class
> being expected and returned, but lower-case.
>
> E.g., $feature->location(), $seq->species() etc
>
> 	-hilmar
>
> On Jun 13, 2007, at 7:08 PM, Chris Fields wrote:
>
>> Would probably be worth writing one up for Komodo since Mauricio,
>> Sendu, and I use it.
>>
>> I updated the Advanced BioPerl page with Hilmar's methods
>> suggestions/rules (as well as a few I found dating back a number of
>> years on the mail list).  It might be worth a glance in case there
>> are any changes needed:
>>
>> http://www.bioperl.org/wiki/Advanced_BioPerl
>>
>> chris
...


From johncumbers at gmail.com  Thu Jun 14 00:20:42 2007
From: johncumbers at gmail.com (John Cumbers)
Date: Wed, 13 Jun 2007 20:20:42 -0400
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
Message-ID: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>

Hello,

I have a simple problem, I'm trying to search a genome sequence for a motif,
I then want to output a BED file to display all the locations of this motif
on the UCSC Genome Browser.  I could not find a script to do this, so I
started to write my own.   I'm new to perl and my code below was my attempt
to read the sequence string and output the index bp of the start of each
motif.  With this I could build the BED file myself, which requires start
and finish base pairs.

For the first motif I can output the start index, but when I try and read
the next one off the sequence it does not work.  Instead I just get an
output of a list of 1's.  I realise that this is more a request for some
simple perl help, but any help much appreciated.

Best wishes,
John


$seq_object = read_sequence("Drosophila.Chr3.test.AE014296.fasta");  #turn
my FASTA file into a seq object.
$sequence_as_a_string = $seq_object->seq();  #turn it into a string
# search $sequence_as_a_string  string for motif AAA as example
# if found, return the index that it is found at

while ($sequence_as_a_string =~ m/AAA/g) {
  print "Found '$&'.  Next attempt at character " .
pos($sequence_as_a_string)+1 . "\n";
}


-- 
John Cumbers,  Graduate Student
Biology and Medicine
Brown University, Box G-W
Providence, Rhode Island, 02912, USA
Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
UK to USA: 0207 617 7824


From cjfields at uiuc.edu  Thu Jun 14 01:58:37 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 13 Jun 2007 20:58:37 -0500
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
In-Reply-To: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
References: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
Message-ID: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>

This is answered in the FAQ (sorry if the URL wraps, but we don't  
like tinyurls):

http://www.bioperl.org/wiki/ 
FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_. 
22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F

chris

On Jun 13, 2007, at 7:20 PM, John Cumbers wrote:

> Hello,
>
> I have a simple problem, I'm trying to search a genome sequence for  
> a motif,
> I then want to output a BED file to display all the locations of  
> this motif
> on the UCSC Genome Browser.  I could not find a script to do this,  
> so I
> started to write my own.   I'm new to perl and my code below was my  
> attempt
> to read the sequence string and output the index bp of the start of  
> each
> motif.  With this I could build the BED file myself, which requires  
> start
> and finish base pairs.
>
> For the first motif I can output the start index, but when I try  
> and read
> the next one off the sequence it does not work.  Instead I just get an
> output of a list of 1's.  I realise that this is more a request for  
> some
> simple perl help, but any help much appreciated.
>
> Best wishes,
> John
>
>
> $seq_object = read_sequence 
> ("Drosophila.Chr3.test.AE014296.fasta");  #turn
> my FASTA file into a seq object.
> $sequence_as_a_string = $seq_object->seq();  #turn it into a string
> # search $sequence_as_a_string  string for motif AAA as example
> # if found, return the index that it is found at
>
> while ($sequence_as_a_string =~ m/AAA/g) {
>   print "Found '$&'.  Next attempt at character " .
> pos($sequence_as_a_string)+1 . "\n";
> }
>
>
>
> -- 
> John Cumbers,  Graduate Student
> Biology and Medicine
> Brown University, Box G-W
> Providence, Rhode Island, 02912, USA
> Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
> UK to USA: 0207 617 7824
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Thu Jun 14 04:08:04 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 13 Jun 2007 21:08:04 -0700
Subject: [Bioperl-l] wiki bulk update
Message-ID: <992B2C7A-E944-4C69-BDE0-B0B0F6D1274D@bioperl.org>

I did a some bulk update of Module pages for new modules that had  
been created since we last setup these pages:
I outlined a little bit of what it requires behind the scenes.

http://bioperl.org/wiki/BioPerl:Module_pages

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From bix at sendu.me.uk  Thu Jun 14 09:35:00 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 10:35:00 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
Message-ID: <46710BC4.3060302@sendu.me.uk>

It is preferable to have ->new syntax over new Object syntax, as 
outlined here: 
http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules

I propose making this syntax change in all Bioperl POD documentation, so 
that the bad syntax is no longer suggested/encouraged. Any objections? 
If not, I'll go ahead and commit the changes.

(affects 907 modules in live)


Cheers,
Sendu.


From bix at sendu.me.uk  Thu Jun 14 10:01:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 11:01:02 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46710BC4.3060302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
Message-ID: <467111DE.6060800@sendu.me.uk>

Sendu Bala wrote:
> It is preferable to have ->new syntax over new Object syntax, as 
> outlined here: 
> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules 
> 
> 
> I propose making this syntax change in all Bioperl POD documentation,

Actually, I propose making the change to code as well.


From hlapp at gmx.net  Thu Jun 14 12:47:47 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 14 Jun 2007 08:47:47 -0400
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <467111DE.6060800@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk>
Message-ID: <0D7CD74F-DCB3-44F8-9AC7-144B1BD58946@gmx.net>

Sounds fine to me. People do go by working examples, and I've seen  
inconsistent examples leading to confusion on the end of newbies.

	-hilmar

On Jun 14, 2007, at 6:01 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>>
>> I propose making this syntax change in all Bioperl POD documentation,
>
> Actually, I propose making the change to code as well.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Jun 14 12:55:18 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 07:55:18 -0500
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <467111DE.6060800@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <467111DE.6060800@sendu.me.uk>
Message-ID: <EC0DB8AB-F7C8-423B-9566-34B3FD24B3EC@uiuc.edu>

Sounds fine by me.  I may actually start tackling some of the feature/ 
annotation overloading stuff myself to see what happens (I'll drop a  
notice when that occurs).

chris

On Jun 14, 2007, at 5:01 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>>
>> I propose making this syntax change in all Bioperl POD documentation,
>
> Actually, I propose making the change to code as well.


From tanzeem.mb at gmail.com  Thu Jun 14 06:27:19 2007
From: tanzeem.mb at gmail.com (tanzeem)
Date: Wed, 13 Jun 2007 23:27:19 -0700 (PDT)
Subject: [Bioperl-l] Problem working with remoteblast submit method in
	webbrowser.
Message-ID: <11114623.post@talk.nabble.com>


 I have a program which uses the Bio perl remoteblast module which compares a
aminoacid  fasta file with swissprot database. The submit_blast() method 
works successfully when   run  from commandline.But when the program is run
from web browser it returns -1. I was trying to adapt the code from
Remoteblast synopsis for my need.
-- 
View this message in context: http://www.nabble.com/Problem-working-with-remoteblast-submit-method-in-webbrowser.-tf3919886.html#a11114623
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From bix at sendu.me.uk  Thu Jun 14 15:34:27 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 16:34:27 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46710BC4.3060302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
Message-ID: <46716003.2030302@sendu.me.uk>

Sendu Bala wrote:
> It is preferable to have ->new syntax over new Object syntax, as 
> outlined here: 
> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules
> 
> I propose making this syntax change in all Bioperl POD documentation, so 
> that the bad syntax is no longer suggested/encouraged. Any objections? 
> If not, I'll go ahead and commit the changes.
> 
> (affects 907 modules in live)

It was actually 515 modules & test scripts from live, 48 from run, 21
from db and 2 from network.

Now committed. Before and after my changes these were failing:


Failed Test     Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
t/BioGraphics.t    3   768    38    3  3-5
t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
                                        1932 2106
t/Sopma.t          2   512    16    2  8 15
t/genbank.t        2   512   247    2  122-123


BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
(unintentional?).

Sopma may not be a bug: results from server might have changed.

genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163
-> 1.164 not doing what the new tests expect.

PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
you working on that, or can I fix those errors?

Anyone care to look into those things?

Cheers,
Sendu.


From cjfields at uiuc.edu  Thu Jun 14 16:35:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 11:35:21 -0500
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <AAFC1021-9E3A-4C31-A9B8-4B0046F907A1@uiuc.edu>

The genbank commit was mine so I'll look into it; may be that I  
hadn't finished up the bug work.  If if have time I'll look into  
Sopma as well (unless you get to it first).

chris

On Jun 14, 2007, at 10:34 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as
>> outlined here:
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object- 
>> oriented_programming_and_modules
>>
>> I propose making this syntax change in all Bioperl POD  
>> documentation, so
>> that the bad syntax is no longer suggested/encouraged. Any  
>> objections?
>> If not, I'll go ahead and commit the changes.
>>
>> (affects 907 modules in live)
>
> It was actually 515 modules & test scripts from live, 48 from run, 21
> from db and 2 from network.
>
> Now committed. Before and after my changes these were failing:
>
>
> Failed Test     Stat Wstat Total Fail  List of Failed
> ---------------------------------------------------------------------- 
> ---------
> t/BioGraphics.t    3   768    38    3  3-5
> t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
>                                         1932 2106
> t/Sopma.t          2   512    16    2  8 15
> t/genbank.t        2   512   247    2  122-123
>
>
> BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
> (unintentional?).
>
> Sopma may not be a bug: results from server might have changed.
>
> genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm  
> 1.163
> -> 1.164 not doing what the new tests expect.
>
> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan,  
> are
> you working on that, or can I fix those errors?
>
> Anyone care to look into those things?
>
> Cheers,
> Sendu.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Thu Jun 14 16:43:43 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 17:43:43 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <4671703F.4010109@sheffield.ac.uk>

I'm just wondering if anyone passes their modules through perltidy in
order for them to have the same look/feel? If so, do you have a
.perltidyrc file? Also, is it worth running the Bioperl modules through it?

Nath


From n.haigh at sheffield.ac.uk  Thu Jun 14 16:36:37 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 17:36:37 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716003.2030302@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
Message-ID: <46716E95.3090604@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Sendu Bala wrote:
>> It is preferable to have ->new syntax over new Object syntax, as 
>> outlined here: 
>> http://www.bioperl.org/wiki/Bioperl_Best_Practices#BioPerl_Object-oriented_programming_and_modules
>>
>> I propose making this syntax change in all Bioperl POD documentation, so 
>> that the bad syntax is no longer suggested/encouraged. Any objections? 
>> If not, I'll go ahead and commit the changes.
>>
>> (affects 907 modules in live)
> 
> It was actually 515 modules & test scripts from live, 48 from run, 21
> from db and 2 from network.
> 
> Now committed. Before and after my changes these were failing:
> 
> 
> Failed Test     Stat Wstat Total Fail  List of Failed
> -------------------------------------------------------------------------------
> t/BioGraphics.t    3   768    38    3  3-5
> t/PodSyntax.t      9  2304  2195    9  378 614 660 1023 1197 1512 1558
>                                         1932 2106
> t/Sopma.t          2   512    16    2  8 15
> t/genbank.t        2   512   247    2  122-123
> 
> 
> BioGraphics failure caused by Graphics Panel.pm 1.135 -> 1.136
> (unintentional?).
> 
> Sopma may not be a bug: results from server might have changed.
> 
> genbank caused by genbank.t 1.20 -> 1.21 and presumably genbank.pm 1.163
> -> 1.164 not doing what the new tests expect.
> 
> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
> you working on that, or can I fix those errors?
> 

I can fix these - although I'm still trying to get my new Debian 4.0
system up-to-speed so it might take me a little while! RE the
PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't
installed. However, would it be better to have Test::Pod in t/lib so
that it runs on the user's system during installation or leave it as is?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGcW6VczuW2jkwy2gRAv3dAKCURgd4F881MhbessKxNh/cPrJu2wCeLwnS
7olroF2e6+4I0biz6fWRmu4=
=s3hK
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Thu Jun 14 17:15:24 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 18:15:24 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <4671703F.4010109@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk>
Message-ID: <467177AC.8060104@sendu.me.uk>

Nathan S. Haigh wrote:
> I'm just wondering if anyone passes their modules through perltidy in
> order for them to have the same look/feel? If so, do you have a
> .perltidyrc file? Also, is it worth running the Bioperl modules through it?

I don't use it, but I was contemplating the same thing. Chris uses it 
from time to time and I think we have a similar taste in style.

But we'd have to hammer something out that was agreeable to everyone.


From mmokrejs at ribosome.natur.cuni.cz  Thu Jun 14 17:19:42 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 14 Jun 2007 19:19:42 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
Message-ID: <467178AE.5040905@ribosome.natur.cuni.cz>


David Messina wrote:
> Hi Martin,
> 
> You're in luck -- the BioPerl core distribution includes two scripts  
> for doing just that:
> 
> 	genbank2gff

Somehow these scripts were not installed for me on Gentoo, but I have then in the
cvs copy. ;-) Anyway, the one above is not for me, I do not need the GFF database,
or better to say I have no intent to install that unknown thing, seems like an overkill
for my case. I just want to render a plasmid map.

> 	genbank2gff3

This one seems more promising but still with current cvs checkout I get...

$ perl /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl --in stdin --out stdout < ~/99.gb 
# Input: stdin
Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, <FH> line 7.
Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, <FH> line 7.
Can't call method "binomial" on an undefined value at /home/mmokrejs/proj/bioperl/bioperl-live/blib/script/bp_genbank2gff3.pl line 675, <FH> line 125.
$
$ bp_seqconvert.pl --from genbank --to embl < ~/IRESite/gb/99.gb 
Use of uninitialized value in split at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1335, <STDIN> line 7.
Use of uninitialized value in quotemeta at /usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO/genbank.pm line 1338, <STDIN> line 7.
ID   unknown; SV 1; circular; unassigned DNA; STD; UNC; 5391 BP.
XX
AC   unknown;
XX
XX
XX
CC   ApEinfo:methylated:0
...

Oh dear, I have just manually edited the files and still they are wrong? Oh no. :(

> 
> Look in the scripts directory of the distro.
> 
> Also, there is a *huge* amount of documentation and examples on the  
> BioPerl website.
> 
> 	http://www.bioperl.org/wiki/HOWTOs

You mean http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File ? ;-)

> 
> Reading those, reading the FAQ, and searching the mailing list  
> archives are where I look first when I don't know how to do something  
> in BioPerl.
> 
> 
> Dave
> 
> --
> Dave Messina
> Senior Analyst, Assembly Group
> Genome Sequencing Center
> Washington University
> St. Louis, MO
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 99.gb
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070614/fc6e601a/attachment.ksh>

From mmokrejs at ribosome.natur.cuni.cz  Thu Jun 14 17:23:28 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Thu, 14 Jun 2007 19:23:28 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <467178AE.5040905@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
Message-ID: <46717990.6040509@ribosome.natur.cuni.cz>

Martin MOKREJ? wrote:

>> Also, there is a *huge* amount of documentation and examples on the  
>> BioPerl website.
>>
>>     http://www.bioperl.org/wiki/HOWTOs
> 
> You mean 
> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File 
> ? ;-)

$ perl embl2picture.pl ~/99.gb | display -
Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature Bio::Location::Simple=HASH(0x893ebac): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature Bio::Location::Simple=HASH(0x893e720): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.

Error returned while evaluating value of 'description' option for glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 141, <GEN0> line 125.
$

The plasmid is a circular DNA, why is the diagram in linear? ;-)

Martin


From bix at sendu.me.uk  Thu Jun 14 17:03:34 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 14 Jun 2007 18:03:34 +0100
Subject: [Bioperl-l] new Bio::Object -> Bio::Object->new()
In-Reply-To: <46716E95.3090604@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<46716E95.3090604@sheffield.ac.uk>
Message-ID: <467174E6.1090001@sendu.me.uk>

Nathan S. Haigh wrote:
>> PodSyntax is new and obviously a bunch of POD needs fixing: Nathan, are
>> you working on that, or can I fix those errors?
> 
> I can fix these - although I'm still trying to get my new Debian 4.0
> system up-to-speed so it might take me a little while! RE the
> PodSyntax.t, I've got it silently skipping the tests if Test::Pod isn't
> installed. However, would it be better to have Test::Pod in t/lib so
> that it runs on the user's system during installation or leave it as is?

Leave it as is. Every-day users don't need to check the syntax of the 
pod. In fact, it really only needs to be done once, prior to packaging 
up a new release.


From n.haigh at sheffield.ac.uk  Thu Jun 14 17:32:37 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 18:32:37 +0100
Subject: [Bioperl-l] Perltidy
In-Reply-To: <467177AC.8060104@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
Message-ID: <46717BB5.8000706@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> I'm just wondering if anyone passes their modules through perltidy in
>> order for them to have the same look/feel? If so, do you have a
>> .perltidyrc file? Also, is it worth running the Bioperl modules
>> through it?
> 
> I don't use it, but I was contemplating the same thing. Chris uses it
> from time to time and I think we have a similar taste in style.
> 
> But we'd have to hammer something out that was agreeable to everyone.

A starting place maybe Perl Best Practices by Damian Conway:
http://www.oreilly.com/catalog/perlbp/


The perltidyrc file can e found here:
http://www.perlmonks.org/?node_id=485885

I also found this nice thread with some ideas, inc some code that causes
emacs to auto-perltidy everything you use cperl-mode with. I don't use
emacs myself, ut here's the link if anyone is interested:
http://www.perlmonks.org/?node_id=516501

Nath


From johnsonm at gmail.com  Thu Jun 14 17:38:31 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Thu, 14 Jun 2007 12:38:31 -0500
Subject: [Bioperl-l] Perltidy
In-Reply-To: <467177AC.8060104@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
Message-ID: <ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>

    The nice thing about Perl Tidy is that everybody can have their
own config file.  There could be a bioperl default config that gets
applied at checkin time.  Anybody that didn't like it could script
checkouts to get run through their own config.  Diffs might get a
little hairy, but as long as you tidy before diffing, it shouldn't be
too bad.  Speaking of which....coding style is controversial enough,
but since that's already been opened, what about CVS vs Subversion? 8)
 Some of the scripting for this sort of thing might be easer in
Subversion.  Though maybe something like Git would fit the developer
model better (more support for distributed development).

On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
> Nathan S. Haigh wrote:
> > I'm just wondering if anyone passes their modules through perltidy in
> > order for them to have the same look/feel? If so, do you have a
> > .perltidyrc file? Also, is it worth running the Bioperl modules through it?
>
> I don't use it, but I was contemplating the same thing. Chris uses it
> from time to time and I think we have a similar taste in style.
>
> But we'd have to hammer something out that was agreeable to everyone.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From n.haigh at sheffield.ac.uk  Thu Jun 14 17:39:39 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 14 Jun 2007 18:39:39 +0100
Subject: [Bioperl-l] cvs changes in working copy
Message-ID: <46717D5B.5040108@sheffield.ac.uk>

Not sure if I'm being dense or if it's because I've been working with
svn recently, but - how do I get a list of files that are different in
my working copy compared to the repository?

Cheers
Nath


From cjfields at uiuc.edu  Thu Jun 14 17:46:38 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 12:46:38 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
Message-ID: <CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>

Is 99.gb supposed to be a GenBank file?  And you're loading it into  
embl2picture (which I assume takes EMBL format files)?  Without  
example code we can easily make the wrong assumptions (i.e. that this  
is user error and not a BioPerl problem).

Also, I don't believe the feature plotting scripts plot circular  
chromosomes/plasmids.  If you want this functionality you'll have to  
code it for yourself.

chris

On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote:

> Martin MOKREJ? wrote:
>
>>> Also, there is a *huge* amount of documentation and examples on the
>>> BioPerl website.
>>>
>>>     http://www.bioperl.org/wiki/HOWTOs
>>
>> You mean
>> http://www.bioperl.org/wiki/ 
>> HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>> ? ;-)
>
> $ perl embl2picture.pl ~/99.gb | display -
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature  
> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature  
> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature  
> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature  
> Bio::Location::Simple=HASH(0x893e720): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
>
> Error returned while evaluating value of 'description' option for  
> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature  
> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method  
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl  
> line 141, <GEN0> line 125.
> $
>
> The plasmid is a circular DNA, why is the diagram in linear? ;-)
>
> Martin
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Thu Jun 14 17:57:35 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 14 Jun 2007 12:57:35 -0500
Subject: [Bioperl-l] Perltidy
In-Reply-To: <46717BB5.8000706@sheffield.ac.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk> <46717BB5.8000706@sheffield.ac.uk>
Message-ID: <4671818F.5040902@campus.iztacala.unam.mx>

I think a consensus .perltidyrc could be placed in the source distribution.

Mauricio.

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> I'm just wondering if anyone passes their modules through perltidy in
>>> order for them to have the same look/feel? If so, do you have a
>>> .perltidyrc file? Also, is it worth running the Bioperl modules
>>> through it?
>> I don't use it, but I was contemplating the same thing. Chris uses it
>> from time to time and I think we have a similar taste in style.
>>
>> But we'd have to hammer something out that was agreeable to everyone.
> 
> A starting place maybe Perl Best Practices by Damian Conway:
> http://www.oreilly.com/catalog/perlbp/
> 
> 
> The perltidyrc file can e found here:
> http://www.perlmonks.org/?node_id=485885
> 
> I also found this nice thread with some ideas, inc some code that causes
> emacs to auto-perltidy everything you use cperl-mode with. I don't use
> emacs myself, ut here's the link if anyone is interested:
> http://www.perlmonks.org/?node_id=516501
> 
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Thu Jun 14 18:32:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 13:32:41 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
Message-ID: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>

To chip in on this, I only use perltidy when I need to clean bioperl  
code up for debugging (particularly if blocks are hard to see) and  
just use the defaults.  I agree it would be nice to have everything  
tidied up but it'll definitely need to be a consensus config file.

About svn, I like the idea of eventually migrating to using it over  
CVS (I think BioPython and BioJava have plans to but I'm not sure)  
but I don't really know enough to say how feasible/difficult the  
migration path would be.  Anyone know?

chris

On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote:

>     The nice thing about Perl Tidy is that everybody can have their
> own config file.  There could be a bioperl default config that gets
> applied at checkin time.  Anybody that didn't like it could script
> checkouts to get run through their own config.  Diffs might get a
> little hairy, but as long as you tidy before diffing, it shouldn't be
> too bad.  Speaking of which....coding style is controversial enough,
> but since that's already been opened, what about CVS vs Subversion? 8)
>  Some of the scripting for this sort of thing might be easer in
> Subversion.  Though maybe something like Git would fit the developer
> model better (more support for distributed development).
>
> On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
>> Nathan S. Haigh wrote:
>>> I'm just wondering if anyone passes their modules through  
>>> perltidy in
>>> order for them to have the same look/feel? If so, do you have a
>>> .perltidyrc file? Also, is it worth running the Bioperl modules  
>>> through it?
>>
>> I don't use it, but I was contemplating the same thing. Chris uses it
>> from time to time and I think we have a similar taste in style.
>>
>> But we'd have to hammer something out that was agreeable to everyone.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonm at gmail.com  Thu Jun 14 18:46:24 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Thu, 14 Jun 2007 13:46:24 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
Message-ID: <ebf5eb170706141146r6e07efffhbb98a6d101c45ccd@mail.gmail.com>

    If there was a default/standard/consensus bioperl perltidy config
file, I would probably use it prior to checkin, on my own, so I could
code in my schizophrenic style without worrying about starting any
format wars.  When I'm fixing or enhancing somebody else's code, I
always try and adapt to whatever style they used, even if it grates on
my nerves.  I'd love to not have to worry about that with Bioperl.  Of
course, nobody will every agree on a standard, so it's probably a moot
point.  8)

On 6/14/07, Chris Fields <cjfields at uiuc.edu> wrote:
> To chip in on this, I only use perltidy when I need to clean bioperl
> code up for debugging (particularly if blocks are hard to see) and
> just use the defaults.  I agree it would be nice to have everything
> tidied up but it'll definitely need to be a consensus config file.
>
> About svn, I like the idea of eventually migrating to using it over
> CVS (I think BioPython and BioJava have plans to but I'm not sure)
> but I don't really know enough to say how feasible/difficult the
> migration path would be.  Anyone know?
>
> chris


From jason at bioperl.org  Thu Jun 14 19:00:09 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 14 Jun 2007 12:00:09 -0700
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
Message-ID: <CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>


On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:

> To chip in on this, I only use perltidy when I need to clean bioperl
> code up for debugging (particularly if blocks are hard to see) and
> just use the defaults.  I agree it would be nice to have everything
> tidied up but it'll definitely need to be a consensus config file.
>

Can we do any sort of massive conversion at some logical timepoint.   
Probably after a branch release or something?  Because it basically  
means we're going to have differences on nearly every line which is  
going to make diff-ing difficult when debugging old/new versions.   
Maybe it is not a problem because we aren't introducing and new bugs!

> About svn, I like the idea of eventually migrating to using it over
> CVS (I think BioPython and BioJava have plans to but I'm not sure)
> but I don't really know enough to say how feasible/difficult the
> migration path would be.  Anyone know?
>

It's doable but non-trivial.  cvs2svn (python gah!) script exists to  
help in this.  There are pros and cons to converting.   There is a  
fair amount of documentation and other pointers out there that point  
to the CVS server for getting latest code so we'd need to think about  
whether we'd support some sort of backwards compatible SVN -> CVS for  
read-only or what.

Mostly it will need someone to lead the charge - I made a go at doing  
it in the winter, but I really don't have the SVN-foo to make this  
work.  We'd need someone with SVN experience to step up and help.   
You can always try and we can play with the converted repository for  
a while without making it the new code base.

-j

> chris
>
> On Jun 14, 2007, at 12:38 PM, Mark Johnson wrote:
>
>>     The nice thing about Perl Tidy is that everybody can have their
>> own config file.  There could be a bioperl default config that gets
>> applied at checkin time.  Anybody that didn't like it could script
>> checkouts to get run through their own config.  Diffs might get a
>> little hairy, but as long as you tidy before diffing, it shouldn't be
>> too bad.  Speaking of which....coding style is controversial enough,
>> but since that's already been opened, what about CVS vs  
>> Subversion? 8)
>>  Some of the scripting for this sort of thing might be easer in
>> Subversion.  Though maybe something like Git would fit the developer
>> model better (more support for distributed development).
>>
>> On 6/14/07, Sendu Bala <bix at sendu.me.uk> wrote:
>>> Nathan S. Haigh wrote:
>>>> I'm just wondering if anyone passes their modules through
>>>> perltidy in
>>>> order for them to have the same look/feel? If so, do you have a
>>>> .perltidyrc file? Also, is it worth running the Bioperl modules
>>>> through it?
>>>
>>> I don't use it, but I was contemplating the same thing. Chris  
>>> uses it
>>> from time to time and I think we have a similar taste in style.
>>>
>>> But we'd have to hammer something out that was agreeable to  
>>> everyone.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Thu Jun 14 19:01:27 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 14 Jun 2007 12:01:27 -0700
Subject: [Bioperl-l] cvs changes in working copy
In-Reply-To: <46717D5B.5040108@sheffield.ac.uk>
References: <46717D5B.5040108@sheffield.ac.uk>
Message-ID: <EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>

cvs update | grep '^M'

On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote:

> Not sure if I'm being dense or if it's because I've been working with
> svn recently, but - how do I get a list of files that are different in
> my working copy compared to the repository?
>
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From cjfields at uiuc.edu  Thu Jun 14 19:20:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 14 Jun 2007 14:20:46 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
References: <46710BC4.3060302@sendu.me.uk> <46716003.2030302@sendu.me.uk>
	<4671703F.4010109@sheffield.ac.uk> <467177AC.8060104@sendu.me.uk>
	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>
	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>
	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
Message-ID: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>


On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:

>
> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>
>> To chip in on this, I only use perltidy when I need to clean bioperl
>> code up for debugging (particularly if blocks are hard to see) and
>> just use the defaults.  I agree it would be nice to have everything
>> tidied up but it'll definitely need to be a consensus config file.
>>
>
> Can we do any sort of massive conversion at some logical timepoint.
> Probably after a branch release or something?  Because it basically
> means we're going to have differences on nearly every line which is
> going to make diff-ing difficult when debugging old/new versions.
> Maybe it is not a problem because we aren't introducing and new bugs!

I agree; if we intend on doing this it should be all at once, maybe  
on a branch dedicated to ensure that code changes don't tank tests  
(they shouldn't but one never knows).  We would then need a script up- 
and-running that tidies everything up prior to commits (though what  
happens if perltidy tanks?...).

Sendu, up for it?

>> About svn, I like the idea of eventually migrating to using it over
>> CVS (I think BioPython and BioJava have plans to but I'm not sure)
>> but I don't really know enough to say how feasible/difficult the
>> migration path would be.  Anyone know?
>>
>
> It's doable but non-trivial.  cvs2svn (python gah!) script exists to
> help in this.  There are pros and cons to converting.   There is a
> fair amount of documentation and other pointers out there that point
> to the CVS server for getting latest code so we'd need to think about
> whether we'd support some sort of backwards compatible SVN -> CVS for
> read-only or what.
>
> Mostly it will need someone to lead the charge - I made a go at doing
> it in the winter, but I really don't have the SVN-foo to make this
> work.  We'd need someone with SVN experience to step up and help.
> You can always try and we can play with the converted repository for
> a while without making it the new code base.
>
> -j

Stepped into that one, didn't I!  I'll look into how much effort is  
involved and try getting something going in the next month or two,  
maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
might be worth looking into.

chris


From arareko at campus.iztacala.unam.mx  Thu Jun 14 19:50:39 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 14 Jun 2007 14:50:39 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
Message-ID: <46719C0F.5010706@campus.iztacala.unam.mx>

Chris Fields wrote:
> On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:
> 
>> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>>
>>> About svn, I like the idea of eventually migrating to using it over
>>> CVS (I think BioPython and BioJava have plans to but I'm not sure)
>>> but I don't really know enough to say how feasible/difficult the
>>> migration path would be.  Anyone know?
>>>
>> It's doable but non-trivial.  cvs2svn (python gah!) script exists to
>> help in this.  There are pros and cons to converting.   There is a
>> fair amount of documentation and other pointers out there that point
>> to the CVS server for getting latest code so we'd need to think about
>> whether we'd support some sort of backwards compatible SVN -> CVS for
>> read-only or what.
>>
>> Mostly it will need someone to lead the charge - I made a go at doing
>> it in the winter, but I really don't have the SVN-foo to make this
>> work.  We'd need someone with SVN experience to step up and help.
>> You can always try and we can play with the converted repository for
>> a while without making it the new code base.
>>
>> -j
> 
> Stepped into that one, didn't I!  I'll look into how much effort is  
> involved and try getting something going in the next month or two,  
> maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
> might be worth looking into.
> 
> chris
> 

Chris D has worked with CVS-SVN transitioning for other projects, maybe 
he can shed some light on this.

Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From sac at bioperl.org  Thu Jun 14 21:33:39 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Thu, 14 Jun 2007 14:33:39 -0700
Subject: [Bioperl-l] How can I pull out all instances of a motif from a
	genome sequence and output them as a BED file?
In-Reply-To: <5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>
References: <cfd758050706131720v4638f8d6la65d31a18c324127@mail.gmail.com>
	<5F3F49B4-CE07-4DD7-B82E-9DDE42B516D0@uiuc.edu>
Message-ID: <8f200b4c0706141433i37267774u1dc2193d8508c47b@mail.gmail.com>

This issue was discussed recently here. Check out this thread:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15046/focus=15048

Some of the tools mentioned in the FAQ item Chris mentioned do not
report where the match occurred, only that a match occurred
(String::Approx, agrep), though some do report do report match
locations (fuzznuc, fuzzprot; not sure about TFBS).

My Bio::Tools::SeqPattern module does not even perform any matches, it
just encapsulates a regular expression for a nuc or protein motif and
knows how to handle ambiguity code expansion and reverse
complementing. The idea is that you can use this to convert a
biological sequence motif into a string suitable for use in a perl
regex. Adding a match() method to this module would be handy.

There an example script for it in examples/tools of the distro (which,
btw references an obsolete module, so it won't run as is -- I'll fix).

Steve

On 6/13/07, Chris Fields <cjfields at uiuc.edu> wrote:
> This is answered in the FAQ (sorry if the URL wraps, but we don't
> like tinyurls):
>
> http://www.bioperl.org/wiki/
> FAQ#How_do_I_do_motif_searches_with_BioPerl.3F_Can_I_do_.
> 22find_all_sequences_that_are_75.25_identical.22_to_a_given_motif.3F
>
> chris
>
> On Jun 13, 2007, at 7:20 PM, John Cumbers wrote:
>
> > Hello,
> >
> > I have a simple problem, I'm trying to search a genome sequence for
> > a motif,
> > I then want to output a BED file to display all the locations of
> > this motif
> > on the UCSC Genome Browser.  I could not find a script to do this,
> > so I
> > started to write my own.   I'm new to perl and my code below was my
> > attempt
> > to read the sequence string and output the index bp of the start of
> > each
> > motif.  With this I could build the BED file myself, which requires
> > start
> > and finish base pairs.
> >
> > For the first motif I can output the start index, but when I try
> > and read
> > the next one off the sequence it does not work.  Instead I just get an
> > output of a list of 1's.  I realise that this is more a request for
> > some
> > simple perl help, but any help much appreciated.
> >
> > Best wishes,
> > John
> >
> >
> > $seq_object = read_sequence
> > ("Drosophila.Chr3.test.AE014296.fasta");  #turn
> > my FASTA file into a seq object.
> > $sequence_as_a_string = $seq_object->seq();  #turn it into a string
> > # search $sequence_as_a_string  string for motif AAA as example
> > # if found, return the index that it is found at
> >
> > while ($sequence_as_a_string =~ m/AAA/g) {
> >   print "Found '$&'.  Next attempt at character " .
> > pos($sequence_as_a_string)+1 . "\n";
> > }
> >
> >
> >
> > --
> > John Cumbers,  Graduate Student
> > Biology and Medicine
> > Brown University, Box G-W
> > Providence, Rhode Island, 02912, USA
> > Tel USA: +1 401 523 8190,  Fax: +1 401 863-2166
> > UK to USA: 0207 617 7824
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From hlapp at gmx.net  Thu Jun 14 23:04:11 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 14 Jun 2007 19:04:11 -0400
Subject: [Bioperl-l] cvs changes in working copy
In-Reply-To: <EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>
References: <46717D5B.5040108@sheffield.ac.uk>
	<EE64F124-7DA2-4FB1-BE9B-C267126FCF6F@bioperl.org>
Message-ID: <3B262E6A-2C90-49FA-BCA1-BF1900C5AC3A@gmx.net>

Actually, that will update your repository. If you just wanted to  
take a peek you would use cvs status:

$ cvs status | grep 'Locally Modified'

	-hilmar

On Jun 14, 2007, at 3:01 PM, Jason Stajich wrote:

> cvs update | grep '^M'
>
> On Jun 14, 2007, at 10:39 AM, Nathan S. Haigh wrote:
>
>> Not sure if I'm being dense or if it's because I've been working with
>> svn recently, but - how do I get a list of files that are  
>> different in
>> my working copy compared to the repository?
>>
>> Cheers
>> Nath
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From mmokrejs at ribosome.natur.cuni.cz  Fri Jun 15 07:28:17 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Fri, 15 Jun 2007 09:28:17 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
Message-ID: <46723F91.60501@ribosome.natur.cuni.cz>

Chris Fields wrote:
> Is 99.gb supposed to be a GenBank file?  And you're loading it into 

Yes, it was attached to the email. ;)

> embl2picture (which I assume takes EMBL format files)?  Without example 
> code we can easily make the wrong assumptions (i.e. that this is user 
> error and not a BioPerl problem).

use constant USAGE =><<END;
Usage: $0 <file>
   Render a GenBank/EMBL entry into drawable form.
   Return as a GIF or PNG image on standard output.
 
   File must be in embl, genbank, or another SeqIO-
   recognized format.  Only the first entry will be
   rendered.
 
Example to try:
   embl2picture.pl factor7.embl | display -
 
END

> 
> Also, I don't believe the feature plotting scripts plot circular 
> chromosomes/plasmids.  If you want this functionality you'll have to 
> code it for yourself.

That's a pitty it does not, but at least if someone could improve the docs. ;)
Unfortunately I don't have the time to rewrite the code myself now,
I need a working, standalone, already available tool. :(
M.

> 
> chris
> 
> On Jun 14, 2007, at 12:23 PM, Martin MOKREJ? wrote:
> 
>> Martin MOKREJ? wrote:
>>
>>>> Also, there is a *huge* amount of documentation and examples on the
>>>> BioPerl website.
>>>>
>>>>     http://www.bioperl.org/wiki/HOWTOs
>>>
>>> You mean
>>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File 
>>>
>>> ? ;-)
>>
>> $ perl embl2picture.pl ~/99.gb | display -
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature 
>> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa57f0), feature 
>> Bio::Location::Simple=HASH(0x893ebb8): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5880), feature 
>> Bio::Location::Simple=HASH(0x893ebac): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa5430), feature 
>> Bio::Location::Simple=HASH(0x893e720): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>>
>> Error returned while evaluating value of 'description' option for 
>> glyph Bio::Graphics::Glyph::generic=HASH(0x8aa546c), feature 
>> Bio::Location::Simple=HASH(0x893e6b4): Can't locate object method 
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl line 
>> 141, <GEN0> line 125.
>> $
>>
>> The plasmid is a circular DNA, why is the diagram in linear? ;-)
>>
>> Martin
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 

-- 
Dr. Martin Mokrejs
Dept. of Genetics and Microbiology
Faculty of Science, Charles University
Vinicna 5, 128 43 Prague, Czech Republic
http://www.iresite.org
http://www.iresite.org/~mmokrejs


From dhoworth at mrc-lmb.cam.ac.uk  Fri Jun 15 08:59:09 2007
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Fri, 15 Jun 2007 09:59:09 +0100
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <46717990.6040509@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
Message-ID: <467254DD.3010505@mrc-lmb.cam.ac.uk>

Martin MOKREJ? wrote:
>>> Also, there is a *huge* amount of documentation and examples on
>>> the BioPerl website.
>>> 
>>> http://www.bioperl.org/wiki/HOWTOs
>> You mean 
>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>>  ? ;-)
> 
> $ perl embl2picture.pl ~/99.gb | display - Error returned while
> evaluating value of 'description' option for glyph
> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature
> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method
> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl
> line 141, <GEN0> line 125.

Hmm an error at line 141 of a 69 line script? Methinks you're not
actually running the script that's presented on the wiki page you
quoted. I cut-and-pasted the script and your file and it worked for me
(at least, it produced an image, along with a bunch of OOPS lines)

HTH, Dave


From n.haigh at sheffield.ac.uk  Fri Jun 15 10:21:38 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 11:21:38 +0100
Subject: [Bioperl-l] Installation using --install_base
Message-ID: <46726832.7080601@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm setting up a new installation of Debian 4.0 at home and though I'd
try to install BioPerl as a normal user rather than root. So in CPAN
options I set the --install_base to /home/username/perl and set PERL5LIB
to point to the same place.

Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
user and ask to install all optional modules, it tries to install them
through CPAN - however it seems to fail because some dependencies don't
seem to want to install in a user directory.

Has anyone else found this or might I be doing something wrong?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGcmgyczuW2jkwy2gRAtgqAKDIv717ciVHr5V+Z1kqPV2a++E8dgCfYr2a
VPt4tEPLW2J+BiKnN3B8aV8=
=c+9z
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Fri Jun 15 10:07:04 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 15 Jun 2007 11:07:04 +0100
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
Message-ID: <467264C8.4020202@sendu.me.uk>

Chris Fields wrote:
> On Jun 14, 2007, at 2:00 PM, Jason Stajich wrote:
> 
>> On Jun 14, 2007, at 11:32 AM, Chris Fields wrote:
>>
>>> To chip in on this, I only use perltidy when I need to clean bioperl
>>> code up for debugging (particularly if blocks are hard to see) and
>>> just use the defaults.  I agree it would be nice to have everything
>>> tidied up but it'll definitely need to be a consensus config file.
>>>
>> Can we do any sort of massive conversion at some logical timepoint.
>> Probably after a branch release or something?  Because it basically
>> means we're going to have differences on nearly every line which is
>> going to make diff-ing difficult when debugging old/new versions.
>> Maybe it is not a problem because we aren't introducing and new bugs!

Sorry, can you clarify the problem you envisage? And why would making a 
branch release help?


> I agree; if we intend on doing this it should be all at once, maybe  
> on a branch dedicated to ensure that code changes don't tank tests  
> (they shouldn't but one never knows).  We would then need a script up- 
> and-running that tidies everything up prior to commits (though what  
> happens if perltidy tanks?...).
> 
> Sendu, up for it?

If its going to be difficult and a hassle, for such an unnecessary thing 
I'm not sure its worth it. There are more pressing things to be done for 
Bioperl.

If I can just run perltidy on the entire package and commit, I'd do it. 
If that's not appropriate, I won't.


>>> About svn
[snip]
> Stepped into that one, didn't I!  I'll look into how much effort is  
> involved and try getting something going in the next month or two,  
> maybe sooner if time permits.  I'm lacking on SVN-foo as well but it  
> might be worth looking into.

I'd put this in the unnecessary-but-nice category as well. If it will be 
as easy as my ->new change, go ahead. If not, there are more pressing 
matters (POD fixing, test script updating and finishing...).


From n.haigh at sheffield.ac.uk  Fri Jun 15 10:35:40 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 11:35:40 +0100
Subject: [Bioperl-l] Installation using --install_base
Message-ID: <46726B7C.7070902@sheffield.ac.uk>

I'm setting up a new installation of Debian 4.0 at home and though I'd
try to install BioPerl as a normal user rather than root. So in CPAN
options I set the --install_base to /home/username/perl and set PERL5LIB
to point to the same place.

Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
user and ask to install all optional modules, it tries to install them
through CPAN - however it seems to fail because some dependencies don't
seem to want to install in a user directory.

Has anyone else found this or might I be doing something wrong?

Nath


From bix at sendu.me.uk  Fri Jun 15 10:45:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 15 Jun 2007 11:45:48 +0100
Subject: [Bioperl-l] Installation using --install_base
In-Reply-To: <46726832.7080601@sheffield.ac.uk>
References: <46726832.7080601@sheffield.ac.uk>
Message-ID: <46726DDC.8090202@sendu.me.uk>

Nathan S. Haigh wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> I'm setting up a new installation of Debian 4.0 at home and though I'd
> try to install BioPerl as a normal user rather than root. So in CPAN
> options I set the --install_base to /home/username/perl and set PERL5LIB
> to point to the same place.
> 
> Now, when I run "perl Build.PL" on the HEAD of BioPerl as a non root
> user and ask to install all optional modules, it tries to install them
> through CPAN - however it seems to fail because some dependencies don't
> seem to want to install in a user directory.
> 
> Has anyone else found this or might I be doing something wrong?

You'll need to configure CPAN to install into your user directory. 
Upgrade to the latest version, then go read the docs on the various 
configurable options. I thought I at least mentioned this in the Bioperl 
INSTALL doc. If not, can someone come up with a concise clarification?


From sdavis2 at mail.nih.gov  Fri Jun 15 10:56:08 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 15 Jun 2007 06:56:08 -0400
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467264C8.4020202@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
Message-ID: <46727048.3080904@mail.nih.gov>

Sendu Bala wrote:
> If its going to be difficult and a hassle, for such an unnecessary thing 
> I'm not sure its worth it. There are more pressing things to be done for 
> Bioperl.
> 
> If I can just run perltidy on the entire package and commit, I'd do it. 
> If that's not appropriate, I won't.

I agree with the sentiment noted above.  I'm a bit of an outsider here,
but bioperl is a collaborative project.  Not everyone has the same
sentiments about what "correct" style means.  As a programmer, I really
wouldn't want significant changes on the style of my code.  And perl
happily puts up with many styles.  I would say leave things as they
are--let the individual programmers choose.  It reduces the amount of
work of questionable importance and allows the coding style freedom that
perl supports.

Just my $.02.

Sean


From cjfields at uiuc.edu  Fri Jun 15 14:05:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:05:07 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <46723F91.60501@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
	<46723F91.60501@ribosome.natur.cuni.cz>
Message-ID: <A2212781-75F3-4BB7-967F-1668B682E84E@uiuc.edu>


On Jun 15, 2007, at 2:28 AM, Martin MOKREJ? wrote:

> Chris Fields wrote:
>> Is 99.gb supposed to be a GenBank file?  And you're loading it into
>
> Yes, it was attached to the email. ;)

<bring foot to mouth and insert>

Sorry about that.  I notice that '.' was added, but the spacing  
seemed off.  I think bioperl catches that fine but it's something  
Wayne should consider.

>> embl2picture (which I assume takes EMBL format files)?  Without  
>> example
>> code we can easily make the wrong assumptions (i.e. that this is user
>> error and not a BioPerl problem).
>
> use constant USAGE =><<END;
> Usage: $0 <file>
>    Render a GenBank/EMBL entry into drawable form.
>    Return as a GIF or PNG image on standard output.
>
>    File must be in embl, genbank, or another SeqIO-
>    recognized format.  Only the first entry will be
>    rendered.
>
> Example to try:
>    embl2picture.pl factor7.embl | display -
>
> END

Horribly named script (should be seq2picture, since it converts both  
gb/embl).  The use of 'all_tags' makes me think the script version  
you are using is old, as those methods have long since been renamed.   
Dave has it working though, so maybe your version has been updated?   
The 'use of initialized data in' errors are probably from inclusion  
of mandatory fields with no data or '.'.

>> Also, I don't believe the feature plotting scripts plot circular
>> chromosomes/plasmids.  If you want this functionality you'll have to
>> code it for yourself.
>
> That's a pitty it does not, but at least if someone could improve  
> the docs. ;)
> Unfortunately I don't have the time to rewrite the code myself now,
> I need a working, standalone, already available tool. :(
> M.

As I said, unless someone shows interest and codes it just won't get  
done.  We have had very little interest in this, either b/c there are  
tools already out there to do this very thing (multitudes of plasmid  
drawing programs, some free like ApE) or that nobody's bothered to  
write it up.

chris


From cjfields at uiuc.edu  Fri Jun 15 14:22:23 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:22:23 -0500
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <46727048.3080904@mail.nih.gov>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov>
Message-ID: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>


On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:

> Sendu Bala wrote:
>> If its going to be difficult and a hassle, for such an unnecessary  
>> thing
>> I'm not sure its worth it. There are more pressing things to be  
>> done for
>> Bioperl.
>>
>> If I can just run perltidy on the entire package and commit, I'd  
>> do it.
>> If that's not appropriate, I won't.
>
> I agree with the sentiment noted above.  I'm a bit of an outsider  
> here,
> but bioperl is a collaborative project.  Not everyone has the same
> sentiments about what "correct" style means.  As a programmer, I  
> really
> wouldn't want significant changes on the style of my code.  And perl
> happily puts up with many styles.  I would say leave things as they
> are--let the individual programmers choose.  It reduces the amount of
> work of questionable importance and allows the coding style freedom  
> that
> perl supports.
>
> Just my $.02.
>
> Sean

I tend to run it on modules that need some reformatting  
(SearchIO::blast comes to mind).  I believe you're correct when this  
comes down to programming style, but I think this echoes a sentiment  
(frustration, perhaps) that some of us have with long-term  
maintenance of said code.

Maybe a compromise:  include a copy of .perltidyrc with the  
distribution that goes by what a consensus wants or by the general  
rules laid out in Perl Best Practices (spaced settings, use of spaces  
over tabs, etc).  Conversion would be encouraged but voluntary, with  
the caveat that if someone needs to clean up code down the road (bug  
fixes, enhancements, etc) and if the original author isn't able to  
add it in themselves, it could be perltidy'd in order to help the  
developer (locate and fix the issue)|(add relevant enhancement where  
needed).

chris


From cjfields at uiuc.edu  Fri Jun 15 14:56:23 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 09:56:23 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467264C8.4020202@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
Message-ID: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>


On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:

>>>> ...
>>> Can we do any sort of massive conversion at some logical timepoint.
>>> Probably after a branch release or something?  Because it basically
>>> means we're going to have differences on nearly every line which is
>>> going to make diff-ing difficult when debugging old/new versions.
>>> Maybe it is not a problem because we aren't introducing and new  
>>> bugs!
>
> Sorry, can you clarify the problem you envisage? And why would  
> making a branch release help?

Maybe the worry is that mass conversion in such a large codebase  
could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o  
trying?

>> I agree; if we intend on doing this it should be all at once,  
>> maybe  on a branch dedicated to ensure that code changes don't  
>> tank tests  (they shouldn't but one never knows).  We would then  
>> need a script up- and-running that tidies everything up prior to  
>> commits (though what  happens if perltidy tanks?...).
>> Sendu, up for it?
>
> If its going to be difficult and a hassle, for such an unnecessary  
> thing I'm not sure its worth it. There are more pressing things to  
> be done for Bioperl.
>
> If I can just run perltidy on the entire package and commit, I'd do  
> it. If that's not appropriate, I won't.

The choices aren't necessarily all or nothing.  What about voluntary,  
recommended use of a perltidy config file included with the  
distribution, with additional 'caveats'?  See my response to Sean.

>>>> About svn
> [snip]
>> Stepped into that one, didn't I!  I'll look into how much effort  
>> is  involved and try getting something going in the next month or  
>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as  
>> well but it  might be worth looking into.
>
> I'd put this in the unnecessary-but-nice category as well. If it  
> will be as easy as my ->new change, go ahead. If not, there are  
> more pressing matters (POD fixing, test script updating and  
> finishing...).

A few other open-bio projects have actively discussed a CVS->SVN  
migration (BioRuby and I think BioPython, though the latter could be  
wrong).  As I said, "it might be worth looking into" to weigh the  
pros/cons, get others opinions from others who have made the  
transition, etc.  We could, as Jason suggested, even set up a tester  
SVN w/o making it the default codebase (lock it off to a few testers,  
have CVS commits automatically/manually carry over to SVN, etc).

I agree with you that it's not feasible to switch over prior to a  
release and that there are more pressing issues, but it doesn't hurt  
having an open discussion about it.

chris


From sdavis2 at mail.nih.gov  Fri Jun 15 15:15:57 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 15 Jun 2007 11:15:57 -0400
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk> <46727048.3080904@mail.nih.gov>
	<78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
Message-ID: <4672AD2D.2090001@mail.nih.gov>

Chris Fields wrote:
> 
> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:
> 
>> Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary thing
>>> I'm not sure its worth it. There are more pressing things to be done for
>>> Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd do it.
>>> If that's not appropriate, I won't.
>>
>> I agree with the sentiment noted above.  I'm a bit of an outsider here,
>> but bioperl is a collaborative project.  Not everyone has the same
>> sentiments about what "correct" style means.  As a programmer, I really
>> wouldn't want significant changes on the style of my code.  And perl
>> happily puts up with many styles.  I would say leave things as they
>> are--let the individual programmers choose.  It reduces the amount of
>> work of questionable importance and allows the coding style freedom that
>> perl supports.
>>
>> Just my $.02.
>>
>> Sean
> 
> I tend to run it on modules that need some reformatting (SearchIO::blast
> comes to mind).  I believe you're correct when this comes down to
> programming style, but I think this echoes a sentiment (frustration,
> perhaps) that some of us have with long-term maintenance of said code.
> 
> Maybe a compromise:  include a copy of .perltidyrc with the distribution
> that goes by what a consensus wants or by the general rules laid out in
> Perl Best Practices (spaced settings, use of spaces over tabs, etc). 
> Conversion would be encouraged but voluntary, with the caveat that if
> someone needs to clean up code down the road (bug fixes, enhancements,
> etc) and if the original author isn't able to add it in themselves, it
> could be perltidy'd in order to help the developer (locate and fix the
> issue)|(add relevant enhancement where needed).

Don't get me wrong--I think whatever makes bioperl a better, more
maintainable beast should be what is done.  The bioperl gurus should
absolutely do what is best for them for code maintainability.

Sean


From n.haigh at sheffield.ac.uk  Fri Jun 15 15:17:15 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 15 Jun 2007 16:17:15 +0100
Subject: [Bioperl-l] Perltidy and... SVN and ...Re:  Perltidy
In-Reply-To: <78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>	<467264C8.4020202@sendu.me.uk>
	<46727048.3080904@mail.nih.gov>
	<78117FE3-6051-423C-A481-32F2DE9A05AC@uiuc.edu>
Message-ID: <4672AD7B.4050109@sheffield.ac.uk>

Chris Fields wrote:
> On Jun 15, 2007, at 5:56 AM, Sean Davis wrote:
> 
>> Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary  
>>> thing
>>> I'm not sure its worth it. There are more pressing things to be  
>>> done for
>>> Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd  
>>> do it.
>>> If that's not appropriate, I won't.
>> I agree with the sentiment noted above.  I'm a bit of an outsider  
>> here,
>> but bioperl is a collaborative project.  Not everyone has the same
>> sentiments about what "correct" style means.  As a programmer, I  
>> really
>> wouldn't want significant changes on the style of my code.  And perl
>> happily puts up with many styles.  I would say leave things as they
>> are--let the individual programmers choose.  It reduces the amount of
>> work of questionable importance and allows the coding style freedom  
>> that
>> perl supports.
>>
>> Just my $.02.
>>
>> Sean
> 
> I tend to run it on modules that need some reformatting  
> (SearchIO::blast comes to mind).  I believe you're correct when this  
> comes down to programming style, but I think this echoes a sentiment  
> (frustration, perhaps) that some of us have with long-term  
> maintenance of said code.
> 
> Maybe a compromise:  include a copy of .perltidyrc with the  
> distribution that goes by what a consensus wants or by the general  
> rules laid out in Perl Best Practices (spaced settings, use of spaces  
> over tabs, etc).  

RE spaces, tabs etc - how well is the different coding styles handled
for displaying in html and via the online browsable cvs?

Conversion would be encouraged but voluntary, with
> the caveat that if someone needs to clean up code down the road (bug  
> fixes, enhancements, etc) and if the original author isn't able to  
> add it in themselves, it could be perltidy'd in order to help the  
> developer (locate and fix the issue)|(add relevant enhancement where  
> needed).
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From johnsonm at gmail.com  Fri Jun 15 19:37:26 2007
From: johnsonm at gmail.com (Mark Johnson)
Date: Fri, 15 Jun 2007 14:37:26 -0500
Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap
	start and stop coordinates??
In-Reply-To: <E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
References: <CED81D34E37D5043A1211565277A51E507E23161@exchkc02.stowers-institute.org>
	<79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu>
	<ebf5eb170705161211m6fb570b5r86ee055299993172@mail.gmail.com>
	<B012903E-7C0F-4E34-9BFE-E551855B6C62@uiuc.edu>
	<ebf5eb170705211348w57c37f18oeb128656c446cff@mail.gmail.com>
	<62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu>
	<ebf5eb170705211421w244933fcu4db8ba748653c090@mail.gmail.com>
	<9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu>
	<a79f6a4b0705211729j3ff17d60v610fab7f5e135303@mail.gmail.com>
	<E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
Message-ID: <ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>

Patches waiting in Bugzilla (Bug #2299).  Changes:

-Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for
prokaryotic reports (Glimmer2/Glimmer3)
-Bio::Tools::Glimmer now produces features with Fuzzy or Split
locations as appropriate (partial or circular/wraparound predictions)
-Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out
sequence lengths
-Bio::Tools::Run::Glimmer passes along the sequence length to
Bio::Tools::Glimmer for Glimmer2

I should probably modify Bio::Tools::Genemark to use
Bio::SeqFeature::Generic features for prokaryotic reports, to be
consistent, but this is more likely to surprise people.  If nobody
screams about the change to Bio::Tools::Glimmer, I'll do it at some
point.

On 5/21/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote:
>
> >> glimmer2/3 both assume the genome is circular by default (I'm
> >> assuming since Glimmer2/3 are used for bacterial genomes).  Acc. to
> >> the Glimmer3 release notes the detail file has the information in the
> >> header; from the Glimmer3 data used for tests:
> >
> > You beat me to the reply Chris - yes, Glimmer2/3 assume circular
> > chromosome by default. I had forgotten about this in earlier
> > discussions of the new Glimmer parsers as I normally run it in
> > --linear / -L mode (even if I know it is circular) because it is
> > easier to handle, and our sequencer/assembler team usually gets the
> > origin of replication right.
> >
> >> Command:  /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../BCTDNA
> >> Glimmer3.icm Glimmer3
> >
> > I did a double-take here - that's the path to my Glimmer3
> > installation! It took me a couple of minutes to realise that you got
> > it from the bioperl test data I created. D'oh! :-)
>
> Yep, I forgot about that!
>
> >> There are options available for glimmer3 (-L, -X) that specify a
> >> linear sequence or allow ORFs to extend past the end of the sequence
> >> analyzed (the latter assumes a linear sequence).
> >
> > If the -L mode should produce Bio::Location::Split objects, I guess if
> > -X is used
> > it should produce Bio::Location::Fuzzy objects too...
> >
> > --Torsten
>
> True, didn't think about that one.  Def. something to consider adding
> in.
>
> chris
>
>
>


From cjfields at uiuc.edu  Fri Jun 15 20:55:06 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 15:55:06 -0500
Subject: [Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap
	start and stop coordinates??
In-Reply-To: <ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>
References: <CED81D34E37D5043A1211565277A51E507E23161@exchkc02.stowers-institute.org>
	<79FDA731-CC37-42B0-8200-0865F52C1CAC@uiuc.edu>
	<ebf5eb170705161211m6fb570b5r86ee055299993172@mail.gmail.com>
	<B012903E-7C0F-4E34-9BFE-E551855B6C62@uiuc.edu>
	<ebf5eb170705211348w57c37f18oeb128656c446cff@mail.gmail.com>
	<62034FE5-C375-49F3-9A4E-2545F93615F4@uiuc.edu>
	<ebf5eb170705211421w244933fcu4db8ba748653c090@mail.gmail.com>
	<9FAD90F3-79B3-4002-9A11-6C11F7D00614@uiuc.edu>
	<a79f6a4b0705211729j3ff17d60v610fab7f5e135303@mail.gmail.com>
	<E22A8442-E00D-4732-9D80-EE61C75732B7@uiuc.edu>
	<ebf5eb170706151237x1eeda0e6y728384715cb6a21a@mail.gmail.com>
Message-ID: <D09AF2F1-1459-4B6B-A3ED-85CEDE34D7B6@uiuc.edu>

I'll try getting to that in tonight.  Been pretty tied up lately...

chris

On Jun 15, 2007, at 2:37 PM, Mark Johnson wrote:

> Patches waiting in Bugzilla (Bug #2299).  Changes:
>
> -Bio::Tools::Glimmer now emits Bio::SeqFeature::Generic features for
> prokaryotic reports (Glimmer2/Glimmer3)
> -Bio::Tools::Glimmer now produces features with Fuzzy or Split
> locations as appropriate (partial or circular/wraparound predictions)
> -Bio::Tools:Glimmer goes through the Glimmer3 .detail file to pull out
> sequence lengths
> -Bio::Tools::Run::Glimmer passes along the sequence length to
> Bio::Tools::Glimmer for Glimmer2
>
> I should probably modify Bio::Tools::Genemark to use
> Bio::SeqFeature::Generic features for prokaryotic reports, to be
> consistent, but this is more likely to surprise people.  If nobody
> screams about the change to Bio::Tools::Glimmer, I'll do it at some
> point.
>
> On 5/21/07, Chris Fields <cjfields at uiuc.edu> wrote:
>>
>> On May 21, 2007, at 7:29 PM, Torsten Seemann wrote:
>>
>>>> glimmer2/3 both assume the genome is circular by default (I'm
>>>> assuming since Glimmer2/3 are used for bacterial genomes).  Acc. to
>>>> the Glimmer3 release notes the detail file has the information  
>>>> in the
>>>> header; from the Glimmer3 data used for tests:
>>>
>>> You beat me to the reply Chris - yes, Glimmer2/3 assume circular
>>> chromosome by default. I had forgotten about this in earlier
>>> discussions of the new Glimmer parsers as I normally run it in
>>> --linear / -L mode (even if I know it is circular) because it is
>>> easier to handle, and our sequencer/assembler team usually gets the
>>> origin of replication right.
>>>
>>>> Command:  /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../ 
>>>> BCTDNA
>>>> Glimmer3.icm Glimmer3
>>>
>>> I did a double-take here - that's the path to my Glimmer3
>>> installation! It took me a couple of minutes to realise that you got
>>> it from the bioperl test data I created. D'oh! :-)
>>
>> Yep, I forgot about that!
>>
>>>> There are options available for glimmer3 (-L, -X) that specify a
>>>> linear sequence or allow ORFs to extend past the end of the  
>>>> sequence
>>>> analyzed (the latter assumes a linear sequence).
>>>
>>> If the -L mode should produce Bio::Location::Split objects, I  
>>> guess if
>>> -X is used
>>> it should produce Bio::Location::Fuzzy objects too...
>>>
>>> --Torsten
>>
>> True, didn't think about that one.  Def. something to consider adding
>> in.
>>
>> chris
>>
>>
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From rvos at interchange.ubc.ca  Fri Jun 15 21:08:17 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Fri, 15 Jun 2007 14:08:17 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
Message-ID: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>

Hi,

I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS.

Rutger


-----Original Message-----

> Date: Fri Jun 15 07:56:23 PDT 2007
> From: "Chris Fields" <cjfields at uiuc.edu>
> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> To: "Sendu Bala" <bix at sendu.me.uk>
>
> 
> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> 
> >>>> ...
> >>> Can we do any sort of massive conversion at some logical timepoint.
> >>> Probably after a branch release or something?  Because it basically
> >>> means we're going to have differences on nearly every line which is
> >>> going to make diff-ing difficult when debugging old/new versions.
> >>> Maybe it is not a problem because we aren't introducing and new  
> >>> bugs!
> >
> > Sorry, can you clarify the problem you envisage? And why would  
> > making a branch release help?
> 
> Maybe the worry is that mass conversion in such a large codebase  
> could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o  
> trying?
> 
> >> I agree; if we intend on doing this it should be all at once,  
> >> maybe  on a branch dedicated to ensure that code changes don't  
> >> tank tests  (they shouldn't but one never knows).  We would then  
> >> need a script up- and-running that tidies everything up prior to  
> >> commits (though what  happens if perltidy tanks?...).
> >> Sendu, up for it?
> >
> > If its going to be difficult and a hassle, for such an unnecessary  
> > thing I'm not sure its worth it. There are more pressing things to  
> > be done for Bioperl.
> >
> > If I can just run perltidy on the entire package and commit, I'd do  
> > it. If that's not appropriate, I won't.
> 
> The choices aren't necessarily all or nothing.  What about voluntary,  
> recommended use of a perltidy config file included with the  
> distribution, with additional 'caveats'?  See my response to Sean.
> 
> >>>> About svn
> > [snip]
> >> Stepped into that one, didn't I!  I'll look into how much effort  
> >> is  involved and try getting something going in the next month or  
> >> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as  
> >> well but it  might be worth looking into.
> >
> > I'd put this in the unnecessary-but-nice category as well. If it  
> > will be as easy as my ->new change, go ahead. If not, there are  
> > more pressing matters (POD fixing, test script updating and  
> > finishing...).
> 
> A few other open-bio projects have actively discussed a CVS->SVN  
> migration (BioRuby and I think BioPython, though the latter could be  
> wrong).  As I said, "it might be worth looking into" to weigh the  
> pros/cons, get others opinions from others who have made the  
> transition, etc.  We could, as Jason suggested, even set up a tester  
> SVN w/o making it the default codebase (lock it off to a few testers,  
> have CVS commits automatically/manually carry over to SVN, etc).
> 
> I agree with you that it's not feasible to switch over prior to a  
> release and that there are more pressing issues, but it doesn't hurt  
> having an open discussion about it.
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From spiros at lokku.com  Fri Jun 15 21:40:32 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Fri, 15 Jun 2007 22:40:32 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>

On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
> Hi,
>
> I would very much prefer it if bioperl moved to svn. I'm considering merging Bio::Phylo (to the extent that that's possible/practical) with bioperl and move it to an OBF repository, but I'd rather not go back to CVS.
>
> Rutger
>

I second that, SVN seems like the reasonable choice. I would be more
than happy to help out as well.

Spiros

>
> -----Original Message-----
>
> > Date: Fri Jun 15 07:56:23 PDT 2007
> > From: "Chris Fields" <cjfields at uiuc.edu>
> > Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> > To: "Sendu Bala" <bix at sendu.me.uk>
> >
> >
> > On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> >
> > >>>> ...
> > >>> Can we do any sort of massive conversion at some logical timepoint.
> > >>> Probably after a branch release or something?  Because it basically
> > >>> means we're going to have differences on nearly every line which is
> > >>> going to make diff-ing difficult when debugging old/new versions.
> > >>> Maybe it is not a problem because we aren't introducing and new
> > >>> bugs!
> > >
> > > Sorry, can you clarify the problem you envisage? And why would
> > > making a branch release help?
> >
> > Maybe the worry is that mass conversion in such a large codebase
> > could lead to hard-to-locate bugs.  Shouldn't occur but who knows w/o
> > trying?
> >
> > >> I agree; if we intend on doing this it should be all at once,
> > >> maybe  on a branch dedicated to ensure that code changes don't
> > >> tank tests  (they shouldn't but one never knows).  We would then
> > >> need a script up- and-running that tidies everything up prior to
> > >> commits (though what  happens if perltidy tanks?...).
> > >> Sendu, up for it?
> > >
> > > If its going to be difficult and a hassle, for such an unnecessary
> > > thing I'm not sure its worth it. There are more pressing things to
> > > be done for Bioperl.
> > >
> > > If I can just run perltidy on the entire package and commit, I'd do
> > > it. If that's not appropriate, I won't.
> >
> > The choices aren't necessarily all or nothing.  What about voluntary,
> > recommended use of a perltidy config file included with the
> > distribution, with additional 'caveats'?  See my response to Sean.
> >
> > >>>> About svn
> > > [snip]
> > >> Stepped into that one, didn't I!  I'll look into how much effort
> > >> is  involved and try getting something going in the next month or
> > >> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
> > >> well but it  might be worth looking into.
> > >
> > > I'd put this in the unnecessary-but-nice category as well. If it
> > > will be as easy as my ->new change, go ahead. If not, there are
> > > more pressing matters (POD fixing, test script updating and
> > > finishing...).
> >
> > A few other open-bio projects have actively discussed a CVS->SVN
> > migration (BioRuby and I think BioPython, though the latter could be
> > wrong).  As I said, "it might be worth looking into" to weigh the
> > pros/cons, get others opinions from others who have made the
> > transition, etc.  We could, as Jason suggested, even set up a tester
> > SVN w/o making it the default codebase (lock it off to a few testers,
> > have CVS commits automatically/manually carry over to SVN, etc).
> >
> > I agree with you that it's not feasible to switch over prior to a
> > release and that there are more pressing issues, but it doesn't hurt
> > having an open discussion about it.
> >
> > chris
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From hlapp at gmx.net  Fri Jun 15 22:10:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 18:10:25 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
Message-ID: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>

So should we set up a sandbox svn repository and those who would like  
to help out

- take shots at migrating bioperl (any current cvs snapshot will do)  
to svn

- you document what you find yourself having to do in trying to make  
it work

- you report back when you think you have a working repository

- we all get a defined amount of time to test to our hearts' content,  
say 2 weeks

- you fix issues that were encountered

- report back when done, followed by retesting for, say 1 week

- iterate previous 2 steps until no issues and no objections to  
migration

- two more weeks of warning period to all developers to commit all  
outstanding changes, or reapply them to a future svn checkout

- pull the trigger by locking down cvs, applying the migration as  
worked out before, and announcing that BioPerl is now on svn

- get free beer at next BOSC (I'll pay if no one else does)

This may not be precisely the plan that needs to be executed, but  
it's probably somewhere along those lines.

If there are volunteers who would like to spearhead this, then power  
to you - I think everyone is in favor and the advantages of svn don't  
need to be debated. The only reason it hasn't happened yet is because  
no one has stepped forward who would have the energy.

I'm sure ChrisD will gladly create the svn sandbox if we have  
volunteers lined up to get going.

	-hilmar

On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:

> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>> Hi,
>>
>> I would very much prefer it if bioperl moved to svn. I'm  
>> considering merging Bio::Phylo (to the extent that that's possible/ 
>> practical) with bioperl and move it to an OBF repository, but I'd  
>> rather not go back to CVS.
>>
>> Rutger
>>
>
> I second that, SVN seems like the reasonable choice. I would be more
> than happy to help out as well.
>
> Spiros
>
>>
>> -----Original Message-----
>>
>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>
>>>
>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>
>>>>>>> ...
>>>>>> Can we do any sort of massive conversion at some logical  
>>>>>> timepoint.
>>>>>> Probably after a branch release or something?  Because it  
>>>>>> basically
>>>>>> means we're going to have differences on nearly every line  
>>>>>> which is
>>>>>> going to make diff-ing difficult when debugging old/new versions.
>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>> bugs!
>>>>
>>>> Sorry, can you clarify the problem you envisage? And why would
>>>> making a branch release help?
>>>
>>> Maybe the worry is that mass conversion in such a large codebase
>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows  
>>> w/o
>>> trying?
>>>
>>>>> I agree; if we intend on doing this it should be all at once,
>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>> need a script up- and-running that tidies everything up prior to
>>>>> commits (though what  happens if perltidy tanks?...).
>>>>> Sendu, up for it?
>>>>
>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>> thing I'm not sure its worth it. There are more pressing things to
>>>> be done for Bioperl.
>>>>
>>>> If I can just run perltidy on the entire package and commit, I'd do
>>>> it. If that's not appropriate, I won't.
>>>
>>> The choices aren't necessarily all or nothing.  What about  
>>> voluntary,
>>> recommended use of a perltidy config file included with the
>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>
>>>>>>> About svn
>>>> [snip]
>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>> is  involved and try getting something going in the next month or
>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>> well but it  might be worth looking into.
>>>>
>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>> more pressing matters (POD fixing, test script updating and
>>>> finishing...).
>>>
>>> A few other open-bio projects have actively discussed a CVS->SVN
>>> migration (BioRuby and I think BioPython, though the latter could be
>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>> pros/cons, get others opinions from others who have made the
>>> transition, etc.  We could, as Jason suggested, even set up a tester
>>> SVN w/o making it the default codebase (lock it off to a few  
>>> testers,
>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>
>>> I agree with you that it's not feasible to switch over prior to a
>>> release and that there are more pressing issues, but it doesn't hurt
>>> having an open discussion about it.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jason at bioperl.org  Fri Jun 15 22:23:15 2007
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 15 Jun 2007 15:23:15 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>

Sounds like a plan, I'll be curious to see if we can still get keep  
anonymous CVS working as I'd like to not have to pull the plug on  
that.  There are some threads out on the web about how to do this  
with a commit rule on SVN.

Also, can someone who is close enough to all the SVN benefits please  
elaborate how it is going to help _this_ project?
Perhaps you would be willing to put a few words up -- like on (a to  
be created):
http://bioperl.org/wiki/BioPerl:Version_control_changeover

This way if anonymous CVS is broken and/or developers who haven't  
been paying attention come back to commit code ask why things changed  
we don't have to compose long emails... =)

-jason
On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote:

> So should we set up a sandbox svn repository and those who would like
> to help out
>
> - take shots at migrating bioperl (any current cvs snapshot will do)
> to svn
>
> - you document what you find yourself having to do in trying to make
> it work
>
> - you report back when you think you have a working repository
>
> - we all get a defined amount of time to test to our hearts' content,
> say 2 weeks
>
> - you fix issues that were encountered
>
> - report back when done, followed by retesting for, say 1 week
>
> - iterate previous 2 steps until no issues and no objections to
> migration
>
> - two more weeks of warning period to all developers to commit all
> outstanding changes, or reapply them to a future svn checkout
>
> - pull the trigger by locking down cvs, applying the migration as
> worked out before, and announcing that BioPerl is now on svn
>
> - get free beer at next BOSC (I'll pay if no one else does)
>
> This may not be precisely the plan that needs to be executed, but
> it's probably somewhere along those lines.
>
> If there are volunteers who would like to spearhead this, then power
> to you - I think everyone is in favor and the advantages of svn don't
> need to be debated. The only reason it hasn't happened yet is because
> no one has stepped forward who would have the energy.

>
> I'm sure ChrisD will gladly create the svn sandbox if we have
> volunteers lined up to get going.
>
> 	-hilmar
>
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>
>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>> Hi,
>>>
>>> I would very much prefer it if bioperl moved to svn. I'm
>>> considering merging Bio::Phylo (to the extent that that's possible/
>>> practical) with bioperl and move it to an OBF repository, but I'd
>>> rather not go back to CVS.
>>>
>>> Rutger
>>>
>>
>> I second that, SVN seems like the reasonable choice. I would be more
>> than happy to help out as well.
>>
>> Spiros
>>
>>>
>>> -----Original Message-----
>>>
>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>
>>>>
>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>
>>>>>>>> ...
>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>> timepoint.
>>>>>>> Probably after a branch release or something?  Because it
>>>>>>> basically
>>>>>>> means we're going to have differences on nearly every line
>>>>>>> which is
>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>> versions.
>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>> bugs!
>>>>>
>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>> making a branch release help?
>>>>
>>>> Maybe the worry is that mass conversion in such a large codebase
>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>> w/o
>>>> trying?
>>>>
>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>> Sendu, up for it?
>>>>>
>>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>>> thing I'm not sure its worth it. There are more pressing things to
>>>>> be done for Bioperl.
>>>>>
>>>>> If I can just run perltidy on the entire package and commit,  
>>>>> I'd do
>>>>> it. If that's not appropriate, I won't.
>>>>
>>>> The choices aren't necessarily all or nothing.  What about
>>>> voluntary,
>>>> recommended use of a perltidy config file included with the
>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>
>>>>>>>> About svn
>>>>> [snip]
>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>> is  involved and try getting something going in the next month or
>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>> well but it  might be worth looking into.
>>>>>
>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>> more pressing matters (POD fixing, test script updating and
>>>>> finishing...).
>>>>
>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>> migration (BioRuby and I think BioPython, though the latter  
>>>> could be
>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>> pros/cons, get others opinions from others who have made the
>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>> tester
>>>> SVN w/o making it the default codebase (lock it off to a few
>>>> testers,
>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>
>>>> I agree with you that it's not feasible to switch over prior to a
>>>> release and that there are more pressing issues, but it doesn't  
>>>> hurt
>>>> having an open discussion about it.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From sheris at eps.berkeley.edu  Fri Jun 15 22:58:12 2007
From: sheris at eps.berkeley.edu (Sheri Simmons)
Date: Fri, 15 Jun 2007 15:58:12 -0700
Subject: [Bioperl-l] seq doesn't validate error
Message-ID: <200706151558.12911.sheris@eps.berkeley.edu>

Hi,
I'm getting an error as follows when I try to reverse complement a sequence 
string stored in a hash of arrays. The storage code is: 

		$nstarthash{$key} = [$sortchecks[0], join("", @nseq), 		
join("",@{$seqhash{$key}})];

the sequence of interest is the element at index 1. 

Later, I try to retrieve this string for a subset of keys so I can reverse 
complement it based on input from another hash (%complement):

			my %revcomphash = map { my $read = $_;
			grep $complement{$read} eq 'C', %complement;
			{$_, (Bio::Seq->new(-seq =>$nstarthash{$_}[1]))->revcom->seq()};}
			 keys(%nstarthash); 


I get the following warning (long sequence edited for clarity):

-- -------------------- WARNING ---------------------
MSG: seq doesn't validate, mismatch is 1
---------------------------------------------------

------------- EXCEPTION  -------------
MSG: Attempting to set the sequence to [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC] 
which does not look healthy
STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498
STACK toplevel ../quality_wrapper.pl:103

I cannot find any non-allowed characters in the sequence, and the 
de-referencing appears to work correctly. Can anyone help me?
I'm using the latest Bioperl installation (1.5.2) with ActivePerl5.8 on a 
Mepis 6.5 system. 

Thanks
Sheri

---------------------------------------------------------------------
Sheri Simmons
Department of Earth and Planetary Sciences
University of California, Berkeley
Berkeley, CA 94720-4767


From Kevin.M.Brown at asu.edu  Fri Jun 15 23:11:34 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Fri, 15 Jun 2007 16:11:34 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <200706151558.12911.sheris@eps.berkeley.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
Message-ID: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>

> I'm getting an error as follows when I try to reverse 
> complement a sequence string stored in a hash of arrays. The 
> storage code is: 
> 
> 		$nstarthash{$key} = [$sortchecks[0], join("", 
> @nseq), 		
> join("",@{$seqhash{$key}})];
> 
> the sequence of interest is the element at index 1. 
> 
> Later, I try to retrieve this string for a subset of keys so 
> I can reverse complement it based on input from another hash 
> (%complement):
> 
> 			my %revcomphash = map { my $read = $_;
> 			grep $complement{$read} eq 'C', %complement;
> 			{$_, (Bio::Seq->new(-seq 
> =>$nstarthash{$_}[1]))->revcom->seq()};}
> 			 keys(%nstarthash); 
> 
> 
> I get the following warning (long sequence edited for clarity):
> 
> -- -------------------- WARNING ---------------------
> MSG: seq doesn't validate, mismatch is 1
> ---------------------------------------------------
> 
> ------------- EXCEPTION  -------------
> MSG: Attempting to set the sequence to 
> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
> which does not look healthy
> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK 
> toplevel ../quality_wrapper.pl:103
> 
> I cannot find any non-allowed characters in the sequence, and 
> the de-referencing appears to work correctly. Can anyone help me?
> I'm using the latest Bioperl installation (1.5.2) with 
> ActivePerl5.8 on a Mepis 6.5 system. 

Try telling the Bio::Seq object what alphabet to use when creating it.
I tend to create them like:

Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')


From sheris at eps.berkeley.edu  Fri Jun 15 23:53:04 2007
From: sheris at eps.berkeley.edu (Sheri Simmons)
Date: Fri, 15 Jun 2007 16:53:04 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
	<1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
Message-ID: <200706151653.04135.sheris@eps.berkeley.edu>

Thanks for the suggestion, but that still gives the same error as before.

On Friday 15 June 2007 4:11 pm, Kevin Brown wrote:
> > I'm getting an error as follows when I try to reverse
> > complement a sequence string stored in a hash of arrays. The
> > storage code is:
> >
> > 		$nstarthash{$key} = [$sortchecks[0], join("",
> > @nseq),
> > join("",@{$seqhash{$key}})];
> >
> > the sequence of interest is the element at index 1.
> >
> > Later, I try to retrieve this string for a subset of keys so
> > I can reverse complement it based on input from another hash
> > (%complement):
> >
> > 			my %revcomphash = map { my $read = $_;
> > 			grep $complement{$read} eq 'C', %complement;
> > 			{$_, (Bio::Seq->new(-seq
> > =>$nstarthash{$_}[1]))->revcom->seq()};}
> > 			 keys(%nstarthash);
> >
> >
> > I get the following warning (long sequence edited for clarity):
> >
> > -- -------------------- WARNING ---------------------
> > MSG: seq doesn't validate, mismatch is 1
> > ---------------------------------------------------
> >
> > ------------- EXCEPTION  -------------
> > MSG: Attempting to set the sequence to
> > [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
> > which does not look healthy
> > STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
> > STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
> > STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK
> > toplevel ../quality_wrapper.pl:103
> >
> > I cannot find any non-allowed characters in the sequence, and
> > the de-referencing appears to work correctly. Can anyone help me?
> > I'm using the latest Bioperl installation (1.5.2) with
> > ActivePerl5.8 on a Mepis 6.5 system.
>
> Try telling the Bio::Seq object what alphabet to use when creating it.
> I tend to create them like:
>
> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')

-- 
Sheri Simmons
Department of Earth and Planetary Sciences
University of California, Berkeley
Berkeley, CA 94720-4767


From hlapp at gmx.net  Sat Jun 16 01:27:42 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 21:27:42 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <EDC569BF-2E4B-4BFC-916A-665CC2FFABAF@gmx.net>

Could you post a ticket to the helpdesk: support at open-bio.org.

	-hilmar

On Jun 15, 2007, at 9:08 PM, George Hartzell wrote:

> Hilmar Lapp writes:
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>
> Free Beer, huh?  Do you deliver?
>
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
>
> thanks!
>
> g.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Sat Jun 16 01:08:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Fri, 15 Jun 2007 21:08:32 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <18035.14352.963113.473274@almost.alerce.com>

Hilmar Lapp writes:
 > So should we set up a sandbox svn repository and those who would like  
 > to help out
 > 
 > - take shots at migrating bioperl (any current cvs snapshot will do)  
 > to svn

Free Beer, huh?  Do you deliver?

Can you package up a tarball of the cvs repository (bzip or gzip would
save some time) itself?

thanks!

g.


From cjfields at uiuc.edu  Sat Jun 16 01:42:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 20:42:05 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>

The browsable CVS has a 'Download tarball' link if that helps.

http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
cvsroot=bioperl

chris

On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:

> Hilmar Lapp writes:
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>
> Free Beer, huh?  Do you deliver?
>
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
>
> thanks!
>
> g.


From cjfields at uiuc.edu  Sat Jun 16 01:50:09 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 20:50:09 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
Message-ID: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>

I'll help out to the extent I can w/o having the SVN know-how.  We  
need (as Jason points out) someone who can detail the benefits and  
maybe keep an updated journal on the wiki.

I believe at least one or two of the other Bio* contemplated moving  
over to SVN, which may be worth checking out.

chris

On Jun 15, 2007, at 5:10 PM, Hilmar Lapp wrote:

> So should we set up a sandbox svn repository and those who would like
> to help out
>
> - take shots at migrating bioperl (any current cvs snapshot will do)
> to svn
>
> - you document what you find yourself having to do in trying to make
> it work
>
> - you report back when you think you have a working repository
>
> - we all get a defined amount of time to test to our hearts' content,
> say 2 weeks
>
> - you fix issues that were encountered
>
> - report back when done, followed by retesting for, say 1 week
>
> - iterate previous 2 steps until no issues and no objections to
> migration
>
> - two more weeks of warning period to all developers to commit all
> outstanding changes, or reapply them to a future svn checkout
>
> - pull the trigger by locking down cvs, applying the migration as
> worked out before, and announcing that BioPerl is now on svn
>
> - get free beer at next BOSC (I'll pay if no one else does)
>
> This may not be precisely the plan that needs to be executed, but
> it's probably somewhere along those lines.
>
> If there are volunteers who would like to spearhead this, then power
> to you - I think everyone is in favor and the advantages of svn don't
> need to be debated. The only reason it hasn't happened yet is because
> no one has stepped forward who would have the energy.
>
> I'm sure ChrisD will gladly create the svn sandbox if we have
> volunteers lined up to get going.
>
> 	-hilmar
>
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>
>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>> Hi,
>>>
>>> I would very much prefer it if bioperl moved to svn. I'm
>>> considering merging Bio::Phylo (to the extent that that's possible/
>>> practical) with bioperl and move it to an OBF repository, but I'd
>>> rather not go back to CVS.
>>>
>>> Rutger
>>>
>>
>> I second that, SVN seems like the reasonable choice. I would be more
>> than happy to help out as well.
>>
>> Spiros
>>
>>>
>>> -----Original Message-----
>>>
>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>
>>>>
>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>
>>>>>>>> ...
>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>> timepoint.
>>>>>>> Probably after a branch release or something?  Because it
>>>>>>> basically
>>>>>>> means we're going to have differences on nearly every line
>>>>>>> which is
>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>> versions.
>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>> bugs!
>>>>>
>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>> making a branch release help?
>>>>
>>>> Maybe the worry is that mass conversion in such a large codebase
>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>> w/o
>>>> trying?
>>>>
>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>> Sendu, up for it?
>>>>>
>>>>> If its going to be difficult and a hassle, for such an unnecessary
>>>>> thing I'm not sure its worth it. There are more pressing things to
>>>>> be done for Bioperl.
>>>>>
>>>>> If I can just run perltidy on the entire package and commit,  
>>>>> I'd do
>>>>> it. If that's not appropriate, I won't.
>>>>
>>>> The choices aren't necessarily all or nothing.  What about
>>>> voluntary,
>>>> recommended use of a perltidy config file included with the
>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>
>>>>>>>> About svn
>>>>> [snip]
>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>> is  involved and try getting something going in the next month or
>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>> well but it  might be worth looking into.
>>>>>
>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>> more pressing matters (POD fixing, test script updating and
>>>>> finishing...).
>>>>
>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>> migration (BioRuby and I think BioPython, though the latter  
>>>> could be
>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>> pros/cons, get others opinions from others who have made the
>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>> tester
>>>> SVN w/o making it the default codebase (lock it off to a few
>>>> testers,
>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>
>>>> I agree with you that it's not feasible to switch over prior to a
>>>> release and that there are more pressing issues, but it doesn't  
>>>> hurt
>>>> having an open discussion about it.
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Sat Jun 16 02:12:55 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 15 Jun 2007 22:12:55 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
Message-ID: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>

I think he meant the cvs repository itself, containing all the change  
data. -hilmar

On Jun 15, 2007, at 9:42 PM, Chris Fields wrote:

> The browsable CVS has a 'Download tarball' link if that helps.
>
> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
> cvsroot=bioperl
>
> chris
>
> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:
>
>> Hilmar Lapp writes:
>>> So should we set up a sandbox svn repository and those who would  
>>> like
>>> to help out
>>>
>>> - take shots at migrating bioperl (any current cvs snapshot will do)
>>> to svn
>>
>> Free Beer, huh?  Do you deliver?
>>
>> Can you package up a tarball of the cvs repository (bzip or gzip  
>> would
>> save some time) itself?
>>
>> thanks!
>>
>> g.
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Jun 16 02:37:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 15 Jun 2007 21:37:55 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
Message-ID: <F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>

Ah, got it.  Sorry.

George, planning on taking this up?

chris

On Jun 15, 2007, at 9:12 PM, Hilmar Lapp wrote:

> I think he meant the cvs repository itself, containing all the  
> change data. -hilmar
>
> On Jun 15, 2007, at 9:42 PM, Chris Fields wrote:
>
>> The browsable CVS has a 'Download tarball' link if that helps.
>>
>> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/? 
>> cvsroot=bioperl
>>
>> chris
>>
>> On Jun 15, 2007, at 8:08 PM, George Hartzell wrote:
>>
>>> Hilmar Lapp writes:
>>>> So should we set up a sandbox svn repository and those who would  
>>>> like
>>>> to help out
>>>>
>>>> - take shots at migrating bioperl (any current cvs snapshot will  
>>>> do)
>>>> to svn
>>>
>>> Free Beer, huh?  Do you deliver?
>>>
>>> Can you package up a tarball of the cvs repository (bzip or gzip  
>>> would
>>> save some time) itself?
>>>
>>> thanks!
>>>
>>> g.
>>
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sat Jun 16 08:20:57 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 16 Jun 2007 09:20:57 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18035.14352.963113.473274@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
Message-ID: <46739D69.4090204@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:
> Hilmar Lapp writes:
>  > So should we set up a sandbox svn repository and those who would like  
>  > to help out
>  > 
>  > - take shots at migrating bioperl (any current cvs snapshot will do)  
>  > to svn
> 
> Free Beer, huh?  Do you deliver?
> 
> Can you package up a tarball of the cvs repository (bzip or gzip would
> save some time) itself?
> 
> thanks!
> 
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Sounds like George might know what he's doing! I have a question about
setting up svn access. I believe access can be done in several ways,
over webdav, over ssh and probably others too. Do you have any knowledge
about the benefits of one over the other? I suppose I'm thinking of what
to implement to allow anonymous read access for users and authenticated
access for developers.

Nath

p.s. if you need any monkeys to do some work I'm happy to help out as
much as possible.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGc51pczuW2jkwy2gRAmi9AJ0XojVdh4ckXoc3bwVSmeNw95cR7QCfV+G9
Lb9NUEe4dkCakQ+Gc7Py98A=
=BG9m
-----END PGP SIGNATURE-----


From rvos at interchange.ubc.ca  Sat Jun 16 10:37:11 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 03:37:11 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <15232024.1181990231860.JavaMail.myubc2@handel.my.ubc.ca>

I can volunteer some time to help out with this.

Rutger

-----Original Message-----

> Date: Fri Jun 15 15:10:25 PDT 2007
> From: "Hilmar Lapp" <hlapp at gmx.net>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: spiros at lokku.com
>
> So should we set up a sandbox svn repository and those who would like  
> to help out
> 
> - take shots at migrating bioperl (any current cvs snapshot will do)  
> to svn
> 
> - you document what you find yourself having to do in trying to make  
> it work
> 
> - you report back when you think you have a working repository
> 
> - we all get a defined amount of time to test to our hearts' content,  
> say 2 weeks
> 
> - you fix issues that were encountered
> 
> - report back when done, followed by retesting for, say 1 week
> 
> - iterate previous 2 steps until no issues and no objections to  
> migration
> 
> - two more weeks of warning period to all developers to commit all  
> outstanding changes, or reapply them to a future svn checkout
> 
> - pull the trigger by locking down cvs, applying the migration as  
> worked out before, and announcing that BioPerl is now on svn
> 
> - get free beer at next BOSC (I'll pay if no one else does)
> 
> This may not be precisely the plan that needs to be executed, but  
> it's probably somewhere along those lines.
> 
> If there are volunteers who would like to spearhead this, then power  
> to you - I think everyone is in favor and the advantages of svn don't  
> need to be debated. The only reason it hasn't happened yet is because  
> no one has stepped forward who would have the energy.
> 
> I'm sure ChrisD will gladly create the svn sandbox if we have  
> volunteers lined up to get going.
> 
> 	-hilmar
> 
> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
> 
> > On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
> >> Hi,
> >>
> >> I would very much prefer it if bioperl moved to svn. I'm  
> >> considering merging Bio::Phylo (to the extent that that's possible/ 
> >> practical) with bioperl and move it to an OBF repository, but I'd  
> >> rather not go back to CVS.
> >>
> >> Rutger
> >>
> >
> > I second that, SVN seems like the reasonable choice. I would be more
> > than happy to help out as well.
> >
> > Spiros
> >
> >>
> >> -----Original Message-----
> >>
> >>> Date: Fri Jun 15 07:56:23 PDT 2007
> >>> From: "Chris Fields" <cjfields at uiuc.edu>
> >>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
> >>> To: "Sendu Bala" <bix at sendu.me.uk>
> >>>
> >>>
> >>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
> >>>
> >>>>>>> ...
> >>>>>> Can we do any sort of massive conversion at some logical  
> >>>>>> timepoint.
> >>>>>> Probably after a branch release or something?  Because it  
> >>>>>> basically
> >>>>>> means we're going to have differences on nearly every line  
> >>>>>> which is
> >>>>>> going to make diff-ing difficult when debugging old/new versions.
> >>>>>> Maybe it is not a problem because we aren't introducing and new
> >>>>>> bugs!
> >>>>
> >>>> Sorry, can you clarify the problem you envisage? And why would
> >>>> making a branch release help?
> >>>
> >>> Maybe the worry is that mass conversion in such a large codebase
> >>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows  
> >>> w/o
> >>> trying?
> >>>
> >>>>> I agree; if we intend on doing this it should be all at once,
> >>>>> maybe  on a branch dedicated to ensure that code changes don't
> >>>>> tank tests  (they shouldn't but one never knows).  We would then
> >>>>> need a script up- and-running that tidies everything up prior to
> >>>>> commits (though what  happens if perltidy tanks?...).
> >>>>> Sendu, up for it?
> >>>>
> >>>> If its going to be difficult and a hassle, for such an unnecessary
> >>>> thing I'm not sure its worth it. There are more pressing things to
> >>>> be done for Bioperl.
> >>>>
> >>>> If I can just run perltidy on the entire package and commit, I'd do
> >>>> it. If that's not appropriate, I won't.
> >>>
> >>> The choices aren't necessarily all or nothing.  What about  
> >>> voluntary,
> >>> recommended use of a perltidy config file included with the
> >>> distribution, with additional 'caveats'?  See my response to Sean.
> >>>
> >>>>>>> About svn
> >>>> [snip]
> >>>>> Stepped into that one, didn't I!  I'll look into how much effort
> >>>>> is  involved and try getting something going in the next month or
> >>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
> >>>>> well but it  might be worth looking into.
> >>>>
> >>>> I'd put this in the unnecessary-but-nice category as well. If it
> >>>> will be as easy as my ->new change, go ahead. If not, there are
> >>>> more pressing matters (POD fixing, test script updating and
> >>>> finishing...).
> >>>
> >>> A few other open-bio projects have actively discussed a CVS->SVN
> >>> migration (BioRuby and I think BioPython, though the latter could be
> >>> wrong).  As I said, "it might be worth looking into" to weigh the
> >>> pros/cons, get others opinions from others who have made the
> >>> transition, etc.  We could, as Jason suggested, even set up a tester
> >>> SVN w/o making it the default codebase (lock it off to a few  
> >>> testers,
> >>> have CVS commits automatically/manually carry over to SVN, etc).
> >>>
> >>> I agree with you that it's not feasible to switch over prior to a
> >>> release and that there are more pressing issues, but it doesn't hurt
> >>> having an open discussion about it.
> >>>
> >>> chris
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Sat Jun 16 11:21:47 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Sat, 16 Jun 2007 07:21:47 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
Message-ID: <4673C7CB.1030709@mail.nih.gov>

Chris Fields wrote:
> I'll help out to the extent I can w/o having the SVN know-how.  We  
> need (as Jason points out) someone who can detail the benefits and  
> maybe keep an updated journal on the wiki.
>
> I believe at least one or two of the other Bio* contemplated moving  
> over to SVN, which may be worth checking out.
>   
The bioconductor project is on SVN.  The project includes over 200 
packages (the equivalent of perl modules) with something around 150-200 
ACTIVE developers.  They also have a build system for several OSes that 
operates on a cron-like system with builds of several versions 
approximately daily.  Their system is running at something like revision 
30,000, so they have significant experience.  If anyone would like 
technical support, I can certainly ask the folks maintaining their site 
if they can give some input.  Let me know if anyone would like a contact 
person.

As for access, the typical access is over http (or https).  Access 
controls can be set up on the server side while allowing anonymous 
access for checkout.  There are many excellent SVN for every OS, so that 
should not be a problem. 

Sean


From cjfields at uiuc.edu  Sat Jun 16 14:02:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 09:02:35 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4673C7CB.1030709@mail.nih.gov>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
Message-ID: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>


On Jun 16, 2007, at 6:21 AM, Sean Davis wrote:

> Chris Fields wrote:
>> I'll help out to the extent I can w/o having the SVN know-how.  We
>> need (as Jason points out) someone who can detail the benefits and
>> maybe keep an updated journal on the wiki.
>>
>> I believe at least one or two of the other Bio* contemplated moving
>> over to SVN, which may be worth checking out.
>>
> The bioconductor project is on SVN.  The project includes over 200
> packages (the equivalent of perl modules) with something around  
> 150-200
> ACTIVE developers.  They also have a build system for several OSes  
> that
> operates on a cron-like system with builds of several versions
> approximately daily.  Their system is running at something like  
> revision
> 30,000, so they have significant experience.  If anyone would like
> technical support, I can certainly ask the folks maintaining their  
> site
> if they can give some input.  Let me know if anyone would like a  
> contact
> person.
>
> As for access, the typical access is over http (or https).  Access
> controls can be set up on the server side while allowing anonymous
> access for checkout.  There are many excellent SVN for every OS, so  
> that
> should not be a problem.
>
> Sean

It looks like George Hartzell may be taking a crack at it, with  
Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
could have something testable relatively soon.  After that we'll need  
to work out a few other issues, basically what's on Hilmar's list.

chris


From hlapp at gmx.net  Sat Jun 16 14:40:08 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 16 Jun 2007 10:40:08 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<AB7E0918-0EBA-47C9-8A64-FB8709230F2A@bioperl.org>
Message-ID: <51E89347-4AF7-482E-98DB-BE1AA0138A91@gmx.net>

Just as an aside, even if we can't keep anonymous cvs working, I  
would think that using apache URL rewriting and a small CGI script  
that returns an appropriate page redirect we can without too much  
trouble keep the hyperlinks functional that people may have bookmarked

	-hilmar

On Jun 15, 2007, at 6:23 PM, Jason Stajich wrote:

> Sounds like a plan, I'll be curious to see if we can still get keep  
> anonymous CVS working as I'd like to not have to pull the plug on  
> that.  There are some threads out on the web about how to do this  
> with a commit rule on SVN.
>
> Also, can someone who is close enough to all the SVN benefits  
> please elaborate how it is going to help _this_ project?
> Perhaps you would be willing to put a few words up -- like on (a to  
> be created):
> http://bioperl.org/wiki/BioPerl:Version_control_changeover
>
> This way if anonymous CVS is broken and/or developers who haven't  
> been paying attention come back to commit code ask why things  
> changed we don't have to compose long emails... =)
>
> -jason
> On Jun 15, 2007, at 3:10 PM, Hilmar Lapp wrote:
>
>> So should we set up a sandbox svn repository and those who would like
>> to help out
>>
>> - take shots at migrating bioperl (any current cvs snapshot will do)
>> to svn
>>
>> - you document what you find yourself having to do in trying to make
>> it work
>>
>> - you report back when you think you have a working repository
>>
>> - we all get a defined amount of time to test to our hearts' content,
>> say 2 weeks
>>
>> - you fix issues that were encountered
>>
>> - report back when done, followed by retesting for, say 1 week
>>
>> - iterate previous 2 steps until no issues and no objections to
>> migration
>>
>> - two more weeks of warning period to all developers to commit all
>> outstanding changes, or reapply them to a future svn checkout
>>
>> - pull the trigger by locking down cvs, applying the migration as
>> worked out before, and announcing that BioPerl is now on svn
>>
>> - get free beer at next BOSC (I'll pay if no one else does)
>>
>> This may not be precisely the plan that needs to be executed, but
>> it's probably somewhere along those lines.
>>
>> If there are volunteers who would like to spearhead this, then power
>> to you - I think everyone is in favor and the advantages of svn don't
>> need to be debated. The only reason it hasn't happened yet is because
>> no one has stepped forward who would have the energy.
>
>>
>> I'm sure ChrisD will gladly create the svn sandbox if we have
>> volunteers lined up to get going.
>>
>> 	-hilmar
>>
>> On Jun 15, 2007, at 5:40 PM, Spiros Denaxas wrote:
>>
>>> On 6/15/07, rvos <rvos at interchange.ubc.ca> wrote:
>>>> Hi,
>>>>
>>>> I would very much prefer it if bioperl moved to svn. I'm
>>>> considering merging Bio::Phylo (to the extent that that's possible/
>>>> practical) with bioperl and move it to an OBF repository, but I'd
>>>> rather not go back to CVS.
>>>>
>>>> Rutger
>>>>
>>>
>>> I second that, SVN seems like the reasonable choice. I would be more
>>> than happy to help out as well.
>>>
>>> Spiros
>>>
>>>>
>>>> -----Original Message-----
>>>>
>>>>> Date: Fri Jun 15 07:56:23 PDT 2007
>>>>> From: "Chris Fields" <cjfields at uiuc.edu>
>>>>> Subject: Re: [Bioperl-l] SVN and ...Re:  Perltidy
>>>>> To: "Sendu Bala" <bix at sendu.me.uk>
>>>>>
>>>>>
>>>>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>>>>
>>>>>>>>> ...
>>>>>>>> Can we do any sort of massive conversion at some logical
>>>>>>>> timepoint.
>>>>>>>> Probably after a branch release or something?  Because it
>>>>>>>> basically
>>>>>>>> means we're going to have differences on nearly every line
>>>>>>>> which is
>>>>>>>> going to make diff-ing difficult when debugging old/new  
>>>>>>>> versions.
>>>>>>>> Maybe it is not a problem because we aren't introducing and new
>>>>>>>> bugs!
>>>>>>
>>>>>> Sorry, can you clarify the problem you envisage? And why would
>>>>>> making a branch release help?
>>>>>
>>>>> Maybe the worry is that mass conversion in such a large codebase
>>>>> could lead to hard-to-locate bugs.  Shouldn't occur but who knows
>>>>> w/o
>>>>> trying?
>>>>>
>>>>>>> I agree; if we intend on doing this it should be all at once,
>>>>>>> maybe  on a branch dedicated to ensure that code changes don't
>>>>>>> tank tests  (they shouldn't but one never knows).  We would then
>>>>>>> need a script up- and-running that tidies everything up prior to
>>>>>>> commits (though what  happens if perltidy tanks?...).
>>>>>>> Sendu, up for it?
>>>>>>
>>>>>> If its going to be difficult and a hassle, for such an  
>>>>>> unnecessary
>>>>>> thing I'm not sure its worth it. There are more pressing  
>>>>>> things to
>>>>>> be done for Bioperl.
>>>>>>
>>>>>> If I can just run perltidy on the entire package and commit,  
>>>>>> I'd do
>>>>>> it. If that's not appropriate, I won't.
>>>>>
>>>>> The choices aren't necessarily all or nothing.  What about
>>>>> voluntary,
>>>>> recommended use of a perltidy config file included with the
>>>>> distribution, with additional 'caveats'?  See my response to Sean.
>>>>>
>>>>>>>>> About svn
>>>>>> [snip]
>>>>>>> Stepped into that one, didn't I!  I'll look into how much effort
>>>>>>> is  involved and try getting something going in the next  
>>>>>>> month or
>>>>>>> two,  maybe sooner if time permits.  I'm lacking on SVN-foo as
>>>>>>> well but it  might be worth looking into.
>>>>>>
>>>>>> I'd put this in the unnecessary-but-nice category as well. If it
>>>>>> will be as easy as my ->new change, go ahead. If not, there are
>>>>>> more pressing matters (POD fixing, test script updating and
>>>>>> finishing...).
>>>>>
>>>>> A few other open-bio projects have actively discussed a CVS->SVN
>>>>> migration (BioRuby and I think BioPython, though the latter  
>>>>> could be
>>>>> wrong).  As I said, "it might be worth looking into" to weigh the
>>>>> pros/cons, get others opinions >from others who have made the
>>>>> transition, etc.  We could, as Jason suggested, even set up a  
>>>>> tester
>>>>> SVN w/o making it the default codebase (lock it off to a few
>>>>> testers,
>>>>> have CVS commits automatically/manually carry over to SVN, etc).
>>>>>
>>>>> I agree with you that it's not feasible to switch over prior to a
>>>>> release and that there are more pressing issues, but it doesn't  
>>>>> hurt
>>>>> having an open discussion about it.
>>>>>
>>>>> chris
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Jun 16 14:55:09 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 16 Jun 2007 10:55:09 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4673C7CB.1030709@mail.nih.gov>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
Message-ID: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>


On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:

> As for access, the typical access is over http (or https).

We're using svn+ssh here (NESCent) so the password is the same as the  
one you set for your account on the server, and you can use public/ 
private key negotiation for authentication.

I think the ability to not provide a password for every single  
interaction is a requirement. If that requires using svn+ssh or can  
be made to work through https too I don't know. On sf.net I have to  
use https for svn and it doesn't ask me for the password each time.  
Not sure how this works though, maybe some local caching?

We should not be using http, or whatever other protocol that sends  
unencrypted passwords.

>   Access controls can be set up on the server side while allowing  
> anonymous access for checkout.  There are many excellent SVN for  
> every OS, so that should not be a problem.

On Mac OSX the most convenient way I have found is through fink. It  
does ask to install 30 other dependencies, which had me balk at  
first, but me doing it by hand is even worse than fink doing it, so I  
finally gave in and it's really a breeze. I've not had a single issue.

  From a sysadmin perspective, what might be worth keeping in mind is  
that svn is going to store everything in a database (BerkeleyDB I  
think). I.e., there is no such thing anymore as restoring individual  
source code files from backup if one gets accidentally corrupted on  
the server. It seems you have to restore the entire database, i.e.,  
the entire repository. I vaguely recall though that how svn manages  
the repository is actually configurable and that other storage than  
DB is possible too. Don't ask me for the pros and cons of one vs the  
other.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From rvos at interchange.ubc.ca  Sat Jun 16 17:09:18 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 10:09:18 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>

CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).

For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).

Rutger


-----Original Message-----

> Date: Sat Jun 16 07:55:09 PDT 2007
> From: "Hilmar Lapp" <hlapp at gmx.net>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: "Sean Davis" <sdavis2 at mail.nih.gov>
>
> 
> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> 
> > As for access, the typical access is over http (or https).
> 
> We're using svn+ssh here (NESCent) so the password is the same as the  
> one you set for your account on the server, and you can use public/ 
> private key negotiation for authentication.
> 
> I think the ability to not provide a password for every single  
> interaction is a requirement. If that requires using svn+ssh or can  
> be made to work through https too I don't know. On sf.net I have to  
> use https for svn and it doesn't ask me for the password each time.  
> Not sure how this works though, maybe some local caching?
> 
> We should not be using http, or whatever other protocol that sends  
> unencrypted passwords.
> 
> >   Access controls can be set up on the server side while allowing  
> > anonymous access for checkout.  There are many excellent SVN for  
> > every OS, so that should not be a problem.
> 
> On Mac OSX the most convenient way I have found is through fink. It  
> does ask to install 30 other dependencies, which had me balk at  
> first, but me doing it by hand is even worse than fink doing it, so I  
> finally gave in and it's really a breeze. I've not had a single issue.
> 
>   From a sysadmin perspective, what might be worth keeping in mind is  
> that svn is going to store everything in a database (BerkeleyDB I  
> think). I.e., there is no such thing anymore as restoring individual  
> source code files from backup if one gets accidentally corrupted on  
> the server. It seems you have to restore the entire database, i.e.,  
> the entire repository. I vaguely recall though that how svn manages  
> the repository is actually configurable and that other storage than  
> DB is possible too. Don't ask me for the pros and cons of one vs the  
> other.
> 
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From rvos at interchange.ubc.ca  Sat Jun 16 17:15:45 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Sat, 16 Jun 2007 10:15:45 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>

A brief word on the topic of perltidy: no. I like what it does, and I sort of follow one of its settings (-syn -sob -b), but if you run it on a whole source tree it'll screw up the diffs, and I'm still worried about it breaking things (though really it shouldn't, it creates a *.bak if something doesn't compile anymore).

Rutger


-----Original Message-----

> Date: Sat Jun 16 10:09:18 PDT 2007
> From: "rvos" <rvos at interchange.ubc.ca>
> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> To: "Hilmar Lapp" <hlapp at gmx.net>, "Sean Davis" <sdavis2 at mail.nih.gov>
>
> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).
> 
> For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).
> 
> Rutger
> 
> 
> -----Original Message-----
> 
> > Date: Sat Jun 16 07:55:09 PDT 2007
> > From: "Hilmar Lapp" <hlapp at gmx.net>
> > Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
> > To: "Sean Davis" <sdavis2 at mail.nih.gov>
> >
> > 
> > On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> > 
> > > As for access, the typical access is over http (or https).
> > 
> > We're using svn+ssh here (NESCent) so the password is the same as the  
> > one you set for your account on the server, and you can use public/ 
> > private key negotiation for authentication.
> > 
> > I think the ability to not provide a password for every single  
> > interaction is a requirement. If that requires using svn+ssh or can  
> > be made to work through https too I don't know. On sf.net I have to  
> > use https for svn and it doesn't ask me for the password each time.  
> > Not sure how this works though, maybe some local caching?
> > 
> > We should not be using http, or whatever other protocol that sends  
> > unencrypted passwords.
> > 
> > >   Access controls can be set up on the server side while allowing  
> > > anonymous access for checkout.  There are many excellent SVN for  
> > > every OS, so that should not be a problem.
> > 
> > On Mac OSX the most convenient way I have found is through fink. It  
> > does ask to install 30 other dependencies, which had me balk at  
> > first, but me doing it by hand is even worse than fink doing it, so I  
> > finally gave in and it's really a breeze. I've not had a single issue.
> > 
> >   From a sysadmin perspective, what might be worth keeping in mind is  
> > that svn is going to store everything in a database (BerkeleyDB I  
> > think). I.e., there is no such thing anymore as restoring individual  
> > source code files from backup if one gets accidentally corrupted on  
> > the server. It seems you have to restore the entire database, i.e.,  
> > the entire repository. I vaguely recall though that how svn manages  
> > the repository is actually configurable and that other storage than  
> > DB is possible too. Don't ask me for the pros and cons of one vs the  
> > other.
> > 
> > 	-hilmar
> > -- 
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From george.heller at yahoo.com  Sat Jun 16 17:29:26 2007
From: george.heller at yahoo.com (George Heller)
Date: Sat, 16 Jun 2007 10:29:26 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
Message-ID: <959624.48556.qm@web56502.mail.re3.yahoo.com>

Hi all,
   
  I am looking at extracting the taxonomy hierarchy for some taxon ids. What I plan to do is, for a given taxon id, say 33090, I want to extract all taxon ids that are children of this species. I do not just want the immediate children, but the children's children and so on. 
   
  Any ideas on the way I can go about doing this?
   
  George

       
---------------------------------
Shape Yahoo! in your own image.  Join our Network Research Panel today!


From bix at sendu.me.uk  Sat Jun 16 18:21:38 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 16 Jun 2007 19:21:38 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <959624.48556.qm@web56502.mail.re3.yahoo.com>
References: <959624.48556.qm@web56502.mail.re3.yahoo.com>
Message-ID: <46742A32.90305@sendu.me.uk>

George Heller wrote:
> Hi all,
> 
> I am looking at extracting the taxonomy hierarchy for some taxon ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
> 
> Any ideas on the way I can go about doing this?

Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
some kind of looping structure. Most easily a recursing sub.

If you happen to code up something neat and efficient, why not share it 
with us and we could add it to the Taxonomy module(s).


From cjfields at uiuc.edu  Sat Jun 16 19:23:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 14:23:43 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
Message-ID: <A59B3FA2-6732-4DB2-9C9C-223DFF41D1E9@uiuc.edu>


On Jun 16, 2007, at 9:55 AM, Hilmar Lapp wrote:

>
> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>
>> As for access, the typical access is over http (or https).
>
> We're using svn+ssh here (NESCent) so the password is the same as the
> one you set for your account on the server, and you can use public/
> private key negotiation for authentication.
>
> I think the ability to not provide a password for every single
> interaction is a requirement. If that requires using svn+ssh or can
> be made to work through https too I don't know. On sf.net I have to
> use https for svn and it doesn't ask me for the password each time.
> Not sure how this works though, maybe some local caching?
>
> We should not be using http, or whatever other protocol that sends
> unencrypted passwords.

Agreed; it should be through ssh.

>>   Access controls can be set up on the server side while allowing
>> anonymous access for checkout.  There are many excellent SVN for
>> every OS, so that should not be a problem.
>
> On Mac OSX the most convenient way I have found is through fink. It
> does ask to install 30 other dependencies, which had me balk at
> first, but me doing it by hand is even worse than fink doing it, so I
> finally gave in and it's really a breeze. I've not had a single issue.
>
>   From a sysadmin perspective, what might be worth keeping in mind is
> that svn is going to store everything in a database (BerkeleyDB I
> think). I.e., there is no such thing anymore as restoring individual
> source code files from backup if one gets accidentally corrupted on
> the server. It seems you have to restore the entire database, i.e.,
> the entire repository. I vaguely recall though that how svn manages
> the repository is actually configurable and that other storage than
> DB is possible too. Don't ask me for the pros and cons of one vs the
> other.

MacPorts/DarwinPorts also has subversion, various language bindings,  
cvs2svn, and various perl modules.  There are also a few SVN GUIs  
lingering around (including live folders within Komodo).

chris


From cjfields at uiuc.edu  Sat Jun 16 19:18:06 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 16 Jun 2007 14:18:06 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>
References: <27462805.1182014145280.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <1A314D08-8F3C-4A4B-B58D-64AC7952F149@uiuc.edu>

I think it's viable as an option if the code really needs it.  After  
100+ commits some of the code has schizy coding styles, so cleaning  
it up helps.  In those cases having a perltidy config file present  
wouldn't hurt.  However I agree that it shouldn't be applied across  
every module and should be done judiciously (the commit message, for  
instance, should actually state the code was tidied).

chris

PS - Nice to see the ball is rolling on SVN!

On Jun 16, 2007, at 12:15 PM, rvos wrote:

> A brief word on the topic of perltidy: no. I like what it does, and  
> I sort of follow one of its settings (-syn -sob -b), but if you run  
> it on a whole source tree it'll screw up the diffs, and I'm still  
> worried about it breaking things (though really it shouldn't, it  
> creates a *.bak if something doesn't compile anymore).
>
> Rutger
>
>
>
> -----Original Message-----
>
>> Date: Sat Jun 16 10:09:18 PDT 2007
>> From: "rvos" <rvos at interchange.ubc.ca>
>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
>> To: "Hilmar Lapp" <hlapp at gmx.net>, "Sean Davis"  
>> <sdavis2 at mail.nih.gov>
>>
>> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales  
>> talk has been expended over it already, for my own purpose I like  
>> the integration with eclipse (through subclipse plugin) and  
>> komodo, in addition to the atomic commits (so I can ctrl+c if I  
>> goof up (again)).
>>
>> For standalone use on osx I didn't use the fink one, but I forgot  
>> where I did get it from. It was very easy to set up, though. On  
>> windows there is a really nice standalone one (tortoisesvn) that  
>> integrates with the explorer so you can see on the file icons what  
>> the state of a file is. I know that there's a cvs2svn utility that  
>> converts your revision history (seems a requirement).
>>
>> Rutger
>>
>>
>> -----Original Message-----
>>
>>> Date: Sat Jun 16 07:55:09 PDT 2007
>>> From: "Hilmar Lapp" <hlapp at gmx.net>
>>> Subject: Re: [Bioperl-l] SVN and ...Re: Perltidy
>>> To: "Sean Davis" <sdavis2 at mail.nih.gov>
>>>
>>>
>>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>>
>>>> As for access, the typical access is over http (or https).
>>>
>>> We're using svn+ssh here (NESCent) so the password is the same as  
>>> the
>>> one you set for your account on the server, and you can use public/
>>> private key negotiation for authentication.
>>>
>>> I think the ability to not provide a password for every single
>>> interaction is a requirement. If that requires using svn+ssh or can
>>> be made to work through https too I don't know. On sf.net I have to
>>> use https for svn and it doesn't ask me for the password each time.
>>> Not sure how this works though, maybe some local caching?
>>>
>>> We should not be using http, or whatever other protocol that sends
>>> unencrypted passwords.
>>>
>>>>   Access controls can be set up on the server side while allowing
>>>> anonymous access for checkout.  There are many excellent SVN for
>>>> every OS, so that should not be a problem.
>>>
>>> On Mac OSX the most convenient way I have found is through fink. It
>>> does ask to install 30 other dependencies, which had me balk at
>>> first, but me doing it by hand is even worse than fink doing it,  
>>> so I
>>> finally gave in and it's really a breeze. I've not had a single  
>>> issue.
>>>
>>>   From a sysadmin perspective, what might be worth keeping in  
>>> mind is
>>> that svn is going to store everything in a database (BerkeleyDB I
>>> think). I.e., there is no such thing anymore as restoring individual
>>> source code files from backup if one gets accidentally corrupted on
>>> the server. It seems you have to restore the entire database, i.e.,
>>> the entire repository. I vaguely recall though that how svn manages
>>> the repository is actually configurable and that other storage than
>>> DB is possible too. Don't ask me for the pros and cons of one vs the
>>> other.
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hartzell at alerce.com  Sat Jun 16 17:47:01 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 16 Jun 2007 10:47:01 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
	<F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
Message-ID: <18036.8725.29073.619527@almost.alerce.com>

Chris Fields writes:
 > Ah, got it.  Sorry.
 > 
 > George, planning on taking this up?

I'm going to take a *peek*.  I just finished (unless someone finds
another issue) moving someone's cvs repository over to svn, so I have
some tools cobbled together and some knowledge in the cache.

I don't have too much idle time at the moment though, so if it gets
gooey I'll just summarize what I learn.  Either way it seems worth a
peek.

I will need the repository itself though.  I'll post a note to
support at open-bio.org.

g.


From jason at bioperl.org  Sat Jun 16 23:54:18 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 16 Jun 2007 16:54:18 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18036.8725.29073.619527@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<6486C307-6682-4AB4-9CFC-D4BB317F0A98@uiuc.edu>
	<6CBDAADF-75FC-41C8-95AA-DB07A0C65A62@gmx.net>
	<F9A17504-CAD4-4B4F-A388-F0365B288FCB@uiuc.edu>
	<18036.8725.29073.619527@almost.alerce.com>
Message-ID: <6F57475B-715F-49D1-B6D2-F3FD3ACCB728@bioperl.org>

Thanks George.
I'll respond to your support ticket as well but I put up tarballs of  
the repository as of today.

I had thought at one point ChrisD might have setup rsync-able access  
to the whole repostitory through code.open-bio.org but for now I have  
put up tarballs of most of the CVS dirs from bioperl
http://bioperl.org/uploads/

Just to say I already went through all the steps of running cvs2svn  
myself and had problems gathering back out the branches and all the  
tags when I tried it.  If you want to start with a smaller repository  
like bioperl-network or bioperl-db as the initial cvs2svn conversion  
script took quite a long time to run on bioperl-live.

Regarding ssh/https:
We have already gone through some of this for blipkit and biojava  
projects.  I think we'll still keep separate anonymous read-only  
(code.open-bio.org) and writeable repositories (dev.open-bio.org) as  
I think we are resisting any webapps on the developement server as we  
want that to as locked down as possible.  For the newly created svn  
repositories that I've been creating/using I just use svn+ssh and  
that worked okay.


-jason

On Jun 16, 2007, at 10:47 AM, George Hartzell wrote:

> Chris Fields writes:
>> Ah, got it.  Sorry.
>>
>> George, planning on taking this up?
>
> I'm going to take a *peek*.  I just finished (unless someone finds
> another issue) moving someone's cvs repository over to svn, so I have
> some tools cobbled together and some knowledge in the cache.
>
> I don't have too much idle time at the moment though, so if it gets
> gooey I'll just summarize what I learn.  Either way it seems worth a
> peek.
>
> I will need the repository itself though.  I'll post a note to
> support at open-bio.org.
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hartzell at alerce.com  Sat Jun 16 23:56:09 2007
From: hartzell at alerce.com (George Hartzell)
Date: Sat, 16 Jun 2007 16:56:09 -0700
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <46739D69.4090204@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<18035.14352.963113.473274@almost.alerce.com>
	<46739D69.4090204@sheffield.ac.uk>
Message-ID: <18036.30873.609341.181853@almost.alerce.com>

Nathan S. Haigh writes:
 > [...]
 > Sounds like George might know what he's doing! 

Hey, I've been looking for a Marketing Director.  Want a job?

 > I have a question about
 > setting up svn access. I believe access can be done in several ways,
 > over webdav, over ssh and probably others too. Do you have any knowledge
 > about the benefits of one over the other? I suppose I'm thinking of what
 > to implement to allow anonymous read access for users and authenticated
 > access for developers.

There are two and a half ways to talk to the repository:

  - You can put it behind a web server (e.g. apache) and get at it
    using http/https.  Authentication and authorization happen using
    the normal web server tricks, so as long as you don't do anything
    silly (e.g. don't use basic auth, stick with mod_auth_digest),
    even http connections won't send passwords in the clear.  You can
    define users in .htpassword files or use any of the fancier setup
    (e.g. sql databases, etc...).

  - You can talk to it via subversion's simple server, svnserve.
    There are two ways you usually talk to svnserve (neither of which
    send passwords in the clear):

      * directly, using a URL like
          svn:/svn.example.com/repo/proj/trunk
        when you do this the client either talks directly to a copy of
        svnserve running as a daemon, or possibly to something like
        inetd that'll start an svnserve as necessary.

        In this case, you define authen. and author. info in an
        svnserve.conf file.

      * indirectly, using a URL like
          svn+ssh://svn.example.com/repo/proj/trunk/
        in which case you make an ssh connection to the server machine
        (and authenticate via ssh mechanisms, anything other than a
        key-pair will drive you nuts with repeated password requests)
        and then an svnserve process is started up for you in "tunnel
        mode".  Access control is coarse grained an via OS level  access
        permisions. 

        Generally in this case you need to give out shell accounts to
        everyone involved, or (tsk, tsk) have them use a common
        account.  There's a cute trick in the svn book that shows how
        to use a shared ssh account but still have all of the changes
        in the repo keep track of the real user.  I've never tried
        it.... 

   - If you're on the same machine as the repo, you can do this
     simple:
        file:///path/to/repo/proj/trunk

The biggest deciding factor is how you want to manage your users and
whether you're already messing around with a web server.  I've
generally worked in small group and everyone's had ssh access, but
I've set it up the other ways too.

You can even access via multiple paths.  The only trick is that the
repository needs to be writable by whoever's committing, and if
they're running svnserve themselves (file: or svn+ssh:) and things
aren't set up right (all the dirs in the repo need to be group
writable and have the magic bit set so that any new stuff created is
also writable, users umasks and group membership need to be aligned)
then things go fubar.  Google's your friend here, and each of the
OS's/distro's has a standard hack for making this work, usually
involving a wrapper app that takes care of things.

Feel free to ask any particular questions.

Phew,

g.


From jason at bioperl.org  Sun Jun 17 00:17:58 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 16 Jun 2007 17:17:58 -0700
Subject: [Bioperl-l] seq doesn't validate error
In-Reply-To: <200706151653.04135.sheris@eps.berkeley.edu>
References: <200706151558.12911.sheris@eps.berkeley.edu>
	<1A4207F8295607498283FE9E93B775B4034C6E7B@EX02.asurite.ad.asu.edu>
	<200706151653.04135.sheris@eps.berkeley.edu>
Message-ID: <6A369DE9-943A-4DF1-9DF0-F68E361C8C20@bioperl.org>

There error is clearly saying there must be a symbol or letter in  
your sequence that violates the regexp.
I had modified the code in CVS to actually provide a more informative  
mismatch error in the error message, but this probably not in the  
release you are using.

Anyways, add this to see what is causing the problem:

print join(",",($nstarthash{$_}[1] =~ /([^ 
$Bio::PrimarySeq::MATCHPATTERN]+)/g)), "\n";

-jason
On Jun 15, 2007, at 4:53 PM, Sheri Simmons wrote:

> Thanks for the suggestion, but that still gives the same error as  
> before.
>
> On Friday 15 June 2007 4:11 pm, Kevin Brown wrote:
>>> I'm getting an error as follows when I try to reverse
>>> complement a sequence string stored in a hash of arrays. The
>>> storage code is:
>>>
>>> 		$nstarthash{$key} = [$sortchecks[0], join("",
>>> @nseq),
>>> join("",@{$seqhash{$key}})];
>>>
>>> the sequence of interest is the element at index 1.
>>>
>>> Later, I try to retrieve this string for a subset of keys so
>>> I can reverse complement it based on input from another hash
>>> (%complement):
>>>
>>> 			my %revcomphash = map { my $read = $_;
>>> 			grep $complement{$read} eq 'C', %complement;
>>> 			{$_, (Bio::Seq->new(-seq
>>> =>$nstarthash{$_}[1]))->revcom->seq()};}
>>> 			 keys(%nstarthash);
>>>
>>>
>>> I get the following warning (long sequence edited for clarity):
>>>
>>> -- -------------------- WARNING ---------------------
>>> MSG: seq doesn't validate, mismatch is 1
>>> ---------------------------------------------------
>>>
>>> ------------- EXCEPTION  -------------
>>> MSG: Attempting to set the sequence to
>>> [GCCCCTGTAATCGCTTTTATATCGTCAGCGATC]
>>> which does not look healthy
>>> STACK Bio::PrimarySeq::seq /usr/share/perl5/Bio/PrimarySeq.pm:268
>>> STACK Bio::PrimarySeq::new /usr/share/perl5/Bio/PrimarySeq.pm:217
>>> STACK Bio::Seq::new /usr/share/perl5/Bio/Seq.pm:498 STACK
>>> toplevel ../quality_wrapper.pl:103
>>>
>>> I cannot find any non-allowed characters in the sequence, and
>>> the de-referencing appears to work correctly. Can anyone help me?
>>> I'm using the latest Bioperl installation (1.5.2) with
>>> ActivePerl5.8 on a Mepis 6.5 system.
>>
>> Try telling the Bio::Seq object what alphabet to use when creating  
>> it.
>> I tend to create them like:
>>
>> Bio::Seq->new(-seq=> $seqvar, -alphabet=>'dna')
>
> -- 
> Sheri Simmons
> Department of Earth and Planetary Sciences
> University of California, Berkeley
> Berkeley, CA 94720-4767
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From n.haigh at sheffield.ac.uk  Sun Jun 17 11:45:11 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 17 Jun 2007 12:45:11 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <46751EC7.8020609@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

rvos wrote:
> CIPRES and Bio::Phylo use svn. As for the benefits, a lot of sales talk has been expended over it already, for my own purpose I like the integration with eclipse (through subclipse plugin) and komodo, in addition to the atomic commits (so I can ctrl+c if I goof up (again)).
> 
> For standalone use on osx I didn't use the fink one, but I forgot where I did get it from. It was very easy to set up, though. On windows there is a really nice standalone one (tortoisesvn) that integrates with the explorer so you can see on the file icons what the state of a file is. I know that there's a cvs2svn utility that converts your revision history (seems a requirement).
> 
> Rutger
> 
> 

Just to clarify, subversion is available as command line for windows:
http://subversion.tigris.org/project_packages.html

TortoiseSVN is another svn client with a GUI that integrates into the
shell. I tried setting this up a while back to use ssh (via PUTTY), but
I wasn't successful. This may have been due to me just starting out with
svn or that it was harder to setup in an earlier version of TortoiseSVN.

Does anyone have experience of setting up svn on Windows to use ssh? If
the changeover takes place, I'm happy to write some howto's for setting
up svn clients for Windows.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGdR7HczuW2jkwy2gRAmgOAJ96wLzVYbjqEPborZTsw6gwU6UitgCfV02v
8xHJvn/Eqf9LePR3Ei0ZaIw=
=t5pN
-----END PGP SIGNATURE-----


From george.heller at yahoo.com  Sun Jun 17 18:41:55 2007
From: george.heller at yahoo.com (George Heller)
Date: Sun, 17 Jun 2007 11:41:55 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46742A32.90305@sendu.me.uk>
Message-ID: <148654.15952.qm@web56511.mail.re3.yahoo.com>

Hi all,
   
  Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. 
   
  Thanks.
  George

Sendu Bala <bix at sendu.me.uk> wrote:
  George Heller wrote:
> Hi all,
> 
> I am looking at extracting the taxonomy hierarchy for some taxon ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
> 
> Any ideas on the way I can go about doing this?

Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
some kind of looping structure. Most easily a recursing sub.

If you happen to code up something neat and efficient, why not share it 
with us and we could add it to the Taxonomy module(s).


---------------------------------
Shape Yahoo! in your own image.  Join our Network Research Panel today!


From jason at bioperl.org  Sun Jun 17 20:48:05 2007
From: jason at bioperl.org (Jason Stajich)
Date: Sun, 17 Jun 2007 13:48:05 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <148654.15952.qm@web56511.mail.re3.yahoo.com>
References: <148654.15952.qm@web56511.mail.re3.yahoo.com>
Message-ID: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org>

I assume you already figured out how to setup a local taxonomydb?

You just want the extant species/leaves of the tree

my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;


-jason
On Jun 17, 2007, at 11:41 AM, George Heller wrote:

> Hi all,
>
>   Can anyone point me to some example that uses the  
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at  
> this, and I am not quite sure how to implement it.
>
>   Thanks.
>   George
>
> Sendu Bala <bix at sendu.me.uk> wrote:
>   George Heller wrote:
>> Hi all,
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children and so
>> on.
>>
>> Any ideas on the way I can go about doing this?
>
> Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
>
> If you happen to code up something neat and efficient, why not  
> share it
> with us and we could add it to the Taxonomy module(s).
>
>
>
> ---------------------------------
> Shape Yahoo! in your own image.  Join our Network Research Panel  
> today!
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From aaron.j.mackey at gsk.com  Mon Jun 18 02:35:42 2007
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Sun, 17 Jun 2007 22:35:42 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46742A32.90305@sendu.me.uk>
Message-ID: <OF9A874C93.CFF12016-ON852572FE.000E328D-852572FE.000E463E@gsk.com>

To do so efficiently, you might want to check out:

  http://www.oreillynet.com/pub/a/network/2002/11/27/bioconf.html

-Aaron

bioperl-l-bounces at lists.open-bio.org wrote on 06/16/2007 02:21:38 PM:

> George Heller wrote:
> > Hi all,
> > 
> > I am looking at extracting the taxonomy hierarchy for some taxon ids.
> > What I plan to do is, for a given taxon id, say 33090, I want to
> > extract all taxon ids that are children of this species. I do not
> > just want the immediate children, but the children's children and so
> > on.
> > 
> > Any ideas on the way I can go about doing this?
> 
> Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
> 
> If you happen to code up something neat and efficient, why not share it 
> with us and we could add it to the Taxonomy module(s).
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From aaron.j.mackey at gsk.com  Mon Jun 18 02:34:12 2007
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Sun, 17 Jun 2007 22:34:12 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <B44C5404-3440-4DBB-8653-FBC46540249B@gmx.net>
Message-ID: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>

> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
> 
> > As for access, the typical access is over http (or https).
> 
> We're using svn+ssh here (NESCent)

Let me just note that https is preferable to ssh for those poor slobs 
stuck behind a corporate firewall (svn happily prompts me for my proxy 
server's user/pass, then my https authentication realm's user/pass - all 
then get cached in some .svn/ file that I don't have to worry about again 
until my proxy server password changes once a month ...)

-Aaron


From george.heller at yahoo.com  Mon Jun 18 04:21:45 2007
From: george.heller at yahoo.com (George Heller)
Date: Sun, 17 Jun 2007 21:21:45 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <617EAA37-D502-42AE-8820-C9514DBF7ACD@bioperl.org>
Message-ID: <487845.37410.qm@web56510.mail.re3.yahoo.com>

Thanks. And how can I assign the $node here in the below code, such that I can reference it to a particular taxon id record? I want to retrieve all the descendents from the taxonomy hierarchy, given a particular taxon id. 
   
  I have a local db setup, in which I have uploaded data using the load_ncbi_taxonomy.pl script. 
   
  Thanks.
  George
   
  Jason Stajich <jason at bioperl.org> wrote:
    I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;  

  
  -jason
    On Jun 17, 2007, at 11:41 AM, George Heller wrote:

    Hi all,
  

    Can anyone point me to some example that uses the get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at this, and I am not quite sure how to implement it. 
  

    Thanks.
    George
  

  Sendu Bala <bix at sendu.me.uk> wrote:
    George Heller wrote:
    Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not share it 
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image.  Join our Network Research Panel today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Need a vacation? Get great deals to amazing places on Yahoo! Travel. 


From bix at sendu.me.uk  Mon Jun 18 10:44:00 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 11:44:00 +0100
Subject: [Bioperl-l] Network tests overhaul
Message-ID: <467661F0.2060703@sendu.me.uk>

When the test suite runs currently, most (the intent is all) tests skip 
if the test would require network (internet) access. This is to avoid 
tests failing not due to bugs in Bioperl code, but due to temporarily 
inaccessible servers. This is also to make running the test suite faster.

To do a complete test you currently have to set BIOPERLDEBUG to true, 
which activates the network test but also increases verbosity. This 
actually causes a problem, since when running the entire test suite the 
additional debug information is more a hindrance than a help, since the 
reams of printed information can hide significant warnings that may also 
get printed. Its also ugly.

The solution is to divorce activation of network tests from the request 
for verbosity. The obvious implementation is to have another environment 
variable, perhaps BIOPERLNETWORK. However, there is an opportunity to do 
something more appropriate. The running of networking tests should be a 
choice given to every end-user installing Bioperl. Debugging 
information, on the other hand, is only of interest to the developer 
working on a specific module under test, so can be left as a 'hidden' 
env var.


I have just committed one possible implementation along these lines.

You say:
perl Build.PL
as normal, and if you seem to have internet access it asks you if you'd 
like to run network tests. The default answer is no. If you answer yes, 
network tests will be enabled.

You can alternatively say:
perl Build.PL --network
and if you seem to have internet access, network tests will be enabled.

Then you run the tests:
./Build test
Any tests written to support the new system will then skip network tests 
if they haven't been enabled.

The only test I've written to support the new system is t/RemoteBlast.t:
./Build test --test_files t/RemoteBlast.t --verbose


Adding support to test scripts consists of the following changes:

+ use Module::Build;
+ my $build = Module::Build->current(get_options => { network => {} });
+ my $do_network_tests = $build->notes('network');

! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests
---
! if (!$do_network_tests) { # skip network tests


I propose adding this support to all test scripts that carry out network 
tests. Does anyone have objections? Does anyone have alternate 
implementations that may be superior?

I specifically suggest we don't use an env var in addition to the above, 
because the multiple ways of doing things could lead to confusion. Which 
takes priority? Did a user really have the networking tests turned on 
when he reported his test results?


The one thing I need help with is identifying which tests attempt to 
access the internet. I think we caught most of them for the 1.5.2 
release, but I think there are more lurking around. Can anyone offer a 
way to systematically find at least the test scripts which access the 
internet, if not the specific tests within?

Cheers,
Sendu.


From bix at sendu.me.uk  Mon Jun 18 10:46:17 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 11:46:17 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <46766279.7050202@sendu.me.uk>

Sendu Bala wrote:
> Adding support to test scripts consists of the following changes:
> 
> + use Module::Build;
> + my $build = Module::Build->current(get_options => { network => {} });

That should read:
+ my $build = Module::Build->current();

> + my $do_network_tests = $build->notes('network');


From cjfields at uiuc.edu  Mon Jun 18 11:45:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 06:45:10 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <46766279.7050202@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk>
Message-ID: <C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>

The idea sounds good, though if we plan on doing this we need to  
update the Test HOWTO as well.

Some modules require only a few (<50% of the total) network tests; I  
think SeqFeature.t may be one, though I'm not sure.  Does this handle  
those cases?

chris

On Jun 18, 2007, at 5:46 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Adding support to test scripts consists of the following changes:
>>
>> + use Module::Build;
>> + my $build = Module::Build->current(get_options => { network =>  
>> {} });
>
> That should read:
> + my $build = Module::Build->current();
>
>> + my $do_network_tests = $build->notes('network');
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Jun 18 11:49:18 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 12:49:18 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk> <46766279.7050202@sendu.me.uk>
	<C3AD4CC8-4B55-4613-B751-99E18C7A87B5@uiuc.edu>
Message-ID: <4676713E.1000508@sendu.me.uk>

Chris Fields wrote:
> The idea sounds good, though if we plan on doing this we need to update 
> the Test HOWTO as well.
> 
> Some modules require only a few (<50% of the total) network tests; I 
> think SeqFeature.t may be one, though I'm not sure.  Does this handle 
> those cases?

Yes, the system just gives the test script a boolean describing if 
network tests should be run. The script can then do whatever it wants 
with the boolean. Skip all tests, skip no tests, skip just some tests... 
its a drop-in replacement for the current 'debug' boolean used based on 
BIOPERLDEBUG.


From hlapp at gmx.net  Mon Jun 18 12:38:25 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:38:25 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <487845.37410.qm@web56510.mail.re3.yahoo.com>
References: <487845.37410.qm@web56510.mail.re3.yahoo.com>
Message-ID: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net>

I'm a bit confused - it sounds like you have set up a local BioSQL  
database and loaded the NCBI taxonomy into the database. You can now  
use simple SQL to retrieve all descendants of a node in the tree  
given its NCBI taxonID such as

	SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
	WHERE
	    n.ncbi_taxon_id = :taxonID
	AND tn.left_value > n. left_value
	AND tn.right_value < n.right_value
	AND tn.taxon_id = tnm.taxon_id
	AND tn.name_class = 'scientific_name'

BioPerl doesn't have a Taxonomy::biosql module yet (though this would  
seem like a worthwhile thing to add), so you can't use the  
Bio::DB::Taxonomy interface to do this against a BioSQL instance.

However, BioPerl does have support for the flat-file download of the  
NCBI taxonomy database and indexes it, so you can simply use  
Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download  
to achieve what you wanted to do in a less than 5 lines of perl.

Although the recursive implementation of Taxonomy::get_all_Descendants 
() won't be lightning fast, it may still be perfectly fine for your  
application - are you sure it is not?

	-hilmar

On Jun 18, 2007, at 12:21 AM, George Heller wrote:

> Thanks. And how can I assign the $node here in the below code, such  
> that I can reference it to a particular taxon id record? I want to  
> retrieve all the descendents from the taxonomy hierarchy, given a  
> particular taxon id.
>
>   I have a local db setup, in which I have uploaded data using the  
> load_ncbi_taxonomy.pl script.
>
>   Thanks.
>   George
>
>   Jason Stajich <jason at bioperl.org> wrote:
>     I assume you already figured out how to setup a local taxonomydb?
>
>
>   You just want the extant species/leaves of the tree
>
>
> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>
>
>
>   -jason
>     On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>     Hi all,
>
>
>     Can anyone point me to some example that uses the  
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at  
> this, and I am not quite sure how to implement it.
>
>
>     Thanks.
>     George
>
>
>   Sendu Bala <bix at sendu.me.uk> wrote:
>     George Heller wrote:
>     Hi all,
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon  
> ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children and so
>   on.
>
>
>   Any ideas on the way I can go about doing this?
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and  
> each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>   If you happen to code up something neat and efficient, why not  
> share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image.  Join our Network Research Panel  
> today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Jun 18 12:44:22 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:44:22 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
Message-ID: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>

Just curious - how do you cvs commit then to an external repository?  
Is that open in the firewall?

It is true though that corporations typically will not permit any  
encrypted outgoing traffic through their firewall except https.  
sf.net only supports https for svn, AFAIK.

	-hilmar

On Jun 17, 2007, at 10:34 PM, aaron.j.mackey at gsk.com wrote:

>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>
>>> As for access, the typical access is over http (or https).
>>
>> We're using svn+ssh here (NESCent)
>
> Let me just note that https is preferable to ssh for those poor slobs
> stuck behind a corporate firewall (svn happily prompts me for my proxy
> server's user/pass, then my https authentication realm's user/pass  
> - all
> then get cached in some .svn/ file that I don't have to worry about  
> again
> until my proxy server password changes once a month ...)
>
> -Aaron
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Jun 18 12:47:56 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 08:47:56 -0400
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <B9BDBD4A-962D-4E83-8151-5D6EA8B69D3B@gmx.net>

Sounds like a great idea to me. -hilmar

On Jun 18, 2007, at 6:44 AM, Sendu Bala wrote:

> When the test suite runs currently, most (the intent is all) tests  
> skip
> if the test would require network (internet) access. This is to avoid
> tests failing not due to bugs in Bioperl code, but due to temporarily
> inaccessible servers. This is also to make running the test suite  
> faster.
>
> To do a complete test you currently have to set BIOPERLDEBUG to true,
> which activates the network test but also increases verbosity. This
> actually causes a problem, since when running the entire test suite  
> the
> additional debug information is more a hindrance than a help, since  
> the
> reams of printed information can hide significant warnings that may  
> also
> get printed. Its also ugly.
>
> The solution is to divorce activation of network tests from the  
> request
> for verbosity. The obvious implementation is to have another  
> environment
> variable, perhaps BIOPERLNETWORK. However, there is an opportunity  
> to do
> something more appropriate. The running of networking tests should  
> be a
> choice given to every end-user installing Bioperl. Debugging
> information, on the other hand, is only of interest to the developer
> working on a specific module under test, so can be left as a 'hidden'
> env var.
>
>
> I have just committed one possible implementation along these lines.
>
> You say:
> perl Build.PL
> as normal, and if you seem to have internet access it asks you if  
> you'd
> like to run network tests. The default answer is no. If you answer  
> yes,
> network tests will be enabled.
>
> You can alternatively say:
> perl Build.PL --network
> and if you seem to have internet access, network tests will be  
> enabled.
>
> Then you run the tests:
> ./Build test
> Any tests written to support the new system will then skip network  
> tests
> if they haven't been enabled.
>
> The only test I've written to support the new system is t/ 
> RemoteBlast.t:
> ./Build test --test_files t/RemoteBlast.t --verbose
>
>
> Adding support to test scripts consists of the following changes:
>
> + use Module::Build;
> + my $build = Module::Build->current(get_options => { network =>  
> {} });
> + my $do_network_tests = $build->notes('network');
>
> ! if (!$ENV{'BIOPERLDEBUG'}) { # skip network tests
> ---
> ! if (!$do_network_tests) { # skip network tests
>
>
> I propose adding this support to all test scripts that carry out  
> network
> tests. Does anyone have objections? Does anyone have alternate
> implementations that may be superior?
>
> I specifically suggest we don't use an env var in addition to the  
> above,
> because the multiple ways of doing things could lead to confusion.  
> Which
> takes priority? Did a user really have the networking tests turned on
> when he reported his test results?
>
>
> The one thing I need help with is identifying which tests attempt to
> access the internet. I think we caught most of them for the 1.5.2
> release, but I think there are more lurking around. Can anyone offer a
> way to systematically find at least the test scripts which access the
> internet, if not the specific tests within?
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 12:55:53 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 07:55:53 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
Message-ID: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>

On Jun 18, 2007, at 7:44 AM, Hilmar Lapp wrote:

> Just curious - how do you cvs commit then to an external repository?
> Is that open in the firewall?
>
> It is true though that corporations typically will not permit any
> encrypted outgoing traffic through their firewall except https.
> sf.net only supports https for svn, AFAIK.
>
> 	-hilmar

If so it may be better to allow https, though I don't know how Chris  
D. and others feel about it.

Did we make a decision as to the fate of cvs if we get svn up-and- 
running?  Keep it around (assuming svn commits would be carried over  
to cvs and vice versa)?  Or see what happens over time?

chris


From sdavis2 at mail.nih.gov  Mon Jun 18 13:05:50 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 18 Jun 2007 09:05:50 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
Message-ID: <4676832E.5080704@mail.nih.gov>

aaron.j.mackey at gsk.com wrote:
>> On Jun 16, 2007, at 7:21 AM, Sean Davis wrote:
>>
>>> As for access, the typical access is over http (or https).
>> We're using svn+ssh here (NESCent)
> 
> Let me just note that https is preferable to ssh for those poor slobs 
> stuck behind a corporate firewall (svn happily prompts me for my proxy 
> server's user/pass, then my https authentication realm's user/pass - all 
> then get cached in some .svn/ file that I don't have to worry about again 
> until my proxy server password changes once a month ...)

That would be my suggestion as well (although I added it only
parenthetically).

Sean


From hlapp at gmx.net  Mon Jun 18 13:13:27 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 09:13:27 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
Message-ID: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>


On Jun 18, 2007, at 8:55 AM, Chris Fields wrote:

> Did we make a decision as to the fate of cvs if we get svn up-and- 
> running?  Keep it around (assuming svn commits would be carried  
> over to cvs and vice versa)?  Or see what happens over time?

Let's not plan for having cvs and svn writable repositories in  
parallel - that would create an administrative nightmare. Once the  
tests complete, there'll be a clean cut-over.

What Jason suggested is to try and continue a read-only (anonymous)  
cvs repository, updated from the svn repository that the developers  
use, aside from an anonymous svn repository mirroring the writable  
one. This would primarily be for maintaining working URLs for those  
folks who http-linked into the anonymous cvs repository. What I added  
earlier is that even if that fails to be feasible, you can achieve  
the goal using some small CGI script and apache redirect to map CVS- 
style links to the anonymous svn repository.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 13:31:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 08:31:35 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>
	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<6759E154-CB02-4D76-8AB8-26C19FB952D2@gmx.net>
Message-ID: <0E64DBD0-BBE9-411A-A146-70236EF558BB@uiuc.edu>


On Jun 18, 2007, at 8:13 AM, Hilmar Lapp wrote:

>
> On Jun 18, 2007, at 8:55 AM, Chris Fields wrote:
>
>> Did we make a decision as to the fate of cvs if we get svn up-and- 
>> running?  Keep it around (assuming svn commits would be carried  
>> over to cvs and vice versa)?  Or see what happens over time?
>
> Let's not plan for having cvs and svn writable repositories in  
> parallel - that would create an administrative nightmare. Once the  
> tests complete, there'll be a clean cut-over.

My thoughts as well.  Much simpler.

> What Jason suggested is to try and continue a read-only (anonymous)  
> cvs repository, updated from the svn repository that the developers  
> use, aside from an anonymous svn repository mirroring the writable  
> one. This would primarily be for maintaining working URLs for those  
> folks who http-linked into the anonymous cvs repository. What I  
> added earlier is that even if that fails to be feasible, you can  
> achieve the goal using some small CGI script and apache redirect to  
> map CVS-style links to the anonymous svn repository.
>
> 	-hilmar

I like the idea of a read-only cvs or a 'faux' cvs, though the former  
would initially be easier as we already have it available.  We could  
just lock it down at some switchover point to read-only (something I  
think Jason also suggested).

chris


From bix at sendu.me.uk  Mon Jun 18 13:13:33 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 14:13:33 +0100
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
Message-ID: <467684FD.3080300@sendu.me.uk>

Chris Fields wrote:
> 
> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>> If its going to be difficult and a hassle, for such an unnecessary 
>> thing I'm not sure its worth it. There are more pressing things to be 
>> done for Bioperl.
>>
>> If I can just run perltidy on the entire package and commit, I'd do 
>> it. If that's not appropriate, I won't.
> 
> The choices aren't necessarily all or nothing.  What about voluntary, 
> recommended use of a perltidy config file included with the 
> distribution, with additional 'caveats'?

I'm happy with that idea. Why not come up with something and make it 
available for us to try out?


Cheers,
Sendu.


From bix at sendu.me.uk  Mon Jun 18 13:26:36 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 14:26:36 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
Message-ID: <4676880C.9030009@sendu.me.uk>

Chris Fields wrote:
> If so it may be better to allow https, though I don't know how Chris  
> D. and others feel about it.

If it makes no difference to me as an end-user, I won't mind. But I 
won't want to enter my password even once, at the beginning of a 
session. If that's not possible with https, then ssh should be an option 
as well.


Unrelated, but it randomly just occurred to me: what happens to all the 
id lines at the top of modules? Eg:

$Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $

That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
I wish we would, since they caused me no end of hassles during the 1.5.2 
release, doing updates across branches.)


> Did we make a decision as to the fate of cvs if we get svn up-and- 
> running?  Keep it around (assuming svn commits would be carried over  
> to cvs and vice versa)?  Or see what happens over time?

Well, I don't think hard decisions are possible until we know how its 
going to work in practice. I tried setting up my own svn repository 
once, but didn't keep it and can't remember much about it.

So, I suppose we'll play it by ear and decide things later. Is someone 
out there actively doing something leading toward a demonstration of how 
it will be?


From cjfields at uiuc.edu  Mon Jun 18 13:58:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 08:58:34 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <467684FD.3080300@sendu.me.uk>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
Message-ID: <DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>


On Jun 18, 2007, at 8:13 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>> On Jun 15, 2007, at 5:07 AM, Sendu Bala wrote:
>>> If its going to be difficult and a hassle, for such an unnecessary
>>> thing I'm not sure its worth it. There are more pressing things  
>>> to be
>>> done for Bioperl.
>>>
>>> If I can just run perltidy on the entire package and commit, I'd do
>>> it. If that's not appropriate, I won't.
>>
>> The choices aren't necessarily all or nothing.  What about voluntary,
>> recommended use of a perltidy config file included with the
>> distribution, with additional 'caveats'?
>
> I'm happy with that idea. Why not come up with something and make it
> available for us to try out?
>
>
> Cheers,
> Sendu.

Will do.  Maybe something that conforms to PBP; there's a PBP  
perltidy config on perlmonks, along with some emacs/vim related bits:

http://www.perlmonks.org/?node_id=516501

chris


From sdavis2 at mail.nih.gov  Mon Jun 18 14:03:35 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Mon, 18 Jun 2007 10:03:35 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4676880C.9030009@sendu.me.uk>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
Message-ID: <467690B7.7090105@mail.nih.gov>

Sendu Bala wrote:
> Chris Fields wrote:
>> If so it may be better to allow https, though I don't know how Chris  
>> D. and others feel about it.
> 
> If it makes no difference to me as an end-user, I won't mind. But I 
> won't want to enter my password even once, at the beginning of a 
> session. If that's not possible with https, then ssh should be an option 
> as well.
> 
> 
> Unrelated, but it randomly just occurred to me: what happens to all the 
> id lines at the top of modules? Eg:
> 
> $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $
> 
> That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
> I wish we would, since they caused me no end of hassles during the 1.5.2 
> release, doing updates across branches.)

See here:

http://svnbook.red-bean.com/en/1.0/ch07s02.html

Check out the section at the bottom having to do with svn:keywords.

Sean


From akarger at CGR.Harvard.edu  Mon Jun 18 14:10:57 2007
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 18 Jun 2007 10:10:57 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <46751EC7.8020609@sheffield.ac.uk>
References: <10517356.1182013758160.JavaMail.myubc2@brahms.my.ubc.ca>
	<46751EC7.8020609@sheffield.ac.uk>
Message-ID: <B9182BFF5B004245BABC12956EA6322E04AFA6BC@huls5.nucleus.harvard.edu>

 
> Just to clarify, subversion is available as command line for windows:
> http://subversion.tigris.org/project_packages.html
> 
> TortoiseSVN is another svn client with a GUI that integrates into the
> shell. I tried setting this up a while back to use ssh (via 
> PUTTY), but
> I wasn't successful. This may have been due to me just 
> starting out with
> svn or that it was harder to setup in an earlier version of 
> TortoiseSVN.
> 
> Does anyone have experience of setting up svn on Windows to 
> use ssh? If
> the changeover takes place, I'm happy to write some howto's 
> for setting
> up svn clients for Windows.

Here are some notes I wrote recently. I'm using this with command-line
svn, not TortoiseSVN. I would hope that it would work with Tortoise,
too, but I can't guarantee.

1. Run PuTTYgen (installed with PuTTY, probably in Start
menu->Programs->PuTTY) and follow directions to create a private key
file like C:\someplace\private_key.ppk and a public key. At this point,
you'll pick an ssh password, which is separate from your login password.

2. Get an account with the appropriate .ssh/authorized_keys file on the
host machine. (This is not Windows-specific. By the way, if you change
the lines of the authorized_keys file to start with, e.g., 
	command="svnserve -t -r /main/repos/dir",no-pty ssh-rsa AAAAB...
comment
then (a) you're more secure because users can't open a real shell on the
computer, and (b) users don't need to type the repository directory in
their svn co commands.)

3. Set your environment variables (My Computer->Properties. Advanced
Tab, click on Environment Variables. In the top half ("User variables
for ..."), click "New" and put in the variable name and value.

3a. Set the SVN_EDITOR environment variable to your favorite editor,
such as vim or emacs, or a full path to some other editor. If it's not
set, then either VISUAL or EDITOR must be set.

3b. Set the SVN_SSH environment variable to run PuTTY's "plink" program,
which is the Windows equivalent of command-line ssh. If you installed
PuTTY in the default location, set it to "C:/Program
Files/PuTTY/plink.exe". Note 1: use FORWARD slashes. Note 2: Include the
quotation marks in the environment variable.

4. When you want to start using svn, you'll need to run Pageant (Start
menu->Programs->PuTTY), select "Add Key", browse to your private key
file, and enter the ssh password you chose in step 1 (not your login
password). Pageant will stay running until you quit it or logout, so you
can have multiple svn checkins etc., and you only need to type in your
password once.

5. Now just run command-line svn commands the same way you would on UNIX
(modulo Windows' brain-dead shell).

-Amir Karger


From cjfields at uiuc.edu  Mon Jun 18 14:24:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 09:24:00 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <4676880C.9030009@sendu.me.uk>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
Message-ID: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>

On Jun 18, 2007, at 8:26 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> If so it may be better to allow https, though I don't know how  
>> Chris  D. and others feel about it.
>
> If it makes no difference to me as an end-user, I won't mind. But I  
> won't want to enter my password even once, at the beginning of a  
> session. If that's not possible with https, then ssh should be an  
> option as well.

Aaron pointed out in a related post that https access is the  
preferred option behind a corporate firewall (svn prompts for proxy  
user/pass, then caches it).  Not sure how Jason/Hilmar/Chris D. feel  
about https or supporting both https+ssh.

...

>> Did we make a decision as to the fate of cvs if we get svn up-and-  
>> running?  Keep it around (assuming svn commits would be carried  
>> over  to cvs and vice versa)?  Or see what happens over time?
>
> Well, I don't think hard decisions are possible until we know how  
> its going to work in practice. I tried setting up my own svn  
> repository once, but didn't keep it and can't remember much about it.

Agree; we'll need to work out specifics once we know how things work  
out using cvs2svn.  I think the idea is to test using a smaller  
distribution (maybe network or db) and move up from there.

> So, I suppose we'll play it by ear and decide things later. Is  
> someone out there actively doing something leading toward a  
> demonstration of how it will be?

George Hartzell is going to test it out, I believe, and will post  
something when he can.

chris


From dmessina at wustl.edu  Mon Jun 18 14:54:31 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 18 Jun 2007 09:54:31 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
	<DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
Message-ID: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>

[Chris F]
> Will do.  Maybe something that conforms to PBP; there's a PBP
> perltidy config on perlmonks, along with some emacs/vim related bits:
>
> http://www.perlmonks.org/?node_id=516501


FYI, perltidy now has a built-in -pbp flag:

[from perltidy-20070508]
> -pbp, --perl-best-practices
> -pbp is an abbreviation for the parameters in the book Perl Best  
> Practices by Damian Conway:
>
>     -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1  
> -nsfs -nolq
>     -wbb="% + - * / x != == >= <= =~ !~ < > | & =
>           **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x="
> Note that the -st and -se flags make perltidy act as a filter on  
> one file only. These can be overridden with -nst and -nse if  
> necessary.
>
[full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/ 
bin/perltidy]


Dave


From dmessina at wustl.edu  Mon Jun 18 15:04:10 2007
From: dmessina at wustl.edu (David Messina)
Date: Mon, 18 Jun 2007 10:04:10 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <467661F0.2060703@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
Message-ID: <C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>

Awesome, Sendu! Really glad you implemented this.


> Can anyone offer a
> way to systematically find at least the test scripts which access the
> internet, if not the specific tests within?

I think tests would be accessing the net indirectly through a BioPerl  
module (which may also be using indirect access), so it'd be hard to  
come up with a universal glob for that.

However:

	% grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l
	108

	% ls -1 bioperl-live/t | wc -l
	248

Less than half of the test files use BIOPERLDEBUG, so that narrows  
down the possibilities...

Dave


From bix at sendu.me.uk  Mon Jun 18 15:09:19 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 16:09:19 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
Message-ID: <4676A01F.30205@sendu.me.uk>

David Messina wrote:
>> Can anyone offer a
>> way to systematically find at least the test scripts which access the
>> internet, if not the specific tests within?
> 
> I think tests would be accessing the net indirectly through a BioPerl 
> module (which may also be using indirect access), so it'd be hard to 
> come up with a universal glob for that.
> 
> However:
> 
>     % grep -Rl BIOPERLDEBUG bioperl-live/t | wc -l
>     108
> 
>     % ls -1 bioperl-live/t | wc -l
>     248
> 
> Less than half of the test files use BIOPERLDEBUG, so that narrows down 
> the possibilities...

Not necessarily. The problem is that there may be test scripts that have 
never even tried to skip network tests, and therefore don't use 
BIOPERLDEBUG. (Or that chose their own way to decide when to skip.)

I was thinking along the lines of, does anyone know how to monitor 
accesses to the network card (or equivalent), getting information on 
which program (test script) requested the access?


From cjfields at uiuc.edu  Mon Jun 18 15:41:28 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 10:41:28 -0500
Subject: [Bioperl-l] SVN and ...Re:  Perltidy
In-Reply-To: <67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>
References: <46710BC4.3060302@sendu.me.uk>
	<46716003.2030302@sendu.me.uk>	<4671703F.4010109@sheffield.ac.uk>
	<467177AC.8060104@sendu.me.uk>	<ebf5eb170706141038s10415f21tb421470e53e62268@mail.gmail.com>	<BEB29F2B-743A-441B-8087-FB8E33CD069D@uiuc.edu>	<CE09FC06-196E-4AB9-98E3-FA498A2DD410@bioperl.org>
	<C5A3A21D-E7FF-42F1-85F8-6704D74DBCF9@uiuc.edu>
	<467264C8.4020202@sendu.me.uk>
	<3B091467-8776-40AB-8A8E-DCB81A2B3252@uiuc.edu>
	<467684FD.3080300@sendu.me.uk>
	<DB6121AD-2F8A-4FEF-8CA9-822777C4BFF4@uiuc.edu>
	<67E635BD-FC19-4046-949B-358B671299E6@wustl.edu>
Message-ID: <B3EDFCDD-0F3D-47C8-B3A8-A428F24B265A@uiuc.edu>


On Jun 18, 2007, at 9:54 AM, David Messina wrote:

> [Chris F]
>> Will do.  Maybe something that conforms to PBP; there's a PBP
>> perltidy config on perlmonks, along with some emacs/vim related bits:
>>
>> http://www.perlmonks.org/?node_id=516501
>
>
> FYI, perltidy now has a built-in -pbp flag:
>
> [from perltidy-20070508]
>> -pbp, --perl-best-practices
>> -pbp is an abbreviation for the parameters in the book Perl Best
>> Practices by Damian Conway:
>>
>>     -l=78 -i=4 -ci=4 -st -se -vt=2 -cti=0 -pt=1 -bt=1 -sbt=1 -bbt=1
>> -nsfs -nolq
>>     -wbb="% + - * / x != == >= <= =~ !~ < > | & =
>>           **= += *= &= <<= &&= -= /= |= >>= ||= //= .= %= ^= x="
>> Note that the -st and -se flags make perltidy act as a filter on
>> one file only. These can be overridden with -nst and -nse if
>> necessary.
>>
> [full docs here: http://search.cpan.org/~shancock/Perl-Tidy-20070508/
> bin/perltidy]
>
>
> Dave

<slaps head>  Makes sense that would eventually be incorporated.

If so there's no need to include a config (unless we want to sway  
away from PBP-style).  We can just recommend everyone use that setting.

chris


From cjfields at uiuc.edu  Mon Jun 18 16:06:26 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 11:06:26 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676A01F.30205@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
Message-ID: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>


On Jun 18, 2007, at 10:09 AM, Sendu Bala wrote:

> David Messina wrote:
>>> ...
>> Less than half of the test files use BIOPERLDEBUG, so that narrows  
>> down
>> the possibilities...
>
> Not necessarily. The problem is that there may be test scripts that  
> have
> never even tried to skip network tests, and therefore don't use
> BIOPERLDEBUG. (Or that chose their own way to decide when to skip.)
>
> I was thinking along the lines of, does anyone know how to monitor
> accesses to the network card (or equivalent), getting information on
> which program (test script) requested the access?

EUtilities.t uses network tests predominately.  I'll switch over when  
I commit everything from the overhaul.

Couldn't you enable BIOPERLDEBUG, disable network access, then  
iterate through tests checking for those which fail or skip?  I think  
Test::Harness has a way to do this, using execute_tests().

chris


From bix at sendu.me.uk  Mon Jun 18 16:34:38 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 18 Jun 2007 17:34:38 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
Message-ID: <4676B41E.3050706@sendu.me.uk>

Chris Fields wrote:
> Couldn't you enable BIOPERLDEBUG, disable network access, then iterate 
> through tests checking for those which fail or skip?

Yes, good idea, though my dev machine is also my email/webserver so I'd 
rather come up with an alternate solution than one involving 'disable 
network access'.

Still, that's what I'll probably end up doing. Cheers!


Oh, Chris, Spiros, how goes the Test::More conversion? I might want to 
wait for you to finish, or join in? If you're not going to have time to 
do any more in the next few weeks, can you please update 
http://www.bioperl.org/wiki/TestMoreProgress removing your name (or in 
the opposite case, add your name in)? Its not quite clear to me which 
tests are assigned to whom. Can someone clarify what the markings mean?

Cheers,
Sendu.


From cjfields at uiuc.edu  Mon Jun 18 16:43:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 11:43:31 -0500
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676B41E.3050706@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
	<4676B41E.3050706@sendu.me.uk>
Message-ID: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>


On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> Couldn't you enable BIOPERLDEBUG, disable network access, then  
>> iterate through tests checking for those which fail or skip?
>
> Yes, good idea, though my dev machine is also my email/webserver so  
> I'd rather come up with an alternate solution than one involving  
> 'disable network access'.
>
> Still, that's what I'll probably end up doing. Cheers!
>
>
> Oh, Chris, Spiros, how goes the Test::More conversion? I might want  
> to wait for you to finish, or join in? If you're not going to have  
> time to do any more in the next few weeks, can you please update  
> http://www.bioperl.org/wiki/TestMoreProgress removing your name (or  
> in the opposite case, add your name in)? Its not quite clear to me  
> which tests are assigned to whom. Can someone clarify what the  
> markings mean?
>
> Cheers,
> Sendu.

Not sure how far along spiros is; I handed it over after I finished  
up to the 'Q' tests.  In general the ones marked out have been  
converted over, ones with names next to them have been claimed.  If  
you need help I'll prob. start back up again to finish them off; we  
just need to divy them up.

chris


From george.heller at yahoo.com  Mon Jun 18 17:07:59 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 10:07:59 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <12C87737-B6D5-46E5-BCF4-5388A949009B@gmx.net>
Message-ID: <218165.62089.qm@web56505.mail.re3.yahoo.com>

What exactly is the "node n" in the query below. When I issue this query, it says, 
   
  relation "node" does not exist.
   
  I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line,
   
  shift->throw_not_implemented();
   
  Thanks.
  George.

Hilmar Lapp <hlapp at gmx.net> wrote:
  I'm a bit confused - it sounds like you have set up a local BioSQL 
database and loaded the NCBI taxonomy into the database. You can now 
use simple SQL to retrieve all descendants of a node in the tree 
given its NCBI taxonID such as

SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
WHERE
n.ncbi_taxon_id = :taxonID
AND tn.left_value > n. left_value
AND tn.right_value < n.right_value
AND tn.taxon_id = tnm.taxon_id
AND tn.name_class = 'scientific_name'

BioPerl doesn't have a Taxonomy::biosql module yet (though this would 
seem like a worthwhile thing to add), so you can't use the 
Bio::DB::Taxonomy interface to do this against a BioSQL instance.

However, BioPerl does have support for the flat-file download of the 
NCBI taxonomy database and indexes it, so you can simply use 
Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download 
to achieve what you wanted to do in a less than 5 lines of perl.

Although the recursive implementation of Taxonomy::get_all_Descendants 
() won't be lightning fast, it may still be perfectly fine for your 
application - are you sure it is not?

-hilmar

On Jun 18, 2007, at 12:21 AM, George Heller wrote:

> Thanks. And how can I assign the $node here in the below code, such 
> that I can reference it to a particular taxon id record? I want to 
> retrieve all the descendents from the taxonomy hierarchy, given a 
> particular taxon id.
>
> I have a local db setup, in which I have uploaded data using the 
> load_ncbi_taxonomy.pl script.
>
> Thanks.
> George
>
> Jason Stajich wrote:
> I assume you already figured out how to setup a local taxonomydb?
>
>
> You just want the extant species/leaves of the tree
>
>
> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>
>
>
> -jason
> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
> Hi all,
>
>
> Can anyone point me to some example that uses the 
> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at 
> this, and I am not quite sure how to implement it.
>
>
> Thanks.
> George
>
>
> Sendu Bala wrote:
> George Heller wrote:
> Hi all,
>
>
> I am looking at extracting the taxonomy hierarchy for some taxon 
> ids.
> What I plan to do is, for a given taxon id, say 33090, I want to
> extract all taxon ids that are children of this species. I do not
> just want the immediate children, but the children's children and so
> on.
>
>
> Any ideas on the way I can go about doing this?
>
>
> Well, you'll use Bio::DB::Taxonomy presumably, and 
> each_Descendent in
> some kind of looping structure. Most easily a recursing sub.
>
>
> If you happen to code up something neat and efficient, why not 
> share it
> with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
> ---------------------------------
> Shape Yahoo! in your own image. Join our Network Research Panel 
> today!
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================


---------------------------------
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 


From jason at bioperl.org  Mon Jun 18 17:53:28 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 10:53:28 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com>
References: <218165.62089.qm@web56505.mail.re3.yahoo.com>
Message-ID: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org>

It is implemented in the implementing class - DB::Taxonomy is just  
the base class. For example see the flatfile implementation  
Bio::DB::Taxonomy::flatfile

See the scripts/taxa/local_taxonomydb_query.PLS for example using it:
nodes and names are from NCBI taxonomy database.

Here is an un-debugged copy+paste for your question that *should* work.

use Bio::DB::Taxonomy
my $idx_dir = '/tmp';

my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                -nodesfile => $nodesfile,
                                -namesfile => $namesfile,
                                -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;


-jason

On Jun 18, 2007, at 10:07 AM, George Heller wrote:

> What exactly is the "node n" in the query below. When I issue this  
> query, it says,
>
>   relation "node" does not exist.
>
>   I tried to use the get_all_Descendents method but it looks like  
> in order to do a recursive call it calls the method  
> each_Descendent. This method is not implemented in  
> Bio::DB::Taxonomy. It just has a single line,
>
>   shift->throw_not_implemented();
>
>   Thanks.
>   George.
>
> Hilmar Lapp <hlapp at gmx.net> wrote:
>   I'm a bit confused - it sounds like you have set up a local BioSQL
> database and loaded the NCBI taxonomy into the database. You can now
> use simple SQL to retrieve all descendants of a node in the tree
> given its NCBI taxonID such as
>
> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
> WHERE
> n.ncbi_taxon_id = :taxonID
> AND tn.left_value > n. left_value
> AND tn.right_value < n.right_value
> AND tn.taxon_id = tnm.taxon_id
> AND tn.name_class = 'scientific_name'
>
> BioPerl doesn't have a Taxonomy::biosql module yet (though this would
> seem like a worthwhile thing to add), so you can't use the
> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
> However, BioPerl does have support for the flat-file download of the
> NCBI taxonomy database and indexes it, so you can simply use
> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download
> to achieve what you wanted to do in a less than 5 lines of perl.
>
> Although the recursive implementation of Taxonomy::get_all_Descendants
> () won't be lightning fast, it may still be perfectly fine for your
> application - are you sure it is not?
>
> -hilmar
>
> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>> Thanks. And how can I assign the $node here in the below code, such
>> that I can reference it to a particular taxon id record? I want to
>> retrieve all the descendents from the taxonomy hierarchy, given a
>> particular taxon id.
>>
>> I have a local db setup, in which I have uploaded data using the
>> load_ncbi_taxonomy.pl script.
>>
>> Thanks.
>> George
>>
>> Jason Stajich wrote:
>> I assume you already figured out how to setup a local taxonomydb?
>>
>>
>> You just want the extant species/leaves of the tree
>>
>>
>> my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
>>
>>
>>
>> -jason
>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>> Hi all,
>>
>>
>> Can anyone point me to some example that uses the
>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>> this, and I am not quite sure how to implement it.
>>
>>
>> Thanks.
>> George
>>
>>
>> Sendu Bala wrote:
>> George Heller wrote:
>> Hi all,
>>
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon
>> ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children and so
>> on.
>>
>>
>> Any ideas on the way I can go about doing this?
>>
>>
>> Well, you'll use Bio::DB::Taxonomy presumably, and
>> each_Descendent in
>> some kind of looping structure. Most easily a recursing sub.
>>
>>
>> If you happen to code up something neat and efficient, why not
>> share it
>> with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Shape Yahoo! in your own image. Join our Network Research Panel
>> today!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Need a vacation? Get great deals to amazing places on Yahoo! Travel.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
>
>
>
> ---------------------------------
> Take the Internet to Go: Yahoo!Go puts the Internet in your pocket:  
> mail, news, photos & more.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From hlapp at gmx.net  Mon Jun 18 22:10:00 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:10:00 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>
References: <OFACBFC1C8.353997A3-ON852572FE.000DD469-852572FE.000E22E5@gsk.com>	<3F0D41AB-24CE-42E9-BDB8-0953AEB34066@gmx.net>
	<20B16E5E-3357-4F9C-A336-EB87C1AB92EF@uiuc.edu>
	<4676880C.9030009@sendu.me.uk>
	<278C510F-8873-4F3D-A399-955979DD3DA5@uiuc.edu>
Message-ID: <989DBD68-896E-4FB9-9413-4A1060E88ABD@gmx.net>

https is working fine for me for sf.net repositories, and I only have  
to enter the password upon first commit (since checkout doesn't even  
need a password).

	-hilmar

On Jun 18, 2007, at 10:24 AM, Chris Fields wrote:

> Not sure how Jason/Hilmar/Chris D. feel about https or supporting  
> both https+ssh

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From george.heller at yahoo.com  Mon Jun 18 22:18:21 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 15:18:21 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <6F7C6223-2DD7-46B4-A637-EC6AFFC6BDBC@bioperl.org>
Message-ID: <904670.24974.qm@web56513.mail.re3.yahoo.com>

I tried running the below mentioned script and I seem to be getting the following error:
   
  Weak references are not implemented in the version of perl at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76.
Compilation failed in require at my.pl line 7.
BEGIN failed--compilation aborted at my.pl line 7.

  My script looks something like,
   
  #!/usr/bin/perl
  use strict;
#use warnings;
use DBI;
  use Bio::Tree::Node;
use Bio::DB::Taxonomy;
use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
  
my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                               -nodesfile => $nodesfile,
                               -namesfile => $namesfile,
                               -directory => $idx_dir);
 my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  
      foreach $field (@extant_children) {
         print "$field";
         print "|";
         print "\n";
      }

  And I am running the script using the command,
   
  perl myscript.pl -v --names names.dmp --nodes nodes.dmp
   
  and I have the nodes.dmp and names.dmp files in the current directory.
   
  Thanks,
  George
  

Jason Stajich <jason at bioperl.org> wrote:
  It is implemented in the implementing class - DB::Taxonomy is just the base class. For example see the flatfile implementation Bio::DB::Taxonomy::flatfile  

  See the scripts/taxa/local_taxonomydb_query.PLS for example using it:
  nodes and names are from NCBI taxonomy database.
  

  Here is an un-debugged copy+paste for your question that *should* work.
  

  use Bio::DB::Taxonomy
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
    my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                 -nodesfile => $nodesfile,
                                 -namesfile => $namesfile,
                                 -directory => $idx_dir);
     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  -jason

    On Jun 18, 2007, at 10:07 AM, George Heller wrote:

    What exactly is the "node n" in the query below. When I issue this query, it says, 
  

    relation "node" does not exist.
  

    I tried to use the get_all_Descendents method but it looks like in order to do a recursive call it calls the method each_Descendent. This method is not implemented in Bio::DB::Taxonomy. It just has a single line,
  

    shift->throw_not_implemented();
  

    Thanks.
    George.
  

  Hilmar Lapp <hlapp at gmx.net> wrote:
    I'm a bit confused - it sounds like you have set up a local BioSQL 
  database and loaded the NCBI taxonomy into the database. You can now 
  use simple SQL to retrieve all descendants of a node in the tree 
  given its NCBI taxonID such as
  

  SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
  WHERE
  n.ncbi_taxon_id = :taxonID
  AND tn.left_value > n. left_value
  AND tn.right_value < n.right_value
  AND tn.taxon_id = tnm.taxon_id
  AND tn.name_class = 'scientific_name'
  

  BioPerl doesn't have a Taxonomy::biosql module yet (though this would 
  seem like a worthwhile thing to add), so you can't use the 
  Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

  However, BioPerl does have support for the flat-file download of the 
  NCBI taxonomy database and indexes it, so you can simply use 
  Taxonomy::{get_taxon,get_all_Descendants} using the flatfile download 
  to achieve what you wanted to do in a less than 5 lines of perl.
  

  Although the recursive implementation of Taxonomy::get_all_Descendants 
  () won't be lightning fast, it may still be perfectly fine for your 
  application - are you sure it is not?
  

  -hilmar
  

  On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

    Thanks. And how can I assign the $node here in the below code, such 
  that I can reference it to a particular taxon id record? I want to 
  retrieve all the descendents from the taxonomy hierarchy, given a 
  particular taxon id.
  

  I have a local db setup, in which I have uploaded data using the 
  load_ncbi_taxonomy.pl script.
  

  Thanks.
  George
  

  Jason Stajich wrote:
  I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descedents;
  

  -jason
  On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

  Hi all,
  

  Can anyone point me to some example that uses the 
  get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at 
  this, and I am not quite sure how to implement it.
  

  Thanks.
  George
  

  Sendu Bala wrote:
  George Heller wrote:
  Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon 
  ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and 
  each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not 
  share it
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image. Join our Network Research Panel 
  today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Need a vacation? Get great deals to amazing places on Yahoo! Travel.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  -- 
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more. 


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Bored stiff? Loosen up...
Download and play hundreds of games for free on Yahoo! Games.


From hlapp at gmx.net  Mon Jun 18 22:27:19 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:27:19 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <218165.62089.qm@web56505.mail.re3.yahoo.com>
References: <218165.62089.qm@web56505.mail.re3.yahoo.com>
Message-ID: <DEB0D23B-4FEC-418A-8AAB-FF4CBF4DAF65@gmx.net>


On Jun 18, 2007, at 1:07 PM, George Heller wrote:

> What exactly is the "node n" in the query below. When I issue this  
> query, it says,

Sorry, replace with "taxon". Jason answered the rest.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Jun 18 22:33:40 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 18 Jun 2007 17:33:40 -0500
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <904670.24974.qm@web56513.mail.re3.yahoo.com>
References: <904670.24974.qm@web56513.mail.re3.yahoo.com>
Message-ID: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>

As the error implies your local version of perl doesn't seem support  
weak references, which means it doesn't have Scalar::Utils (which was  
added to core after perl 5.6.1, I think).  Try installing  
Scalar::Utils to see what happens.

chris

On Jun 18, 2007, at 5:18 PM, George Heller wrote:

> I tried running the below mentioned script and I seem to be getting  
> the following error:
>
>   Weak references are not implemented in the version of perl at / 
> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/ 
> Bio/Tree/Node.pm line 76.
> Compilation failed in require at my.pl line 7.
> BEGIN failed--compilation aborted at my.pl line 7.
>
>   My script looks something like,
>
>   #!/usr/bin/perl
>   use strict;
> #use warnings;
> use DBI;
>   use Bio::Tree::Node;
> use Bio::DB::Taxonomy;
> use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
>
> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                -nodesfile => $nodesfile,
>                                -namesfile => $namesfile,
>                                -directory => $idx_dir);
>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>       foreach $field (@extant_children) {
>          print "$field";
>          print "|";
>          print "\n";
>       }
>
>   And I am running the script using the command,
>
>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>   and I have the nodes.dmp and names.dmp files in the current  
> directory.
>
>   Thanks,
>   George
>
>
> Jason Stajich <jason at bioperl.org> wrote:
>   It is implemented in the implementing class - DB::Taxonomy is  
> just the base class. For example see the flatfile implementation  
> Bio::DB::Taxonomy::flatfile
>
>   See the scripts/taxa/local_taxonomydb_query.PLS for example using  
> it:
>   nodes and names are from NCBI taxonomy database.
>
>
>   Here is an un-debugged copy+paste for your question that *should*  
> work.
>
>
>   use Bio::DB::Taxonomy
>   my $idx_dir = '/tmp';
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                  -nodesfile => $nodesfile,
>                                  -namesfile => $namesfile,
>                                  -directory => $idx_dir);
>      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>
>
>
>   -jason
>
>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>     What exactly is the "node n" in the query below. When I issue  
> this query, it says,
>
>
>     relation "node" does not exist.
>
>
>     I tried to use the get_all_Descendents method but it looks like  
> in order to do a recursive call it calls the method  
> each_Descendent. This method is not implemented in  
> Bio::DB::Taxonomy. It just has a single line,
>
>
>     shift->throw_not_implemented();
>
>
>     Thanks.
>     George.
>
>
>   Hilmar Lapp <hlapp at gmx.net> wrote:
>     I'm a bit confused - it sounds like you have set up a local BioSQL
>   database and loaded the NCBI taxonomy into the database. You can now
>   use simple SQL to retrieve all descendants of a node in the tree
>   given its NCBI taxonID such as
>
>
>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>   WHERE
>   n.ncbi_taxon_id = :taxonID
>   AND tn.left_value > n. left_value
>   AND tn.right_value < n.right_value
>   AND tn.taxon_id = tnm.taxon_id
>   AND tn.name_class = 'scientific_name'
>
>
>   BioPerl doesn't have a Taxonomy::biosql module yet (though this  
> would
>   seem like a worthwhile thing to add), so you can't use the
>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>   However, BioPerl does have support for the flat-file download of the
>   NCBI taxonomy database and indexes it, so you can simply use
>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile  
> download
>   to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>   Although the recursive implementation of  
> Taxonomy::get_all_Descendants
>   () won't be lightning fast, it may still be perfectly fine for your
>   application - are you sure it is not?
>
>
>   -hilmar
>
>
>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>     Thanks. And how can I assign the $node here in the below code,  
> such
>   that I can reference it to a particular taxon id record? I want to
>   retrieve all the descendents from the taxonomy hierarchy, given a
>   particular taxon id.
>
>
>   I have a local db setup, in which I have uploaded data using the
>   load_ncbi_taxonomy.pl script.
>
>
>   Thanks.
>   George
>
>
>   Jason Stajich wrote:
>   I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>   You just want the extant species/leaves of the tree
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descedents;
>
>
>
>
>
>
>   -jason
>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>   Hi all,
>
>
>
>
>   Can anyone point me to some example that uses the
>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>   this, and I am not quite sure how to implement it.
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>   Sendu Bala wrote:
>   George Heller wrote:
>   Hi all,
>
>
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon
>   ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children and so
>   on.
>
>
>
>
>   Any ideas on the way I can go about doing this?
>
>
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and
>   each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>   If you happen to code up something neat and efficient, why not
>   share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image. Join our Network Research Panel
>   today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Need a vacation? Get great deals to amazing places on Yahoo! Travel.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Take the Internet to Go: Yahoo!Go puts the Internet in your  
> pocket: mail, news, photos & more.
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Bored stiff? Loosen up...
> Download and play hundreds of games for free on Yahoo! Games.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Mon Jun 18 22:50:38 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 18 Jun 2007 18:50:38 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>
References: <904670.24974.qm@web56513.mail.re3.yahoo.com>
	<707D8821-68CB-40AE-A190-AD0F8F2C3915@uiuc.edu>
Message-ID: <F433CCB4-781D-480E-8EF5-CD68E70B27B8@gmx.net>

The perl version appears to be 5.8.5 though, so something strange  
appears to be going on too.

George, can you please post the output of

	$ /usr/bin/perl -V

-hilmar

On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:

> As the error implies your local version of perl doesn't seem support
> weak references, which means it doesn't have Scalar::Utils (which was
> added to core after perl 5.6.1, I think).  Try installing
> Scalar::Utils to see what happens.
>
> chris
>
> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>> I tried running the below mentioned script and I seem to be getting
>> the following error:
>>
>>   Weak references are not implemented in the version of perl at /
>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>> Bio/Tree/Node.pm line 76.
>> Compilation failed in require at my.pl line 7.
>> BEGIN failed--compilation aborted at my.pl line 7.
>>
>>   My script looks something like,
>>
>>   #!/usr/bin/perl
>>   use strict;
>> #use warnings;
>> use DBI;
>>   use Bio::Tree::Node;
>> use Bio::DB::Taxonomy;
>> use Bio::DB::Taxonomy::flatfile;
>>   my $idx_dir = '/tmp';
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>>                                -nodesfile => $nodesfile,
>>                                -namesfile => $namesfile,
>>                                -directory => $idx_dir);
>>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>  my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>       foreach $field (@extant_children) {
>>          print "$field";
>>          print "|";
>>          print "\n";
>>       }
>>
>>   And I am running the script using the command,
>>
>>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>
>>   and I have the nodes.dmp and names.dmp files in the current
>> directory.
>>
>>   Thanks,
>>   George
>>
>>
>> Jason Stajich <jason at bioperl.org> wrote:
>>   It is implemented in the implementing class - DB::Taxonomy is
>> just the base class. For example see the flatfile implementation
>> Bio::DB::Taxonomy::flatfile
>>
>>   See the scripts/taxa/local_taxonomydb_query.PLS for example using
>> it:
>>   nodes and names are from NCBI taxonomy database.
>>
>>
>>   Here is an un-debugged copy+paste for your question that *should*
>> work.
>>
>>
>>   use Bio::DB::Taxonomy
>>   my $idx_dir = '/tmp';
>>
>>
>>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>>     my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>>                                  -nodesfile => $nodesfile,
>>                                  -namesfile => $namesfile,
>>                                  -directory => $idx_dir);
>>      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>  my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>
>>
>>
>>   -jason
>>
>>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>
>>     What exactly is the "node n" in the query below. When I issue
>> this query, it says,
>>
>>
>>     relation "node" does not exist.
>>
>>
>>     I tried to use the get_all_Descendents method but it looks like
>> in order to do a recursive call it calls the method
>> each_Descendent. This method is not implemented in
>> Bio::DB::Taxonomy. It just has a single line,
>>
>>
>>     shift->throw_not_implemented();
>>
>>
>>     Thanks.
>>     George.
>>
>>
>>   Hilmar Lapp <hlapp at gmx.net> wrote:
>>     I'm a bit confused - it sounds like you have set up a local  
>> BioSQL
>>   database and loaded the NCBI taxonomy into the database. You can  
>> now
>>   use simple SQL to retrieve all descendants of a node in the tree
>>   given its NCBI taxonID such as
>>
>>
>>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>>   WHERE
>>   n.ncbi_taxon_id = :taxonID
>>   AND tn.left_value > n. left_value
>>   AND tn.right_value < n.right_value
>>   AND tn.taxon_id = tnm.taxon_id
>>   AND tn.name_class = 'scientific_name'
>>
>>
>>   BioPerl doesn't have a Taxonomy::biosql module yet (though this
>> would
>>   seem like a worthwhile thing to add), so you can't use the
>>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>
>>
>>   However, BioPerl does have support for the flat-file download of  
>> the
>>   NCBI taxonomy database and indexes it, so you can simply use
>>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>> download
>>   to achieve what you wanted to do in a less than 5 lines of perl.
>>
>>
>>   Although the recursive implementation of
>> Taxonomy::get_all_Descendants
>>   () won't be lightning fast, it may still be perfectly fine for your
>>   application - are you sure it is not?
>>
>>
>>   -hilmar
>>
>>
>>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>
>>
>>     Thanks. And how can I assign the $node here in the below code,
>> such
>>   that I can reference it to a particular taxon id record? I want to
>>   retrieve all the descendents from the taxonomy hierarchy, given a
>>   particular taxon id.
>>
>>
>>   I have a local db setup, in which I have uploaded data using the
>>   load_ncbi_taxonomy.pl script.
>>
>>
>>   Thanks.
>>   George
>>
>>
>>   Jason Stajich wrote:
>>   I assume you already figured out how to setup a local taxonomydb?
>>
>>
>>
>>
>>   You just want the extant species/leaves of the tree
>>
>>
>>
>>
>>   my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descedents;
>>
>>
>>
>>
>>
>>
>>   -jason
>>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>>
>>   Hi all,
>>
>>
>>
>>
>>   Can anyone point me to some example that uses the
>>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>>   this, and I am not quite sure how to implement it.
>>
>>
>>
>>
>>   Thanks.
>>   George
>>
>>
>>
>>
>>   Sendu Bala wrote:
>>   George Heller wrote:
>>   Hi all,
>>
>>
>>
>>
>>   I am looking at extracting the taxonomy hierarchy for some taxon
>>   ids.
>>   What I plan to do is, for a given taxon id, say 33090, I want to
>>   extract all taxon ids that are children of this species. I do not
>>   just want the immediate children, but the children's children  
>> and so
>>   on.
>>
>>
>>
>>
>>   Any ideas on the way I can go about doing this?
>>
>>
>>
>>
>>   Well, you'll use Bio::DB::Taxonomy presumably, and
>>   each_Descendent in
>>   some kind of looping structure. Most easily a recursing sub.
>>
>>
>>
>>
>>   If you happen to code up something neat and efficient, why not
>>   share it
>>   with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Shape Yahoo! in your own image. Join our Network Research Panel
>>   today!
>>   _______________________________________________
>>   Bioperl-l mailing list
>>   Bioperl-l at lists.open-bio.org
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>>
>>   --
>>   Jason Stajich
>>   jason at bioperl.org
>>   http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Need a vacation? Get great deals to amazing places on Yahoo!  
>> Travel.
>>   _______________________________________________
>>   Bioperl-l mailing list
>>   Bioperl-l at lists.open-bio.org
>>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>   --
>>   ===========================================================
>>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>>   ===========================================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>   ---------------------------------
>>   Take the Internet to Go: Yahoo!Go puts the Internet in your
>> pocket: mail, news, photos & more.
>>
>>
>>     --
>>   Jason Stajich
>>   jason at bioperl.org
>>   http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Bored stiff? Loosen up...
>> Download and play hundreds of games for free on Yahoo! Games.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From george.heller at yahoo.com  Mon Jun 18 23:05:42 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 16:05:42 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F433CCB4-781D-480E-8EF5-CD68E70B27B8@gmx.net>
Message-ID: <706979.34648.qm@web56509.mail.re3.yahoo.com>

This is the output of /usr/bin/perl -V

Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
  Platform:
    osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
    uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
    ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
    libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
    gnulibc_version='2.3.4'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  
Characteristics of this binary (from libperl):
  Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
  Built under linux
  Compiled at Jul 24 2006 18:28:10
  @INC:
    /usr/lib/perl5/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/5.8.5
    /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.5
    /usr/lib/perl5/site_perl/5.8.4
    /usr/lib/perl5/site_perl/5.8.3
    /usr/lib/perl5/site_perl/5.8.2
    /usr/lib/perl5/site_perl/5.8.1
    /usr/lib/perl5/site_perl/5.8.0
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.5
    /usr/lib/perl5/vendor_perl/5.8.4
    /usr/lib/perl5/vendor_perl/5.8.3
    /usr/lib/perl5/vendor_perl/5.8.2
    /usr/lib/perl5/vendor_perl/5.8.1
    /usr/lib/perl5/vendor_perl/5.8.0
    /usr/lib/perl5/vendor_perl
   
  Thanks.
  George
    .

Hilmar Lapp <hlapp at gmx.net> wrote:
  The perl version appears to be 5.8.5 though, so something strange 
appears to be going on too.

George, can you please post the output of

$ /usr/bin/perl -V

-hilmar

On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:

> As the error implies your local version of perl doesn't seem support
> weak references, which means it doesn't have Scalar::Utils (which was
> added to core after perl 5.6.1, I think). Try installing
> Scalar::Utils to see what happens.
>
> chris
>
> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>> I tried running the below mentioned script and I seem to be getting
>> the following error:
>>
>> Weak references are not implemented in the version of perl at /
>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>> Bio/Tree/Node.pm line 76.
>> Compilation failed in require at my.pl line 7.
>> BEGIN failed--compilation aborted at my.pl line 7.
>>
>> My script looks something like,
>>
>> #!/usr/bin/perl
>> use strict;
>> #use warnings;
>> use DBI;
>> use Bio::Tree::Node;
>> use Bio::DB::Taxonomy;
>> use Bio::DB::Taxonomy::flatfile;
>> my $idx_dir = '/tmp';
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>> -nodesfile => $nodesfile,
>> -namesfile => $namesfile,
>> -directory => $idx_dir);
>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>> foreach $field (@extant_children) {
>> print "$field";
>> print "|";
>> print "\n";
>> }
>>
>> And I am running the script using the command,
>>
>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>
>> and I have the nodes.dmp and names.dmp files in the current
>> directory.
>>
>> Thanks,
>> George
>>
>>
>> Jason Stajich wrote:
>> It is implemented in the implementing class - DB::Taxonomy is
>> just the base class. For example see the flatfile implementation
>> Bio::DB::Taxonomy::flatfile
>>
>> See the scripts/taxa/local_taxonomydb_query.PLS for example using
>> it:
>> nodes and names are from NCBI taxonomy database.
>>
>>
>> Here is an un-debugged copy+paste for your question that *should*
>> work.
>>
>>
>> use Bio::DB::Taxonomy
>> my $idx_dir = '/tmp';
>>
>>
>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>> -nodesfile => $nodesfile,
>> -namesfile => $namesfile,
>> -directory => $idx_dir);
>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descendents;
>>
>>
>>
>>
>> -jason
>>
>> On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>
>> What exactly is the "node n" in the query below. When I issue
>> this query, it says,
>>
>>
>> relation "node" does not exist.
>>
>>
>> I tried to use the get_all_Descendents method but it looks like
>> in order to do a recursive call it calls the method
>> each_Descendent. This method is not implemented in
>> Bio::DB::Taxonomy. It just has a single line,
>>
>>
>> shift->throw_not_implemented();
>>
>>
>> Thanks.
>> George.
>>
>>
>> Hilmar Lapp wrote:
>> I'm a bit confused - it sounds like you have set up a local 
>> BioSQL
>> database and loaded the NCBI taxonomy into the database. You can 
>> now
>> use simple SQL to retrieve all descendants of a node in the tree
>> given its NCBI taxonID such as
>>
>>
>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>> WHERE
>> n.ncbi_taxon_id = :taxonID
>> AND tn.left_value > n. left_value
>> AND tn.right_value < n.right_value
>> AND tn.taxon_id = tnm.taxon_id
>> AND tn.name_class = 'scientific_name'
>>
>>
>> BioPerl doesn't have a Taxonomy::biosql module yet (though this
>> would
>> seem like a worthwhile thing to add), so you can't use the
>> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>
>>
>> However, BioPerl does have support for the flat-file download of 
>> the
>> NCBI taxonomy database and indexes it, so you can simply use
>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>> download
>> to achieve what you wanted to do in a less than 5 lines of perl.
>>
>>
>> Although the recursive implementation of
>> Taxonomy::get_all_Descendants
>> () won't be lightning fast, it may still be perfectly fine for your
>> application - are you sure it is not?
>>
>>
>> -hilmar
>>
>>
>> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>
>>
>> Thanks. And how can I assign the $node here in the below code,
>> such
>> that I can reference it to a particular taxon id record? I want to
>> retrieve all the descendents from the taxonomy hierarchy, given a
>> particular taxon id.
>>
>>
>> I have a local db setup, in which I have uploaded data using the
>> load_ncbi_taxonomy.pl script.
>>
>>
>> Thanks.
>> George
>>
>>
>> Jason Stajich wrote:
>> I assume you already figured out how to setup a local taxonomydb?
>>
>>
>>
>>
>> You just want the extant species/leaves of the tree
>>
>>
>>
>>
>> my @extant_children = grep { $_->is_Leaf } $node-
>>> get_all_Descedents;
>>
>>
>>
>>
>>
>>
>> -jason
>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>
>>
>> Hi all,
>>
>>
>>
>>
>> Can anyone point me to some example that uses the
>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>> this, and I am not quite sure how to implement it.
>>
>>
>>
>>
>> Thanks.
>> George
>>
>>
>>
>>
>> Sendu Bala wrote:
>> George Heller wrote:
>> Hi all,
>>
>>
>>
>>
>> I am looking at extracting the taxonomy hierarchy for some taxon
>> ids.
>> What I plan to do is, for a given taxon id, say 33090, I want to
>> extract all taxon ids that are children of this species. I do not
>> just want the immediate children, but the children's children 
>> and so
>> on.
>>
>>
>>
>>
>> Any ideas on the way I can go about doing this?
>>
>>
>>
>>
>> Well, you'll use Bio::DB::Taxonomy presumably, and
>> each_Descendent in
>> some kind of looping structure. Most easily a recursing sub.
>>
>>
>>
>>
>> If you happen to code up something neat and efficient, why not
>> share it
>> with us and we could add it to the Taxonomy module(s).
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Shape Yahoo! in your own image. Join our Network Research Panel
>> today!
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Need a vacation? Get great deals to amazing places on Yahoo! 
>> Travel.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> --
>> ===========================================================
>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Take the Internet to Go: Yahoo!Go puts the Internet in your
>> pocket: mail, news, photos & more.
>>
>>
>> --
>> Jason Stajich
>> jason at bioperl.org
>> http://jason.open-bio.org/
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------------
>> Bored stiff? Loosen up...
>> Download and play hundreds of games for free on Yahoo! Games.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================


---------------------------------
Expecting? Get great news right away with email Auto-Check.
Try the Yahoo! Mail Beta.


From jason at bioperl.org  Mon Jun 18 23:22:08 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 16:22:08 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <706979.34648.qm@web56509.mail.re3.yahoo.com>
References: <706979.34648.qm@web56509.mail.re3.yahoo.com>
Message-ID: <C93DF7A1-20AC-4474-BBC6-0C2598406EEB@bioperl.org>

Try installing the latest Scalar::Util

On Jun 18, 2007, at 4:05 PM, George Heller wrote:

> This is the output of /usr/bin/perl -V
>
> Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
>   Platform:
>     osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386- 
> linux-thread-multi
>     uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>     config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>     hint=recommended, useposix=true, d_sigaction=define
>     usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>     useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
>     use64bitint=undef use64bitall=undef uselongdouble=undef
>     usemymalloc=n, bincompat5005=undef
>   Compiler:
>     cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- 
> strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>     optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>     cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- 
> aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>     ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)',  
> gccosandvers=''
>     intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>     d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>     ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>     alignbytes=4, prototype=define
>   Linker and Libraries:
>     ld='gcc', ldflags =' -L/usr/local/lib'
>     libpth=/usr/local/lib /lib /usr/lib
>     libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>     perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>     libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>     gnulibc_version='2.3.4'
>   Dynamic Linking:
>     dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,- 
> E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>     cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
> Characteristics of this binary (from libperl):
>   Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>   Built under linux
>   Compiled at Jul 24 2006 18:28:10
>   @INC:
>     /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/5.8.5
>     /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>     /usr/lib/perl5/site_perl/5.8.5
>     /usr/lib/perl5/site_perl/5.8.4
>     /usr/lib/perl5/site_perl/5.8.3
>     /usr/lib/perl5/site_perl/5.8.2
>     /usr/lib/perl5/site_perl/5.8.1
>     /usr/lib/perl5/site_perl/5.8.0
>     /usr/lib/perl5/site_perl
>     /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>     /usr/lib/perl5/vendor_perl/5.8.5
>     /usr/lib/perl5/vendor_perl/5.8.4
>     /usr/lib/perl5/vendor_perl/5.8.3
>     /usr/lib/perl5/vendor_perl/5.8.2
>     /usr/lib/perl5/vendor_perl/5.8.1
>     /usr/lib/perl5/vendor_perl/5.8.0
>     /usr/lib/perl5/vendor_perl
>
>   Thanks.
>   George
>     .
>
> Hilmar Lapp <hlapp at gmx.net> wrote:
>   The perl version appears to be 5.8.5 though, so something strange
> appears to be going on too.
>
> George, can you please post the output of
>
> $ /usr/bin/perl -V
>
> -hilmar
>
> On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>> As the error implies your local version of perl doesn't seem support
>> weak references, which means it doesn't have Scalar::Utils (which was
>> added to core after perl 5.6.1, I think). Try installing
>> Scalar::Utils to see what happens.
>>
>> chris
>>
>> On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>>
>>> I tried running the below mentioned script and I seem to be getting
>>> the following error:
>>>
>>> Weak references are not implemented in the version of perl at /
>>> usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>>> BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>>> Bio/Tree/Node.pm line 76.
>>> Compilation failed in require at my.pl line 7.
>>> BEGIN failed--compilation aborted at my.pl line 7.
>>>
>>> My script looks something like,
>>>
>>> #!/usr/bin/perl
>>> use strict;
>>> #use warnings;
>>> use DBI;
>>> use Bio::Tree::Node;
>>> use Bio::DB::Taxonomy;
>>> use Bio::DB::Taxonomy::flatfile;
>>> my $idx_dir = '/tmp';
>>>
>>> my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>>> -nodesfile => $nodesfile,
>>> -namesfile => $namesfile,
>>> -directory => $idx_dir);
>>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descendents;
>>>
>>> foreach $field (@extant_children) {
>>> print "$field";
>>> print "|";
>>> print "\n";
>>> }
>>>
>>> And I am running the script using the command,
>>>
>>> perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>>>
>>> and I have the nodes.dmp and names.dmp files in the current
>>> directory.
>>>
>>> Thanks,
>>> George
>>>
>>>
>>> Jason Stajich wrote:
>>> It is implemented in the implementing class - DB::Taxonomy is
>>> just the base class. For example see the flatfile implementation
>>> Bio::DB::Taxonomy::flatfile
>>>
>>> See the scripts/taxa/local_taxonomydb_query.PLS for example using
>>> it:
>>> nodes and names are from NCBI taxonomy database.
>>>
>>>
>>> Here is an un-debugged copy+paste for your question that *should*
>>> work.
>>>
>>>
>>> use Bio::DB::Taxonomy
>>> my $idx_dir = '/tmp';
>>>
>>>
>>> my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>>> my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>>> -nodesfile => $nodesfile,
>>> -namesfile => $namesfile,
>>> -directory => $idx_dir);
>>> my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descendents;
>>>
>>>
>>>
>>>
>>> -jason
>>>
>>> On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>>>
>>> What exactly is the "node n" in the query below. When I issue
>>> this query, it says,
>>>
>>>
>>> relation "node" does not exist.
>>>
>>>
>>> I tried to use the get_all_Descendents method but it looks like
>>> in order to do a recursive call it calls the method
>>> each_Descendent. This method is not implemented in
>>> Bio::DB::Taxonomy. It just has a single line,
>>>
>>>
>>> shift->throw_not_implemented();
>>>
>>>
>>> Thanks.
>>> George.
>>>
>>>
>>> Hilmar Lapp wrote:
>>> I'm a bit confused - it sounds like you have set up a local
>>> BioSQL
>>> database and loaded the NCBI taxonomy into the database. You can
>>> now
>>> use simple SQL to retrieve all descendants of a node in the tree
>>> given its NCBI taxonID such as
>>>
>>>
>>> SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>>> WHERE
>>> n.ncbi_taxon_id = :taxonID
>>> AND tn.left_value > n. left_value
>>> AND tn.right_value < n.right_value
>>> AND tn.taxon_id = tnm.taxon_id
>>> AND tn.name_class = 'scientific_name'
>>>
>>>
>>> BioPerl doesn't have a Taxonomy::biosql module yet (though this
>>> would
>>> seem like a worthwhile thing to add), so you can't use the
>>> Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>>>
>>>
>>> However, BioPerl does have support for the flat-file download of
>>> the
>>> NCBI taxonomy database and indexes it, so you can simply use
>>> Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>>> download
>>> to achieve what you wanted to do in a less than 5 lines of perl.
>>>
>>>
>>> Although the recursive implementation of
>>> Taxonomy::get_all_Descendants
>>> () won't be lightning fast, it may still be perfectly fine for your
>>> application - are you sure it is not?
>>>
>>>
>>> -hilmar
>>>
>>>
>>> On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>>>
>>>
>>> Thanks. And how can I assign the $node here in the below code,
>>> such
>>> that I can reference it to a particular taxon id record? I want to
>>> retrieve all the descendents from the taxonomy hierarchy, given a
>>> particular taxon id.
>>>
>>>
>>> I have a local db setup, in which I have uploaded data using the
>>> load_ncbi_taxonomy.pl script.
>>>
>>>
>>> Thanks.
>>> George
>>>
>>>
>>> Jason Stajich wrote:
>>> I assume you already figured out how to setup a local taxonomydb?
>>>
>>>
>>>
>>>
>>> You just want the extant species/leaves of the tree
>>>
>>>
>>>
>>>
>>> my @extant_children = grep { $_->is_Leaf } $node-
>>>> get_all_Descedents;
>>>
>>>
>>>
>>>
>>>
>>>
>>> -jason
>>> On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>>>
>>>
>>> Hi all,
>>>
>>>
>>>
>>>
>>> Can anyone point me to some example that uses the
>>> get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>>> this, and I am not quite sure how to implement it.
>>>
>>>
>>>
>>>
>>> Thanks.
>>> George
>>>
>>>
>>>
>>>
>>> Sendu Bala wrote:
>>> George Heller wrote:
>>> Hi all,
>>>
>>>
>>>
>>>
>>> I am looking at extracting the taxonomy hierarchy for some taxon
>>> ids.
>>> What I plan to do is, for a given taxon id, say 33090, I want to
>>> extract all taxon ids that are children of this species. I do not
>>> just want the immediate children, but the children's children
>>> and so
>>> on.
>>>
>>>
>>>
>>>
>>> Any ideas on the way I can go about doing this?
>>>
>>>
>>>
>>>
>>> Well, you'll use Bio::DB::Taxonomy presumably, and
>>> each_Descendent in
>>> some kind of looping structure. Most easily a recursing sub.
>>>
>>>
>>>
>>>
>>> If you happen to code up something neat and efficient, why not
>>> share it
>>> with us and we could add it to the Taxonomy module(s).
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Shape Yahoo! in your own image. Join our Network Research Panel
>>> today!
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>>
>>> --
>>> Jason Stajich
>>> jason at bioperl.org
>>> http://jason.open-bio.org/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Need a vacation? Get great deals to amazing places on Yahoo!
>>> Travel.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> --
>>> ===========================================================
>>> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Take the Internet to Go: Yahoo!Go puts the Internet in your
>>> pocket: mail, news, photos & more.
>>>
>>>
>>> --
>>> Jason Stajich
>>> jason at bioperl.org
>>> http://jason.open-bio.org/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------
>>> Bored stiff? Loosen up...
>>> Download and play hundreds of games for free on Yahoo! Games.
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
>
>
>
> ---------------------------------
> Expecting? Get great news right away with email Auto-Check.
> Try the Yahoo! Mail Beta.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From george.heller at yahoo.com  Tue Jun 19 00:04:00 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 17:04:00 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <C93DF7A1-20AC-4474-BBC6-0C2598406EEB@bioperl.org>
Message-ID: <424035.72876.qm@web56507.mail.re3.yahoo.com>

Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
   
  Sorry to be bothering, really appreaciate your patience.
   
  Thanks.
  George

Jason Stajich <jason at bioperl.org> wrote:
  Try installing the latest Scalar::Util  
    On Jun 18, 2007, at 4:05 PM, George Heller wrote:

    This is the output of /usr/bin/perl -V
  

  Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
    Platform:
      osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
      uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
      config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
      hint=recommended, useposix=true, d_sigaction=define
      usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
      useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
      use64bitint=undef use64bitall=undef uselongdouble=undef
      usemymalloc=n, bincompat5005=undef
    Compiler:
      cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
      optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
      cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
      ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
      intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
      d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
      ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
      alignbytes=4, prototype=define
    Linker and Libraries:
      ld='gcc', ldflags =' -L/usr/local/lib'
      libpth=/usr/local/lib /lib /usr/lib
      libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
      perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
      libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
      gnulibc_version='2.3.4'
    Dynamic Linking:
      dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
      cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

  Characteristics of this binary (from libperl):
    Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
    Built under linux
    Compiled at Jul 24 2006 18:28:10
    @INC:
      /usr/lib/perl5/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/5.8.5
      /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
      /usr/lib/perl5/site_perl/5.8.5
      /usr/lib/perl5/site_perl/5.8.4
      /usr/lib/perl5/site_perl/5.8.3
      /usr/lib/perl5/site_perl/5.8.2
      /usr/lib/perl5/site_perl/5.8.1
      /usr/lib/perl5/site_perl/5.8.0
      /usr/lib/perl5/site_perl
      /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
      /usr/lib/perl5/vendor_perl/5.8.5
      /usr/lib/perl5/vendor_perl/5.8.4
      /usr/lib/perl5/vendor_perl/5.8.3
      /usr/lib/perl5/vendor_perl/5.8.2
      /usr/lib/perl5/vendor_perl/5.8.1
      /usr/lib/perl5/vendor_perl/5.8.0
      /usr/lib/perl5/vendor_perl
  

    Thanks.
    George
      .
  

  Hilmar Lapp <hlapp at gmx.net> wrote:
    The perl version appears to be 5.8.5 though, so something strange 
  appears to be going on too.
  

  George, can you please post the output of
  

  $ /usr/bin/perl -V
  

  -hilmar
  

  On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

    As the error implies your local version of perl doesn't seem support
  weak references, which means it doesn't have Scalar::Utils (which was
  added to core after perl 5.6.1, I think). Try installing
  Scalar::Utils to see what happens.
  

  chris
  

  On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

    I tried running the below mentioned script and I seem to be getting
  the following error:
  

  Weak references are not implemented in the version of perl at /
  usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
  BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
  Bio/Tree/Node.pm line 76.
  Compilation failed in require at my.pl line 7.
  BEGIN failed--compilation aborted at my.pl line 7.
  

  My script looks something like,
  

  #!/usr/bin/perl
  use strict;
  #use warnings;
  use DBI;
  use Bio::Tree::Node;
  use Bio::DB::Taxonomy;
  use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
  my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
  -nodesfile => $nodesfile,
  -namesfile => $namesfile,
  -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descendents;
  

  foreach $field (@extant_children) {
  print "$field";
  print "|";
  print "\n";
  }
  

  And I am running the script using the command,
  

  perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

  and I have the nodes.dmp and names.dmp files in the current
  directory.
  

  Thanks,
  George
  

  Jason Stajich wrote:
  It is implemented in the implementing class - DB::Taxonomy is
  just the base class. For example see the flatfile implementation
  Bio::DB::Taxonomy::flatfile
  

  See the scripts/taxa/local_taxonomydb_query.PLS for example using
  it:
  nodes and names are from NCBI taxonomy database.
  

  Here is an un-debugged copy+paste for your question that *should*
  work.
  

  use Bio::DB::Taxonomy
  my $idx_dir = '/tmp';
  

  my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
  my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
  -nodesfile => $nodesfile,
  -namesfile => $namesfile,
  -directory => $idx_dir);
  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descendents;
  

  -jason
  

  On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

  What exactly is the "node n" in the query below. When I issue
  this query, it says,
  

  relation "node" does not exist.
  

  I tried to use the get_all_Descendents method but it looks like
  in order to do a recursive call it calls the method
  each_Descendent. This method is not implemented in
  Bio::DB::Taxonomy. It just has a single line,
  

  shift->throw_not_implemented();
  

  Thanks.
  George.
  

  Hilmar Lapp wrote:
  I'm a bit confused - it sounds like you have set up a local 
  BioSQL
  database and loaded the NCBI taxonomy into the database. You can 
  now
  use simple SQL to retrieve all descendants of a node in the tree
  given its NCBI taxonID such as
  

  SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
  WHERE
  n.ncbi_taxon_id = :taxonID
  AND tn.left_value > n. left_value
  AND tn.right_value < n.right_value
  AND tn.taxon_id = tnm.taxon_id
  AND tn.name_class = 'scientific_name'
  

  BioPerl doesn't have a Taxonomy::biosql module yet (though this
  would
  seem like a worthwhile thing to add), so you can't use the
  Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

  However, BioPerl does have support for the flat-file download of 
  the
  NCBI taxonomy database and indexes it, so you can simply use
  Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
  download
  to achieve what you wanted to do in a less than 5 lines of perl.
  

  Although the recursive implementation of
  Taxonomy::get_all_Descendants
  () won't be lightning fast, it may still be perfectly fine for your
  application - are you sure it is not?
  

  -hilmar
  

  On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

  Thanks. And how can I assign the $node here in the below code,
  such
  that I can reference it to a particular taxon id record? I want to
  retrieve all the descendents from the taxonomy hierarchy, given a
  particular taxon id.
  

  I have a local db setup, in which I have uploaded data using the
  load_ncbi_taxonomy.pl script.
  

  Thanks.
  George
  

  Jason Stajich wrote:
  I assume you already figured out how to setup a local taxonomydb?
  

  You just want the extant species/leaves of the tree
  

  my @extant_children = grep { $_->is_Leaf } $node-
    get_all_Descedents;
  

  -jason
  On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

  Hi all,
  

  Can anyone point me to some example that uses the
  get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
  this, and I am not quite sure how to implement it.
  

  Thanks.
  George
  

  Sendu Bala wrote:
  George Heller wrote:
  Hi all,
  

  I am looking at extracting the taxonomy hierarchy for some taxon
  ids.
  What I plan to do is, for a given taxon id, say 33090, I want to
  extract all taxon ids that are children of this species. I do not
  just want the immediate children, but the children's children 
  and so
  on.
  

  Any ideas on the way I can go about doing this?
  

  Well, you'll use Bio::DB::Taxonomy presumably, and
  each_Descendent in
  some kind of looping structure. Most easily a recursing sub.
  

  If you happen to code up something neat and efficient, why not
  share it
  with us and we could add it to the Taxonomy module(s).
  

  ---------------------------------
  Shape Yahoo! in your own image. Join our Network Research Panel
  today!
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Need a vacation? Get great deals to amazing places on Yahoo! 
  Travel.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  --
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Take the Internet to Go: Yahoo!Go puts the Internet in your
  pocket: mail, news, photos & more.
  

  --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/
  

  ---------------------------------
  Bored stiff? Loosen up...
  Download and play hundreds of games for free on Yahoo! Games.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  Christopher Fields
  Postdoctoral Researcher
  Lab of Dr. Robert Switzer
  Dept of Biochemistry
  University of Illinois Urbana-Champaign
  

  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

  -- 
  ===========================================================
  : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
  ===========================================================
  

  ---------------------------------
  Expecting? Get great news right away with email Auto-Check.
  Try the Yahoo! Mail Beta.
  _______________________________________________
  Bioperl-l mailing list
  Bioperl-l at lists.open-bio.org
  http://lists.open-bio.org/mailman/listinfo/bioperl-l


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Building a website is a piece of cake. 
Yahoo! Small Business gives you all the tools to get online.


From jason at bioperl.org  Tue Jun 19 00:17:34 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 17:17:34 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <424035.72876.qm@web56507.mail.re3.yahoo.com>
References: <424035.72876.qm@web56507.mail.re3.yahoo.com>
Message-ID: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org>

All the children are in this array.

You get to decide what you want to do with them. In the following  
example I print the id, rank, and scientific name out to the screen.
Because this is a taxonomy db query you are getting back  
Bio::Taxonomy::Taxon objects so read the documentation for this  
module to see what you can do with the object.
I would also suggest spending a little time with the Getting started  
and HOWTO:Trees documentation on the website to get familiar with the  
objects and nomenclature.


my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;

for my $child ( @extant_children ) {
   print "id is ", $child->id, "\n"; # NCBI taxa id
   print "rank is ", $child->rank, "\n"; # e.g. species
   print "scientific name is ", $child->scientific_name, "\n"; #  
scientific name
}

On Jun 18, 2007, at 5:04 PM, George Heller wrote:

> Ok, I installed the latest of Scalar::Util and the script seems to  
> be working. But I am confused where exactly I need to look for the  
> descendent taxon ids once the script is run. I did look into the / 
> tmp/ directory, but I couldnt understand much.
>
>   Sorry to be bothering, really appreaciate your patience.
>
>   Thanks.
>   George
>
> Jason Stajich <jason at bioperl.org> wrote:
>   Try installing the latest Scalar::Util
>     On Jun 18, 2007, at 4:05 PM, George Heller wrote:
>
>     This is the output of /usr/bin/perl -V
>
>
>   Summary of my perl5 (revision 5 version 8 subversion 5)  
> configuration:
>     Platform:
>       osname=linux, osvers=2.6.9-22.18.bz155725.elsmp,  
> archname=i386-linux-thread-multi
>       uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>       config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>       hint=recommended, useposix=true, d_sigaction=define
>       usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>       useperlio=define d_sfio=undef uselargefiles=define  
> usesocks=undef
>       use64bitint=undef use64bitall=undef uselongdouble=undef
>       usemymalloc=n, bincompat5005=undef
>     Compiler:
>       cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - 
> fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>       optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>       cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict- 
> aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>       ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)',  
> gccosandvers=''
>       intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>       d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>       ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>       alignbytes=4, prototype=define
>     Linker and Libraries:
>       ld='gcc', ldflags =' -L/usr/local/lib'
>       libpth=/usr/local/lib /lib /usr/lib
>       libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>       perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>       libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>       gnulibc_version='2.3.4'
>     Dynamic Linking:
>       dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- 
> Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>       cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
>
>   Characteristics of this binary (from libperl):
>     Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>     Built under linux
>     Compiled at Jul 24 2006 18:28:10
>     @INC:
>       /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/5.8.5
>       /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>       /usr/lib/perl5/site_perl/5.8.5
>       /usr/lib/perl5/site_perl/5.8.4
>       /usr/lib/perl5/site_perl/5.8.3
>       /usr/lib/perl5/site_perl/5.8.2
>       /usr/lib/perl5/site_perl/5.8.1
>       /usr/lib/perl5/site_perl/5.8.0
>       /usr/lib/perl5/site_perl
>       /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>       /usr/lib/perl5/vendor_perl/5.8.5
>       /usr/lib/perl5/vendor_perl/5.8.4
>       /usr/lib/perl5/vendor_perl/5.8.3
>       /usr/lib/perl5/vendor_perl/5.8.2
>       /usr/lib/perl5/vendor_perl/5.8.1
>       /usr/lib/perl5/vendor_perl/5.8.0
>       /usr/lib/perl5/vendor_perl
>
>
>     Thanks.
>     George
>       .
>
>
>   Hilmar Lapp <hlapp at gmx.net> wrote:
>     The perl version appears to be 5.8.5 though, so something strange
>   appears to be going on too.
>
>
>   George, can you please post the output of
>
>
>   $ /usr/bin/perl -V
>
>
>   -hilmar
>
>
>   On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>
>     As the error implies your local version of perl doesn't seem  
> support
>   weak references, which means it doesn't have Scalar::Utils (which  
> was
>   added to core after perl 5.6.1, I think). Try installing
>   Scalar::Utils to see what happens.
>
>
>   chris
>
>
>   On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>
>     I tried running the below mentioned script and I seem to be  
> getting
>   the following error:
>
>
>   Weak references are not implemented in the version of perl at /
>   usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>   BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
>   Bio/Tree/Node.pm line 76.
>   Compilation failed in require at my.pl line 7.
>   BEGIN failed--compilation aborted at my.pl line 7.
>
>
>   My script looks something like,
>
>
>   #!/usr/bin/perl
>   use strict;
>   #use warnings;
>   use DBI;
>   use Bio::Tree::Node;
>   use Bio::DB::Taxonomy;
>   use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>   my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>   -nodesfile => $nodesfile,
>   -namesfile => $namesfile,
>   -directory => $idx_dir);
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descendents;
>
>
>   foreach $field (@extant_children) {
>   print "$field";
>   print "|";
>   print "\n";
>   }
>
>
>   And I am running the script using the command,
>
>
>   perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>
>   and I have the nodes.dmp and names.dmp files in the current
>   directory.
>
>
>   Thanks,
>   George
>
>
>
>
>   Jason Stajich wrote:
>   It is implemented in the implementing class - DB::Taxonomy is
>   just the base class. For example see the flatfile implementation
>   Bio::DB::Taxonomy::flatfile
>
>
>   See the scripts/taxa/local_taxonomydb_query.PLS for example using
>   it:
>   nodes and names are from NCBI taxonomy database.
>
>
>
>
>   Here is an un-debugged copy+paste for your question that *should*
>   work.
>
>
>
>
>   use Bio::DB::Taxonomy
>   my $idx_dir = '/tmp';
>
>
>
>
>   my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>   my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>   -nodesfile => $nodesfile,
>   -namesfile => $namesfile,
>   -directory => $idx_dir);
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descendents;
>
>
>
>
>
>
>
>
>   -jason
>
>
>   On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>
>   What exactly is the "node n" in the query below. When I issue
>   this query, it says,
>
>
>
>
>   relation "node" does not exist.
>
>
>
>
>   I tried to use the get_all_Descendents method but it looks like
>   in order to do a recursive call it calls the method
>   each_Descendent. This method is not implemented in
>   Bio::DB::Taxonomy. It just has a single line,
>
>
>
>
>   shift->throw_not_implemented();
>
>
>
>
>   Thanks.
>   George.
>
>
>
>
>   Hilmar Lapp wrote:
>   I'm a bit confused - it sounds like you have set up a local
>   BioSQL
>   database and loaded the NCBI taxonomy into the database. You can
>   now
>   use simple SQL to retrieve all descendants of a node in the tree
>   given its NCBI taxonID such as
>
>
>
>
>   SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>   WHERE
>   n.ncbi_taxon_id = :taxonID
>   AND tn.left_value > n. left_value
>   AND tn.right_value < n.right_value
>   AND tn.taxon_id = tnm.taxon_id
>   AND tn.name_class = 'scientific_name'
>
>
>
>
>   BioPerl doesn't have a Taxonomy::biosql module yet (though this
>   would
>   seem like a worthwhile thing to add), so you can't use the
>   Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>
>
>   However, BioPerl does have support for the flat-file download of
>   the
>   NCBI taxonomy database and indexes it, so you can simply use
>   Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>   download
>   to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>
>
>   Although the recursive implementation of
>   Taxonomy::get_all_Descendants
>   () won't be lightning fast, it may still be perfectly fine for your
>   application - are you sure it is not?
>
>
>
>
>   -hilmar
>
>
>
>
>   On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>
>
>   Thanks. And how can I assign the $node here in the below code,
>   such
>   that I can reference it to a particular taxon id record? I want to
>   retrieve all the descendents from the taxonomy hierarchy, given a
>   particular taxon id.
>
>
>
>
>   I have a local db setup, in which I have uploaded data using the
>   load_ncbi_taxonomy.pl script.
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>   Jason Stajich wrote:
>   I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>
>
>
>
>   You just want the extant species/leaves of the tree
>
>
>
>
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node-
>     get_all_Descedents;
>
>
>
>
>
>
>
>
>
>
>
>
>   -jason
>   On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>
>
>   Hi all,
>
>
>
>
>
>
>
>
>   Can anyone point me to some example that uses the
>   get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
>   this, and I am not quite sure how to implement it.
>
>
>
>
>
>
>
>
>   Thanks.
>   George
>
>
>
>
>
>
>
>
>   Sendu Bala wrote:
>   George Heller wrote:
>   Hi all,
>
>
>
>
>
>
>
>
>   I am looking at extracting the taxonomy hierarchy for some taxon
>   ids.
>   What I plan to do is, for a given taxon id, say 33090, I want to
>   extract all taxon ids that are children of this species. I do not
>   just want the immediate children, but the children's children
>   and so
>   on.
>
>
>
>
>
>
>
>
>   Any ideas on the way I can go about doing this?
>
>
>
>
>
>
>
>
>   Well, you'll use Bio::DB::Taxonomy presumably, and
>   each_Descendent in
>   some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>
>
>
>
>   If you happen to code up something neat and efficient, why not
>   share it
>   with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Shape Yahoo! in your own image. Join our Network Research Panel
>   today!
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Need a vacation? Get great deals to amazing places on Yahoo!
>   Travel.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Take the Internet to Go: Yahoo!Go puts the Internet in your
>   pocket: mail, news, photos & more.
>
>
>
>
>   --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Bored stiff? Loosen up...
>   Download and play hundreds of games for free on Yahoo! Games.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   Christopher Fields
>   Postdoctoral Researcher
>   Lab of Dr. Robert Switzer
>   Dept of Biochemistry
>   University of Illinois Urbana-Champaign
>
>
>
>
>
>
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>   --
>   ===========================================================
>   : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>   ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Expecting? Get great news right away with email Auto-Check.
>   Try the Yahoo! Mail Beta.
>   _______________________________________________
>   Bioperl-l mailing list
>   Bioperl-l at lists.open-bio.org
>   http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Building a website is a piece of cake.
> Yahoo! Small Business gives you all the tools to get online.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From george.heller at yahoo.com  Tue Jun 19 00:29:31 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 17:29:31 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <1FE460CB-F001-4EAE-9E83-FDF52AFFA5D0@bioperl.org>
Message-ID: <369098.81077.qm@web56507.mail.re3.yahoo.com>

But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like,
   
  #!/usr/bin/perl
  use strict;
#use warnings;
use DBI;
  use Bio::Tree::Node;
use Bio::DB::Taxonomy;
use Bio::DB::Taxonomy::flatfile;
  my $idx_dir = '/tmp';
my $nodefile;
my $namesfile;

  my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                               -nodesfile => $nodefile,
                               -namesfile => $namesfile,
                               -directory => $idx_dir);
 my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
 my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  
for my $child ( @extant_children ) {
  print "id is ", $child->id, "\n"; # NCBI taxa id
  print "rank is ", $child->rank, "\n"; # e.g. species
  print "scientific name is ", $child->scientific_name, "\n"; #
scientific name
}

Thanks.
  George
  
Jason Stajich <jason at bioperl.org> wrote:
    All the children are in this array.  
  

  You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen.  
  Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object.
    I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature.
  

  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  for my $child ( @extant_children ) {
      print "id is ", $child->id, "\n"; # NCBI taxa id
    print "rank is ", $child->rank, "\n"; # e.g. species
    print "scientific name is ", $child->scientific_name, "\n"; # scientific name
  }


    On Jun 18, 2007, at 5:04 PM, George Heller wrote:

    Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
  

    Sorry to be bothering, really appreaciate your patience.
  

    Thanks.
    George
  

  Jason Stajich <jason at bioperl.org> wrote:
    Try installing the latest Scalar::Util  
      On Jun 18, 2007, at 4:05 PM, George Heller wrote:
  

      This is the output of /usr/bin/perl -V
  

    Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
      Platform:
        osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
        uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
        config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
        hint=recommended, useposix=true, d_sigaction=define
        usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
        useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
        use64bitint=undef use64bitall=undef uselongdouble=undef
        usemymalloc=n, bincompat5005=undef
      Compiler:
        cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
        optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
        cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
        ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
        intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
        d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
        ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
        alignbytes=4, prototype=define
      Linker and Libraries:
        ld='gcc', ldflags =' -L/usr/local/lib'
        libpth=/usr/local/lib /lib /usr/lib
        libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
        perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
        libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
        gnulibc_version='2.3.4'
      Dynamic Linking:
        dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
        cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

    Characteristics of this binary (from libperl):
      Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
      Built under linux
      Compiled at Jul 24 2006 18:28:10
      @INC:
        /usr/lib/perl5/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/5.8.5
        /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
        /usr/lib/perl5/site_perl/5.8.5
        /usr/lib/perl5/site_perl/5.8.4
        /usr/lib/perl5/site_perl/5.8.3
        /usr/lib/perl5/site_perl/5.8.2
        /usr/lib/perl5/site_perl/5.8.1
        /usr/lib/perl5/site_perl/5.8.0
        /usr/lib/perl5/site_perl
        /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
        /usr/lib/perl5/vendor_perl/5.8.5
        /usr/lib/perl5/vendor_perl/5.8.4
        /usr/lib/perl5/vendor_perl/5.8.3
        /usr/lib/perl5/vendor_perl/5.8.2
        /usr/lib/perl5/vendor_perl/5.8.1
        /usr/lib/perl5/vendor_perl/5.8.0
        /usr/lib/perl5/vendor_perl
  

      Thanks.
      George
        .
  

    Hilmar Lapp <hlapp at gmx.net> wrote:
      The perl version appears to be 5.8.5 though, so something strange 
    appears to be going on too.
  

    George, can you please post the output of
  

    $ /usr/bin/perl -V
  

    -hilmar
  

    On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

      As the error implies your local version of perl doesn't seem support
    weak references, which means it doesn't have Scalar::Utils (which was
    added to core after perl 5.6.1, I think). Try installing
    Scalar::Utils to see what happens.
  

    chris
  

    On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

      I tried running the below mentioned script and I seem to be getting
    the following error:
  

    Weak references are not implemented in the version of perl at /
    usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
    BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
    Bio/Tree/Node.pm line 76.
    Compilation failed in require at my.pl line 7.
    BEGIN failed--compilation aborted at my.pl line 7.
  

    My script looks something like,
  

    #!/usr/bin/perl
    use strict;
    #use warnings;
    use DBI;
    use Bio::Tree::Node;
    use Bio::DB::Taxonomy;
    use Bio::DB::Taxonomy::flatfile;
    my $idx_dir = '/tmp';
  

    my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
    my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
    -nodesfile => $nodesfile,
    -namesfile => $namesfile,
    -directory => $idx_dir);
    my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descendents;
  

    foreach $field (@extant_children) {
    print "$field";
    print "|";
    print "\n";
    }
  

    And I am running the script using the command,
  

    perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

    and I have the nodes.dmp and names.dmp files in the current
    directory.
  

    Thanks,
    George
  

    Jason Stajich wrote:
    It is implemented in the implementing class - DB::Taxonomy is
    just the base class. For example see the flatfile implementation
    Bio::DB::Taxonomy::flatfile
  

    See the scripts/taxa/local_taxonomydb_query.PLS for example using
    it:
    nodes and names are from NCBI taxonomy database.
  

    Here is an un-debugged copy+paste for your question that *should*
    work.
  

    use Bio::DB::Taxonomy
    my $idx_dir = '/tmp';
  

    my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
    my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
    -nodesfile => $nodesfile,
    -namesfile => $namesfile,
    -directory => $idx_dir);
    my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descendents;
  

    -jason
  

    On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

    What exactly is the "node n" in the query below. When I issue
    this query, it says,
  

    relation "node" does not exist.
  

    I tried to use the get_all_Descendents method but it looks like
    in order to do a recursive call it calls the method
    each_Descendent. This method is not implemented in
    Bio::DB::Taxonomy. It just has a single line,
  

    shift->throw_not_implemented();
  

    Thanks.
    George.
  

    Hilmar Lapp wrote:
    I'm a bit confused - it sounds like you have set up a local 
    BioSQL
    database and loaded the NCBI taxonomy into the database. You can 
    now
    use simple SQL to retrieve all descendants of a node in the tree
    given its NCBI taxonID such as
  

    SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
    WHERE
    n.ncbi_taxon_id = :taxonID
    AND tn.left_value > n. left_value
    AND tn.right_value < n.right_value
    AND tn.taxon_id = tnm.taxon_id
    AND tn.name_class = 'scientific_name'
  

    BioPerl doesn't have a Taxonomy::biosql module yet (though this
    would
    seem like a worthwhile thing to add), so you can't use the
    Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

    However, BioPerl does have support for the flat-file download of 
    the
    NCBI taxonomy database and indexes it, so you can simply use
    Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
    download
    to achieve what you wanted to do in a less than 5 lines of perl.
  

    Although the recursive implementation of
    Taxonomy::get_all_Descendants
    () won't be lightning fast, it may still be perfectly fine for your
    application - are you sure it is not?
  

    -hilmar
  

    On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

    Thanks. And how can I assign the $node here in the below code,
    such
    that I can reference it to a particular taxon id record? I want to
    retrieve all the descendents from the taxonomy hierarchy, given a
    particular taxon id.
  

    I have a local db setup, in which I have uploaded data using the
    load_ncbi_taxonomy.pl script.
  

    Thanks.
    George
  

    Jason Stajich wrote:
    I assume you already figured out how to setup a local taxonomydb?
  

    You just want the extant species/leaves of the tree
  

    my @extant_children = grep { $_->is_Leaf } $node-
      get_all_Descedents;
  

    -jason
    On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

    Hi all,
  

    Can anyone point me to some example that uses the
    get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
    this, and I am not quite sure how to implement it.
  

    Thanks.
    George
  

    Sendu Bala wrote:
    George Heller wrote:
    Hi all,
  

    I am looking at extracting the taxonomy hierarchy for some taxon
    ids.
    What I plan to do is, for a given taxon id, say 33090, I want to
    extract all taxon ids that are children of this species. I do not
    just want the immediate children, but the children's children 
    and so
    on.
  

    Any ideas on the way I can go about doing this?
  

    Well, you'll use Bio::DB::Taxonomy presumably, and
    each_Descendent in
    some kind of looping structure. Most easily a recursing sub.
  

    If you happen to code up something neat and efficient, why not
    share it
    with us and we could add it to the Taxonomy module(s).
  

    ---------------------------------
    Shape Yahoo! in your own image. Join our Network Research Panel
    today!
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

    ---------------------------------
    Need a vacation? Get great deals to amazing places on Yahoo! 
    Travel.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    --
    ===========================================================
    : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
    ===========================================================
  

    ---------------------------------
    Take the Internet to Go: Yahoo!Go puts the Internet in your
    pocket: mail, news, photos & more.
  

    --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

    ---------------------------------
    Bored stiff? Loosen up...
    Download and play hundreds of games for free on Yahoo! Games.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    Christopher Fields
    Postdoctoral Researcher
    Lab of Dr. Robert Switzer
    Dept of Biochemistry
    University of Illinois Urbana-Champaign
  

    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

    -- 
    ===========================================================
    : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
    ===========================================================
  

    ---------------------------------
    Expecting? Get great news right away with email Auto-Check.
    Try the Yahoo! Mail Beta.
    _______________________________________________
    Bioperl-l mailing list
    Bioperl-l at lists.open-bio.org
    http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

  ---------------------------------
  Building a website is a piece of cake. 
  Yahoo! Small Business gives you all the tools to get online.


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us.


From jason at bioperl.org  Tue Jun 19 01:05:43 2007
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 18 Jun 2007 18:05:43 -0700
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <369098.81077.qm@web56507.mail.re3.yahoo.com>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
Message-ID: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>

The files are indexes because you are indexing a flatfile - this  
speeds up the lookup so the second time you run the script it doesn't  
have to index.
You don't need to look at the files, they won't make sense to a human!

The reason it isn't printing anything is someone didn't really write  
the implementation quite right. This code was overhauled by Sendu  
before the last release I guess something didn't quite get connected.

I checked in code that has the Bio::Taxon delegating now to a DB  
handle for the each_Descendent call.
You can either patch your code  or just use the code listed here:
  http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

On Jun 18, 2007, at 5:29 PM, George Heller wrote:

> But the problem is that I don't really get any output on the  
> screen. In the /tmp directory I get 4 files namely parents, nodes,  
> id2names and names2id, but I dont know what to make of them. This  
> is what my script looks like,
>
>   #!/usr/bin/perl
>   use strict;
> #use warnings;
> use DBI;
>   use Bio::Tree::Node;
> use Bio::DB::Taxonomy;
> use Bio::DB::Taxonomy::flatfile;
>   my $idx_dir = '/tmp';
> my $nodefile;
> my $namesfile;
>
>   my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
> my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
>                                -nodesfile => $nodefile,
>                                -namesfile => $namesfile,
>                                -directory => $idx_dir);
>  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>  my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
> for my $child ( @extant_children ) {
>   print "id is ", $child->id, "\n"; # NCBI taxa id
>   print "rank is ", $child->rank, "\n"; # e.g. species
>   print "scientific name is ", $child->scientific_name, "\n"; #
> scientific name
> }
>
> Thanks.
>   George
>
> Jason Stajich <jason at bioperl.org> wrote:
>     All the children are in this array.
>
>
>   You get to decide what you want to do with them. In the following  
> example I print the id, rank, and scientific name out to the screen.
>   Because this is a taxonomy db query you are getting back  
> Bio::Taxonomy::Taxon objects so read the documentation for this  
> module to see what you can do with the object.
>     I would also suggest spending a little time with the Getting  
> started and HOWTO:Trees documentation on the website to get  
> familiar with the objects and nomenclature.
>
>
>
>
>   my @extant_children = grep { $_->is_Leaf } $node- 
> >get_all_Descendents;
>
>
>   for my $child ( @extant_children ) {
>       print "id is ", $child->id, "\n"; # NCBI taxa id
>     print "rank is ", $child->rank, "\n"; # e.g. species
>     print "scientific name is ", $child->scientific_name, "\n"; #  
> scientific name
>   }
>
>
>     On Jun 18, 2007, at 5:04 PM, George Heller wrote:
>
>     Ok, I installed the latest of Scalar::Util and the script seems  
> to be working. But I am confused where exactly I need to look for  
> the descendent taxon ids once the script is run. I did look into  
> the /tmp/ directory, but I couldnt understand much.
>
>
>     Sorry to be bothering, really appreaciate your patience.
>
>
>     Thanks.
>     George
>
>
>   Jason Stajich <jason at bioperl.org> wrote:
>     Try installing the latest Scalar::Util
>       On Jun 18, 2007, at 4:05 PM, George Heller wrote:
>
>
>       This is the output of /usr/bin/perl -V
>
>
>
>
>     Summary of my perl5 (revision 5 version 8 subversion 5)  
> configuration:
>       Platform:
>         osname=linux, osvers=2.6.9-22.18.bz155725.elsmp,  
> archname=i386-linux-thread-multi
>         uname='linux hs20-bc1-4.build.redhat.com  
> 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686  
> i686 i386 gnulinux '
>         config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 - 
> mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost - 
> Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. - 
> Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux - 
> Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads - 
> Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db - 
> Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio - 
> Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/ 
> less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
>         hint=recommended, useposix=true, d_sigaction=define
>         usethreads=define use5005threads=undef useithreads=define  
> usemultiplicity=define
>         useperlio=define d_sfio=undef uselargefiles=define  
> usesocks=undef
>         use64bitint=undef use64bitall=undef uselongdouble=undef
>         usemymalloc=n, bincompat5005=undef
>       Compiler:
>         cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING - 
> fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE - 
> D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
>         optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
>         cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno- 
> strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
>         ccversion='', gccversion='3.4.6 20060404 (Red Hat  
> 3.4.6-2)', gccosandvers=''
>         intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
>         d_longlong=define, longlongsize=8, d_longdbl=define,  
> longdblsize=12
>         ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
> Off_t='off_t', lseeksize=8
>         alignbytes=4, prototype=define
>       Linker and Libraries:
>         ld='gcc', ldflags =' -L/usr/local/lib'
>         libpth=/usr/local/lib /lib /usr/lib
>         libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil - 
> lpthread -lc
>         perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
>         libc=/lib/libc-2.3.4.so, so=so, useshrplib=true,  
> libperl=libperl.so
>         gnulibc_version='2.3.4'
>       Dynamic Linking:
>         dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='- 
> Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
>         cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
>
>
>
>
>     Characteristics of this binary (from libperl):
>       Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS  
> USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
>       Built under linux
>       Compiled at Jul 24 2006 18:28:10
>       @INC:
>         /usr/lib/perl5/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/5.8.5
>         /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
>         /usr/lib/perl5/site_perl/5.8.5
>         /usr/lib/perl5/site_perl/5.8.4
>         /usr/lib/perl5/site_perl/5.8.3
>         /usr/lib/perl5/site_perl/5.8.2
>         /usr/lib/perl5/site_perl/5.8.1
>         /usr/lib/perl5/site_perl/5.8.0
>         /usr/lib/perl5/site_perl
>         /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
>         /usr/lib/perl5/vendor_perl/5.8.5
>         /usr/lib/perl5/vendor_perl/5.8.4
>         /usr/lib/perl5/vendor_perl/5.8.3
>         /usr/lib/perl5/vendor_perl/5.8.2
>         /usr/lib/perl5/vendor_perl/5.8.1
>         /usr/lib/perl5/vendor_perl/5.8.0
>         /usr/lib/perl5/vendor_perl
>
>
>
>
>       Thanks.
>       George
>         .
>
>
>
>
>     Hilmar Lapp <hlapp at gmx.net> wrote:
>       The perl version appears to be 5.8.5 though, so something  
> strange
>     appears to be going on too.
>
>
>
>
>     George, can you please post the output of
>
>
>
>
>     $ /usr/bin/perl -V
>
>
>
>
>     -hilmar
>
>
>
>
>     On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
>
>
>
>
>       As the error implies your local version of perl doesn't seem  
> support
>     weak references, which means it doesn't have Scalar::Utils  
> (which was
>     added to core after perl 5.6.1, I think). Try installing
>     Scalar::Utils to see what happens.
>
>
>
>
>     chris
>
>
>
>
>     On Jun 18, 2007, at 5:18 PM, George Heller wrote:
>
>
>
>
>       I tried running the below mentioned script and I seem to be  
> getting
>     the following error:
>
>
>
>
>     Weak references are not implemented in the version of perl at /
>     usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
>     BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/ 
> 5.8.5/
>     Bio/Tree/Node.pm line 76.
>     Compilation failed in require at my.pl line 7.
>     BEGIN failed--compilation aborted at my.pl line 7.
>
>
>
>
>     My script looks something like,
>
>
>
>
>     #!/usr/bin/perl
>     use strict;
>     #use warnings;
>     use DBI;
>     use Bio::Tree::Node;
>     use Bio::DB::Taxonomy;
>     use Bio::DB::Taxonomy::flatfile;
>     my $idx_dir = '/tmp';
>
>
>
>
>     my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>     -nodesfile => $nodesfile,
>     -namesfile => $namesfile,
>     -directory => $idx_dir);
>     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descendents;
>
>
>
>
>     foreach $field (@extant_children) {
>     print "$field";
>     print "|";
>     print "\n";
>     }
>
>
>
>
>     And I am running the script using the command,
>
>
>
>
>     perl myscript.pl -v --names names.dmp --nodes nodes.dmp
>
>
>
>
>     and I have the nodes.dmp and names.dmp files in the current
>     directory.
>
>
>
>
>     Thanks,
>     George
>
>
>
>
>
>
>
>
>     Jason Stajich wrote:
>     It is implemented in the implementing class - DB::Taxonomy is
>     just the base class. For example see the flatfile implementation
>     Bio::DB::Taxonomy::flatfile
>
>
>
>
>     See the scripts/taxa/local_taxonomydb_query.PLS for example using
>     it:
>     nodes and names are from NCBI taxonomy database.
>
>
>
>
>
>
>
>
>     Here is an un-debugged copy+paste for your question that *should*
>     work.
>
>
>
>
>
>
>
>
>     use Bio::DB::Taxonomy
>     my $idx_dir = '/tmp';
>
>
>
>
>
>
>
>
>     my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
>     my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
>     -nodesfile => $nodesfile,
>     -namesfile => $namesfile,
>     -directory => $idx_dir);
>     my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descendents;
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     -jason
>
>
>
>
>     On Jun 18, 2007, at 10:07 AM, George Heller wrote:
>
>
>
>
>     What exactly is the "node n" in the query below. When I issue
>     this query, it says,
>
>
>
>
>
>
>
>
>     relation "node" does not exist.
>
>
>
>
>
>
>
>
>     I tried to use the get_all_Descendents method but it looks like
>     in order to do a recursive call it calls the method
>     each_Descendent. This method is not implemented in
>     Bio::DB::Taxonomy. It just has a single line,
>
>
>
>
>
>
>
>
>     shift->throw_not_implemented();
>
>
>
>
>
>
>
>
>     Thanks.
>     George.
>
>
>
>
>
>
>
>
>     Hilmar Lapp wrote:
>     I'm a bit confused - it sounds like you have set up a local
>     BioSQL
>     database and loaded the NCBI taxonomy into the database. You can
>     now
>     use simple SQL to retrieve all descendants of a node in the tree
>     given its NCBI taxonID such as
>
>
>
>
>
>
>
>
>     SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
>     WHERE
>     n.ncbi_taxon_id = :taxonID
>     AND tn.left_value > n. left_value
>     AND tn.right_value < n.right_value
>     AND tn.taxon_id = tnm.taxon_id
>     AND tn.name_class = 'scientific_name'
>
>
>
>
>
>
>
>
>     BioPerl doesn't have a Taxonomy::biosql module yet (though this
>     would
>     seem like a worthwhile thing to add), so you can't use the
>     Bio::DB::Taxonomy interface to do this against a BioSQL instance.
>
>
>
>
>
>
>
>
>     However, BioPerl does have support for the flat-file download of
>     the
>     NCBI taxonomy database and indexes it, so you can simply use
>     Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
>     download
>     to achieve what you wanted to do in a less than 5 lines of perl.
>
>
>
>
>
>
>
>
>     Although the recursive implementation of
>     Taxonomy::get_all_Descendants
>     () won't be lightning fast, it may still be perfectly fine for  
> your
>     application - are you sure it is not?
>
>
>
>
>
>
>
>
>     -hilmar
>
>
>
>
>
>
>
>
>     On Jun 18, 2007, at 12:21 AM, George Heller wrote:
>
>
>
>
>
>
>
>
>     Thanks. And how can I assign the $node here in the below code,
>     such
>     that I can reference it to a particular taxon id record? I want to
>     retrieve all the descendents from the taxonomy hierarchy, given a
>     particular taxon id.
>
>
>
>
>
>
>
>
>     I have a local db setup, in which I have uploaded data using the
>     load_ncbi_taxonomy.pl script.
>
>
>
>
>
>
>
>
>     Thanks.
>     George
>
>
>
>
>
>
>
>
>     Jason Stajich wrote:
>     I assume you already figured out how to setup a local taxonomydb?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     You just want the extant species/leaves of the tree
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     my @extant_children = grep { $_->is_Leaf } $node-
>       get_all_Descedents;
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     -jason
>     On Jun 17, 2007, at 11:41 AM, George Heller wrote:
>
>
>
>
>
>
>
>
>     Hi all,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Can anyone point me to some example that uses the
>     get_all_Descendents method from Bio::DB::Taxonomy? I am a  
> newbie at
>     this, and I am not quite sure how to implement it.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Thanks.
>     George
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Sendu Bala wrote:
>     George Heller wrote:
>     Hi all,
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     I am looking at extracting the taxonomy hierarchy for some taxon
>     ids.
>     What I plan to do is, for a given taxon id, say 33090, I want to
>     extract all taxon ids that are children of this species. I do not
>     just want the immediate children, but the children's children
>     and so
>     on.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Any ideas on the way I can go about doing this?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     Well, you'll use Bio::DB::Taxonomy presumably, and
>     each_Descendent in
>     some kind of looping structure. Most easily a recursing sub.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     If you happen to code up something neat and efficient, why not
>     share it
>     with us and we could add it to the Taxonomy module(s).
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Shape Yahoo! in your own image. Join our Network Research Panel
>     today!
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Need a vacation? Get great deals to amazing places on Yahoo!
>     Travel.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
>
>     --
>     ===========================================================
>     : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>     ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Take the Internet to Go: Yahoo!Go puts the Internet in your
>     pocket: mail, news, photos & more.
>
>
>
>
>
>
>
>
>     --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Bored stiff? Loosen up...
>     Download and play hundreds of games for free on Yahoo! Games.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>     Christopher Fields
>     Postdoctoral Researcher
>     Lab of Dr. Robert Switzer
>     Dept of Biochemistry
>     University of Illinois Urbana-Champaign
>
>
>
>
>
>
>
>
>
>
>
>
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>     --
>     ===========================================================
>     : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
>     ===========================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>     ---------------------------------
>     Expecting? Get great news right away with email Auto-Check.
>     Try the Yahoo! Mail Beta.
>     _______________________________________________
>     Bioperl-l mailing list
>     Bioperl-l at lists.open-bio.org
>     http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>       --
>     Jason Stajich
>     jason at bioperl.org
>     http://jason.open-bio.org/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>   ---------------------------------
>   Building a website is a piece of cake.
>   Yahoo! Small Business gives you all the tools to get online.
>
>
>     --
>   Jason Stajich
>   jason at bioperl.org
>   http://jason.open-bio.org/
>
>
>
>
>
>
>
> ---------------------------------
> Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s  
> user panel and lay it on us.

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From torsten.seemann at infotech.monash.edu.au  Tue Jun 19 01:21:04 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 19 Jun 2007 11:21:04 +1000
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4676A01F.30205@sendu.me.uk>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
Message-ID: <a79f6a4b0706181821p12a2e138xade9c30895e45068@mail.gmail.com>

Sendu,

> >> Can anyone offer a
> >> way to systematically find at least the test scripts which access the
> >> internet, if not the specific tests within?

Perhaps you could use 'strace' to list network system calls for each
test script, and grep out AF_INET connections?

% strace -e trace=network command_to_test 2>&1 | grep AF_INET

I'm not an strace expert but it might do what you need.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University
--Tel +61 3 9905 9010


From george.heller at yahoo.com  Tue Jun 19 01:16:10 2007
From: george.heller at yahoo.com (George Heller)
Date: Mon, 18 Jun 2007 18:16:10 -0700 (PDT)
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
Message-ID: <815364.33231.qm@web56512.mail.re3.yahoo.com>

Works perfectly. Thanks so much Jason, Hilmar, Chris. You've been a great help!
   
  Thanks.
  George

Jason Stajich <jason at bioperl.org> wrote:
  The files are indexes because you are indexing a flatfile - this speeds up the lookup so the second time you run the script it doesn't have to index.  You don't need to look at the files, they won't make sense to a human!
  

  The reason it isn't printing anything is someone didn't really write the implementation quite right. This code was overhauled by Sendu before the last release I guess something didn't quite get connected. 
  

  I checked in code that has the Bio::Taxon delegating now to a DB handle for the each_Descendent call.
  You can either patch your code  or just use the code listed here:
     http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

  
    On Jun 18, 2007, at 5:29 PM, George Heller wrote:

    But the problem is that I don't really get any output on the screen. In the /tmp directory I get 4 files namely parents, nodes, id2names and names2id, but I dont know what to make of them. This is what my script looks like,
  

    #!/usr/bin/perl
    use strict;
  #use warnings;
  use DBI;
    use Bio::Tree::Node;
  use Bio::DB::Taxonomy;
  use Bio::DB::Taxonomy::flatfile;
    my $idx_dir = '/tmp';
  my $nodefile;
  my $namesfile;
  

    my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
  my $db = new Bio::DB::Taxonomy(-source    => 'flatfile',
                                 -nodesfile => $nodefile,
                                 -namesfile => $namesfile,
                                 -directory => $idx_dir);
   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
   my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

  for my $child ( @extant_children ) {
    print "id is ", $child->id, "\n"; # NCBI taxa id
    print "rank is ", $child->rank, "\n"; # e.g. species
    print "scientific name is ", $child->scientific_name, "\n"; #
  scientific name
  }
  

  Thanks.
    George
  

  Jason Stajich <jason at bioperl.org> wrote:
      All the children are in this array.  
  

    You get to decide what you want to do with them. In the following example I print the id, rank, and scientific name out to the screen.  
    Because this is a taxonomy db query you are getting back Bio::Taxonomy::Taxon objects so read the documentation for this module to see what you can do with the object.
      I would also suggest spending a little time with the Getting started and HOWTO:Trees documentation on the website to get familiar with the objects and nomenclature.
  

    my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
  

    for my $child ( @extant_children ) {
        print "id is ", $child->id, "\n"; # NCBI taxa id
      print "rank is ", $child->rank, "\n"; # e.g. species
      print "scientific name is ", $child->scientific_name, "\n"; # scientific name
    }
  

      On Jun 18, 2007, at 5:04 PM, George Heller wrote:
  

      Ok, I installed the latest of Scalar::Util and the script seems to be working. But I am confused where exactly I need to look for the descendent taxon ids once the script is run. I did look into the /tmp/ directory, but I couldnt understand much. 
  

      Sorry to be bothering, really appreaciate your patience.
  

      Thanks.
      George
  

    Jason Stajich <jason at bioperl.org> wrote:
      Try installing the latest Scalar::Util  
        On Jun 18, 2007, at 4:05 PM, George Heller wrote:
  

        This is the output of /usr/bin/perl -V
  

      Summary of my perl5 (revision 5 version 8 subversion 5) configuration:
        Platform:
          osname=linux, osvers=2.6.9-22.18.bz155725.elsmp, archname=i386-linux-thread-multi
          uname='linux hs20-bc1-4.build.redhat.com 2.6.9-22.18.bz155725.elsmp #1 smp thu nov 17 15:34:08 est 2005 i686 i686 i386 gnulinux '
          config_args='-des -Doptimize=-O2 -g -pipe -m32 -march=i386 -mtune=pentium4 -Dversion=5.8.5 -Dmyhostname=localhost -Dperladmin=root at localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dinstallprefix=/usr -Dprefix=/usr -Darchname=i386-linux -Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dinc_version_list=5.8.4 5.8.3 5.8.2 5.8.1 5.8.0'
          hint=recommended, useposix=true, d_sigaction=define
          usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
          useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
          use64bitint=undef use64bitall=undef uselongdouble=undef
          usemymalloc=n, bincompat5005=undef
        Compiler:
          cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
          optimize='-O2 -g -pipe -m32 -march=i386 -mtune=pentium4',
          cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBUGGING -fno-strict-aliasing -pipe -I/usr/local/include -I/usr/include/gdbm'
          ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-2)', gccosandvers=''
          intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
          d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
          ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
          alignbytes=4, prototype=define
        Linker and Libraries:
          ld='gcc', ldflags =' -L/usr/local/lib'
          libpth=/usr/local/lib /lib /usr/lib
          libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
          perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
          libc=/lib/libc-2.3.4.so, so=so, useshrplib=true, libperl=libperl.so
          gnulibc_version='2.3.4'
        Dynamic Linking:
          dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.8.5/i386-linux-thread-multi/CORE'
          cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
  

      Characteristics of this binary (from libperl):
        Compile-time options: DEBUGGING MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL_IMPLICIT_CONTEXT
        Built under linux
        Compiled at Jul 24 2006 18:28:10
        @INC:
          /usr/lib/perl5/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/5.8.5
          /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.3/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.2/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
          /usr/lib/perl5/site_perl/5.8.5
          /usr/lib/perl5/site_perl/5.8.4
          /usr/lib/perl5/site_perl/5.8.3
          /usr/lib/perl5/site_perl/5.8.2
          /usr/lib/perl5/site_perl/5.8.1
          /usr/lib/perl5/site_perl/5.8.0
          /usr/lib/perl5/site_perl
          /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.3/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.2/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.8.5
          /usr/lib/perl5/vendor_perl/5.8.4
          /usr/lib/perl5/vendor_perl/5.8.3
          /usr/lib/perl5/vendor_perl/5.8.2
          /usr/lib/perl5/vendor_perl/5.8.1
          /usr/lib/perl5/vendor_perl/5.8.0
          /usr/lib/perl5/vendor_perl
  

        Thanks.
        George
          .
  

      Hilmar Lapp <hlapp at gmx.net> wrote:
        The perl version appears to be 5.8.5 though, so something strange 
      appears to be going on too.
  

      George, can you please post the output of
  

      $ /usr/bin/perl -V
  

      -hilmar
  

      On Jun 18, 2007, at 6:33 PM, Chris Fields wrote:
  

        As the error implies your local version of perl doesn't seem support
      weak references, which means it doesn't have Scalar::Utils (which was
      added to core after perl 5.6.1, I think). Try installing
      Scalar::Utils to see what happens.
  

      chris
  

      On Jun 18, 2007, at 5:18 PM, George Heller wrote:
  

        I tried running the below mentioned script and I seem to be getting
      the following error:
  

      Weak references are not implemented in the version of perl at /
      usr/lib/perl5/site_perl/5.8.5/Bio/Tree/Node.pm line 76
      BEGIN failed--compilation aborted at /usr/lib/perl5/site_perl/5.8.5/
      Bio/Tree/Node.pm line 76.
      Compilation failed in require at my.pl line 7.
      BEGIN failed--compilation aborted at my.pl line 7.
  

      My script looks something like,
  

      #!/usr/bin/perl
      use strict;
      #use warnings;
      use DBI;
      use Bio::Tree::Node;
      use Bio::DB::Taxonomy;
      use Bio::DB::Taxonomy::flatfile;
      my $idx_dir = '/tmp';
  

      my ($nodefile,$namesfile) = ('nodes.dmp','names.dmp');
      my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
      -nodesfile => $nodesfile,
      -namesfile => $namesfile,
      -directory => $idx_dir);
      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descendents;
  

      foreach $field (@extant_children) {
      print "$field";
      print "|";
      print "\n";
      }
  

      And I am running the script using the command,
  

      perl myscript.pl -v --names names.dmp --nodes nodes.dmp
  

      and I have the nodes.dmp and names.dmp files in the current
      directory.
  

      Thanks,
      George
  

      Jason Stajich wrote:
      It is implemented in the implementing class - DB::Taxonomy is
      just the base class. For example see the flatfile implementation
      Bio::DB::Taxonomy::flatfile
  

      See the scripts/taxa/local_taxonomydb_query.PLS for example using
      it:
      nodes and names are from NCBI taxonomy database.
  

      Here is an un-debugged copy+paste for your question that *should*
      work.
  

      use Bio::DB::Taxonomy
      my $idx_dir = '/tmp';
  

      my ($nodefile,$namesfile) = ('nodes.dmp,'names.dmp');
      my $db = new Bio::DB::Taxonomy(-source => 'flatfile',
      -nodesfile => $nodesfile,
      -namesfile => $namesfile,
      -directory => $idx_dir);
      my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descendents;
  

      -jason
  

      On Jun 18, 2007, at 10:07 AM, George Heller wrote:
  

      What exactly is the "node n" in the query below. When I issue
      this query, it says,
  

      relation "node" does not exist.
  

      I tried to use the get_all_Descendents method but it looks like
      in order to do a recursive call it calls the method
      each_Descendent. This method is not implemented in
      Bio::DB::Taxonomy. It just has a single line,
  

      shift->throw_not_implemented();
  

      Thanks.
      George.
  

      Hilmar Lapp wrote:
      I'm a bit confused - it sounds like you have set up a local 
      BioSQL
      database and loaded the NCBI taxonomy into the database. You can 
      now
      use simple SQL to retrieve all descendants of a node in the tree
      given its NCBI taxonID such as
  

      SELECT tn.*, tnm.name FROM taxon tn, taxon_name tnm, node n
      WHERE
      n.ncbi_taxon_id = :taxonID
      AND tn.left_value > n. left_value
      AND tn.right_value < n.right_value
      AND tn.taxon_id = tnm.taxon_id
      AND tn.name_class = 'scientific_name'
  

      BioPerl doesn't have a Taxonomy::biosql module yet (though this
      would
      seem like a worthwhile thing to add), so you can't use the
      Bio::DB::Taxonomy interface to do this against a BioSQL instance.
  

      However, BioPerl does have support for the flat-file download of 
      the
      NCBI taxonomy database and indexes it, so you can simply use
      Taxonomy::{get_taxon,get_all_Descendants} using the flatfile
      download
      to achieve what you wanted to do in a less than 5 lines of perl.
  

      Although the recursive implementation of
      Taxonomy::get_all_Descendants
      () won't be lightning fast, it may still be perfectly fine for your
      application - are you sure it is not?
  

      -hilmar
  

      On Jun 18, 2007, at 12:21 AM, George Heller wrote:
  

      Thanks. And how can I assign the $node here in the below code,
      such
      that I can reference it to a particular taxon id record? I want to
      retrieve all the descendents from the taxonomy hierarchy, given a
      particular taxon id.
  

      I have a local db setup, in which I have uploaded data using the
      load_ncbi_taxonomy.pl script.
  

      Thanks.
      George
  

      Jason Stajich wrote:
      I assume you already figured out how to setup a local taxonomydb?
  

      You just want the extant species/leaves of the tree
  

      my @extant_children = grep { $_->is_Leaf } $node-
        get_all_Descedents;
  

      -jason
      On Jun 17, 2007, at 11:41 AM, George Heller wrote:
  

      Hi all,
  

      Can anyone point me to some example that uses the
      get_all_Descendents method from Bio::DB::Taxonomy? I am a newbie at
      this, and I am not quite sure how to implement it.
  

      Thanks.
      George
  

      Sendu Bala wrote:
      George Heller wrote:
      Hi all,
  

      I am looking at extracting the taxonomy hierarchy for some taxon
      ids.
      What I plan to do is, for a given taxon id, say 33090, I want to
      extract all taxon ids that are children of this species. I do not
      just want the immediate children, but the children's children 
      and so
      on.
  

      Any ideas on the way I can go about doing this?
  

      Well, you'll use Bio::DB::Taxonomy presumably, and
      each_Descendent in
      some kind of looping structure. Most easily a recursing sub.
  

      If you happen to code up something neat and efficient, why not
      share it
      with us and we could add it to the Taxonomy module(s).
  

      ---------------------------------
      Shape Yahoo! in your own image. Join our Network Research Panel
      today!
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

      ---------------------------------
      Need a vacation? Get great deals to amazing places on Yahoo! 
      Travel.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      --
      ===========================================================
      : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
      ===========================================================
  

      ---------------------------------
      Take the Internet to Go: Yahoo!Go puts the Internet in your
      pocket: mail, news, photos & more.
  

      --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

      ---------------------------------
      Bored stiff? Loosen up...
      Download and play hundreds of games for free on Yahoo! Games.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      Christopher Fields
      Postdoctoral Researcher
      Lab of Dr. Robert Switzer
      Dept of Biochemistry
      University of Illinois Urbana-Champaign
  

      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

      -- 
      ===========================================================
      : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
      ===========================================================
  

      ---------------------------------
      Expecting? Get great news right away with email Auto-Check.
      Try the Yahoo! Mail Beta.
      _______________________________________________
      Bioperl-l mailing list
      Bioperl-l at lists.open-bio.org
      http://lists.open-bio.org/mailman/listinfo/bioperl-l
  

        --
      Jason Stajich
      jason at bioperl.org
      http://jason.open-bio.org/
  

    ---------------------------------
    Building a website is a piece of cake. 
    Yahoo! Small Business gives you all the tools to get online.
  

      --
    Jason Stajich
    jason at bioperl.org
    http://jason.open-bio.org/
  

  ---------------------------------
  Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel and lay it on us.


    --
  Jason Stajich
  jason at bioperl.org
  http://jason.open-bio.org/


---------------------------------
Now that's room service! Choose from over 150,000 hotels 
in 45,000 destinations on Yahoo! Travel to find your fit.


From torsten.seemann at infotech.monash.edu.au  Tue Jun 19 01:26:41 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 19 Jun 2007 11:26:41 +1000
Subject: [Bioperl-l] gff2xml
In-Reply-To: <a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>
References: <462784640706120904g25a6550dsc56a22af64ca98cd@mail.gmail.com>
	<a79f6a4b0706121718g4b0ca6a4m97f253b2e2b84059@mail.gmail.com>
Message-ID: <a79f6a4b0706181826x4ccc4ee5n8ddafa703ad162a3@mail.gmail.com>

(Sean, please reply to the bioperl-l list rather than to me personally
so everyone can read it. i'm reposting it here)

> > I posted this on the gbrowse list earlier. I'm looking to convert gff
> > data files into xml. Does anyone know of a module written to do this
> > already?
>
> What DTD do you want the XML to conform to?
> eg. ChadoXML, TinySeq XML, TIGR XML ... ?

Hi Torsten,
I'm collaborating with other groups and want web-service compatible
functionality for various tools. Normally the analysis tools I'm using
generate gff output. I'm going to have to wrap this output in XML with
XSL stylesheet for end-users to view. Haven't done it before and don't
know what DTD to use. The bp_seqconvert.pl doesn't accept gff format.
I would imagine the DTD would be quite short as the gff files are very
standard, I just don't have any experience with these DTD
requirements.
--Sean O'Keeffe <limericksean at gmail.com>


From sac at bioperl.org  Tue Jun 19 06:42:27 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Mon, 18 Jun 2007 23:42:27 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re: Perltidy)
Message-ID: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>

On 6/16/07, Jason Stajich <jason at bioperl.org> wrote:
> [...]
> Just to say I already went through all the steps of running cvs2svn
> myself and had problems gathering back out the branches and all the
> tags when I tried it.  If you want to start with a smaller repository
> like bioperl-network or bioperl-db as the initial cvs2svn conversion
> script took quite a long time to run on bioperl-live.

Might this been a good opportunity to investigate partitioning
bioperl-live into sub-repositories? There has been talk in the past of
defining a set of "core" modules separate from other functionally
related groups of modules that would be viewed as optional extensions.
The goal being to help manage growth and simplify releases. There are
currently 892 modules under Bio/.

In addition to simplifying the migration to SVN, it would also have
other benefits. Say some new functionality or a slew of fixes were
added to Bio::Graphics. We could turn around a new Bio::Graphics
release quickly without having to work on getting various other parts
up to snuff that aren't related to graphics (Biblio, DB, PopGen,
Search etc.). Maintenance and releases of the various extensions would
be more parallelizable, orchestrated by separate ring leaders.

Over time, as a set of functionality matures, it would see fewer
updates and there would be less of a need for users to
download/install/test it. This could make bioperl easier to customize,
extend, and grok in general.

Long term, it should ease development and release cycles, but it will
involve a bit of near term bullet-biting. We'd need to get clear on
how to partition things, including modules, tests, docs, installation
logic, etc. and we'd probably need new integration tests to verify
that the subsets continue working together.

What do folks think? Would this SVN-based, re-partitioned bioperl-live
constitute a 2.0 release? Any volunteers to help assemble a roadmap
and milestones? Should I go on dreaming?

Cheers,
Steve


From bix at sendu.me.uk  Tue Jun 19 07:01:05 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 08:01:05 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
Message-ID: <46777F31.7030402@sendu.me.uk>

Jason Stajich wrote:
> The reason it isn't printing anything is someone didn't really write  
> the implementation quite right. This code was overhauled by Sendu  
> before the last release I guess something didn't quite get connected.
> 
> I checked in code that has the Bio::Taxon delegating now to a DB  
> handle for the each_Descendent call.
> You can either patch your code  or just use the code listed here:
>   http://bioperl.org/wiki/Module:Bio::DB::Taxonomy

I've reverted that change.

For some reason the docs for Bio::Taxon::each_Descendent aren't showing 
up on the website, but they state:

---
Note that this method never asks the database for the descendents; it 
will only return objects you have manually set with add_Descendent(), or 
where this was done for you by making a Bio::Tree::Tree with this object 
as an argument to new().

To get the database descendents use 
$taxon->db_handle->each_Descendent($taxon).
---


I also have a note in the Synopsis for the module:

---
# Though be careful with each_Descendent - unless you add_Descendent()
# yourself, you won't get an answer because unlike for ancestor(),
# Bio::Taxon does not ask the database for the answer. You can ask the
# database yourself using the same method:
($human) = $homo->db_handle->each_Descendent($homo);
---


This is quite deliberate and is to prevent Bad Things from happening. 
(Can't exactly remember the reasoning now, but I know it was good.)


From bix at sendu.me.uk  Tue Jun 19 07:41:57 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 08:41:57 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
Message-ID: <467788C5.6070406@sendu.me.uk>

Steve Chervitz wrote:
> Might this been a good opportunity to investigate partitioning
> bioperl-live into sub-repositories? There has been talk in the past of
> defining a set of "core" modules separate from other functionally
> related groups of modules that would be viewed as optional extensions.
> The goal being to help manage growth and simplify releases. There are
> currently 892 modules under Bio/.
> 
> In addition to simplifying the migration to SVN, it would also have
> other benefits. Say some new functionality or a slew of fixes were
> added to Bio::Graphics. We could turn around a new Bio::Graphics
> release quickly without having to work on getting various other parts
> up to snuff that aren't related to graphics (Biblio, DB, PopGen,
> Search etc.). Maintenance and releases of the various extensions would
> be more parallelizable, orchestrated by separate ring leaders.
> 
> Over time, as a set of functionality matures, it would see fewer
> updates and there would be less of a need for users to
> download/install/test it. This could make bioperl easier to customize,
> extend, and grok in general.
> 
> Long term, it should ease development and release cycles

I actually take the opposite view. Breaking things up makes testing and 
releases more difficult.

If one person acts as pumpkin for all the sub-parts, his work-load 
increases almost linearly with the number of sub-parts. If each sub-part 
gets its own pumpkin, where do all these pumpkins come from? It seems to 
me that frequently authors will write modules but inevitably their 
circumstance changes and they can no longer devote the time to look 
after them. Having a single pumpkin and 'forcing' him to make sure 
everything works (regardless of his personal interest in the module) 
seems more reliable than hoping there will be a person interested enough 
in each sub-part to handle its release.

Since all sub-parts will at the least interact with the 'true' core set 
of Bioperl modules, they need to be tested and potentially re-released 
every time the true core is updated. And since some sub-parts will 
interact with other sub-parts, there will need to be coordinated 
joint-testing and release of multiple sub-parts.

What happens when users report problems? We ask them what version 
they're running. Right now '1.5.2' means a specific thing, and its 
trivial for someone to confirm the same problem by installing 1.5.2. 
What happens when users have to list out all the versions of all the 
sub-parts they have? Who is going to consistently recreate a users 
hodge-podge of versions in order to confirm a bug? Won't the advice 
instead be: "update all versions to the latest and get back to us"?

So, as I see it, all sub-parts would best be tested and released with a 
single new version number every time one sub-part is updated 
(significantly). In which case, why have sub-parts at all? Keeping 
things the way they are now means ease of release for the pumpkin and 
ease of installation for end-users (only one install command to issue to 
CPAN). Having 'true' sub-parts (each with its own pumpkin), in my 
fatalistic view, is just going to lead to some useful sub-parts being 
abandoned and never updated, even where updates may be desirable.

Each and every Bio:: module could have been released separately by its 
respective author. As I see it, one of the main values of 'Bioperl' is 
that its one (reasonably) consistent collection of modules that lowers 
the barrier of entry for new Bioinformaticians, giving them extremely 
easy access to a whole host of functionality with a single install.


From hlapp at gmx.net  Tue Jun 19 12:47:02 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 08:47:02 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <46777F31.7030402@sendu.me.uk>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
Message-ID: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>

So the real mistake was to write

  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;

instead of

  my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
  my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents 
($node);

I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the  
database?

If this is correct, can we highlight this in the documentation? It's  
a small difference that everyone failed to spot.

If it is not correct, then maybe we need to revisit the rationale for  
why a Bio::DB::Taxonomy::get_all_Descendents may not query the  
underlying database.

Also, in my reading of Bio::Taxonomy::Taxon it won't use the database  
either for ancestor(). Which would be consistent with its other methods.

I.e., the bottom line is don't use Node or Taxon objects for  
hierarchy queries that you expect to use an underlying database, use  
the Bio::DB::Taxonomy object instead. It makes sense, but is it true?

	-hilmar

On Jun 19, 2007, at 3:01 AM, Sendu Bala wrote:

> Jason Stajich wrote:
>> The reason it isn't printing anything is someone didn't really write
>> the implementation quite right. This code was overhauled by Sendu
>> before the last release I guess something didn't quite get connected.
>>
>> I checked in code that has the Bio::Taxon delegating now to a DB
>> handle for the each_Descendent call.
>> You can either patch your code  or just use the code listed here:
>>   http://bioperl.org/wiki/Module:Bio::DB::Taxonomy
>
> I've reverted that change.
>
> For some reason the docs for Bio::Taxon::each_Descendent aren't  
> showing
> up on the website, but they state:
>
> ---
> Note that this method never asks the database for the descendents; it
> will only return objects you have manually set with add_Descendent 
> (), or
> where this was done for you by making a Bio::Tree::Tree with this  
> object
> as an argument to new().
>
> To get the database descendents use
> $taxon->db_handle->each_Descendent($taxon).
> ---
>
>
> I also have a note in the Synopsis for the module:
>
> ---
> # Though be careful with each_Descendent - unless you add_Descendent()
> # yourself, you won't get an answer because unlike for ancestor(),
> # Bio::Taxon does not ask the database for the answer. You can ask the
> # database yourself using the same method:
> ($human) = $homo->db_handle->each_Descendent($homo);
> ---
>
>
> This is quite deliberate and is to prevent Bad Things from happening.
> (Can't exactly remember the reasoning now, but I know it was good.)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From rvos at interchange.ubc.ca  Tue Jun 19 13:05:25 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Tue, 19 Jun 2007 06:05:25 -0700 (PDT)
Subject: [Bioperl-l] SVN and ...Re: Perltidy
Message-ID: <15433211.1182258325544.JavaMail.myubc2@brahms.my.ubc.ca>


> Unrelated, but it randomly just occurred to me: what happens to all the 
> id lines at the top of modules? Eg:
> 
> $Id: bl2seq.pm,v 1.28 2007/06/14 14:16:10 sendu Exp $
> 
> That's a cvs-specific thing, right? Do we delete them all? (Regardless, 
> I wish we would, since they caused me no end of hassles during the 1.5.2 
> release, doing updates across branches.)

If you run something like 'svn propset svn:keywords Id' on the file/folder/recursively, svn picks up on the $Id tag. The structure of the resulting string would be a little different, because svn revision numbers are simply auto-increasing integers (afaik) - so any regular expressions that cleverly want to include the revision number in $VERSION would need to be updated.


From bix at sendu.me.uk  Tue Jun 19 14:25:26 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 15:25:26 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
Message-ID: <4677E756.6050200@sendu.me.uk>

Hilmar Lapp wrote:
> So the real mistake was to write
> 
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $node->get_all_Descendents;
> 
> instead of
> 
>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>   my @extant_children = grep { $_->is_Leaf } $db->get_all_Descendents 
> ($node);
> 
> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask the  
> database?

Yes, the database object methods use the database. I don't even think it 
makes sense to question that. What else would it do?


> If this is correct, can we highlight this in the documentation? It's  
> a small difference that everyone failed to spot.

The documentation for what? I've already clearly pointed out the gotcha 
in Bio::Taxon.


> Also, in my reading of Bio::Taxonomy::Taxon it won't use the database  
> either for ancestor(). Which would be consistent with its other methods.

Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're dealing 
with, and it /does/ use the db to get the ancestor, unless the ancestor 
is manually set (see below for explanation).


> I.e., the bottom line is don't use Node or Taxon objects for  
> hierarchy queries that you expect to use an underlying database, use  
> the Bio::DB::Taxonomy object instead. It makes sense, but is it true?

Almost. It happens to be true but ideally wouldn't be the case. The 
confusion and problems arise, I guess, because we have two ways to 
access/create hierarchies and both of them are built from the same 
building block (Bio::Taxon objects).

On the one hand we have Bio::DB::Taxonomy and the other we have 
Bio::Tree::Tree.

Tree objects are easy: you have a Taxon object created in memory for 
each and every node in the tree. Each Taxon knows its ancestor and 
descendants by storing references to the relevant Taxon objects in the 
tree. You 'navigate' through the tree by grabbing a Taxon inside it and 
asking the Taxon itself for its ancestor or descendant.

This leaves us with the Taxon object having the methods ancestor() and 
each_Descendent(), which we'll expect to work in other circumstances.

Bio::DB::Taxonomy returns single Taxon objects from the database on 
request. Now we still expect our ancestor() and each_Descendent() 
methods to work, but if things were set up like Bio::Tree::Tree we'd end 
up pulling the entire database into memory because we'd have to create 
all the Taxon objects that are ancestors and descendants, recursively, 
every time we request a single Taxon (which is wasteful in the case of 
Bio::DB::Taxonomy::flatfile and slow/not allowed in the case of 
Bio::DB::Taxonomy::entrez).

The solution? We simply don't create the immediate ancestor or 
descendant Taxon objects of the requested Taxon, and instead implement 
the Taxon methods to ask the database to create them on demand, if they 
don't already exist. Well, that idea is fine (and necessary) for the 
ancestor method, but we run into problems with each_Descendent().

The problem arises when we create Bio::Tree::Tree objects from a Taxon 
we got from the database. Being able to do that is why Bio::Taxon is 
shared between them, as it is a very desirable thing to do: you can 
instantly create a lineage tree for a Taxon of interest and then use all 
the Bio::Tree::Tree methods on it. Unfortunately one of those methods is 
get_nodes() which is implemented using each_Descendent() and 
get_all_Descendents(). If each_Descendent() asked the database for the 
real answer, we'd end up pulling the entire database into the tree.

So my implementation was to not ask the database and just warn people in 
the docs. Ideally it /would/ use the database, because that's what a 
user would expect. Can anyone see an alternate way around the problem?


From hlapp at gmx.net  Tue Jun 19 16:14:38 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 12:14:38 -0400
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <4677E756.6050200@sendu.me.uk>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
	<4677E756.6050200@sendu.me.uk>
Message-ID: <C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>

Sorry I was accidentally looking at an older branch.

Reading through the Taxon module I get more confused though than  
would leave me at ease.

Here's what I understand of your description of the problem:

- We would like nodes returned from Bio::DB::Taxonomy to use the  
database for all hierarchical queries.

- We would like nodes used in a Bio::Tree::Tree not to use the  
database for any hierarchical query.

What I understand that we have is

- Taxon node objects that have a db_handle set will use the database  
for ancestor(), unless it has been set manually (?), but not for  
each_Descendent().

- Taxon node objects that don't have a db_handle set won't use a  
database but will function normally otherwise.

- This is needed to prevent Bio::Tree::Tree methods from pulling the  
entire tree into memory.

If this is correct (I'm not sure it is), it sounds like we want to  
temporarily divorce taxonomy nodes from their database capabilities  
while they are being queried in a tree context?

I'm still trying to understand - if I create a Bio::Tree::Tree from a  
single node, will the tree automatically contain all nodes along the  
lineage of ancestors up to the root? So, even if extracting this  
lineage involved querying a database it would be acceptable, but not  
for querying descendents?

It sounds to me like what is needed is that nodes that get added to a  
tree need to be stripped of their database capabilities. This could  
be achieved by creating a wrapper class that delegates all non- 
hierarchical methods to the wrapped Taxon object, and overriding all  
hierarchical queries to not use a database. I'm not sure I fully  
understand yet though, but the inconsistent behavior will be sure to  
throw people off track.

	-hilmar

On Jun 19, 2007, at 10:25 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> So the real mistake was to write
>>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>   my @extant_children = grep { $_->is_Leaf } $node- 
>> >get_all_Descendents;
>> instead of
>>   my $node = $db->get_Taxonomy_Node(-taxonid => '33090');
>>   my @extant_children = grep { $_->is_Leaf } $db- 
>> >get_all_Descendents ($node);
>> I.e., the Bio::DB::Taxonomy object *will* (or is allowed to) ask  
>> the  database?
>
> Yes, the database object methods use the database. I don't even  
> think it makes sense to question that. What else would it do?
>
>
>> If this is correct, can we highlight this in the documentation?  
>> It's  a small difference that everyone failed to spot.
>
> The documentation for what? I've already clearly pointed out the  
> gotcha in Bio::Taxon.
>
>
>> Also, in my reading of Bio::Taxonomy::Taxon it won't use the  
>> database  either for ancestor(). Which would be consistent with  
>> its other methods.
>
> Bio::Taxonomy::Taxon? Its deprecated. Bio::Taxon is what we're  
> dealing with, and it /does/ use the db to get the ancestor, unless  
> the ancestor is manually set (see below for explanation).
>
>
>> I.e., the bottom line is don't use Node or Taxon objects for   
>> hierarchy queries that you expect to use an underlying database,  
>> use  the Bio::DB::Taxonomy object instead. It makes sense, but is  
>> it true?
>
> Almost. It happens to be true but ideally wouldn't be the case. The  
> confusion and problems arise, I guess, because we have two ways to  
> access/create hierarchies and both of them are built from the same  
> building block (Bio::Taxon objects).
>
> On the one hand we have Bio::DB::Taxonomy and the other we have  
> Bio::Tree::Tree.
>
> Tree objects are easy: you have a Taxon object created in memory  
> for each and every node in the tree. Each Taxon knows its ancestor  
> and descendants by storing references to the relevant Taxon objects  
> in the tree. You 'navigate' through the tree by grabbing a Taxon  
> inside it and asking the Taxon itself for its ancestor or descendant.
>
> This leaves us with the Taxon object having the methods ancestor()  
> and each_Descendent(), which we'll expect to work in other  
> circumstances.
>
> Bio::DB::Taxonomy returns single Taxon objects from the database on  
> request. Now we still expect our ancestor() and each_Descendent()  
> methods to work, but if things were set up like Bio::Tree::Tree  
> we'd end up pulling the entire database into memory because we'd  
> have to create all the Taxon objects that are ancestors and  
> descendants, recursively, every time we request a single Taxon  
> (which is wasteful in the case of Bio::DB::Taxonomy::flatfile and  
> slow/not allowed in the case of Bio::DB::Taxonomy::entrez).
>
> The solution? We simply don't create the immediate ancestor or  
> descendant Taxon objects of the requested Taxon, and instead  
> implement the Taxon methods to ask the database to create them on  
> demand, if they don't already exist. Well, that idea is fine (and  
> necessary) for the ancestor method, but we run into problems with  
> each_Descendent().
>
> The problem arises when we create Bio::Tree::Tree objects from a  
> Taxon we got from the database. Being able to do that is why  
> Bio::Taxon is shared between them, as it is a very desirable thing  
> to do: you can instantly create a lineage tree for a Taxon of  
> interest and then use all the Bio::Tree::Tree methods on it.  
> Unfortunately one of those methods is get_nodes() which is  
> implemented using each_Descendent() and get_all_Descendents(). If  
> each_Descendent() asked the database for the real answer, we'd end  
> up pulling the entire database into the tree.
>
> So my implementation was to not ask the database and just warn  
> people in the docs. Ideally it /would/ use the database, because  
> that's what a user would expect. Can anyone see an alternate way  
> around the problem?

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Tue Jun 19 18:41:52 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 19 Jun 2007 14:41:52 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] is this a bp_genbank2gff3.pl bug?
In-Reply-To: <18039.61086.829726.809888@gargle.gargle.HOWL>
References: <18039.61086.829726.809888@gargle.gargle.HOWL>
Message-ID: <1182278512.2592.42.camel@localhost.localdomain>

Hi Alessandra,

I cc'ed your message to the bioperl and sequence ontology mailing lists,
since your question is relevant to both.

Converting genbank files to GFF3 is excruciatingly difficult; I
generally find that I can use the genbank2gff3 script to get me most of
the way there, but then I need to do some manual fixing to make it
'right'.

I am using bioperl-live, since there have been several fixes to the
script since bioperl 1.5.2 was released, including the most recent fixes
from me today (when I started working on this); I would suggest you use
bioperl-live as well.  I ran the script on chrY.

Most (perhaps all) of the errors fit into a few categories:

  - CDS doesn't have a phase, where the GFF3 spec requires CDSes to have
a phase.  Since it can be a little bit of a hassle to calculate, I
understand why it was left out, but I'll submit a bug report to have
those calculated.  If you are planning on loading the GFF file into
Chado, you can use the --noCDS option to get exons instead of CDSes,
which makes the problem go away (the validator has a bug here though--it
reports the polypeptide derives_from mRNA as invalid, but it is correct;
I'm reporting that directly to the author).  Here's the bioperl bug
report:

  http://bugzilla.open-bio.org/show_bug.cgi?id=2322

  - "invalid type pair" is caused by the genbank file using feature
types in a way that conflicts with the Sequence Ontology.  For example,
it has STS features that are part_of a gene, pseudogenic_region as
part_of pseudogene.  I don't know if there would be an easy way to catch
this in the conversion script.  You may need to fix these by hand.  If
the problems occur for features that you don't care about, you can use
the --filter option to leave them out of the resulting GFF file (for
example, adding '--filter STS' would leave all STS features out of the
file).  Also, if you don't plan on loading these into Chado (which does
require SO-compliance) but instead plan on using a Bio::DB::SeqFeature
database, these errors may not be a problem.

  - "invalid type" is caused by feature types that are not in SOFA
(Sequence Ontology for Feature Annotation), though the terms probably
are in SO.  I thought at one point we discussed allowing any SO type to
appear in the GFF3 type column, but that is not what the spec says now.
I don't see this type of error as causing a problem for either
Bio::DB::SeqFeature or Chado.  Chado allows features to be typed with
anything that is in SO and does not restrict to SOFA.

Scott


On Tue, 2007-06-19 at 16:56 +0200, Alessandra Bilardi wrote:
> Hi all,
> 
> I used bp_genbank2gff3.pl with CVS bioperl and it created gff3 about
> human genbank file. I used validate_gff3 on line with human.gff and 
> it has id non-unique so the database gbrowse inserting has errors.
> 
> I attach the error file about hs_ref_chrY.gbk and hs_ref_chr1.gbk that 
> I download at at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens
> Elements having id non-unique are:
> - CDS or pseudo*exon without mRNA and parent 
> - STS with egual start and end
> - tRNA with egual name
> 
> If this is a bp_genbank2gff3.pl bug, can you rectify bp_genbank2gff3.pl?
> If I'm mistaken, can you help me?
> 
> Thanks very much for the help in advance,
> 
> Alessandra.
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070619/3d818b27/attachment.sig>

From sac at bioperl.org  Tue Jun 19 18:54:39 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Tue, 19 Jun 2007 11:54:39 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <467788C5.6070406@sendu.me.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
Message-ID: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>

Valid points, Sendu. I wonder if there might be a best-of-both-worlds
approach here. I would not be advocating for a major slice and dice,
but just identifying a few large, reasonably well established and
encapsulated blocks of functionality that could be managed more
independently and segregating them away from the rest. For example:
DB, Graphics, Search+SearchIO, Tools.

Once per year, we could have a "whole caboodle" release where the core
and all sub parts are tested and released as a group, as we currently
do. Then, updates to the sub parts can occur as-needed but without
necessarily involving updates to other sub parts or the core.

The onus would be on the pumpkin for the sub part release to make sure
it continues to work with the last whole caboodle release. This would
minimize the number of release clashes, since sub part updates would
only be sanctioned relative to the last caboodle release, and it would
ensure that the whole set continues to interoperate.

Perhaps it would be worth experimenting with such an approach so we
can judge it based on actual experience. We could identify one
functional sub part and segregate it out, do a release cycle or two,
along with a sub part release, and decide if this makes things easier
or harder, for devs as well as users. We could always bring it back
into the fold if it doesn't work out.

My fear is that as bioperl continues to grow, the monolithic approach
will become increasingly onerous for a single release pumpkin to
manage, and harder to find someone who feels up to the task. It could
also discourage new developers from diving into the codebase if it
looks too deep. And they are our lifeblood.

A more functionally segregated bioperl codebase could lower the
activation energy needed to recruit release pumpkins and new devs,
leading to more release iterations, fewer bugs, more features, and
more sustainable growth.

When I first discovered Bioperl in 1996, it had three modules. At
~900, I  probably wouldn't have joined ranks as a developer (well, I
probably would, but it would have taken a while to digest it and
become a contributor).

Steve

On 6/19/07, Sendu Bala <bix at sendu.me.uk> wrote:
> Steve Chervitz wrote:
> > Might this been a good opportunity to investigate partitioning
> > bioperl-live into sub-repositories? There has been talk in the past of
> > defining a set of "core" modules separate from other functionally
> > related groups of modules that would be viewed as optional extensions.
> > The goal being to help manage growth and simplify releases. There are
> > currently 892 modules under Bio/.
> >
> > In addition to simplifying the migration to SVN, it would also have
> > other benefits. Say some new functionality or a slew of fixes were
> > added to Bio::Graphics. We could turn around a new Bio::Graphics
> > release quickly without having to work on getting various other parts
> > up to snuff that aren't related to graphics (Biblio, DB, PopGen,
> > Search etc.). Maintenance and releases of the various extensions would
> > be more parallelizable, orchestrated by separate ring leaders.
> >
> > Over time, as a set of functionality matures, it would see fewer
> > updates and there would be less of a need for users to
> > download/install/test it. This could make bioperl easier to customize,
> > extend, and grok in general.
> >
> > Long term, it should ease development and release cycles
>
> I actually take the opposite view. Breaking things up makes testing and
> releases more difficult.
>
> If one person acts as pumpkin for all the sub-parts, his work-load
> increases almost linearly with the number of sub-parts. If each sub-part
> gets its own pumpkin, where do all these pumpkins come from? It seems to
> me that frequently authors will write modules but inevitably their
> circumstance changes and they can no longer devote the time to look
> after them. Having a single pumpkin and 'forcing' him to make sure
> everything works (regardless of his personal interest in the module)
> seems more reliable than hoping there will be a person interested enough
> in each sub-part to handle its release.
>
> Since all sub-parts will at the least interact with the 'true' core set
> of Bioperl modules, they need to be tested and potentially re-released
> every time the true core is updated. And since some sub-parts will
> interact with other sub-parts, there will need to be coordinated
> joint-testing and release of multiple sub-parts.
>
> What happens when users report problems? We ask them what version
> they're running. Right now '1.5.2' means a specific thing, and its
> trivial for someone to confirm the same problem by installing 1.5.2.
> What happens when users have to list out all the versions of all the
> sub-parts they have? Who is going to consistently recreate a users
> hodge-podge of versions in order to confirm a bug? Won't the advice
> instead be: "update all versions to the latest and get back to us"?
>
> So, as I see it, all sub-parts would best be tested and released with a
> single new version number every time one sub-part is updated
> (significantly). In which case, why have sub-parts at all? Keeping
> things the way they are now means ease of release for the pumpkin and
> ease of installation for end-users (only one install command to issue to
> CPAN). Having 'true' sub-parts (each with its own pumpkin), in my
> fatalistic view, is just going to lead to some useful sub-parts being
> abandoned and never updated, even where updates may be desirable.
>
> Each and every Bio:: module could have been released separately by its
> respective author. As I see it, one of the main values of 'Bioperl' is
> that its one (reasonably) consistent collection of modules that lowers
> the barrier of entry for new Bioinformaticians, giving them extremely
> easy access to a whole host of functionality with a single install.
>


From bix at sendu.me.uk  Tue Jun 19 19:13:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 20:13:39 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
Message-ID: <46782AE3.2090703@sendu.me.uk>

Steve Chervitz wrote:
> Valid points, Sendu. I wonder if there might be a best-of-both-worlds
> approach here.
[snip]

You haven't convinced me, but I'd go along with the majority decision if 
best-of-both-worlds was picked.


> DB, Graphics, Search+SearchIO, Tools.

I will, however, say that DB interleaves into too many core modules. It 
should stay in core. Tools? Its hardly touched anyway, so I don't see 
the value of taking it out, what with Bio::Tools::Run already being its 
own package. Most Bioperl users probably get Bioperl just to do 
something Blast related, so all Blast stuff really ought to stay in core.

Graphics is an obvious choice and I agree. Updated frequently, and has 
its own release needs. It also has some of the trickier dependencies, so 
would make installing core simpler.

I can imagine plucking Search+SearchIO out, and its something that needs 
regular updating. Another good candidate.


> Perhaps it would be worth experimenting with such an approach so we
> can judge it based on actual experience. We could identify one
> functional sub part and segregate it out, do a release cycle or two,
> along with a sub part release, and decide if this makes things easier
> or harder, for devs as well as users.

Well, we already have the run package. Its a split-off subpart that gets 
updated. The only 'experiment' left to do is finding it its own pumpkin.


From bix at sendu.me.uk  Tue Jun 19 19:48:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 19 Jun 2007 20:48:50 +0100
Subject: [Bioperl-l] Taxonomy hierarchy extraction
In-Reply-To: <C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>
References: <369098.81077.qm@web56507.mail.re3.yahoo.com>
	<F4F4954A-7457-470A-B723-5E33B0A5924E@bioperl.org>
	<46777F31.7030402@sendu.me.uk>
	<5C565157-0415-469A-AB3D-E343D3D6455D@gmx.net>
	<4677E756.6050200@sendu.me.uk>
	<C2348A85-2F44-4AD5-8996-DDA19B79F994@gmx.net>
Message-ID: <46783322.30309@sendu.me.uk>

Hilmar Lapp wrote:
> Here's what I understand of your description of the problem:
> 
> - We would like nodes returned from Bio::DB::Taxonomy to use the  
> database for all hierarchical queries.
> 
> - We would like nodes used in a Bio::Tree::Tree not to use the  
> database for any hierarchical query.

Correct.


> What I understand that we have is
> 
> - Taxon node objects that have a db_handle set will use the database  
> for ancestor(), unless it has been set manually (?), but not for  
> each_Descendent().
> 
> - Taxon node objects that don't have a db_handle set won't use a  
> database but will function normally otherwise.
> 
> - This is needed to prevent Bio::Tree::Tree methods from pulling the  
> entire tree into memory.

Correct.


> If this is correct (I'm not sure it is), it sounds like we want to  
> temporarily divorce taxonomy nodes from their database capabilities  
> while they are being queried in a tree context?

Yes.


> I'm still trying to understand - if I create a Bio::Tree::Tree from a  
> single node, will the tree automatically contain all nodes along the  
> lineage of ancestors up to the root? So, even if extracting this  
> lineage involved querying a database it would be acceptable, but not  
> for querying descendents?

Yes. Asking the database for all the ancestors up to root only pulls a 
couple of nodes into the tree and is exactly what the user would want to 
happen. But if nodes are allowed to get their descendants from the 
database, when we get the root node from the database, we'd get all the 
root's descendants, and then for each of those we'd get all /their/ 
descendants... that's when the whole db gets sucked in.


> It sounds to me like what is needed is that nodes that get added to a  
> tree need to be stripped of their database capabilities. This could  
> be achieved by creating a wrapper class that delegates all non- 
> hierarchical methods to the wrapped Taxon object, and overriding all  
> hierarchical queries to not use a database. I'm not sure I fully  
> understand yet though, but the inconsistent behavior will be sure to  
> throw people off track.

When we're making a tree from a db Taxon we need db access to find all 
the ancestors; we just don't want to get any descendants outside our 
initiating Taxon's direct lineage.


my @names = ('Eukaryota', 'Mammalia', 'Primates', 'Homo', 'Homo sapiens');
my @ranks = qw(superkingdom class order genus species);
my $db = Bio::DB::Taxonomy->new(-source => 'list', -names => \@names,
                                                    -ranks => \@ranks);

@names = ('Eukaryota', 'Mammalia', 'Rodentia', 'Mus', 'Mus musculus');
$db->add_lineage(-names => \@names, -ranks => \@ranks);


my $homo = $db->get_taxon(-name => 'Homo');
isa_ok($homo, 'Bio::Taxon'); # PASS

is $homo->ancestor->scientific_name, 'Primates' # PASS
my @descs = $homo->each_Descendent;
is @descs, 1 # FAIL, we wanted it to contain the 'Homo sapiens' node


my $lineage = Bio::Tree::Tree->new(-node => $homo);
is $lineage->get_root_node->scientific_name, 'Eukaryota'; # PASS
my @nodes = $lineage->get_nodes;
ok @nodes, 4; # PASS: we didn't pull in Rodentia which would be 8

(on that last test I can't remember if the answer might actually be 5 
because our lineage does contain 'Homo sapiens')


If anyone can figure out how to get all those to pass, please let me know.


From cjfields at uiuc.edu  Tue Jun 19 21:15:00 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 19 Jun 2007 16:15:00 -0500
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
Message-ID: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>


On Jun 19, 2007, at 1:54 PM, Steve Chervitz wrote:

> Valid points, Sendu. I wonder if there might be a best-of-both-worlds
> approach here. I would not be advocating for a major slice and dice,
> but just identifying a few large, reasonably well established and
> encapsulated blocks of functionality that could be managed more
> independently and segregating them away from the rest. For example:
> DB, Graphics, Search+SearchIO, Tools.

There should also be a consensus between the core devs on this; I  
don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing  
their opinions as it will directly impact projects which rely on core  
functionality (GBrowse/GMOD, bioperl-db, etc).  I also agree with  
George that this should be postponed until after svn issues are taken  
care of.

Stating that, I think this is a good idea in general, though we'll  
need to be careful which ones we segregate out as non-core.  I agree  
with your choices; I would add in Bio::Restriction, Bio::Assembly,  
Bio::Structure, and a few more.  As long as the distribution required  
installation of 'core' prior to test runs it shouldn't be too much of  
a problem.

In order for this to work we would need to delineate what defines  
'core' (how broad the definition should be), then identify those  
modules that don't fit and decide what to do with them.  Would we  
want to split the others into separate packages or lump together as a  
bioperl-auxiliary (horrid name, but you get my point)?  Too many  
could be a logistical nightmare, as Sendu has pointed out.

> Once per year, we could have a "whole caboodle" release where the core
> and all sub parts are tested and released as a group, as we currently
> do. Then, updates to the sub parts can occur as-needed but without
> necessarily involving updates to other sub parts or the core.

Sounds fine by me.  Actually, my thought was we could reimplement  
Bundle::BioPerl on CPAN (which Module::Build effectively obsoleted)  
to install all the necessary subpackages in order to emulate an old- 
style 'core' installation, or act as an 'install everything BioPerl- 
related' Bundle.  Regular updates of the subpackages to CPAN should  
just require updating the Bundle (which would update only the  
relevant parts, at least I believe it would).

> The onus would be on the pumpkin for the sub part release to make sure
> it continues to work with the last whole caboodle release. This would
> minimize the number of release clashes, since sub part updates would
> only be sanctioned relative to the last caboodle release, and it would
> ensure that the whole set continues to interoperate.
>
> Perhaps it would be worth experimenting with such an approach so we
> can judge it based on actual experience. We could identify one
> functional sub part and segregate it out, do a release cycle or two,
> along with a sub part release, and decide if this makes things easier
> or harder, for devs as well as users. We could always bring it back
> into the fold if it doesn't work out.
>
> My fear is that as bioperl continues to grow, the monolithic approach
> will become increasingly onerous for a single release pumpkin to
> manage, and harder to find someone who feels up to the task. It could
> also discourage new developers from diving into the codebase if it
> looks too deep. And they are our lifeblood.

Agreed!

> A more functionally segregated bioperl codebase could lower the
> activation energy needed to recruit release pumpkins and new devs,
> leading to more release iterations, fewer bugs, more features, and
> more sustainable growth.

'Activation energy.'  Hmm.  Spoken like a true biologist.

> When I first discovered Bioperl in 1996, it had three modules. At
> ~900, I  probably wouldn't have joined ranks as a developer (well, I
> probably would, but it would have taken a while to digest it and
> become a contributor).
>
> Steve

I pretty much agree, though this will require quite a bit more  
discussion.

chris


From hlapp at gmx.net  Tue Jun 19 21:57:54 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 19 Jun 2007 17:57:54 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
Message-ID: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>


On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:

> There should also be a consensus between the core devs on this; I
> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
> their opinions

The problem I have increasingly had with BioPerl (aside from the fact  
that it's written in Perl ;) is the plethora of dependencies I need  
to install, not the number of modules.

But every time I've been told that that's what Perl is all about, and  
I should shut up and install the bundle. Idiosyncratically I don't  
like bundles that clutter up my hard disk with stuff I'll never use,  
and in this sense if BioPerl is divided into 10 packages I will have  
to think about each one whether I need it, and do a separate CVS  
checkout - and regular update - of each one (though granted, I  
believe there are ways the multiple checkout and update thing can be  
taken care of).

In reality, this may be a rapidly disappearing trait though of those  
who have grown up in a time when they proudly spent all their savings  
to buy that new computer because it had a 20MB hard disk, compared to  
the two 360k floppy drives the previous one had.

So don't ask me, just don't make it too hard for the dinosaurs.

> as it will directly impact projects which rely on core
> functionality (GBrowse/GMOD, bioperl-db, etc).

Well, I hope there are ways to limit that?

> I also agree with George that this should be postponed until after  
> svn issues are taken care of.

I agree entirely. Please don't throw this in the same bin or tie one  
to the other. The migration is neither easier nor faster nor better  
testable with a partitioned BioPerl.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 20 01:48:20 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 19 Jun 2007 20:48:20 -0500
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
Message-ID: <D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>


On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote:

> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:
>
>> There should also be a consensus between the core devs on this; I
>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
>> their opinions
>
> The problem I have increasingly had with BioPerl (aside from the fact
> that it's written in Perl ;) is the plethora of dependencies I need
> to install, not the number of modules.
>
> But every time I've been told that that's what Perl is all about, and
> I should shut up and install the bundle. Idiosyncratically I don't
> like bundles that clutter up my hard disk with stuff I'll never use,
> and in this sense if BioPerl is divided into 10 packages I will have
> to think about each one whether I need it, and do a separate CVS
> checkout - and regular update - of each one (though granted, I
> believe there are ways the multiple checkout and update thing can be
> taken care of).

I agree; the fewer dependencies the better.  We could divide it up  
into a small, focused core package with only a few dependencies, and  
1-3 more containing the focused bits which require the most  
maintenance (Graphics, SearchIO/Tools, etc).  I worry about having  
too many more.

> In reality, this may be a rapidly disappearing trait though of those
> who have grown up in a time when they proudly spent all their savings
> to buy that new computer because it had a 20MB hard disk, compared to
> the two 360k floppy drives the previous one had.
>
> So don't ask me, just don't make it too hard for the dinosaurs.

There would need to be some way of getting an old-style full-blown  
core installation regardless of how many subdistros we would divy  
core up into.  My thought for CPAN was having Bundle::BioPerl take  
over this but I'm not sure if it's still being used.  Maybe there are  
other ways for svn/cvs.

>> as it will directly impact projects which rely on core
>> functionality (GBrowse/GMOD, bioperl-db, etc).
>
> Well, I hope there are ways to limit that?

I believe so, yes, particularly for bioperl-db.  I would think  
splitting off Bio::Graphics or Bio::DB* will have some effect on  
GBrowse/GFF.

>> I also agree with George that this should be postponed until after
>> svn issues are taken care of.
>
> I agree entirely. Please don't throw this in the same bin or tie one
> to the other. The migration is neither easier nor faster nor better
> testable with a partitioned BioPerl.
>
> 	-hilmar

We def. have to complete transition to subversion first, then think  
about this some more.

chris


From n.haigh at sheffield.ac.uk  Wed Jun 20 06:31:24 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 07:31:24 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
Message-ID: <4678C9BC.10206@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> On Jun 19, 2007, at 4:57 PM, Hilmar Lapp wrote:
> 
>> On Jun 19, 2007, at 5:15 PM, Chris Fields wrote:
>>
>>> There should also be a consensus between the core devs on this; I
>>> don't see it going very far w/o Lincoln, Jason, Hilmar, etc. voicing
>>> their opinions
>> The problem I have increasingly had with BioPerl (aside from the fact
>> that it's written in Perl ;) is the plethora of dependencies I need
>> to install, not the number of modules.
>>
>> But every time I've been told that that's what Perl is all about, and
>> I should shut up and install the bundle. Idiosyncratically I don't
>> like bundles that clutter up my hard disk with stuff I'll never use,
>> and in this sense if BioPerl is divided into 10 packages I will have
>> to think about each one whether I need it, and do a separate CVS
>> checkout - and regular update - of each one (though granted, I
>> believe there are ways the multiple checkout and update thing can be
>> taken care of).
> 
> I agree; the fewer dependencies the better.  We could divide it up  
> into a small, focused core package with only a few dependencies, and  
> 1-3 more containing the focused bits which require the most  
> maintenance (Graphics, SearchIO/Tools, etc).  I worry about having  
> too many more.
> 
>> In reality, this may be a rapidly disappearing trait though of those
>> who have grown up in a time when they proudly spent all their savings
>> to buy that new computer because it had a 20MB hard disk, compared to
>> the two 360k floppy drives the previous one had.
>>
>> So don't ask me, just don't make it too hard for the dinosaurs.
> 
> There would need to be some way of getting an old-style full-blown  
> core installation regardless of how many subdistros we would divy  
> core up into.  My thought for CPAN was having Bundle::BioPerl take  
> over this but I'm not sure if it's still being used.  Maybe there are  
> other ways for svn/cvs.

Personally, I think this use of Bundle::Bioperl is more in line with
what CPAN Bundles were meant to do - "a bundle is a collection of
modules that comprise a cohesive unit". Under that definition you could
probably put the whole of Bioperl but I won't go there! When a package
is updated and a new release is made, this should be
installable/updatable via cpan as well as updating the bundle with the
correct version. This was you can get all of Bioperl via the bundle, or
just install the sub-packages on their own.

If the switch over to svn takes place, will all the Bioperl-* projects
move over at the same time? If so, will they go into their own svn
repository or into the same one? Since with svn you can checkout any
subtree of the repository I'm not clear on the pro's and cons of either
of these options.

Am I right in thinking that there is a way for cvs to define a "project"
such that when you checkout that "project" it actually checks out
multiple projects behind the scene? I'm sure I've seen this somewhere,
possibly when the project is dependent on some 3rd party code that is
also in cvs. If this is possible, I'm sure it will also be possible with
svn. This could then allow something like the following to happen after
the split up of Bioperl. The following projects could be defined:
bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
called "bioperl" would actually checkout the real projects call
bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
that this ought to be possible, doesn't it?


> 
>>> as it will directly impact projects which rely on core
>>> functionality (GBrowse/GMOD, bioperl-db, etc).
>> Well, I hope there are ways to limit that?
> 
> I believe so, yes, particularly for bioperl-db.  I would think  
> splitting off Bio::Graphics or Bio::DB* will have some effect on  
> GBrowse/GFF.
> 
>>> I also agree with George that this should be postponed until after
>>> svn issues are taken care of.
>> I agree entirely. Please don't throw this in the sam. e bin or tie one
>> to the other. The migration is neither easier nor faster nor better
>> testable with a partitioned BioPerl.
>>
>> 	-hilmar
> 
> We def. have to complete transition to subversion first, then think  
> about this some more.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeMm7czuW2jkwy2gRAi+CAJ9cNZ70GojV7eviRjdWTFLk/MKYoACg2Ls4
op9sQTZyeK6G6taFhTAPMYc=
=7NRw
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 20 11:46:16 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 07:46:16 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <4678C9BC.10206@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
Message-ID: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:

> If the switch over to svn takes place, will all the Bioperl-* projects
> move over at the same time?

They are under the same CVSROOT right now. Locking down some sub- 
repositories but not others may be odd or impossible.

> If so, will they go into their own svn repository or into the same  
> one?

Good question, I'm not sure about the pros and cons one way or the  
other either. The fewer repositories the less sysadmin work in fine- 
graining permissions.

	-hilmar

- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGeRONuV6N2JxL7qsRAoYTAJ9GVuC0j4szCcWTg7yWGoxN3YFucQCgogJ8
Ims4d150lsX0vXtDwGI1lKg=
=K4++
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Wed Jun 20 11:57:22 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 12:57:22 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
Message-ID: <46791622.6080409@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hilmar Lapp wrote:
> 
> On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:
> 
>> If the switch over to svn takes place, will all the Bioperl-* projects
>> move over at the same time?
> 
> They are under the same CVSROOT right now. Locking down some
> sub-repositories but not others may be odd or impossible.
> 
>> If so, will they go into their own svn repository or into the same one?
> 
> Good question, I'm not sure about the pros and cons one way or the other
> either. The fewer repositories the less sysadmin work in fine-graining
> permissions.
> 
>     -hilmar
> 


I don't think there is any major reason why the following single repos
wouldn't do the trick:

/--
  |-bioperl-live
  |     |--- trunk
  |     |--- branches
  |     |--- tags
  |
  |-bioperl-run
        |--- trunk
        |--- branches
        |--- tags

Any reason why this couldn't be used?

I know some people don't like the idea of the revision number
incrementing for the whole repository if it contains several "projects".
However, revision numbers are really only a way for svn to keep track of
things and a very large revision number shouldn't really "upset" anyone.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeRYiczuW2jkwy2gRApS5AJsHl73MWZP8aMfOqlLgTYuzpMWmQgCg3VqA
1Vj8BSUnanpdjYYLE6eGanU=
=bOqK
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 20 12:08:33 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 08:08:33 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <46791622.6080409@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
	<46791622.6080409@sheffield.ac.uk>
Message-ID: <DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote:

> I don't think there is any major reason why the following single repos
> wouldn't do the trick:
>
> /--
>   |-bioperl-live
>   |     |--- trunk
>   |     |--- branches
>   |     |--- tags
>   |
>   |-bioperl-run
>         |--- trunk
>         |--- branches
>         |--- tags
>
> Any reason why this couldn't be used?

That would work fine except that there are several more sub-projects  
(bioperl-db, bioperl-graphics, bioperl-microarray, and a few more).

That should still be fine. I think what needs to be recognized is the  
limitations it puts on permission granularity. If it's all the same  
repository (as is now) then having commit rights to one (subproject)  
will mean commit rights to all. From my perspective that's fine, it  
has worked great so far.

	-hilmar

- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGeRjFuV6N2JxL7qsRAj3dAJ42r1C8By29DNTUP9Ts0Lf5dOcS9QCgjSE1
hckjT7LBtHcmwGI8B+BKQIM=
=gYfA
-----END PGP SIGNATURE-----


From hartzell at alerce.com  Tue Jun 19 19:53:39 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 19 Jun 2007 12:53:39 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and ...Re:
	Perltidy)
In-Reply-To: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
Message-ID: <18040.13379.217277.992742@almost.alerce.com>

Steve Chervitz writes:
 > On 6/16/07, Jason Stajich <jason at bioperl.org> wrote:
 > > [...]
 > > Just to say I already went through all the steps of running cvs2svn
 > > myself and had problems gathering back out the branches and all the
 > > tags when I tried it.  If you want to start with a smaller repository
 > > like bioperl-network or bioperl-db as the initial cvs2svn conversion
 > > script took quite a long time to run on bioperl-live.
 > 
 > Might this been a good opportunity to investigate partitioning
 > bioperl-live into sub-repositories? [...]

I'd say that the time to do this kind of rearrangement would be
*after* the svn repo's set up.  That way you'll be able to track stuff
back through to the beginning of time.

g.


From sdavis2 at mail.nih.gov  Wed Jun 20 12:44:08 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 20 Jun 2007 08:44:08 -0400
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN
	and	...Re:	Perltidy)
In-Reply-To: <DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>	<4678C9BC.10206@sheffield.ac.uk>	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>	<46791622.6080409@sheffield.ac.uk>
	<DBFDD481-4377-4E7C-A4F6-B1B57A4D0A9F@gmx.net>
Message-ID: <46792118.4030205@mail.nih.gov>

Hilmar Lapp wrote:
> 
> On Jun 20, 2007, at 7:57 AM, Nathan S. Haigh wrote:
> 
>> I don't think there is any major reason why the following single repos
>> wouldn't do the trick:
> 
>> /--
>>   |-bioperl-live
>>   |     |--- trunk
>>   |     |--- branches
>>   |     |--- tags
>>   |
>>   |-bioperl-run
>>         |--- trunk
>>         |--- branches
>>         |--- tags
> 
>> Any reason why this couldn't be used?
> 
> That would work fine except that there are several more sub-projects  
> (bioperl-db, bioperl-graphics, bioperl-microarray, and a few more).
> 
> That should still be fine. I think what needs to be recognized is the  
> limitations it puts on permission granularity. If it's all the same  
> repository (as is now) then having commit rights to one (subproject)  
> will mean commit rights to all. From my perspective that's fine, it  
> has worked great so far.

Actually, I think there are ways of creating per-directory access
control.  See here:

http://svnbook.red-bean.com/en/1.2/svn-book.html#svn.serverconfig.svnserve.auth.general

With Apache-based https access, such access control is relatively
straightforward, it appears.  With the standalone svn server over ssh,
one needs to use "commit hook scripts" to limit access.  But I think it
is possible (admitting that I have not tried to do this...).

Sean


From hartzell at alerce.com  Wed Jun 20 13:23:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 20 Jun 2007 06:23:32 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <4678C9BC.10206@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
Message-ID: <18041.10836.728079.835572@almost.alerce.com>

Nathan S. Haigh writes:
 > [...]
 > If the switch over to svn takes place, will all the Bioperl-* projects
 > move over at the same time? If so, will they go into their own svn
 > repository or into the same one? Since with svn you can checkout any
 > subtree of the repository I'm not clear on the pro's and cons of either
 > of these options.

I'm planning to drop the projects from the top of the CVSROOT into a
single svn repository:

    bioperl-ext bioperl-pipeline biodata bioperl-gui
    bioperl-run bioperl-cookbook bioperl-live biosql-schema
    bioperl-corba-client bioperl-microarray html bioperl-corba-server
    bioperl-network task-manager bioperl-das-client bioperl-papers
    xml-html bioperl-db bioperl-pedigree

although that's open to feedback from the core members.

As a progress report, I've built a demo repos with -run, -ext, and
-live in it and asked a couple of folks to to take a peek at it.  When
I get a bit further along I'll figure out how to get something for the
public to test.

 > Am I right in thinking that there is a way for cvs to define a "project"
 > such that when you checkout that "project" it actually checks out
 > multiple projects behind the scene? I'm sure I've seen this somewhere,
 > possibly when the project is dependent on some 3rd party code that is
 > also in cvs. If this is possible, I'm sure it will also be possible with
 > svn. This could then allow something like the following to happen after
 > the split up of Bioperl. The following projects could be defined:
 > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
 > called "bioperl" would actually checkout the real projects call
 > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
 > that this ought to be possible, doesn't it?
 > [...]

I don't think that there's any functionality like that in svn.

g.


From hartzell at alerce.com  Wed Jun 20 13:26:04 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 20 Jun 2007 06:26:04 -0700
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <46791622.6080409@sheffield.ac.uk>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>
	<467788C5.6070406@sendu.me.uk>
	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>
	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>
	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>
	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>
	<4678C9BC.10206@sheffield.ac.uk>
	<5AB0035E-B3FA-44A3-9595-1C1196FF7594@gmx.net>
	<46791622.6080409@sheffield.ac.uk>
Message-ID: <18041.10988.375946.833182@almost.alerce.com>

Nathan S. Haigh writes:
 > -----BEGIN PGP SIGNED MESSAGE-----
 > Hash: SHA1
 > 
 > Hilmar Lapp wrote:
 > > 
 > > On Jun 20, 2007, at 2:31 AM, Nathan S. Haigh wrote:
 > > 
 > >> If the switch over to svn takes place, will all the Bioperl-* projects
 > >> move over at the same time?
 > > 
 > > They are under the same CVSROOT right now. Locking down some
 > > sub-repositories but not others may be odd or impossible.
 > > 
 > >> If so, will they go into their own svn repository or into the same one?
 > > 
 > > Good question, I'm not sure about the pros and cons one way or the other
 > > either. The fewer repositories the less sysadmin work in fine-graining
 > > permissions.
 > > 
 > >     -hilmar
 > > 
 > 
 > 
 > I don't think there is any major reason why the following single repos
 > wouldn't do the trick:
 > 
 > /--
 >   |-bioperl-live
 >   |     |--- trunk
 >   |     |--- branches
 >   |     |--- tags
 >   |
 >   |-bioperl-run
 >         |--- trunk
 >         |--- branches
 >         |--- tags
 > 
 > Any reason why this couldn't be used?
 > [...]

That's exactly the way that I'm setting it up.

g.


From n.haigh at sheffield.ac.uk  Wed Jun 20 13:33:33 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 20 Jun 2007 14:33:33 +0100
Subject: [Bioperl-l] Bioperl partitioning (was Re: SVN and
	...Re:	Perltidy)
In-Reply-To: <18041.10836.728079.835572@almost.alerce.com>
References: <8f200b4c0706182342m2de15cd0l4aa12dcfb78da766@mail.gmail.com>	<467788C5.6070406@sendu.me.uk>	<8f200b4c0706191154x84de66dk1e0a8be13cd747c1@mail.gmail.com>	<3E65196E-4D08-4D8D-80B2-4CECBBC8CE92@uiuc.edu>	<62CA07B5-3278-4EC4-B495-F2275ECC954F@gmx.net>	<D744A5D8-CD60-42C0-AD7A-EA9C991796E0@uiuc.edu>	<4678C9BC.10206@sheffield.ac.uk>
	<18041.10836.728079.835572@almost.alerce.com>
Message-ID: <46792CAD.5060700@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:
> Nathan S. Haigh writes:
>  > [...]
>  > If the switch over to svn takes place, will all the Bioperl-* projects
>  > move over at the same time? If so, will they go into their own svn
>  > repository or into the same one? Since with svn you can checkout any
>  > subtree of the repository I'm not clear on the pro's and cons of either
>  > of these options.
> 
> I'm planning to drop the projects from the top of the CVSROOT into a
> single svn repository:
> 
>     bioperl-ext bioperl-pipeline biodata bioperl-gui
>     bioperl-run bioperl-cookbook bioperl-live biosql-schema
>     bioperl-corba-client bioperl-microarray html bioperl-corba-server
>     bioperl-network task-manager bioperl-das-client bioperl-papers
>     xml-html bioperl-db bioperl-pedigree
> 
> although that's open to feedback from the core members.
> 
> As a progress report, I've built a demo repos with -run, -ext, and
> -live in it and asked a couple of folks to to take a peek at it.  When
> I get a bit further along I'll figure out how to get something for the
> public to test.

Could I take a peek??

> 
>  > Am I right in thinking that there is a way for cvs to define a "project"
>  > such that when you checkout that "project" it actually checks out
>  > multiple projects behind the scene? I'm sure I've seen this somewhere,
>  > possibly when the project is dependent on some 3rd party code that is
>  > also in cvs. If this is possible, I'm sure it will also be possible with
>  > svn. This could then allow something like the following to happen after
>  > the split up of Bioperl. The following projects could be defined:
>  > bioperl-core, bioperl-graphics etc. Issuing a checkout of a "project"
>  > called "bioperl" would actually checkout the real projects call
>  > bioperl-core, bioperl-graphics etc. I may just be dreaming, but it seems
>  > that this ought to be possible, doesn't it?
>  > [...]
> 
> I don't think that there's any functionality like that in svn.


I did come across this which might help:
http://subversion.tigris.org/servlets/ReadMsg?listName=users&msgNo=43561

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeSytczuW2jkwy2gRAnlUAJ4pjhPlYlqOm+M882Ni116MJVzPCwCbB3Su
sWDAmqFhGgtlyeawaIGSV14=
=zeAY
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Wed Jun 20 15:38:20 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 20 Jun 2007 16:38:20 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
Message-ID: <467949EC.9040100@sendu.me.uk>

In considering updating all the test scripts to take advantage of the 
new network option, and/or reimplementing them in Test::More, I thought 
now would be a good time to standardize all the test scripts and reduce 
the possibility of having to alter them all in the future if something 
changes.

For example we could decide on an alternate way of choosing to run 
network tests, or a new way of deciding to output debug information. 
There are also some inconsistencies in the messages produced by tests 
skipping all, and even an unfortunate mistake that has been copy/pasted 
through a lot of test scripts.

My solution is t/lib/BioperlTest.pm (documented with perldoc)

We go from this:

----
use strict;
our $DEBUG;

BEGIN {
   $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
	
   eval { require Test::More; };
   if( $@ ) {
     use lib 't/lib';
   }
   use Test::More; # the mistake!
	
   use Module::Build;
   my $build = Module::Build->current();
   my $do_network_tests = $build->notes('network');

   eval {
     require IO::String;
     require LWP;
     require LWP::UserAgent;
   };
   if ($@) {
     plan skip_all => 'IO::String or LWP or LWP::UserAgentnot installed.
This means Bio::Tools::Run::RemoteBlast is not usable. Skipping tests';
   }
   elsif (!$do_network_tests) {
     plan skip_all => 'Network tests have not been requested, skipping
all';
   }
   else {
     plan tests => 21;
   }

   #...
}

my $obj = Bio::Object->new(-verbose => $DEBUG);
#...
----

To this:

----
use strict;

BEGIN {
   use lib 't/lib';
   use BioperlTest;

   test_begin(-requires_modules => [qw(IO::String LWP LWP::UserAgent)],
              -requires_networking => 1,
              -tests => 21);

   #...
}

my $obj = Bio::Object->new(-verbose => test_debug());
#...
----


Can anyone identify problems with this approach? Is the interface 
presented by BioperlTest flexible enough that any changes would only be 
additions for new functionality (and therefore all test scripts wouldn't 
need to be altered)? Is BioperlTest missing anything you'd like?

Are there any objections to me updating all tests in this manner? For an 
example, see t/RemoteBlast.t


Cheers,
Sendu.


From spiros at lokku.com  Wed Jun 20 15:49:48 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Wed, 20 Jun 2007 16:49:48 +0100
Subject: [Bioperl-l] Network tests overhaul
In-Reply-To: <4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>
References: <467661F0.2060703@sendu.me.uk>
	<C5F2C2F6-5CB0-40EC-9CFE-22E9224EC268@wustl.edu>
	<4676A01F.30205@sendu.me.uk>
	<082DD0A6-6E01-4032-9D19-191F9213A6BD@uiuc.edu>
	<4676B41E.3050706@sendu.me.uk>
	<4D4061A2-E937-4F97-85C3-8937B597C64F@uiuc.edu>
Message-ID: <bba689ec0706200849p3d32ffb8wee14bbeb2027e905@mail.gmail.com>

Yep, they are not all done. Some still need to be ported over, doing
some here and there at home. However, the recent email Sendu sent, the
one about abstracting the setup of testing is actually something i was
thinking myself so it might be a better way to tackle the problem. For
once it would save us from duplicating the same 30 lines of code
across all tests.

As far as network tests are involved, ive always been an avid hater of
them. I believe they only bring more troubles than what they
contribute due to the diversity of setups people have. My way of
tackling them was always to group all the tests that required live
access into one file and then forcibly just run that - iff needed and
not by default. Like i said, thats just my opinion, ive been bitten by
them one time too many.

Spiros

On 6/18/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 18, 2007, at 11:34 AM, Sendu Bala wrote:
>
> > Chris Fields wrote:
> >> Couldn't you enable BIOPERLDEBUG, disable network access, then
> >> iterate through tests checking for those which fail or skip?
> >
> > Yes, good idea, though my dev machine is also my email/webserver so
> > I'd rather come up with an alternate solution than one involving
> > 'disable network access'.
> >
> > Still, that's what I'll probably end up doing. Cheers!
> >
> >
> > Oh, Chris, Spiros, how goes the Test::More conversion? I might want
> > to wait for you to finish, or join in? If you're not going to have
> > time to do any more in the next few weeks, can you please update
> > http://www.bioperl.org/wiki/TestMoreProgress removing your name (or
> > in the opposite case, add your name in)? Its not quite clear to me
> > which tests are assigned to whom. Can someone clarify what the
> > markings mean?
> >
> > Cheers,
> > Sendu.
>
> Not sure how far along spiros is; I handed it over after I finished
> up to the 'Q' tests.  In general the ones marked out have been
> converted over, ones with names next to them have been claimed.  If
> you need help I'll prob. start back up again to finish them off; we
> just need to divy them up.
>
> chris
>


From hlapp at gmx.net  Wed Jun 20 16:27:47 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 20 Jun 2007 12:27:47 -0400
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467949EC.9040100@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
Message-ID: <A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>

Very cool! Sounds like a no-brainer to me to adopt this in all the  
tests. -hilmar

On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:

> In considering updating all the test scripts to take advantage of the
> new network option, and/or reimplementing them in Test::More, I  
> thought
> now would be a good time to standardize all the test scripts and  
> reduce
> the possibility of having to alter them all in the future if something
> changes.
>
> For example we could decide on an alternate way of choosing to run
> network tests, or a new way of deciding to output debug information.
> There are also some inconsistencies in the messages produced by tests
> skipping all, and even an unfortunate mistake that has been copy/ 
> pasted
> through a lot of test scripts.
>
> My solution is t/lib/BioperlTest.pm (documented with perldoc)
>
> We go from this:
>
> ----
> use strict;
> our $DEBUG;
>
> BEGIN {
>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
> 	
>    eval { require Test::More; };
>    if( $@ ) {
>      use lib 't/lib';
>    }
>    use Test::More; # the mistake!
> 	
>    use Module::Build;
>    my $build = Module::Build->current();
>    my $do_network_tests = $build->notes('network');
>
>    eval {
>      require IO::String;
>      require LWP;
>      require LWP::UserAgent;
>    };
>    if ($@) {
>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot  
> installed.
> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping  
> tests';
>    }
>    elsif (!$do_network_tests) {
>      plan skip_all => 'Network tests have not been requested, skipping
> all';
>    }
>    else {
>      plan tests => 21;
>    }
>
>    #...
> }
>
> my $obj = Bio::Object->new(-verbose => $DEBUG);
> #...
> ----
>
> To this:
>
> ----
> use strict;
>
> BEGIN {
>    use lib 't/lib';
>    use BioperlTest;
>
>    test_begin(-requires_modules => [qw(IO::String LWP  
> LWP::UserAgent)],
>               -requires_networking => 1,
>               -tests => 21);
>
>    #...
> }
>
> my $obj = Bio::Object->new(-verbose => test_debug());
> #...
> ----
>
>
> Can anyone identify problems with this approach? Is the interface
> presented by BioperlTest flexible enough that any changes would  
> only be
> additions for new functionality (and therefore all test scripts  
> wouldn't
> need to be altered)? Is BioperlTest missing anything you'd like?
>
> Are there any objections to me updating all tests in this manner?  
> For an
> example, see t/RemoteBlast.t
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 20 16:44:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 20 Jun 2007 11:44:01 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
Message-ID: <BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>

Agreed!  You've already created an example case so there's something  
to go off of.

I plan on changing some EUtilities tests soon so I'll try  
implementing this, basing off your RemoteBlast.t implementation.   
Seems clear enough on the surface; if I run into problems I'll post.

chris

On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote:

> Very cool! Sounds like a no-brainer to me to adopt this in all the
> tests. -hilmar
>
> On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:
>
>> In considering updating all the test scripts to take advantage of the
>> new network option, and/or reimplementing them in Test::More, I
>> thought
>> now would be a good time to standardize all the test scripts and
>> reduce
>> the possibility of having to alter them all in the future if  
>> something
>> changes.
>>
>> For example we could decide on an alternate way of choosing to run
>> network tests, or a new way of deciding to output debug information.
>> There are also some inconsistencies in the messages produced by tests
>> skipping all, and even an unfortunate mistake that has been copy/
>> pasted
>> through a lot of test scripts.
>>
>> My solution is t/lib/BioperlTest.pm (documented with perldoc)
>>
>> We go from this:
>>
>> ----
>> use strict;
>> our $DEBUG;
>>
>> BEGIN {
>>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
>> 	
>>    eval { require Test::More; };
>>    if( $@ ) {
>>      use lib 't/lib';
>>    }
>>    use Test::More; # the mistake!
>> 	
>>    use Module::Build;
>>    my $build = Module::Build->current();
>>    my $do_network_tests = $build->notes('network');
>>
>>    eval {
>>      require IO::String;
>>      require LWP;
>>      require LWP::UserAgent;
>>    };
>>    if ($@) {
>>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot
>> installed.
>> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping
>> tests';
>>    }
>>    elsif (!$do_network_tests) {
>>      plan skip_all => 'Network tests have not been requested,  
>> skipping
>> all';
>>    }
>>    else {
>>      plan tests => 21;
>>    }
>>
>>    #...
>> }
>>
>> my $obj = Bio::Object->new(-verbose => $DEBUG);
>> #...
>> ----
>>
>> To this:
>>
>> ----
>> use strict;
>>
>> BEGIN {
>>    use lib 't/lib';
>>    use BioperlTest;
>>
>>    test_begin(-requires_modules => [qw(IO::String LWP
>> LWP::UserAgent)],
>>               -requires_networking => 1,
>>               -tests => 21);
>>
>>    #...
>> }
>>
>> my $obj = Bio::Object->new(-verbose => test_debug());
>> #...
>> ----
>>
>>
>> Can anyone identify problems with this approach? Is the interface
>> presented by BioperlTest flexible enough that any changes would
>> only be
>> additions for new functionality (and therefore all test scripts
>> wouldn't
>> need to be altered)? Is BioperlTest missing anything you'd like?
>>
>> Are there any objections to me updating all tests in this manner?
>> For an
>> example, see t/RemoteBlast.t
>>
>>
>> Cheers,
>> Sendu.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From wollenbergk at mail.nih.gov  Wed Jun 20 18:11:04 2007
From: wollenbergk at mail.nih.gov (Wollenberg, Kurt (NIH/NIAID))
Date: Wed, 20 Jun 2007 14:11:04 -0400
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
Message-ID: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>

Greetings:

I am working on a script to take a list of sequence IDs, extract the
sequences from GenPept, and then run a BLAST search for each of the
retrieved sequences. I am having a problem with the sequence retrieval,
where some sequences are found and others are not and it's not obvious to me
why this is. 

For example, using a text file containing the two following IDs as input:
SKG3_YEAST
NEM1_YEAST

My script 

while( <IN> ) {
  chomp;
  my $seqid = $_;
  my $seq_obj = get_sequence( 'genpept', $seqid );
}

will create a sequence object for the first ID, (print "Accession of
",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession
number) but for the second I am told

-------------------- WARNING ---------------------
MSG: id (NEM1_YEAST) does not exist
---------------------------------------------------

When I pull up these records using the Entrez cross-databse search in my web
browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using
these search terms). In both records these IDs reside in the same field
("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence finds one
but not the other. Any advice would be greatly appreciated.

Cheers,
Kurt Wollenberg, Ph.D.
Phylogenetics and Sequence Analysis Consultant
Biocomputing Research Consulting Section
Bioinformatics and Scientific IT Program (BSIP)
NIH/NIAID/OTIS
Contractor, Lockheed Martin
http://bioinformatics.niaid.nih.gov

Disclaimer:
The information in this e-mail and any of its attachments is confidential
and may contain sensitive information. It should not be used by anyone who
is not the original intended recipient. If you have received this e-mail in
error please inform the sender and delete it from your mailbox or any other
storage devices. National Institute of Allergy and Infectious Diseases shall
not accept liability for any statements made that are sender's own and not
expressly made on behalf of the NIAID by one of its representatives.


From bosborne11 at verizon.net  Wed Jun 20 18:59:39 2007
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 20 Jun 2007 14:59:39 -0400
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
In-Reply-To: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
Message-ID: <C29EF15B.EAF7%bosborne11@verizon.net>

Kurt,

I can't answer your question but I wouldn't use Bio::Perl myself, I'd use
Bio::DB::GenPept:

501 ~>perl -e 'use Bio::DB::GenPept; $db = Bio::DB::GenPept->new; $seq =
$db->get_Seq_by_acc('NEM1_YEAST'); print $seq->seq;'
MNALKYFSNHLITTKKQKKINVEVTKNQDLLGPSKEVSNKYTSHSENDCVSEVDQQYDHSSSHLKESDQNQERKNS
VPKKPKALRSILIEKIASILWALLLFLPYYLIIKPLMSLWFVFTFPLSVIERRVKHTDKRNRGSNASENELPVSSS
NINDSSEKTNPKNCNLNTIPEAVEDDLNASDEIILQRDNVKGSLLRAQSVKSRPRSYSKSELSLSNHSSSNTVFGT
KRMGRFLFPKKLIPKSVLNTQKKKKLVIDLDETLIHSASRSTTHSNSSQGHLVEVKFGLSGIRTLYFIHKRPYCDL
FLTKVSKWYDLIIFTASMKEYADPVIDWLESSFPSSFSKRYYRSDCVLRDGVGYIKDLSIVKDSEENGKGSSSSLD
DVIIIDNSPVSYAMNVDNAIQVEGWISDPTDTDLLNLLPFLEAMRYSTDVRNILALKHGEKAFNIN502 ~>

It's true that Bio::Perl is easy-to-use but it's also _very_ limited.

Brian O.


On 6/20/07 2:11 PM, "Wollenberg, Kurt (NIH/NIAID)"
<wollenbergk at mail.nih.gov> wrote:

> Greetings:
> 
> I am working on a script to take a list of sequence IDs, extract the
> sequences from GenPept, and then run a BLAST search for each of the
> retrieved sequences. I am having a problem with the sequence retrieval,
> where some sequences are found and others are not and it's not obvious to me
> why this is. 
> 
> For example, using a text file containing the two following IDs as input:
> SKG3_YEAST
> NEM1_YEAST
> 
> My script 
> 
> while( <IN> ) {
>   chomp;
>   my $seqid = $_;
>   my $seq_obj = get_sequence( 'genpept', $seqid );
> }
> 
> will create a sequence object for the first ID, (print "Accession of
> ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct accession
> number) but for the second I am told
> 
> -------------------- WARNING ---------------------
> MSG: id (NEM1_YEAST) does not exist
> ---------------------------------------------------
> 
> When I pull up these records using the Entrez cross-databse search in my web
> browser I find genpept records for both SKG3_YEAST and NEM1_YEAST (using
> these search terms). In both records these IDs reside in the same field
> ("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence finds one
> but not the other. Any advice would be greatly appreciated.
> 
> Cheers,
> Kurt Wollenberg, Ph.D.
> Phylogenetics and Sequence Analysis Consultant
> Biocomputing Research Consulting Section
> Bioinformatics and Scientific IT Program (BSIP)
> NIH/NIAID/OTIS
> Contractor, Lockheed Martin
> http://bioinformatics.niaid.nih.gov
> 
> Disclaimer:
> The information in this e-mail and any of its attachments is confidential
> and may contain sensitive information. It should not be used by anyone who
> is not the original intended recipient. If you have received this e-mail in
> error please inform the sender and delete it from your mailbox or any other
> storage devices. National Institute of Allergy and Infectious Diseases shall
> not accept liability for any statements made that are sender's own and not
> expressly made on behalf of the NIAID by one of its representatives.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Wed Jun 20 20:11:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 20 Jun 2007 15:11:34 -0500
Subject: [Bioperl-l] get_sequence() gets some sequences but not others
In-Reply-To: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
References: <C29EE5F8.17F7%wollenbergk@mail.nih.gov>
Message-ID: <F9F5A58E-4767-49C4-80F2-DEE3CA474C01@uiuc.edu>

I'm assuming you are using the Bio::Perl exported sub get_sequence 
().  I am able to reproduce the issue using bioperl-live; it's an odd  
issue as direct use of Bio::DB::GenPept works fine:

use Bio::DB::GenPept;

my $factory = Bio::DB::GenPept->new();

my @accs = qw(SKG3_YEAST NEM1_YEAST);

my $io = $factory->get_Stream_by_acc(\@accs);

while (my $seq = $io->next_seq) {
     print "Accession:",$seq->accession,"\n";
}

chris


On Jun 20, 2007, at 1:11 PM, Wollenberg, Kurt (NIH/NIAID) wrote:

> Greetings:
>
> I am working on a script to take a list of sequence IDs, extract the
> sequences from GenPept, and then run a BLAST search for each of the
> retrieved sequences. I am having a problem with the sequence  
> retrieval,
> where some sequences are found and others are not and it's not  
> obvious to me
> why this is.
>
> For example, using a text file containing the two following IDs as  
> input:
> SKG3_YEAST
> NEM1_YEAST
>
> My script
>
> while( <IN> ) {
>   chomp;
>   my $seqid = $_;
>   my $seq_obj = get_sequence( 'genpept', $seqid );
> }
>
> will create a sequence object for the first ID, (print "Accession of
> ",$seqid," is ",$seq_obj->accession, "\n"; gives me the correct  
> accession
> number) but for the second I am told
>
> -------------------- WARNING ---------------------
> MSG: id (NEM1_YEAST) does not exist
> ---------------------------------------------------
>
> When I pull up these records using the Entrez cross-databse search  
> in my web
> browser I find genpept records for both SKG3_YEAST and NEM1_YEAST  
> (using
> these search terms). In both records these IDs reside in the same  
> field
> ("DBSOURCE    swissprot: locus") so I'm mystified why get_sequence  
> finds one
> but not the other. Any advice would be greatly appreciated.
>
> Cheers,
> Kurt Wollenberg, Ph.D.
> Phylogenetics and Sequence Analysis Consultant
> Biocomputing Research Consulting Section
> Bioinformatics and Scientific IT Program (BSIP)
> NIH/NIAID/OTIS
> Contractor, Lockheed Martin
> http://bioinformatics.niaid.nih.gov
>
> Disclaimer:
> The information in this e-mail and any of its attachments is  
> confidential
> and may contain sensitive information. It should not be used by  
> anyone who
> is not the original intended recipient. If you have received this e- 
> mail in
> error please inform the sender and delete it from your mailbox or  
> any other
> storage devices. National Institute of Allergy and Infectious  
> Diseases shall
> not accept liability for any statements made that are sender's own  
> and not
> expressly made on behalf of the NIAID by one of its representatives.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From sac at bioperl.org  Thu Jun 21 06:32:47 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Wed, 20 Jun 2007 23:32:47 -0700
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<A6F7E5D6-8904-439A-B676-BFED9991836B@gmx.net>
	<BF4BB95D-4B4F-4336-9FA4-AE7B0C961C96@uiuc.edu>
Message-ID: <8f200b4c0706202332w25a09547k1de20f24466877d9@mail.gmail.com>

Looks like a nice refactor. After it's in place, don't forget to
update the wiki:
http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests

Steve

On 6/20/07, Chris Fields <cjfields at uiuc.edu> wrote:
> Agreed!  You've already created an example case so there's something
> to go off of.
>
> I plan on changing some EUtilities tests soon so I'll try
> implementing this, basing off your RemoteBlast.t implementation.
> Seems clear enough on the surface; if I run into problems I'll post.
>
> chris
>
> On Jun 20, 2007, at 11:27 AM, Hilmar Lapp wrote:
>
> > Very cool! Sounds like a no-brainer to me to adopt this in all the
> > tests. -hilmar
> >
> > On Jun 20, 2007, at 11:38 AM, Sendu Bala wrote:
> >
> >> In considering updating all the test scripts to take advantage of the
> >> new network option, and/or reimplementing them in Test::More, I
> >> thought
> >> now would be a good time to standardize all the test scripts and
> >> reduce
> >> the possibility of having to alter them all in the future if
> >> something
> >> changes.
> >>
> >> For example we could decide on an alternate way of choosing to run
> >> network tests, or a new way of deciding to output debug information.
> >> There are also some inconsistencies in the messages produced by tests
> >> skipping all, and even an unfortunate mistake that has been copy/
> >> pasted
> >> through a lot of test scripts.
> >>
> >> My solution is t/lib/BioperlTest.pm (documented with perldoc)
> >>
> >> We go from this:
> >>
> >> ----
> >> use strict;
> >> our $DEBUG;
> >>
> >> BEGIN {
> >>    $DEBUG = $ENV{'BIOPERLDEBUG'} || 0;
> >>
> >>    eval { require Test::More; };
> >>    if( $@ ) {
> >>      use lib 't/lib';
> >>    }
> >>    use Test::More; # the mistake!
> >>
> >>    use Module::Build;
> >>    my $build = Module::Build->current();
> >>    my $do_network_tests = $build->notes('network');
> >>
> >>    eval {
> >>      require IO::String;
> >>      require LWP;
> >>      require LWP::UserAgent;
> >>    };
> >>    if ($@) {
> >>      plan skip_all => 'IO::String or LWP or LWP::UserAgentnot
> >> installed.
> >> This means Bio::Tools::Run::RemoteBlast is not usable. Skipping
> >> tests';
> >>    }
> >>    elsif (!$do_network_tests) {
> >>      plan skip_all => 'Network tests have not been requested,
> >> skipping
> >> all';
> >>    }
> >>    else {
> >>      plan tests => 21;
> >>    }
> >>
> >>    #...
> >> }
> >>
> >> my $obj = Bio::Object->new(-verbose => $DEBUG);
> >> #...
> >> ----
> >>
> >> To this:
> >>
> >> ----
> >> use strict;
> >>
> >> BEGIN {
> >>    use lib 't/lib';
> >>    use BioperlTest;
> >>
> >>    test_begin(-requires_modules => [qw(IO::String LWP
> >> LWP::UserAgent)],
> >>               -requires_networking => 1,
> >>               -tests => 21);
> >>
> >>    #...
> >> }
> >>
> >> my $obj = Bio::Object->new(-verbose => test_debug());
> >> #...
> >> ----
> >>
> >>
> >> Can anyone identify problems with this approach? Is the interface
> >> presented by BioperlTest flexible enough that any changes would
> >> only be
> >> additions for new functionality (and therefore all test scripts
> >> wouldn't
> >> need to be altered)? Is BioperlTest missing anything you'd like?
> >>
> >> Are there any objections to me updating all tests in this manner?
> >> For an
> >> example, see t/RemoteBlast.t
> >>
> >>
> >> Cheers,
> >> Sendu.
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From staffa at niehs.nih.gov  Thu Jun 21 18:36:12 2007
From: staffa at niehs.nih.gov (Staffa, Nick (NIH/NIEHS))
Date: Thu, 21 Jun 2007 14:36:12 -0400
Subject: [Bioperl-l] BIO::DB::FASTA  ID
Message-ID: <C2A03D5E.4DE9%staffa@niehs.nih.gov>

This program below returns only  1527 IDs from a fasta file that I have
constructed, which has
mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa
1820
.
It actually does not return the first 3 ids,
nor the 5th, nor 7..36, 38,39,41..44......
The header lines are of variable length and the sequence lines are 80
characters except at the ends when they might be shorter.
Is there some caveat that I am ignoring in my format that breaks
bio::db::fasta?


#!/usr/bin/perl
#
#
#
use strict;
use Bio::DB::Fasta;
use Bio::Tools::SeqWords;
use Bio::Seq;
use Bio::SeqIO;
$|=1;
#
#
my $Dpse_UTR_file_for_T_orthologs =
"/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa";
my $db = Bio::DB::Fasta->new
('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa',
  -reindex,  -makeid => \&make_my_id);
my @ids = $db->ids;
my $number_in = @ids;
print "number of Dpse IDs = $number_in\n";
foreach my $id (@ids){
print "$id\n";
}
sub make_my_id {
#       parse header line:
#       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT
    my $line = shift;
#    print "line = $line\n";
    $line =~ />(\w+) /;
    my $ID = $1;
#    print "ID = $ID\n";
    return $ID;
      }

-------------- next part --------------
A non-text attachment was scrubbed...
Name: T_orthologs_Dpse_genes.fa
Type: application/octet-stream
Size: 5033676 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070621/07c354d0/attachment-0004.obj>

From jason at bioperl.org  Thu Jun 21 21:19:14 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 21 Jun 2007 14:19:14 -0700
Subject: [Bioperl-l] BIO::DB::FASTA  ID
In-Reply-To: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
References: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
Message-ID: <F3A92546-08EE-4AD5-BFCE-BF006D153AD7@bioperl.org>

Hey Nick -
I think
a) your IDs are not unique
b) you need to declare the function make_my_id BEFORE your call  
Bio::DB::Fasta->new if you want your function to be used.

$ grep "^>" T_orthologs_Dpse_genes.fa | awk '{print $1}' | sort |  
uniq | wc -l
1527


-jason
On Jun 21, 2007, at 11:36 AM, Staffa, Nick (NIH/NIEHS) wrote:

> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> $|=1;
> #
> #
> my $Dpse_UTR_file_for_T_orthologs =
> "/home/staffa/clients/Kari/D_pse_genome/testit/ 
> T_orthologs_Dpse_genes.fa";
> my $db = Bio::DB::Fasta->new
> ('/home/staffa/clients/Kari/D_pse_genome/testit/ 
> T_orthologs_Dpse_genes.fa',
>   -reindex,  -makeid => \&make_my_id);
> my @ids = $db->ids;
> my $number_in = @ids;
> print "number of Dpse IDs = $number_in\n";
> foreach my $id (@ids){
> print "$id\n";
> }
> sub make_my_id {
> #       parse header line:
> #       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0  
> TTATTTATT
>     my $line = shift;
> #    print "line = $line\n";
>     $line =~ />(\w+) /;
>     my $ID = $1;
> #    print "ID = $ID\n";
>     return $ID;
>       }

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From mkiwala at watson.wustl.edu  Thu Jun 21 21:23:46 2007
From: mkiwala at watson.wustl.edu (Michael Kiwala)
Date: Thu, 21 Jun 2007 16:23:46 -0500
Subject: [Bioperl-l] BIO::DB::FASTA  ID
In-Reply-To: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
References: <C2A03D5E.4DE9%staffa@niehs.nih.gov>
Message-ID: <467AEC62.2040508@watson.wustl.edu>

You only have 1527 unique id's in the file.

~$ grep '^>' Desktop/T_orthologs_Dpse_genes.fa|cut -d\  -f1|sort -u|wc -l
1527


Change your make_id function to make sure the id's are unique.


Staffa, Nick (NIH/NIEHS) wrote:
> This program below returns only  1527 IDs from a fasta file that I have
> constructed, which has
> mildred> grep -c "^>Dpse" T_orthologs_Dpse_genes.fa
> 1820
> .
> It actually does not return the first 3 ids,
> nor the 5th, nor 7..36, 38,39,41..44......
> The header lines are of variable length and the sequence lines are 80
> characters except at the ends when they might be shorter.
> Is there some caveat that I am ignoring in my format that breaks
> bio::db::fasta?
>
>
> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> $|=1;
> #
> #
> my $Dpse_UTR_file_for_T_orthologs =
> "/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa";
> my $db = Bio::DB::Fasta->new
> ('/home/staffa/clients/Kari/D_pse_genome/testit/T_orthologs_Dpse_genes.fa',
>   -reindex,  -makeid => \&make_my_id);
> my @ids = $db->ids;
> my $number_in = @ids;
> print "number of Dpse IDs = $number_in\n";
> foreach my $id (@ids){
> print "$id\n";
> }
> sub make_my_id {
> #       parse header line:
> #       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0 TTATTTATT
>     my $line = shift;
> #    print "line = $line\n";
>     $line =~ />(\w+) /;
>     my $ID = $1;
> #    print "ID = $ID\n";
>     return $ID;
>       }
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Mon Jun 25 13:06:27 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 25 Jun 2007 14:06:27 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467949EC.9040100@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
Message-ID: <467FBDD3.8050009@sendu.me.uk>

Sendu Bala wrote:
> In considering updating all the test scripts to [... use] t/lib/BioperlTest.pm

I'm now in the process of converting all test scripts. In addition to 
those things mentioned previously, BioperlTest now also provides the 
methods test_input_file() and test_output_file().


This:
----
use Bio::Root::IO;
my $output_file = Bio::Root::IO->catfile(qw(t data temp.file));
$obj->new(-file => ">$output_file");

END {
   unlink($output_file);
}

...

$obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file)));
----


Becomes this:
----
my $output_file = test_output_file();
$obj->new(-file => ">$output_file");

...

$obj->new(-file => test_input_file('input.file'));
----


I should think the benefits are obvious, especially for the output 
files, which thanks to inconsistency of using END blocks correctly or at 
all, leaves some output data behind on occasion.

test_input_file() is helpful for the shorthand, but also gets rid of 
many tests' usage of Bio::Root::IO (relying on something you're 
installing and testing in another test script to work in the current 
test script, without testing it in your own test script seems like a 
no-no to me).


From cjfields at uiuc.edu  Mon Jun 25 13:39:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 08:39:21 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467FBDD3.8050009@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
Message-ID: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>

On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> In considering updating all the test scripts to [... use] t/lib/ 
>> BioperlTest.pm
>
> I'm now in the process of converting all test scripts. In addition to
> those things mentioned previously, BioperlTest now also provides the
> methods test_input_file() and test_output_file().
>
>
> This:
> ----
> use Bio::Root::IO;
> my $output_file = Bio::Root::IO->catfile(qw(t data temp.file));
> $obj->new(-file => ">$output_file");
>
> END {
>    unlink($output_file);
> }
>
> ...
>
> $obj->new(-file => Bio::Root::IO->catfile(qw(t data input.file)));
> ----
>
>
> Becomes this:
> ----
> my $output_file = test_output_file();
> $obj->new(-file => ">$output_file");
>
> ...
>
> $obj->new(-file => test_input_file('input.file'));
> ----
>
>
> I should think the benefits are obvious, especially for the output
> files, which thanks to inconsistency of using END blocks correctly  
> or at
> all, leaves some output data behind on occasion.

Sounds fine by me, though it's a lot of work.  BTW, did we ever  
decide whether to finish up with Test::More conversion?  I haven't  
heard back yet; let me know what you want to do.

> test_input_file() is helpful for the shorthand, but also gets rid of
> many tests' usage of Bio::Root::IO (relying on something you're
> installing and testing in another test script to work in the current
> test script, without testing it in your own test script seems like a
> no-no to me).

Well, in a way isn't that itself a test of the class (whether it  
breaks or not)?  ; >

Do test_input_file() and test_input_file() handle directory  
structures in an OS-safe way like catfile()?  For instance, I plan on  
adding test data to a new directory similar to Bio::Graphics (t/data/ 
eutil) to prevent cluttering of the t/data directory.  I could use  
'$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base  
directory is 't/data' but that may not be cross-platform compatible  
with win32 file systems, which may still expect something like 't\data 
\eutil\input.xml'.

chris


From bix at sendu.me.uk  Mon Jun 25 13:45:23 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 25 Jun 2007 14:45:23 +0100
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
Message-ID: <467FC6F3.6080705@sendu.me.uk>

Chris Fields wrote:
> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:
>> I should think the benefits are obvious, especially for the output
>> files, which thanks to inconsistency of using END blocks correctly or at
>> all, leaves some output data behind on occasion.
> 
> Sounds fine by me, though it's a lot of work.  BTW, did we ever decide 
> whether to finish up with Test::More conversion?  I haven't heard back 
> yet; let me know what you want to do.

I'm doing the remaining Test::More conversions at the same time.


> Do test_input_file() and test_input_file() handle directory structures 
> in an OS-safe way like catfile()?  For instance, I plan on adding test 
> data to a new directory similar to Bio::Graphics (t/data/eutil) to 
> prevent cluttering of the t/data directory.  I could use 
> '$obj->new(-file => test_input_file('/eutil/input.xml'))' if the base 
> directory is 't/data' but that may not be cross-platform compatible with 
> win32 file systems, which may still expect something like 
> 't\data\eutil\input.xml'.

Its platform-independent, currently implemented using File::Spec. So 
you'll say:

$obj->new(-file => test_input_file('eutil', 'input.xml'));

Its all documented in the POD of BioperlTest.


From cjfields at uiuc.edu  Mon Jun 25 13:49:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 08:49:51 -0500
Subject: [Bioperl-l] New testing base: BioperlTest.pm
In-Reply-To: <467FC6F3.6080705@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<53825978-65F8-49F4-8836-2F6A26A5CAC4@uiuc.edu>
	<467FC6F3.6080705@sendu.me.uk>
Message-ID: <679B8E76-C090-4A29-B843-99B5853FE2FB@uiuc.edu>


On Jun 25, 2007, at 8:45 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 25, 2007, at 8:06 AM, Sendu Bala wrote:
>>> I should think the benefits are obvious, especially for the output
>>> files, which thanks to inconsistency of using END blocks  
>>> correctly or at
>>> all, leaves some output data behind on occasion.
>> Sounds fine by me, though it's a lot of work.  BTW, did we ever  
>> decide whether to finish up with Test::More conversion?  I haven't  
>> heard back yet; let me know what you want to do.
>
> I'm doing the remaining Test::More conversions at the same time.

Okay.  Just didn't want to do any redundant work if it's already  
being/been done.

>> Do test_input_file() and test_input_file() handle directory  
>> structures in an OS-safe way like catfile()?  For instance, I plan  
>> on adding test data to a new directory similar to Bio::Graphics (t/ 
>> data/eutil) to prevent cluttering of the t/data directory.  I  
>> could use '$obj->new(-file => test_input_file('/eutil/ 
>> input.xml'))' if the base directory is 't/data' but that may not  
>> be cross-platform compatible with win32 file systems, which may  
>> still expect something like 't\data\eutil\input.xml'.
>
> Its platform-independent, currently implemented using File::Spec.  
> So you'll say:
>
> $obj->new(-file => test_input_file('eutil', 'input.xml'));
>
> Its all documented in the POD of BioperlTest.

yay!

chris


From mmokrejs at ribosome.natur.cuni.cz  Mon Jun 25 16:06:24 2007
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Mon, 25 Jun 2007 18:06:24 +0200
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
 file?
In-Reply-To: <467254DD.3010505@mrc-lmb.cam.ac.uk>
References: <466938F6.7050903@ribosome.natur.cuni.cz>	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>	<467178AE.5040905@ribosome.natur.cuni.cz>	<46717990.6040509@ribosome.natur.cuni.cz>
	<467254DD.3010505@mrc-lmb.cam.ac.uk>
Message-ID: <467FE800.4010300@ribosome.natur.cuni.cz>


Dave Howorth wrote:
> Martin MOKREJ? wrote:
>>>> Also, there is a *huge* amount of documentation and examples on
>>>> the BioPerl website.
>>>>
>>>> http://www.bioperl.org/wiki/HOWTOs
>>> You mean 
>>> http://www.bioperl.org/wiki/HOWTO:Graphics#Rendering_Features_from_a_GenBank_or_EMBL_File
>>>  ? ;-)
>> $ perl embl2picture.pl ~/99.gb | display - Error returned while
>> evaluating value of 'description' option for glyph
>> Bio::Graphics::Glyph::generic=HASH(0x8aa5790), feature
>> Bio::Location::Simple=HASH(0x893ebc4): Can't locate object method
>> "all_tags" via package "Bio::Location::Simple" at embl2picture.pl
>> line 141, <GEN0> line 125.
> 
> Hmm an error at line 141 of a 69 line script? Methinks you're not
> actually running the script that's presented on the wiki page you
> quoted. I cut-and-pasted the script and your file and it worked for me
> (at least, it produced an image, along with a bunch of OOPS lines)

Maybe you used the first version of the script?  There are two or more
scripts, I used the very last one.

M.


From cjfields at uiuc.edu  Mon Jun 25 16:48:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 25 Jun 2007 11:48:30 -0500
Subject: [Bioperl-l] How to draw a plasmid map from a genbank-formatted
	file?
In-Reply-To: <467FE7B0.3010904@ribosome.natur.cuni.cz>
References: <466938F6.7050903@ribosome.natur.cuni.cz>
	<56BAE06F-2FDF-4FA4-B6A0-96D89470AF4C@wustl.edu>
	<467178AE.5040905@ribosome.natur.cuni.cz>
	<46717990.6040509@ribosome.natur.cuni.cz>
	<CDE92699-3FEC-483D-B33F-18AA17301812@uiuc.edu>
	<46723F91.60501@ribosome.natur.cuni.cz>
	<A2212781-75F3-4BB7-967F-1668B682E84E@uiuc.edu>
	<467FE7B0.3010904@ribosome.natur.cuni.cz>
Message-ID: <B9DB370F-FB17-4DEF-9664-37489D84FC05@uiuc.edu>

Martin,

Keep bioperl-related discussion on the bioperl mail list.  The large  
majority of this isn't biopython-related, but maybe some devs there  
can add to this?

On Jun 25, 2007, at 11:05 AM, Martin MOKREJ? wrote:

...

> Would you please tell me exactly what is wrong with the spacing?

Here's a section of the seq record attached to your previous email:

DEFINITION .
ACCESSION .
VERSION .
SOURCE .
   ORGANISM .

Normally there is a fixed column width for any data present in a  
field, so it would look more like this:

DEFINITION  PYR4 (DIHYDROOROTASE, PYRIMIDIN 4, dihydroorotase);  
dihydroorotase
             [Arabidopsis thaliana].
ACCESSION   NP_194024
VERSION     NP_194024.1  GI:15235865
DBSOURCE    REFSEQ: accession NM_118422.3
KEYWORDS    .
SOURCE      Arabidopsis thaliana (thale cress)
   ORGANISM  Arabidopsis thaliana
             Eukaryota; Viridiplantae; Streptophyta; Embryophyta;  
Tracheophyta;
             Spermatophyta; Magnoliophyta; eudicotyledons; core  
eudicotyledons;
             rosids; eurosids II; Brassicales; Brassicaceae;  
Arabidopsis.

Here's the relevant bit in the latest release notes:

"The second part of each sequence entry record contains the information
appropriate to its keyword, in positions 13 to 80 for keywords and
positions 11 to 80 for the sequence."

The bioperl devs try to make our parsers as flexible as possible but  
others may not, so it's something in ApE that should probably be  
fixed.  And as mentioned to you several times in the past on the mail  
list and on bugzilla, don't expect sequence records which sway from  
the standard (in this case, the release notes) to parse correctly in  
all cases.  We can try supporting some that sway from that standard  
but only up to a point.  If it causes additional bugs, headaches, or  
degrades performance it won't be supported.

> ...
> Well, I just copy&pasted the script from the bioperl webpages, I think
> from a tutorial or FAQ, don't remember anymore.

Well, can't help you if you can't point out where the code originated  
from.  We would like to know so it can be corrected.

> ...
> Well, my search for such tools available on Unix to be used in a  
> script,
> non-interactively, completely failed. My last hope except getting  
> improved
> ApE is to use the GenomeDiagram under biopython, but so far my .gb  
> files
> cannot be parsed yet. :(
> Martin

As mentioned previously you will likely have to code for it yourself  
(perl or python) or help debug the relevant biopython code to get it  
working.  We can't/won't do this for you unless/until it's something  
we feel warrants implementation.  Judging by the bug list, we also  
haven't the time nor inclination to code for it.  Sorry but we have  
other priorities besides doing your work for you.

chris


From jesper at krogh.cc  Tue Jun 26 07:05:32 2007
From: jesper at krogh.cc (Jesper Krogh)
Date: Tue, 26 Jun 2007 09:05:32 +0200 (CEST)
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
Message-ID: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>

Hi List.

Trying to parse the embl database, the embl-parser fails on: AB019196
http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196


------------- EXCEPTION: Bio::Root::Exception -------------
MSG: AB019196 seems to have an invalid species classification.
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
STACK: Bio::SeqIO::embl::_read_EMBL_Species
/usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
STACK: Bio::SeqIO::embl::next_seq
/usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
STACK: -e:1
-----------------------------------------------------------


It seems to be dissatisfied with this:
OS   Acetobacter aceti
OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.

Thanks.
-- 
Jesper Krogh


From cjfields at uiuc.edu  Tue Jun 26 13:13:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 08:13:50 -0500
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
Message-ID: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>

I can verify this using bioperl-live.  Can you file this as a bug?

http://bugzilla.open-bio.org/

chris

On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:

> Hi List.
>
> Trying to parse the embl database, the embl-parser fails on: AB019196
> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: AB019196 seems to have an invalid species classification.
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
> STACK: Bio::SeqIO::embl::_read_EMBL_Species
> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
> STACK: Bio::SeqIO::embl::next_seq
> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
> STACK: -e:1
> -----------------------------------------------------------
>
>
> It seems to be dissatisfied with this:
> OS   Acetobacter aceti
> OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>
> Thanks.
> -- 
> Jesper Krogh
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From suji_ramin at yahoo.com  Tue Jun 26 04:58:36 2007
From: suji_ramin at yahoo.com (SujiBala)
Date: Mon, 25 Jun 2007 21:58:36 -0700 (PDT)
Subject: [Bioperl-l] Error in constructing Phylogenetic tree using
	BioPerl
Message-ID: <571051.26423.qm@web51107.mail.re2.yahoo.com>

Hi Hello
  This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. 
   
  Error messasge
    Must supply  a valid Bio::Align::AlignI for the _align parameter  in the distance 
  My program
  use Bio::AlignIO;
use Bio::Align::DNAStatistics;
use Bio::Tree::DistanceFactory;
# for a dna alignment  can also use ProteinStatistics
@aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
$stats = Bio::Align::DNAStatistics->new;
$mat = $stats->distance( -align  => @aln,-method => 'Kimura');
$dfactory = Bio::Tree::DistanceFactory->new(-method => 'NJ');
$tree = $dfactory->make_tree($mat);
   
  I am using clustalw formatted fasta file with more than one sequence 
   

SujiBala


---------------------------------
Luggage? GPS? Comic books? 
Check out fitting  gifts for grads at Yahoo! Search.


From bartels.stefan at mh-hannover.de  Tue Jun 26 09:26:03 2007
From: bartels.stefan at mh-hannover.de (don esteban)
Date: Tue, 26 Jun 2007 02:26:03 -0700 (PDT)
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
	<BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
Message-ID: <11302459.post@talk.nabble.com>


Try using the Proxyconfiguration in your script:

$ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080";


L Xu wrote:
> 
> I do have the internet connection bu not use the proxy server.
> I tested the network connection with ping command (below). The ncbi
> website 
> does not response. Is there any special network setting needed for 
> connecting the ncbi website?
> Thank you so much.
> 
> C:\>ping www.yahoo.com
> 
> Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:
> 
> Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
> Reply from 69.147.114.210: bytes=32 time=360ms TTL=45
> 
> Ping statistics for 69.147.114.210:
>     Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
> Approximate round trip times in milli-seconds:
>     Minimum = 312ms, Maximum = 363ms, Average = 338ms
> 
> C:\>ping www.ncbi.nlm.nih.gov
> 
> Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:
> 
> Request timed out.
> Request timed out.
> Request timed out.
> Request timed out.
> 
> Ping statistics for 130.14.29.110:
>     Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
> 
> 
> 
> = = = Original message = = =
> 
> Judging by the output it looks like you have no network access or? can't 
> connect to the server (what remoteblast needs).? Make sure you? don't need 
> proxy settings.
> 
> To preempt the next question, no, I'm not going to explain what a? proxy 
> is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
> tool...
> 
> chris
> 
> On Jun 13, 2007, at 7:16 AM, L Xu wrote:
> 
> 
>    ...
> -------------------- WARNING ---------------------
> MSG: <HTML>
> <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> <BODY>
> <H1>An Error Occurred</H1>
> 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> </BODY>
> </HTML>
> 
> ---------------------------------------------------
> ...
> 
> ___________________________________________________________
> Sent by ePrompter, the premier email notification software.
> Free download at http://www.ePrompter.com.
> 
> _________________________________________________________________
> Get a preview of Live Earth, the hottest event this summer - only on MSN 
> http://liveearth.msn.com?source=msntaglineliveearthhm
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From rahall2 at ualr.edu  Tue Jun 26 13:51:08 2007
From: rahall2 at ualr.edu (Roger Hall)
Date: Tue, 26 Jun 2007 08:51:08 -0500
Subject: [Bioperl-l] Tuesday: ill
Message-ID: <000001c7b7f9$0d029040$4601a8c0@LIBERAL2>

Well I guess I won't be in today after all.
 
Michael, Stephen, and Ames: please call me from the grad office at 10 on
my cell phone (744-8514). 
 
Phil: please go ahead and meet with Tim, and let me know what questions
remain afterwards.
 
Thanks!
 
Roger Hall
Technical Director
MidSouth Bioinformatics Center
University of Arkansas at Little Rock
(501) 569-8074
 

From cjfields at uiuc.edu  Tue Jun 26 14:02:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 09:02:29 -0500
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <4681185D.5030402@cam.ac.uk>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
	<246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
	<4681185D.5030402@cam.ac.uk>
Message-ID: <EC86EE5C-02DF-4E4F-AF25-6E53925CBC1F@uiuc.edu>

Ill try getting to that ASAP (as well as a few bugs).  The problem is  
we have to patch this in 2-3 places (SeqIO::swiss, SeqIO::embl) due  
to repeated code issues, something I'm trying to rectify with a new  
set of parsers.  Just haven't had the time to work on them lately  
unfortunately.

chris

On Jun 26, 2007, at 8:45 AM, Roy Chaudhuri wrote:

> Sorry, replied to this but forgot to cc the list.
>
> It looks like a related problem to bug 2288 that I filed about  
> Bio::SeqIO::swiss - the period after subgen. is what causes the  
> problems since it is interpreted as a seperator between nodes. I  
> put a patch in for Bio::SeqIO::swiss that works for me, but I guess  
> it might have side effects.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
>
> Chris Fields wrote:
>> I can verify this using bioperl-live.  Can you file this as a bug?
>> http://bugzilla.open-bio.org/
>> chris
>> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:
>>> Hi List.
>>>
>>> Trying to parse the embl database, the embl-parser fails on:  
>>> AB019196
>>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>>>
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: AB019196 seems to have an invalid species classification.
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/ 
>>> Root.pm:359
>>> STACK: Bio::SeqIO::embl::_read_EMBL_Species
>>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
>>> STACK: Bio::SeqIO::embl::next_seq
>>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
>>> STACK: -e:1
>>> -----------------------------------------------------------
>>>
>>>
>>> It seems to be dissatisfied with this:
>>> OS   Acetobacter aceti
>>> OC   Bacteria; Proteobacteria; Alphaproteobacteria;  
>>> Rhodospirillales;
>>> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>>>
>>> Thanks.
>>> -- 
>>> Jesper Krogh
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From rrc22 at cam.ac.uk  Tue Jun 26 13:45:01 2007
From: rrc22 at cam.ac.uk (Roy Chaudhuri)
Date: Tue, 26 Jun 2007 14:45:01 +0100
Subject: [Bioperl-l] Possible bug in Bio::SeqIO::embl.pm
In-Reply-To: <246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
References: <10139.195.41.66.226.1182841532.squirrel@mail.jabbernet.dk>
	<246C49A7-D91E-4E74-921D-4B9C174AB41F@uiuc.edu>
Message-ID: <4681185D.5030402@cam.ac.uk>

Sorry, replied to this but forgot to cc the list.

It looks like a related problem to bug 2288 that I filed about 
Bio::SeqIO::swiss - the period after subgen. is what causes the problems 
since it is interpreted as a seperator between nodes. I put a patch in 
for Bio::SeqIO::swiss that works for me, but I guess it might have side 
effects.

Roy.
--
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.

Chris Fields wrote:
> I can verify this using bioperl-live.  Can you file this as a bug?
> 
> http://bugzilla.open-bio.org/
> 
> chris
> 
> On Jun 26, 2007, at 2:05 AM, Jesper Krogh wrote:
> 
>> Hi List.
>>
>> Trying to parse the embl database, the embl-parser fails on: AB019196
>> http://www.ebi.ac.uk/cgi-bin/expasyfetch?AB019196
>>
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: AB019196 seems to have an invalid species classification.
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:359
>> STACK: Bio::SeqIO::embl::_read_EMBL_Species
>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:1091
>> STACK: Bio::SeqIO::embl::next_seq
>> /usr/share/perl/5.8/Bio/SeqIO/embl.pm:322
>> STACK: -e:1
>> -----------------------------------------------------------
>>
>>
>> It seems to be dissatisfied with this:
>> OS   Acetobacter aceti
>> OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rhodospirillales;
>> OC   Acetobacteraceae; Acetobacter; Acetobacter subgen. Acetobacter.
>>
>> Thanks.
>> -- 
>> Jesper Krogh
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Tue Jun 26 14:13:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 26 Jun 2007 15:13:48 +0100
Subject: [Bioperl-l] Error in constructing Phylogenetic tree
	using	BioPerl
In-Reply-To: <571051.26423.qm@web51107.mail.re2.yahoo.com>
References: <571051.26423.qm@web51107.mail.re2.yahoo.com>
Message-ID: <46811F1C.3020307@sendu.me.uk>

SujiBala wrote:
> Hi Hello
>   This is sujatha from singapore. I am trying to construct phylo tree using DNAStatistics and Kirma method. But I am getting the following error message. It would be nice if you could help me resolve this problem asap. 
>    
>   Error messasge
>     Must supply  a valid Bio::Align::AlignI for the _align parameter  in the distance 
>   My program
>   use Bio::AlignIO;
> use Bio::Align::DNAStatistics;
> use Bio::Tree::DistanceFactory;
> # for a dna alignment  can also use ProteinStatistics
> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
> $stats = Bio::Align::DNAStatistics->new;
> $mat = $stats->distance( -align  => @aln,-method => 'Kimura');

Without looking at the docs for these modules, it is immediately obvious 
that Bio::AlignIO->new() is going to return an instance of Bio::AlignIO 
and not an array of alignments. It is also obvious that the -align => 
parameter for the distance() method can't take an array of anything (but 
probably an array ref?).

Check the documentation and make sure you know what objects you're 
generating and passing around.


From schlesi at ebi.ac.uk  Tue Jun 26 14:59:13 2007
From: schlesi at ebi.ac.uk (Felix Schlesinger)
Date: Tue, 26 Jun 2007 15:59:13 +0100
Subject: [Bioperl-l] PAML parser
Message-ID: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>

Hello,

I am trying to use the PAML result parser (BioPerl
Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15.
However on all outputs I have tested no result object is returned
(next_result is undef). This includes the HIV and Lysin datasets
included with PAML.
My code is:

my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir =>
"/.");
my $result = $codemlp->next_result;
foreach my $model ( $result->get_NSSite_results ) {
...

and the error is: Can't call method "get_NSSite_results" on an
undefined value ...

I can include the mlc file is needed. Is this supposed to work? Or do
I have to run paml from bioperl to parse the results?

Thanks
  Felix


From Xianjun.Dong at bccs.uib.no  Tue Jun 26 14:35:17 2007
From: Xianjun.Dong at bccs.uib.no (Xianjun Dong)
Date: Tue, 26 Jun 2007 16:35:17 +0200
Subject: [Bioperl-l] bug for PAML::Baseml
Message-ID: <46812425.8000509@ii.uib.no>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070626/cb3d8193/attachment-0004.html>

From Xianjun.Dong at bccs.uib.no  Tue Jun 26 15:40:47 2007
From: Xianjun.Dong at bccs.uib.no (Xianjun Dong)
Date: Tue, 26 Jun 2007 17:40:47 +0200
Subject: [Bioperl-l] bug for PAML::Baseml
In-Reply-To: <46812425.8000509@ii.uib.no>
References: <46812425.8000509@ii.uib.no>
Message-ID: <4681337F.1000902@ii.uib.no>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070626/604ce866/attachment-0004.html>

From hartzell at alerce.com  Tue Jun 26 18:12:04 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 26 Jun 2007 14:12:04 -0400
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
Message-ID: <18049.22260.967524.353173@almost.alerce.com>


There don't seem to be any .cvsignore files in the repository, or in
CVSROOT/cvsignore.

Am I missing something, or don't we use them?

g.


From cjfields at uiuc.edu  Tue Jun 26 19:54:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 26 Jun 2007 14:54:25 -0500
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <74515C87-5553-4AF0-9B83-26F3E71E15C8@uiuc.edu>

Not sure.  You may want to email support at open-bio.org; my guess is  
Chris D or Jason would have an answer.

chris

On Jun 26, 2007, at 1:12 PM, George Hartzell wrote:

>
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
>
> Am I missing something, or don't we use them?
>
> g.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Tue Jun 26 19:55:21 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 26 Jun 2007 16:55:21 -0300
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>

Maybe we've been using the default?

On Jun 26, 2007, at 3:12 PM, George Hartzell wrote:

>
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
>
> Am I missing something, or don't we use them?
>
> g.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Tue Jun 26 20:21:30 2007
From: hartzell at alerce.com (George Hartzell)
Date: Tue, 26 Jun 2007 16:21:30 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
Message-ID: <18049.30026.61328.134490@almost.alerce.com>

Chris Fields writes:
 > [...]
 > It looks like George Hartzell may be taking a crack at it, with  
 > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
 > could have something testable relatively soon.  After that we'll need  
 > to work out a few other issues, basically what's on Hilmar's list.

There's a repository on file:///home/hartzell/bioperl with all of the
components projects in place.

If you have a dev.open-bio.org account and you're in the bioperl
group, you're good to get at it via:

  file:///home/hartzell/bioperl

or 

  svn+ssh://dev.open-bio.org/home/hartzell/bioperl

There are a couple of things to think about:

  - how are we going to provide access.  I *think* that I heard a
    decision to use http:// and https://.  Who gets to set that up?

  - what do we want to do about keywords.  The cvs2svn tool guesses
    and automatically sets the svn:keywords property to Author Date
    Revision and Id on many of the files in the tree.  If it looks
    like it got it right, we can stick with it.  Or, we can disable
    that conversion and I've cribbed a little script that'll grep out
    files using Id and set the svn:keywords property accordingly.

  - what do we want to do about svn:ignore?  I haven't seen any
    .cvsignore files.

Beyond that, how does the repo look?

How are we going to cut over?

Are we going to try to push svn commits to the read-mostly CVS repo,
or just keep it around for history's sake (I lean towards the latter).

g.


From jason at bioperl.org  Tue Jun 26 23:22:20 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:22:20 -0300
Subject: [Bioperl-l] PAML parser
In-Reply-To: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>
References: <7317d50c0706260759r3bdda445lf0acf10a31ee5765@mail.gmail.com>
Message-ID: <D536496C-D716-42DF-B614-DD43C1B13A67@bioperl.org>

Can you make sure you have the latest and greatest version of these  
modules from the CVS repository?  We had to fix things to parse 3.15  
-- I can't tell if this is the problem or something else.
You can also add -verbose => 1when you initialize the object and it  
may spit out more warnings about whether it is having problems.


-jason

On Jun 26, 2007, at 11:59 AM, Felix Schlesinger wrote:

> Hello,
>
> I am trying to use the PAML result parser (BioPerl
> Bio::Tools::Phylo::PAML) on output files generated by PAML 3.15.
> However on all outputs I have tested no result object is returned
> (next_result is undef). This includes the HIV and Lysin datasets
> included with PAML.
> My code is:
>
> my $codemlp = Bio::Tools::Phylo::PAML->new(-file => "mlc",dir =>
> "/.");
> my $result = $codemlp->next_result;
> foreach my $model ( $result->get_NSSite_results ) {
> ...
>
> and the error is: Can't call method "get_NSSite_results" on an
> undefined value ...
>
> I can include the mlc file is needed. Is this supposed to work? Or do
> I have to run paml from bioperl to parse the results?
>
> Thanks
>   Felix
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Tue Jun 26 23:27:05 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:27:05 -0300
Subject: [Bioperl-l] Error in constructing Phylogenetic tree
	using	BioPerl
In-Reply-To: <46811F1C.3020307@sendu.me.uk>
References: <571051.26423.qm@web51107.mail.re2.yahoo.com>
	<46811F1C.3020307@sendu.me.uk>
Message-ID: <A99815DC-0FC2-4019-B0C4-CA8EA713FEB0@bioperl.org>


On Jun 26, 2007, at 11:13 AM, Sendu Bala wrote:

> SujiBala wrote:
>> Hi Hello
>>   This is sujatha from singapore. I am trying to construct phylo  
>> tree using DNAStatistics and Kirma method. But I am getting the  
>> following error message. It would be nice if you could help me  
>> resolve this problem asap.
>>
>>   Error messasge
>>     Must supply  a valid Bio::Align::AlignI for the _align  
>> parameter  in the distance
>>   My program
>>   use Bio::AlignIO;
>> use Bio::Align::DNAStatistics;
>> use Bio::Tree::DistanceFactory;
>> # for a dna alignment  can also use ProteinStatistics
>> @aln = Bio::AlignIO->new(-file => 'out4.fa', -format=>'clustalw');
>> $stats = Bio::Align::DNAStatistics->new;
>> $mat = $stats->distance( -align  => @aln,-method => 'Kimura');
>

yep you want to call next_aln on the Bio::AlignIO object.
I fixed the example code in the HOWTO so it should work properly now;
http://bioperl.org/wiki/HOWTO:Trees#Constructing_Trees

> Without looking at the docs for these modules, it is immediately  
> obvious
> that Bio::AlignIO->new() is going to return an instance of  
> Bio::AlignIO
> and not an array of alignments. It is also obvious that the -align =>
> parameter for the distance() method can't take an array of anything  
> (but
> probably an array ref?).
>
> Check the documentation and make sure you know what objects you're
> generating and passing around.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Tue Jun 26 23:29:11 2007
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 26 Jun 2007 20:29:11 -0300
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
	<E6FC4C83-7C71-4D3D-902A-3DE79E02A57C@gmx.net>
Message-ID: <5A8FD8A3-9593-4925-AA74-D4B03CDC1C34@bioperl.org>

We don't have one. I have one on my local machine that defined  
basically *~ and .#* so I never had a problem.

Feel free to propose one if you think it is important, I never really  
though it was important.

On Jun 26, 2007, at 4:55 PM, Hilmar Lapp wrote:

> Maybe we've been using the default?
>
> On Jun 26, 2007, at 3:12 PM, George Hartzell wrote:
>
>>
>> There don't seem to be any .cvsignore files in the repository, or in
>> CVSROOT/cvsignore.
>>
>> Am I missing something, or don't we use them?
>>
>> g.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From j_martin at lbl.gov  Wed Jun 27 01:01:29 2007
From: j_martin at lbl.gov (Joel Martin)
Date: Tue, 26 Jun 2007 18:01:29 -0700
Subject: [Bioperl-l] Example code in Bioperl Tutorial
In-Reply-To: <11302459.post@talk.nabble.com>
References: <BAY106-F20A751BADBCA1A976262B7B4180@phx.gbl>
	<BAY106-F32ED418DF7AF0CB4E47AD2B4180@phx.gbl>
	<11302459.post@talk.nabble.com>
Message-ID: <20070627010129.GA8628@eniac.jgi-psf.org>

Hello, 
  The tutorial code snippet is an endless loop, I think it's supposed
to remove the rid.  As the only print statement you added is after the
endless loop, you aren't seeing anything happen.   

Use the code from this instead,

perldoc Bio::Tools::Run::RemoteBlast

  The bptutorial.pl does have a note that it's not useful and to read the pod
for Bio::Tools::Run::RemoteBlast, it's in the next sentences after the code
snippet you used.  

  Though, as it's a tutorial example it might be nice to remove the while
loop .. or at least add the sleep(5) part.
http://www.bioperl.org/wiki/Bptutorial.pl#Running_BLAST_.28using_RemoteBlast.pm.29

  Aside from that, you may have network issues but www.ncbi.nlm.nih.gov
doesn't respond to ping as far as I can tell. 

Joel


On Tue, Jun 26, 2007 at 02:26:03AM -0700, don esteban wrote:
> 
> Try using the Proxyconfiguration in your script:
> 
> $ENV{"HTTP_PROXY"}="http://proxy.somewhere.org:8080";
> 
> 
> 
> 
> L Xu wrote:
> > 
> > I do have the internet connection bu not use the proxy server.
> > I tested the network connection with ping command (below). The ncbi
> > website 
> > does not response. Is there any special network setting needed for 
> > connecting the ncbi website?
> > Thank you so much.
> > 
> > C:\>ping www.yahoo.com
> > 
> > Pinging www.yahoo-ht3.akadns.net [69.147.114.210] with 32 bytes of data:
> > 
> > Reply from 69.147.114.210: bytes=32 time=363ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=319ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=312ms TTL=45
> > Reply from 69.147.114.210: bytes=32 time=360ms TTL=45
> > 
> > Ping statistics for 69.147.114.210:
> >     Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
> > Approximate round trip times in milli-seconds:
> >     Minimum = 312ms, Maximum = 363ms, Average = 338ms
> > 
> > C:\>ping www.ncbi.nlm.nih.gov
> > 
> > Pinging www.ncbi.nlm.nih.gov [130.14.29.110] with 32 bytes of data:
> > 
> > Request timed out.
> > Request timed out.
> > Request timed out.
> > Request timed out.
> > 
> > Ping statistics for 130.14.29.110:
> >     Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
> > 
> > 
> > 
> > = = = Original message = = =
> > 
> > Judging by the output it looks like you have no network access or? can't 
> > connect to the server (what remoteblast needs).? Make sure you? don't need 
> > proxy settings.
> > 
> > To preempt the next question, no, I'm not going to explain what a? proxy 
> > is.? The RemoteBlast docs show how to set them, and Google is a? wonderful 
> > tool...
> > 
> > chris
> > 
> > On Jun 13, 2007, at 7:16 AM, L Xu wrote:
> > 
> > 
> >    ...
> > -------------------- WARNING ---------------------
> > MSG: <HTML>
> > <HEAD><TITLE>An Error Occurred</TITLE></HEAD>
> > <BODY>
> > <H1>An Error Occurred</H1>
> > 500 Can't connect to www.ncbi.nlm.nih.gov:80 (connect: Unknown error)
> > </BODY>
> > </HTML>
> > 
> > ---------------------------------------------------
> > ...
> > 
> > ___________________________________________________________
> > Sent by ePrompter, the premier email notification software.
> > Free download at http://www.ePrompter.com.
> > 
> > _________________________________________________________________
> > Get a preview of Live Earth, the hottest event this summer - only on MSN 
> > http://liveearth.msn.com?source=msntaglineliveearthhm
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > 
> > 
> 
> -- 
> View this message in context: http://www.nabble.com/Example-code-in-Bioperl-Tutorial-tf3914295.html#a11302459
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From melvinp at pacific.net.sg  Wed Jun 27 05:25:08 2007
From: melvinp at pacific.net.sg (Melvin P)
Date: Wed, 27 Jun 2007 13:25:08 +0800
Subject: [Bioperl-l] finding statistics on AA
Message-ID: <4681F4B4.8010609@pacific.net.sg>

Hi, I am new to BioPerl. I am trying to find out if there is any class 
that I can use for occupancy number/occurrence counts, psuedo count, 
observed frequency etc given a few sequences of amino acid. For example, 
what is the observed frequency of residue i at position p. My objective 
is to analyze the information content. Thanks.


From bix at sendu.me.uk  Wed Jun 27 10:23:58 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 11:23:58 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <467FBDD3.8050009@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
Message-ID: <46823ABE.2080300@sendu.me.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> In considering updating all the test scripts to [... use] 
>> t/lib/BioperlTest.pm
> 
> I'm now in the process of converting all test scripts.

And I've now completed that job (for bioperl-live at least), except for 
t/EUtilities.t since I know Chris is working on it.


In addition to converting to Test::More where necessary, I've also made 
all psuedo-TODO blocks real ones. Previously I had advised to use SKIP 
blocks instead since TODO blocks need a Test::Harness upgrade. However I 
think in the next release we ought to make such upgrading compulsory 
(which should be automatic when combined with compulsory usage of 
Module::Build and Test::More in turn: users simply have to update CPAN).


The conversion to BioperlTest directly led to the discovery and fixing 
of 6 minor bugs, so was certainly not without merit.


No user or developer needs to have BIOPERLDEBUG permanently set to true 
anymore. To run all tests you just have to answer yes to the BioDBGFF 
and networking questions of 'perl Build.PL'. With './Build test' you 
then get clean, easy-to-read output where it is obvious to see that we 
currently have these issues:

t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in 
another thread.

t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, 
t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and 
t/Annotation.t all have TODO tests. If you know about those modules, now 
would be a great time to implement those TODOs!

Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are 
deprecated' warnings.


To debug a particular test you could say:
BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t


I've updated the HOWTO for writing test scripts:
http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests


From cjfields at uiuc.edu  Wed Jun 27 11:55:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 06:55:47 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <46823ABE.2080300@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk>
Message-ID: <DC0F57B9-D733-4C89-9B7A-65E1ADFCFDD2@uiuc.edu>


On Jun 27, 2007, at 5:23 AM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Sendu Bala wrote:
>>> In considering updating all the test scripts to [... use]
>>> t/lib/BioperlTest.pm
>>
>> I'm now in the process of converting all test scripts.
>
> And I've now completed that job (for bioperl-live at least), except  
> for
> t/EUtilities.t since I know Chris is working on it.

The network tests will be much shorter; the bulk will be transferred  
to a new suite for the backend Bio::Tools:EUtilities parser (which  
will test static files in t/data/eutils, so no dynamic changes).

> In addition to converting to Test::More where necessary, I've also  
> made
> all psuedo-TODO blocks real ones. Previously I had advised to use SKIP
> blocks instead since TODO blocks need a Test::Harness upgrade.  
> However I
> think in the next release we ought to make such upgrading compulsory
> (which should be automatic when combined with compulsory usage of
> Module::Build and Test::More in turn: users simply have to update  
> CPAN).

Sounds good to me, but there may be some grumblings out there.

Having specific TODOs are nice b/c we can test them w/o fails.  Handy.

> The conversion to BioperlTest directly led to the discovery and fixing
> of 6 minor bugs, so was certainly not without merit.
>
>
> No user or developer needs to have BIOPERLDEBUG permanently set to  
> true
> anymore. To run all tests you just have to answer yes to the BioDBGFF
> and networking questions of 'perl Build.PL'. With './Build test' you
> then get clean, easy-to-read output where it is obvious to see that we
> currently have these issues:
>
> t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in
> another thread.
>
> t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t,
> t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and
> t/Annotation.t all have TODO tests. If you know about those  
> modules, now
> would be a great time to implement those TODOs!

The RNA_SearchIO.t is from ERPIN output; there's no easy way to  
generate it beyond having the user supply the info (or having the  
program author change the output).

Will have to look at the others to see what's involved; maybe  
something for the priority list?

> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
> deprecated' warnings.

I ran into this with XML::Simple data structures recently; there was  
an easy way around it via XML::Simple using forcearray().  It has to  
do with attempting to assign data to/from a hash in a specific way  
involving array references (though I can't remember exactly how; I  
slept since then).

> To debug a particular test you could say:
> BIOPERLDEBUG=1 ./Build test --verbose --test_files t/Sopma.t
>
>
> I've updated the HOWTO for writing test scripts:
> http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests

Good work!

chris


From schlesi at ebi.ac.uk  Wed Jun 27 11:57:27 2007
From: schlesi at ebi.ac.uk (Felix Schlesinger)
Date: Wed, 27 Jun 2007 12:57:27 +0100
Subject: [Bioperl-l] Selecting columns from alignment
Message-ID: <7317d50c0706270457i1c3d92a8hb124fa663f51b837@mail.gmail.com>

Hi,

is there an elegant way to select columns from an alignment object
fulfilling a certain property (for example less than x gaps)?
Everything I can see from Align::AlignI seems to involve looking at
the individual sequences, creating lots of slices and appending them.
If there a better way in bioperl or failing that, does anyone know a
software package with similar functionality (t-coffee has lots of
filters for alignments, but nothing to select columns besides by
position it seems). Ideally this would also return a mapping from old
to new positions in one of the sequences of course.

Thanks
  Felix


From cjfields at uiuc.edu  Wed Jun 27 14:36:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 09:36:41 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>


On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:

> ...
> If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
>
>   file:///home/hartzell/bioperl
>
> or
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

I managed to get it working using file://.  Haven't tried svn+ssh yet  
but I've had persistent problems getting ssh to work properly on my  
macbook; not sure why yet but I haven't had time to play around with it.

> There are a couple of things to think about:
>
>   - how are we going to provide access.  I *think* that I heard a
>     decision to use http:// and https://.  Who gets to set that up?

That hasn't been decided yet and will be up to a consensus of the  
core devs, but I think the odds are in favor of allowing https:// but  
against allowing http://.

As for setup that could be anyone with admin privs, though it may be  
best left up to Chris D, Jason, or Mauricio.

>   - what do we want to do about keywords.  The cvs2svn tool guesses
>     and automatically sets the svn:keywords property to Author Date
>     Revision and Id on many of the files in the tree.  If it looks
>     like it got it right, we can stick with it.  Or, we can disable
>     that conversion and I've cribbed a little script that'll grep out
>     files using Id and set the svn:keywords property accordingly.

Probably again a consensus issue, but you can choose one route.  My  
inclination is the former if it's easier.

>   - what do we want to do about svn:ignore?  I haven't seen any
>     .cvsignore files.

Not sure.  I've never used one personally, but (as Jason suggests) if  
you have ideas for one you can propose them, or we can suggest devs  
set up svn::ignore locally.

> Beyond that, how does the repo look?

Seems fine, though a simple 'svn file:///home/hartzell/bioperl'  
checkout gets everything (all distros, branches, etc).  We need to  
make sure everyone uses 'svn co file:///home/hartzell/bioperl/bioperl- 
live/trunk /live' or similar if they just want the latest core/db/etc.

We'll also need to start a svn wiki page to show how to get relevant  
distros (similar in style probably to the cvs page, with dev  
information, how to set up ssh keys, https stuff, etc).

> How are we going to cut over?
>
> Are we going to try to push svn commits to the read-mostly CVS repo,
> or just keep it around for history's sake (I lean towards the latter).

I think a clean cut-over.  Everyone would be warned to hold commits  
for a day (lest they be lost), then probably do something in this order:

- switch cvs to read-only except for svn commits
- run a clean cvs2svn
- set up svn as read/write
- set up test commits to cvs via svn
- disable cvs commit messages to bioperl-guts, enable svn commit  
messages in it's place.
- push svn commits over to read-only cvs

cvs >>must<< be read-only after that point (no cvs->svn commits),  
with write access only available through svn.  If at some future  
point there is no reason to keep it around or that it is more trouble  
than it's worth, we can make a decision then on cvs's fate.

> g.

chris


From rvos at interchange.ubc.ca  Wed Jun 27 14:23:25 2007
From: rvos at interchange.ubc.ca (rvos)
Date: Wed, 27 Jun 2007 07:23:25 -0700 (PDT)
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
Message-ID: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>

 
> Are we going to try to push svn commits to the read-mostly CVS repo,
> or just keep it around for history's sake (I lean towards the latter).

I'm a little confused - surely once the svn is up and running we'll want *no more* cvs commits? Parallel repositories that each accumulate stuff will be a nightmare. I'm probably just not getting your point.

Rutger


From cjfields at uiuc.edu  Wed Jun 27 15:18:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 10:18:03 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>


On Jun 27, 2007, at 9:23 AM, rvos wrote:

>
>> Are we going to try to push svn commits to the read-mostly CVS repo,
>> or just keep it around for history's sake (I lean towards the  
>> latter).
>
> I'm a little confused - surely once the svn is up and running we'll  
> want *no more* cvs commits? Parallel repositories that each  
> accumulate stuff will be a nightmare. I'm probably just not getting  
> your point.
>
> Rutger

Most projects make a clean break with cvs (no more commits) for the  
reasons you point out.  Not sure how the other core devs feel about  
that but I could go for that; it would def. prevent headaches.  We  
could keep cvs for the time being as read-only, with no svn->cvs  
syncing.

There are few projects which have (as a phase-out plan) old read-only  
cvs repositories available, with an automatic svn->cvs commit  
following every new svn commit.  Not sure how that works, esp. for  
branching/merging and so on which I could see potentially getting hairy.

chris


From cjfields at uiuc.edu  Wed Jun 27 16:05:49 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 11:05:49 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <5EA56270-3427-4995-B3C1-2789229AACF1@uiuc.edu>


On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:

> ...If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
>
>   file:///home/hartzell/bioperl
>
> or
>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

Did manage to get svn+ssh working (with some password harassment);  
core tests passed enough that I think everything's okay.  If ssh keys  
are set up correctly (mine aren't) it should work fine.

chris


From dmessina at wustl.edu  Wed Jun 27 16:27:32 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 11:27:32 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>

> [Chris]
>
> I managed to get it working using file://.  Haven't tried svn+ssh yet
> but I've had persistent problems getting ssh to work properly on my
> macbook; not sure why yet but I haven't had time to play around  
> with it.

I just did a checkout and a test commit, both via svn+ssh -- works  
great for me.


>> [George]
>>
>>   - what do we want to do about keywords.  The cvs2svn tool guesses
>>     and automatically sets the svn:keywords property to Author Date
>>     Revision and Id on many of the files in the tree.  If it looks
>>     like it got it right, we can stick with it.  Or, we can disable
>>     that conversion and I've cribbed a little script that'll grep out
>>     files using Id and set the svn:keywords property accordingly.


I would think we would want "Author Date Id Rev URL" set on  
everything, no?. So either cvs2svn or your tool (whichever you think  
is better), followed by

	svn propset svn:keywords "Author Date Id Rev URL" *

from the root of a working copy would take care of all of the  
existing files in the repository, I think.

George knows more about this than I do, but I think you can set up a  
global config file with

	enable-auto-props = yes
	* = svn:keywords="Author Date Id Rev URL"

to ensure it gets set on any future additions to the repository.


>>   - what do we want to do about svn:ignore?  I haven't seen any
>>     .cvsignore files.
>
> Not sure.  I've never used one personally, but (as Jason suggests) if
> you have ideas for one you can propose them, or we can suggest devs
> set up svn::ignore locally.

I use the default global-ignores

	global-ignores = *.o *.lo *.la #*# .*.rej *.rej .*~ *~ .#* .DS_Store

(again, in my system-wide config file), but I'm not tied to that. I  
do think we should have one, though; individuals can easily override  
any settings in the system-wide config with their own ~/.subversion/ 
config.


>> Beyond that, how does the repo look?

Looks great, George! Thanks for doing this.


Dave


From hartzell at alerce.com  Wed Jun 27 17:00:53 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 13:00:53 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
Message-ID: <18050.38853.526224.791878@almost.alerce.com>

rvos writes:
 >  
 > > Are we going to try to push svn commits to the read-mostly CVS repo,
 > > or just keep it around for history's sake (I lean towards the latter).
 > 
 > I'm a little confused - surely once the svn is up and running we'll
 > want *no more* cvs commits? Parallel repositories that each
 > accumulate stuff will be a nightmare. I'm probably just not getting
 > your point. 

There had been some point of keeping a CVS repository around as a
read-only mirror of the svn repo, presumably for people who's habits
or setup won't let them use svn.

In theory, each commit to the svn repo can be automagically pushed
down into CVS w/out user intervention, google will tell you how but
I've never run anything that way.

g.


From dmessina at wustl.edu  Wed Jun 27 17:27:01 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 12:27:01 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <99969FC2-479E-408C-AADB-7664EBE937CF@wustl.edu>

> [Chris]
> We'll also need to start a svn wiki page to show how to get relevant
> distros (similar in style probably to the cvs page, with dev
> information, how to set up ssh keys, https stuff, etc).

I cloned the CVS page and have started adapting it for Subversion:

	http://www.bioperl.org/wiki/Using_Subversion

I'll do some more on it later today, but if anyone wants to fiddle  
with it in the interim, please do.


Dave


From n.haigh at sheffield.ac.uk  Wed Jun 27 18:44:16 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 19:44:16 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <46823ABE.2080300@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk>
Message-ID: <4682B000.2050707@sheffield.ac.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> Sendu Bala wrote:
>>> In considering updating all the test scripts to [... use] 
>>> t/lib/BioperlTest.pm
>> I'm now in the process of converting all test scripts.
> 
> And I've now completed that job (for bioperl-live at least), except for 
> t/EUtilities.t since I know Chris is working on it.
> 
> 
> In addition to converting to Test::More where necessary, I've also made 
> all psuedo-TODO blocks real ones. Previously I had advised to use SKIP 
> blocks instead since TODO blocks need a Test::Harness upgrade. However I 
> think in the next release we ought to make such upgrading compulsory 
> (which should be automatic when combined with compulsory usage of 
> Module::Build and Test::More in turn: users simply have to update CPAN).
> 
> 
> The conversion to BioperlTest directly led to the discovery and fixing 
> of 6 minor bugs, so was certainly not without merit.
> 
> 
> No user or developer needs to have BIOPERLDEBUG permanently set to true 
> anymore. To run all tests you just have to answer yes to the BioDBGFF 
> and networking questions of 'perl Build.PL'. With './Build test' you 
> then get clean, easy-to-read output where it is obvious to see that we 
> currently have these issues:
> 
> t/Sopma.t and t/BioGraphics.t still have fails that I mentioned in 
> another thread.
> 
> t/protgraph.t, t/blast_pull.t, t/SearchIO.t, t/RestrictionIO.t, 
> t/RNA_SearchIO.t, t/PopGen.t, t/Genewise.t, t/Assembly.t and 
> t/Annotation.t all have TODO tests. If you know about those modules, now 
> would be a great time to implement those TODOs!
> 
> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are 
> deprecated' warnings.

Ah, that reminds me!

I recently tried to do an install of the cvs head (a week or two ago) on
a clean installation of Debian 4.0 (etch). During the installation, of
dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
Bioperl. I seem to remember this circular dependency cropping up before
- am I correct - and can you remind me how this was "fixed"?

Cheers
Nath


From bix at sendu.me.uk  Wed Jun 27 18:52:01 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 19:52:01 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B000.2050707@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
Message-ID: <4682B1D1.3080206@sendu.me.uk>

Nathan S. Haigh wrote:
> I recently tried to do an install of the cvs head (a week or two ago) on
> a clean installation of Debian 4.0 (etch). During the installation, of
> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
> Bioperl. I seem to remember this circular dependency cropping up before
> - am I correct - and can you remind me how this was "fixed"?

Yes, it always happens. It was 'fixed' by being completely ignored by 
me. Installation is guaranteed to fail, but if you really want it, 
trying to install again after you already have Bioperl installed will 
result in success.

Clearly something nicer could be done. Suggestions on a postcard...


From cjfields at uiuc.edu  Wed Jun 27 19:01:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 14:01:01 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B000.2050707@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
Message-ID: <A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>


On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote:

> Sendu Bala wrote:
>> ...
>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
>> deprecated' warnings.
>
> Ah, that reminds me!
>
> I recently tried to do an install of the cvs head (a week or two  
> ago) on
> a clean installation of Debian 4.0 (etch). During the installation, of
> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
> Bioperl. I seem to remember this circular dependency cropping up  
> before
> - am I correct - and can you remind me how this was "fixed"?
>
> Cheers
> Nath

Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part  
of Bioperl (and he could be come a dev).  That would solve it.

chris


From n.haigh at sheffield.ac.uk  Wed Jun 27 19:16:40 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 20:16:40 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
Message-ID: <4682B798.1010409@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> 
> On Jun 27, 2007, at 1:44 PM, Nathan S. Haigh wrote:
> 
>> Sendu Bala wrote:
>>> ...
>>> Bio::SeqIO::entrezgene is still generating 'Pseudo-hashes are
>>> deprecated' warnings.
>>
>> Ah, that reminds me!
>>
>> I recently tried to do an install of the cvs head (a week or two ago) on
>> a clean installation of Debian 4.0 (etch). During the installation, of
>> dependencies, Bio::ASN1::EntrezGene threw an error as it depends on
>> Bioperl. I seem to remember this circular dependency cropping up before
>> - am I correct - and can you remind me how this was "fixed"?
>>
>> Cheers
>> Nath
> 
> Wonder if Mingyi Liu would allow Bio::ASN1::EntrezGene to become part of
> Bioperl (and he could be come a dev).  That would solve it.
> 
> chris

Just to put the feelers out to see what people think.

It seems (to me at least) that Bioperl modules could/should? be released
as individual modules and that "bioperl" would really constitute a
"bundle" of all these modules - in terms of CPAN anyway. Am I correct in
this thinking? The Bio::ASN1::EntrezGene could simply require a
particular module rather than the whole of bioperl - might get out of
the circular dependency theoretically!?

I'm not suggesting moving in this direction, but just wondered what
others thought about this concept?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgreYczuW2jkwy2gRAi5IAJ9/Alq1fktEmAF16DlKcBVcy7d+jQCeIj+X
tOFQUQ7cGJLUITEDw1+QLxc=
=Yc+g
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Wed Jun 27 19:31:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 14:31:44 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B798.1010409@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
	<4682B798.1010409@sheffield.ac.uk>
Message-ID: <33C76559-4771-4FDC-9EEA-1645BC3C576C@uiuc.edu>


On Jun 27, 2007, at 2:16 PM, Nathan S. Haigh wrote:

> ...
>
> Just to put the feelers out to see what people think.
>
> It seems (to me at least) that Bioperl modules could/should? be  
> released
> as individual modules and that "bioperl" would really constitute a
> "bundle" of all these modules - in terms of CPAN anyway. Am I  
> correct in
> this thinking? The Bio::ASN1::EntrezGene could simply require a
> particular module rather than the whole of bioperl - might get out of
> the circular dependency theoretically!?
>
> I'm not suggesting moving in this direction, but just wondered what
> others thought about this concept?
>
> Nath

Well, Steve suggested splitting some of core into distinct groups,  
which I tend to agree with in some respects (speed up releases for  
those modules, such as SearchIO, DB, Graphics).  The problem we have  
yet to solve is what we consider 'core'.  Is it Bio::Seq and  
related?  Should it include Bio::DB*?  Should it just be Bio::*  
modules with no or very few external dependencies?  And so on...,   
probably not a decision we want to make immediately (until after svn  
migration, tests finished, maybe a release or two, a beer)...

The Bioperl module dependency that Bio::ASN1::EntrezGene has is  
Bio::Index::AbstractSeq.  You could try a test build of  
Bio::ASN1::EntrezGene to see what happens.

chris


From hlapp at gmx.net  Wed Jun 27 19:49:15 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 16:49:15 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
Message-ID: <E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>


On Jun 27, 2007, at 1:27 PM, David Messina wrote:

> I would think we would want "Author Date Id Rev URL" set on
> everything, no?. So either cvs2svn or your tool (whichever you think
> is better), followed by
>
> 	svn propset svn:keywords "Author Date Id Rev URL" *

Shouldn't this be done recursively?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Jun 27 19:50:27 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 16:50:27 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
Message-ID: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>


On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:

> Most projects make a clean break with cvs (no more commits) for the
> reasons you point out.  Not sure how the other core devs feel about
> that but I could go for that; it would def. prevent headaches.

There shouldn't be any cvs write support after the cut-over I think.  
I don't see the benefit that would justify the huge headache potential.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 27 20:01:40 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:01:40 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
Message-ID: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>


On Jun 27, 2007, at 2:50 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:
>
>> Most projects make a clean break with cvs (no more commits) for the
>> reasons you point out.  Not sure how the other core devs feel about
>> that but I could go for that; it would def. prevent headaches.
>
> There shouldn't be any cvs write support after the cut-over I  
> think. I don't see the benefit that would justify the huge headache  
> potential.
>
> 	-hilmar

Agreed, so maybe we should set that in stone.  That means no svn->cvs  
syncing post-migration as well, I assume.

Now how about a quick straw poll, what kind of access?  svn+ssh is  
already available, but some (Aaron among them) have indicated they  
would like https as well (not sure how involved it would be to set up).

chris


From hlapp at gmx.net  Wed Jun 27 20:08:40 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:08:40 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
Message-ID: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>


On Jun 27, 2007, at 5:01 PM, Chris Fields wrote:

> That means no svn->cvs syncing post-migration as well, I assume.

That's a bit of a different story. People out there have URL links  
into our anonymous CVS repository. If it's not too troublesome (and  
tend to I think it's not) I'd like to maintain those in working  
order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi  
script that maps between the URL flavors (i.e., that maps a CVS-style  
URL to the equivalent SVN link).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Wed Jun 27 20:15:10 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 16:15:10 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
Message-ID: <18050.50510.84363.355034@almost.alerce.com>

David Messina writes:
 > > [Chris]
 > >
 > > I managed to get it working using file://.  Haven't tried svn+ssh yet
 > > but I've had persistent problems getting ssh to work properly on my
 > > macbook; not sure why yet but I haven't had time to play around  
 > > with it.
 > 
 > I just did a checkout and a test commit, both via svn+ssh -- works  
 > great for me.

Is there anyone working outside of bioperl-{run,live,ext}?

g.


From bix at sendu.me.uk  Wed Jun 27 20:22:13 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 21:22:13 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682B798.1010409@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk> <467FBDD3.8050009@sendu.me.uk>
	<46823ABE.2080300@sendu.me.uk> <4682B000.2050707@sheffield.ac.uk>
	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>
	<4682B798.1010409@sheffield.ac.uk>
Message-ID: <4682C6F5.4020406@sendu.me.uk>

Nathan S. Haigh wrote:
> It seems (to me at least) that Bioperl modules could/should? be released
> as individual modules and that "bioperl" would really constitute a
> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
> this thinking? The Bio::ASN1::EntrezGene could simply require a
> particular module rather than the whole of bioperl - might get out of
> the circular dependency theoretically!?

No, it wouldn't. The 'problem' only arises because the user is 
/choosing/ to install both Bioperl and Bio::ASN1::EntrezGene at the same 
time. So even if Bioperl was released as separate modules there would 
still be that 'bundle' and users would still choose to do the same 
thing: install all the Bioperl modules as well as all its /optional/ 
recommended modules. And there lies the problem: Bio::ASN1::EntrezGene 
requires  Bioperl modules, and one Bioperl module requires 
Bio::ASN1::EntrezGene, so the circularity isn't solved.


(FYI:
Bio::ASN1::EntrezGene requires Bio::Index::AbstractSeq
Bio::Index::AbstractSeq requires a couple of Bioperl modules, including 
Bio::Root::Root

Bio::SeqIO::entrezgene requires Bio::ASN1::EntrezGene and a bunch of 
Bioperl modules, including Bio::Root::Root.
)


You only avoid circularity by choosing not to install everything in one 
go. Which is something you can do right now with no problems.


From n.haigh at sheffield.ac.uk  Wed Jun 27 20:24:18 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 21:24:18 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
Message-ID: <4682C772.5070502@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hilmar Lapp wrote:
> On Jun 27, 2007, at 12:18 PM, Chris Fields wrote:
> 
>> Most projects make a clean break with cvs (no more commits) for the
>> reasons you point out.  Not sure how the other core devs feel about
>> that but I could go for that; it would def. prevent headaches.
> 
> There shouldn't be any cvs write support after the cut-over I think.  
> I don't see the benefit that would justify the huge headache potential.
> 
> 	-hilmar

I agree. A clean switch from cvs read/write to svn read/write plus cvs
read only sounds the least problematic!

However, how will links to cvs be dealt with? Links on Bioperl could be
switched over to point to svn, but what about possible links from
external sources? Maybe a more generic approach of redirection could
work? Or a simple warning page stating the fact that we have moved from
cvs to svn and provide a common link to follow?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgsdyczuW2jkwy2gRAtuyAKDIpN0TNX0U7sTuE3i+fj6WFZ1K0QCfcX7Y
81KurFwJlRtYFxSmLZP56Sk=
=pp7b
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 27 20:30:19 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:30:19 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>


On Jun 26, 2007, at 5:21 PM, George Hartzell wrote:

>
>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>

Cool - this works for me.

One thing I notice is that in cvs log you see which version is in  
which branch which is useful to answer user queries that might be a  
version problem. svn log doesn't seem to want to show that. Does  
anyone have ideas for how to do this in svn?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Jun 27 20:32:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 17:32:18 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4682C772.5070502@sheffield.ac.uk>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<4682C772.5070502@sheffield.ac.uk>
Message-ID: <D080DC49-A2A4-44E4-9027-A63C1772CD85@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Jun 27, 2007, at 5:24 PM, Nathan S. Haigh wrote:

> However, how will links to cvs be dealt with?

Well I said before that probably one can write a couple of lines of  
Perl to write a cgi script that returns the appropriate redirect URL  
with a redirect status code.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFGgslWuV6N2JxL7qsRAvsTAKDjR18NzWzlj74mCF+diNpe2dLV2ACgn/4Y
f6sJ/ngeKEGpKHgyAHM1DAA=
=8n0E
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Wed Jun 27 20:50:11 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:50:11 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
Message-ID: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>


On Jun 27, 2007, at 3:30 PM, Hilmar Lapp wrote:

>
> On Jun 26, 2007, at 5:21 PM, George Hartzell wrote:
>
>>
>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>>
>
> Cool - this works for me.
>
> One thing I notice is that in cvs log you see which version is in  
> which branch which is useful to answer user queries that might be a  
> version problem. svn log doesn't seem to want to show that. Does  
> anyone have ideas for how to do this in svn?
>
> 	-hilmar

We prob. should move it to a new directory ASAP which george can  
write to when he needs to update.  cvs is in /home/repository/ 
bioperl, so maybe something similar, like /home/svn/repository/bioperl?

chris


From cjfields at uiuc.edu  Wed Jun 27 20:51:37 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:51:37 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<4FDAE951-7CFF-40B1-9CE3-9BCEAD37E58F@gmx.net>
Message-ID: <4D8CAAD9-4774-47FB-84E0-7FBA50EC377B@uiuc.edu>


On Jun 27, 2007, at 3:08 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 5:01 PM, Chris Fields wrote:
>
>> That means no svn->cvs syncing post-migration as well, I assume.
>
> That's a bit of a different story. People out there have URL links  
> into our anonymous CVS repository. If it's not too troublesome (and  
> tend to I think it's not) I'd like to maintain those in working  
> order, either b/c there is still a sync'ed r/o cvs, or b/c of a cgi  
> script that maps between the URL flavors (i.e., that maps a CVS- 
> style URL to the equivalent SVN link).
>
> 	-hilmar

I'll try getting a wiki page up as a checklist for this, including  
what direction we're heading in, ideas (your list and CGI redirect  
ideas, svn::ignore issues, etc).  Dave has already started on the  
'getting bioperl using svn' wiki page.

If we intend to sync cvs with svn we need to find the right tools or  
at least check for other projects which have done something similar.   
I haven't googled on that yet but I'll attempt to tonight.

chris


From cjfields at uiuc.edu  Wed Jun 27 20:53:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 15:53:08 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <C2A83EA3.EC27%bosborne11@verizon.net>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
Message-ID: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>

bioperl-run also.  I think the run CVS repo has some binary files, so  
if there are any problems with cvs2svn it'll be there.

chris

On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote:

> George,
>
> bioperl-db and bioperl-network should be included, I think.
>
> Brian O
>
>
> On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:
>
>> David Messina writes:
>>>> [Chris]
>>>>
>>>> I managed to get it working using file://.  Haven't tried svn 
>>>> +ssh yet
>>>> but I've had persistent problems getting ssh to work properly on my
>>>> macbook; not sure why yet but I haven't had time to play around
>>>> with it.
>>>
>>> I just did a checkout and a test commit, both via svn+ssh -- works
>>> great for me.
>>
>> Is there anyone working outside of bioperl-{run,live,ext}?
>>
>> g.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Wed Jun 27 21:05:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 22:05:50 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682C6F5.4020406@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk>
Message-ID: <4682D12E.3000803@sendu.me.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> It seems (to me at least) that Bioperl modules could/should? be released
>> as individual modules and that "bioperl" would really constitute a
>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>> particular module rather than the whole of bioperl - might get out of
>> the circular dependency theoretically!?
> 
> No, it wouldn't.
[snip]
> You only avoid circularity by choosing not to install everything in one 
> go.

Errr... I take that back. Since CPAN bundles install things in a certain 
order, you just have to make sure that everything Bio::ASN1::EntrezGene 
needs is installed first, then Bio::ASN1::EntrezGene, then 
Bio::SeqIO::entrezgene.

But the main problem with this approach is that maintenance, 
global-style code improvements and releases become a nightmare. I could, 
perhaps, imagine a scenario where the repository stayed as-is (one 
monolithic collection), but the dist action of Build.PL could be altered 
to generate a release package per module instead of one big release 
package of all modules, as is currently the case.

Is there much value in doing that? Does anyone want me to look into the 
feasibility of such a thing?


From bosborne11 at verizon.net  Wed Jun 27 20:19:47 2007
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 27 Jun 2007 16:19:47 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <18050.50510.84363.355034@almost.alerce.com>
Message-ID: <C2A83EA3.EC27%bosborne11@verizon.net>

George,

bioperl-db and bioperl-network should be included, I think.

Brian O


On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:

> David Messina writes:
>>> [Chris]
>>> 
>>> I managed to get it working using file://.  Haven't tried svn+ssh yet
>>> but I've had persistent problems getting ssh to work properly on my
>>> macbook; not sure why yet but I haven't had time to play around
>>> with it.
>> 
>> I just did a checkout and a test commit, both via svn+ssh -- works
>> great for me.
> 
> Is there anyone working outside of bioperl-{run,live,ext}?
> 
> g.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Wed Jun 27 21:25:53 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 22:25:53 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682D12E.3000803@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
Message-ID: <4682D5E1.2030507@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> It seems (to me at least) that Bioperl modules could/should? be released
>>> as individual modules and that "bioperl" would really constitute a
>>> "bundle" of all these modules - in terms of CPAN anyway. Am I correct in
>>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>>> particular module rather than the whole of bioperl - might get out of
>>> the circular dependency theoretically!?
>>
>> No, it wouldn't.
> [snip]
>> You only avoid circularity by choosing not to install everything in
>> one go.
> 
> Errr... I take that back. Since CPAN bundles install things in a certain
> order, you just have to make sure that everything Bio::ASN1::EntrezGene
> needs is installed first, then Bio::ASN1::EntrezGene, then
> Bio::SeqIO::entrezgene.
> 
> But the main problem with this approach is that maintenance,
> global-style code improvements and releases become a nightmare. I could,
> perhaps, imagine a scenario where the repository stayed as-is (one
> monolithic collection), but the dist action of Build.PL could be altered
> to generate a release package per module instead of one big release
> package of all modules, as is currently the case.
> 
> Is there much value in doing that? Does anyone want me to look into the
> feasibility of such a thing?


I think the value would be in other external modules being able to use
bioperl modules with more ease (not sure how many modules have, or
currently depend on bioperl) as they would depend on a single module,
rather than the whole package. However, how would the dependencies of
each module be handled? I'm clearly thinking aloud, but....Maybe this
would tease apart "cliques" of modules that are interdependent? and
could in themselves be shipped as bundles e.g. Bio::Graphics and have a
"master" bioperl bundle that installa all the bioperl modules.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgtXhczuW2jkwy2gRAiftAKDZQGDpaq5saEyE3ZfPyFqli4j+8QCfXbIB
2EZjccEFEzfFlx4H47gzwLk=
=nobl
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Wed Jun 27 21:35:28 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 27 Jun 2007 18:35:28 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
Message-ID: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>

Is there a reason not to port every subproject over?

	-hilmar

On Jun 27, 2007, at 5:53 PM, Chris Fields wrote:

> bioperl-run also.  I think the run CVS repo has some binary files, so
> if there are any problems with cvs2svn it'll be there.
>
> chris
>
> On Jun 27, 2007, at 3:19 PM, Brian Osborne wrote:
>
>> George,
>>
>> bioperl-db and bioperl-network should be included, I think.
>>
>> Brian O
>>
>>
>> On 6/27/07 4:15 PM, "George Hartzell" <hartzell at alerce.com> wrote:
>>
>>> David Messina writes:
>>>>> [Chris]
>>>>>
>>>>> I managed to get it working using file://.  Haven't tried svn
>>>>> +ssh yet
>>>>> but I've had persistent problems getting ssh to work properly  
>>>>> on my
>>>>> macbook; not sure why yet but I haven't had time to play around
>>>>> with it.
>>>>
>>>> I just did a checkout and a test commit, both via svn+ssh -- works
>>>> great for me.
>>>
>>> Is there anyone working outside of bioperl-{run,live,ext}?
>>>
>>> g.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Jun 27 21:36:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:36:29 -0500
Subject: [Bioperl-l] Splits again, formerly  Test overhaul complete
In-Reply-To: <4682D12E.3000803@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
Message-ID: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>


On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:

> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> It seems (to me at least) that Bioperl modules could/should? be  
>>> released
>>> as individual modules and that "bioperl" would really constitute a
>>> "bundle" of all these modules - in terms of CPAN anyway. Am I  
>>> correct in
>>> this thinking? The Bio::ASN1::EntrezGene could simply require a
>>> particular module rather than the whole of bioperl - might get  
>>> out of
>>> the circular dependency theoretically!?
>> No, it wouldn't.
> [snip]
>> You only avoid circularity by choosing not to install everything  
>> in one go.
>
> Errr... I take that back. Since CPAN bundles install things in a  
> certain order, you just have to make sure that everything  
> Bio::ASN1::EntrezGene needs is installed first, then  
> Bio::ASN1::EntrezGene, then Bio::SeqIO::entrezgene.
>
> But the main problem with this approach is that maintenance, global- 
> style code improvements and releases become a nightmare. I could,  
> perhaps, imagine a scenario where the repository stayed as-is (one  
> monolithic collection), but the dist action of Build.PL could be  
> altered to generate a release package per module instead of one big  
> release package of all modules, as is currently the case.
>
> Is there much value in doing that? Does anyone want me to look into  
> the feasibility of such a thing?

Not for the time being, at least in my opinion.  Too much on our  
plate at this point with svn migration, test conversion, bugzilla  
running over (next point of attack!), etc.  Maybe something to think  
about after, though I like the idea of a few splits to core as Steve  
suggested (SearchIO, Graphics, some LWP-related DB modules).

My (albeit extreme) thought is to have a lean-and-mean set of 'core'  
modules with as few external dependencies as possible, which could  
work around the circular dependency issue in this case:

                dep.on                  dep.on
Bio::Auxiliary -----> ASN1::EntrezGene -----> core
(with EntrezGene)                            (basic SeqIO, Index, DB,  
etc)
       \---->------>--- dep.on ->----->----->----/

Bioperl auxiliary modules would list core as a required dependency  
along with anything else needed for that particular aux. section  
(i.e. XML parsers, LWP, GD, etc.).  The whole mess, if needed, would  
be installed using Bundle::BioPerl or similar, with no part released  
w/o testing on the whole 'base' to ensure proper interaction.

If a fix needed to be made in one set, make the fix, test against  
bioperl 'base' as a whole, and release when possible.  No need to  
wait for a full-fledged 1.5.3 release.

Maybe wishful thinking...

chris


From cjfields at uiuc.edu  Wed Jun 27 21:44:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:44:47 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
	<9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
Message-ID: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>

We should port them all, yes.

chris

On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote:

> Is there a reason not to port every subproject over?
>
> 	-hilmar


From cjfields at uiuc.edu  Wed Jun 27 21:53:02 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 16:53:02 -0500
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <4682D5E1.2030507@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<4682D5E1.2030507@sheffield.ac.uk>
Message-ID: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>


On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote:

>> ...
>> Is there much value in doing that? Does anyone want me to look  
>> into the
>> feasibility of such a thing?
>
>
> I think the value would be in other external modules being able to use
> bioperl modules with more ease (not sure how many modules have, or
> currently depend on bioperl) as they would depend on a single module,
> rather than the whole package. However, how would the dependencies of
> each module be handled? I'm clearly thinking aloud, but....Maybe this
> would tease apart "cliques" of modules that are interdependent? and
> could in themselves be shipped as bundles e.g. Bio::Graphics and  
> have a
> "master" bioperl bundle that installa all the bioperl modules.

See my response to Sendu, and Steve Chervitz's original post and  
related thread:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ 
focus=15315

which pretty much covers the same ground.  I think at most 4-5 split  
'cliques', including core, with the fewest possible dependencies in  
core.  If we do any of this, it prob. should wait until after an svn  
migration and bugzilla bug stomping unless there is a (well-argued)  
advantage to doing it now.

chris


From n.haigh at sheffield.ac.uk  Wed Jun 27 22:07:31 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 27 Jun 2007 23:07:31 +0100
Subject: [Bioperl-l] Test overhaul complete
In-Reply-To: <1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<4682D5E1.2030507@sheffield.ac.uk>
	<1DF8175A-5212-40CE-A2C5-B9C34E057D00@uiuc.edu>
Message-ID: <4682DFA3.9090100@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> 
> On Jun 27, 2007, at 4:25 PM, Nathan S. Haigh wrote:
> 
>>> ...
>>> Is there much value in doing that? Does anyone want me to look into the
>>> feasibility of such a thing?
>>
>>
>> I think the value would be in other external modules being able to use
>> bioperl modules with more ease (not sure how many modules have, or
>> currently depend on bioperl) as they would depend on a single module,
>> rather than the whole package. However, how would the dependencies of
>> each module be handled? I'm clearly thinking aloud, but....Maybe this
>> would tease apart "cliques" of modules that are interdependent? and
>> could in themselves be shipped as bundles e.g. Bio::Graphics and have a
>> "master" bioperl bundle that installa all the bioperl modules.
> 
> See my response to Sendu, and Steve Chervitz's original post and related
> thread:
> 
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/focus=15315
> 
> which pretty much covers the same ground.  I think at most 4-5 split
> 'cliques', including core, with the fewest possible dependencies in
> core.  If we do any of this, it prob. should wait until after an svn
> migration and bugzilla bug stomping unless there is a (well-argued)
> advantage to doing it now.
> 
> chris


That's fine by me - or should I say, the best way forward - I was really
just thinking aloud :)

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGgt+jczuW2jkwy2gRAhPmAKDCgI1BOp/MOQVUQhQGqWaRRfPTaACfTPix
TSi/e8PtYTwpxn6x+ewrjBs=
=7Vp1
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Wed Jun 27 22:43:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 27 Jun 2007 23:43:48 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
Message-ID: <4682E824.1050507@sendu.me.uk>

Chris Fields wrote:
> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:
>> But the main problem with this approach is that maintenance, global- 
>> style code improvements and releases become a nightmare. I could,  
>> perhaps, imagine a scenario where the repository stayed as-is (one  
>> monolithic collection), but the dist action of Build.PL could be  
>> altered to generate a release package per module instead of one big  
>> release package of all modules, as is currently the case.
>>
>> Is there much value in doing that? Does anyone want me to look into  
>> the feasibility of such a thing?
> 
> Not for the time being, at least in my opinion.  Too much on our  
> plate at this point with svn migration, test conversion, bugzilla  
> running over (next point of attack!), etc.  Maybe something to think  
> about after, though I like the idea of a few splits to core as Steve  
> suggested (SearchIO, Graphics, some LWP-related DB modules).
[snip]
> If a fix needed to be made in one set, make the fix, test against  
> bioperl 'base' as a whole, and release when possible.  No need to  
> wait for a full-fledged 1.5.3 release.

What advantage is there of these defined splits instead of individual 
modules? As I see it you lose some of the potential benefits of breaking 
Bioperl up completely, whilst also suffering the maintenance problems I 
outlined in my objection to Steve's post.

Being able to work on all Bioperl from a single cvs (ne svn) check out/ 
archive, whilst distributing it as individual modules on CPAN seems like 
the best of both worlds to me. What am I missing?


From hartzell at alerce.com  Thu Jun 28 00:41:01 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:41:01 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<834150E7-9307-4B02-B5F0-6FB01393435D@gmx.net>
	<9DD44BC0-95D5-43E1-83F3-E3ACC6825563@uiuc.edu>
Message-ID: <18051.925.23313.932916@almost.alerce.com>

Chris Fields writes:
 > [...]
 > We prob. should move it to a new directory ASAP which george can  
 > write to when he needs to update.  cvs is in /home/repository/ 
 > bioperl, so maybe something similar, like /home/svn/repository/bioperl?

I'd be parsimonious (lazy...) and go for /home/svn/bioperl.

g.


From hartzell at alerce.com  Thu Jun 28 00:46:29 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:46:29 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
Message-ID: <18051.1253.87485.235496@almost.alerce.com>

Chris Fields writes:
 > [...]
 > Now how about a quick straw poll, what kind of access?  svn+ssh is  
 > already available, but some (Aaron among them) have indicated they  
 > would like https as well (not sure how involved it would be to set up).

What we do here, in large part, depends on what our host machine makes
available to us.

Is there an apache instance that we can use?  Maybe a separate one?

May someone among us configure it, or do we need to ask for help?  (in
other words, does anyone have sudo?)

Is there some reason to not include http: (using Digest authentication
so that passwords aren't passed in the clear?)?  Maybe even go so far
as to ask why bother with https:, it's not like we need to transfer
any data encrypted....

g.


From dmessina at wustl.edu  Thu Jun 28 03:02:25 2007
From: dmessina at wustl.edu (David Messina)
Date: Wed, 27 Jun 2007 22:02:25 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
Message-ID: <D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>


On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote:

>
> On Jun 27, 2007, at 1:27 PM, David Messina wrote:
>
>> I would think we would want "Author Date Id Rev URL" set on
>> everything, no?. So either cvs2svn or your tool (whichever you think
>> is better), followed by
>>
>> 	svn propset svn:keywords "Author Date Id Rev URL" *
>
> Shouldn't this be done recursively?


Yep, good catch! Thanks, Hilmar.

Should be:

	svn propset --recursive svn:keywords "Author Date Id Rev URL" *


From jason at bioperl.org  Thu Jun 28 03:29:09 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 28 Jun 2007 00:29:09 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <18051.1253.87485.235496@almost.alerce.com>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
Message-ID: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>

I think Chris D and I will need to confer a bit on https+svn.  I  
don't know when we'll have a good chance to discuss everything.  At  
some point this discussion is may need to be taken off bioperl and  
just the interested parties as we're delving into hardware geek land.

The repository machine (dev) is a locked down machine meaning it only  
really runs ssh and not many servers include httpd.  We have  
anonymous CVS (client and through httpd browsing) running on a  
separate machine (code) that has the info rsynced over every 10 or 15  
minutes. The foundation websites and mailing lists run on a third  
machine (portal).


If we decide to support https we'll need to spend a little time  
deciding how well we can keep it locked down - it will only be https  
not http for example and we may want to see about limiting ssh access  
to everyone if we migrate all OBF projects over to SVN and only  
support https.

Again to re-iterate what I think we would do:
  - SVN read/write will live on 'dev', _WHEN_ we switch over no  
writes to the CVS repository. It will be available by ssh+svn and  
potentially by https+svn
  - SVN read-only will live on 'code', it will be accessible by http+svn
  - CVS read-only will live on 'code', this will only be a sync from  
the SVN to the CVS.  See http://svn2cvs.tigris.org/ for details


As I tried to ask for in the past, would someone also illustrate the  
importance of why _WE_ need to switch to SVN on a wiki page on  
Bioperl so that when someone complains/asks about this in the future  
the arguments are already laid out.  I am basically fine with it, but  
I don't honestly see a compelling reason beyond what has been  
mentioned wrt better integration in IDEs.
http://bioperl.org/wiki/Why_SVN

-jason
On Jun 27, 2007, at 9:46 PM, George Hartzell wrote:

> Chris Fields writes:
>> [...]
>> Now how about a quick straw poll, what kind of access?  svn+ssh is
>> already available, but some (Aaron among them) have indicated they
>> would like https as well (not sure how involved it would be to set  
>> up).
>
> What we do here, in large part, depends on what our host machine makes
> available to us.
>
> Is there an apache instance that we can use?  Maybe a separate one?
>
> May someone among us configure it, or do we need to ask for help?  (in
> other words, does anyone have sudo?)
>
> Is there some reason to not include http: (using Digest authentication
> so that passwords aren't passed in the clear?)?  Maybe even go so far
> as to ask why bother with https:, it's not like we need to transfer
> any data encrypted....
>
> g.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From jason at bioperl.org  Thu Jun 28 03:51:32 2007
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 28 Jun 2007 00:51:32 -0300
Subject: [Bioperl-l] Splits again
In-Reply-To: <4682E824.1050507@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
Message-ID: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>

Hey guys - I'm wading in a bit late as I haven't had time to keep up  
with whole discussion.

So you are suggesting 800+ individual CPAN modules?  I don't think  
that is a good idea.  Why would you split up Bio::Seq::RichSeq and  
Bio::Seq into two separate packages for example? I think if you  
really want to move away from the monolithic install it has to be  
more logical by function - but I am not that optimistic that this is  
going to actually be easier for people.  Maybe I'm misunderstanding.

What are the arguments for separating things -- to make it so people  
aren't scared by the number of modules so they'll code?  It seems  
like some people just want it to be installed and run scripts - does  
having them install dozens of modules work.  Do we need to consider  
people how much this would suck if someone can't use CPAN or  
Module::Builder to automate dependancy tracking installation?  How  
does it work when modules are deprecated?

I'm not sure I have made up my mind on what I'd like to see, but at  
some point I think we need to get a clearer idea of what audience we  
are trying to serve best.  If want it to be easy to install maybe we  
should invest time into making OSX double-click installers, RPMs, and  
the Windows stuff easily installable.  If we want to serve the  
developers who aren't using SVN so we want to push out releases of  
modules ASAP?  I just am not clear on the motivation for some of the  
proposed changes.

Also - the main point I wanted to make - Can I suggest we spend a  
little time discussing what it will take to get a stable release for  
the current code as it stands (bioperl-live and bioperl-run)?  It  
seems like we really need to do this first so that we have a stable  
release that can be followed by CVS -> SVN migration, then consider  
major changes to the repository structure and release packaging, and  
potential deprecation and incorporation of other modules.


I assume there is no chance that we'd have a 1.6 candidate by BOSC  
next month?

Will it be productive to schedule a fair amount of time at BOSC  
discussing how to partition out the packages into separate sub- 
packages after we've done a successful release rather than trying to  
change things right now? I realize not everyone will be there but  
maybe it will be easier to interact on this then.

I think it will also be time to talk with Lincoln/Scott about how  
Gbrowse is structured and if that is working for them.  There is too  
much code in different places that I think we need to figure out how  
to structure it properly so those packages can be released.  It would  
probably mean moving Bio::Graphics, Bio::DB::GFF and  
Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages  
so they could be released more regularly on par with Gbrowse  
schedules.   Also I think someone needs to figure out Bio::Tools::GFF  
vs Bio::FeatureIO -- what do we want to do?  I don't think we really  
fully support GFF3 that well -- the X2GFF scripts probably need some  
more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL,  
etc... ) and or migration to the proper GFF writing.


-jason
On Jun 27, 2007, at 7:43 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 27, 2007, at 4:05 PM, Sendu Bala wrote:
>>> But the main problem with this approach is that maintenance, global-
>>> style code improvements and releases become a nightmare. I could,
>>> perhaps, imagine a scenario where the repository stayed as-is (one
>>> monolithic collection), but the dist action of Build.PL could be
>>> altered to generate a release package per module instead of one big
>>> release package of all modules, as is currently the case.
>>>
>>> Is there much value in doing that? Does anyone want me to look into
>>> the feasibility of such a thing?
>>
>> Not for the time being, at least in my opinion.  Too much on our
>> plate at this point with svn migration, test conversion, bugzilla
>> running over (next point of attack!), etc.  Maybe something to think
>> about after, though I like the idea of a few splits to core as Steve
>> suggested (SearchIO, Graphics, some LWP-related DB modules).
> [snip]
>> If a fix needed to be made in one set, make the fix, test against
>> bioperl 'base' as a whole, and release when possible.  No need to
>> wait for a full-fledged 1.5.3 release.
>
> What advantage is there of these defined splits instead of individual
> modules? As I see it you lose some of the potential benefits of  
> breaking
> Bioperl up completely, whilst also suffering the maintenance  
> problems I
> outlined in my objection to Steve's post.
>
> Being able to work on all Bioperl from a single cvs (ne svn) check  
> out/
> archive, whilst distributing it as individual modules on CPAN seems  
> like
> the best of both worlds to me. What am I missing?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/


From chris at bioteam.net  Thu Jun 28 04:08:25 2007
From: chris at bioteam.net (Chris Dagdigian)
Date: Thu, 28 Jun 2007 00:08:25 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <97A3257B-8E00-48D7-8B7D-51AD728CB8F7@bioteam.net>


My understanding of "https+svn" is that it is actually WebDAV-over- 
HTTP which means that not only would we need to light up a HTTPD  
server on the developer box we'd also have to get a stable mod_dav  
module installed (sometimes not trivial) and then we would have to  
figure out how to handle the authentication bits. Right now with SSH  
we use Unix group permissions to figure out who can write to what  
repository -- WebDAV makes this a lot more complicated.

Forcing encryption over https will prevent someone from sniffing a  
developer password which removes the main security issue. The next  
problem is going to be integrating the DAV module with Linux PAM so  
that existing usernames and passwords can be used, -OR- we have to  
set up and maintain an entirely separate set of username and password  
maps for each developer and each SVN project.

I'm not super concerned about this -- BioTeam runs svn internally and  
we expose our SVN for employees both via WebDAV and SVN+SSH - it's  
not that hard to set up.

My biggest concern really has to do with how much extra work this  
will mean for the OBF sysadmin team. If there is an easy way to get a  
stable Apache/DAV/SVN integration going with authentication coming  
from Linux PAM then this is no big deal. If we have to manually  
maintain separate authentication lists then it will be kind of a hassle.

Like Jason mentioned, the OBF currently segregates "stuff" onto three  
different servers with three levels of security:

- dev.open-bio.org -- Developers only, SSH access only (main  
sourcecode repository for OBF)
- portal.open-bio.org -- Websites, Wikis, Blogs, Mailing list servers  
and helpdesk.open-bio.org
- code.open-bio.org -- "Disposable" anonymous access server that we  
can easily burn/wipe/reinstall if it ever gets hacked

Everything else that Jason mentioned is fine and easy to set up (if  
not already running):

  - SVN+SSH for developers
  - Anonymous SVN and Anonymous RSYNC for community access on  
code.open-bio.org
  - svn2cvs for whomever wants it on code.open-bio.org
  - web based SVN code browser installed on http://code.open-bio.org


Regards,
Chris


On Jun 27, 2007, at 11:29 PM, Jason Stajich wrote:

> I think Chris D and I will need to confer a bit on https+svn.  I  
> don't know when we'll have a good chance to discuss everything.  At  
> some point this discussion is may need to be taken off bioperl and  
> just the interested parties as we're delving into hardware geek land.
>
> The repository machine (dev) is a locked down machine meaning it  
> only really runs ssh and not many servers include httpd.  We have  
> anonymous CVS (client and through httpd browsing) running on a  
> separate machine (code) that has the info rsynced over every 10 or  
> 15 minutes. The foundation websites and mailing lists run on a  
> third machine (portal).
>
>
> If we decide to support https we'll need to spend a little time  
> deciding how well we can keep it locked down - it will only be  
> https not http for example and we may want to see about limiting  
> ssh access to everyone if we migrate all OBF projects over to SVN  
> and only support https.
>
> Again to re-iterate what I think we would do:
>  - SVN read/write will live on 'dev', _WHEN_ we switch over no  
> writes to the CVS repository. It will be available by ssh+svn and  
> potentially by https+svn
>  - SVN read-only will live on 'code', it will be accessible by http 
> +svn
>  - CVS read-only will live on 'code', this will only be a sync from  
> the SVN to the CVS.  See http://svn2cvs.tigris.org/ for details
>
>
> As I tried to ask for in the past, would someone also illustrate  
> the importance of why _WE_ need to switch to SVN on a wiki page on  
> Bioperl so that when someone complains/asks about this in the  
> future the arguments are already laid out.  I am basically fine  
> with it, but I don't honestly see a compelling reason beyond what  
> has been mentioned wrt better integration in IDEs.
> http://bioperl.org/wiki/Why_SVN
>
> -jason
> On Jun 27, 2007, at 9:46 PM, George Hartzell wrote:
>
>> Chris Fields writes:
>>> [...]
>>> Now how about a quick straw poll, what kind of access?  svn+ssh is
>>> already available, but some (Aaron among them) have indicated they
>>> would like https as well (not sure how involved it would be to  
>>> set up).
>>
>> What we do here, in large part, depends on what our host machine  
>> makes
>> available to us.
>>
>> Is there an apache instance that we can use?  Maybe a separate one?
>>
>> May someone among us configure it, or do we need to ask for help?   
>> (in
>> other words, does anyone have sudo?)
>>
>> Is there some reason to not include http: (using Digest  
>> authentication
>> so that passwords aren't passed in the clear?)?  Maybe even go so far
>> as to ask why bother with https:, it's not like we need to transfer
>> any data encrypted....
>>
>> g.
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> jason at bioperl.org
> http://jason.open-bio.org/
>
>


From cjfields at uiuc.edu  Thu Jun 28 04:18:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 27 Jun 2007 23:18:03 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4682E824.1050507@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
Message-ID: <FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>


On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:

> Chris Fields wrote:
> ...
>> If a fix needed to be made in one set, make the fix, test against   
>> bioperl 'base' as a whole, and release when possible.  No need to   
>> wait for a full-fledged 1.5.3 release.
>
> What advantage is there of these defined splits instead of  
> individual modules? As I see it you lose some of the potential  
> benefits of breaking Bioperl up completely, whilst also suffering  
> the maintenance problems I outlined in my objection to Steve's post.
>
> Being able to work on all Bioperl from a single cvs (ne svn) check  
> out/ archive, whilst distributing it as individual modules on CPAN  
> seems like the best of both worlds to me. What am I missing?

Okay, forewarned, but here's my long-winded reasoning.  The short and  
sweet version: I (very) respectfully don't agree with you, at least  
re: the idea we should commit all modules to CPAN independently.  It  
doesn't make any sense to me, but maybe you can elaborate more?   
Maybe I'm misinterpreting what you mean?

Also, I agree with Steve C. that core is anything but a  
representation of a 'core' set of modules, and some sections could  
(should?) be split off into discrete, cohesive units.  We may be  
alone in that camp, though it doesn't seem so (it's popped up more  
than a few times, in one form or another).  If you want an in-depth  
explanation for both opinions, read on (below my sig), or feel free  
to bypass it.  I'll understand.

Finally, all of this should wait until later.  Much later, like after  
a decent release, after svn, etc kind of 'later'.  I think we can  
agree on that.

.
.
.
.
.

Still here?  Okay... each issue (skip as needed):

Individual CPAN modules:

CPAN is not our personal versioning system; it may be if a  
distribution consists of only a few modules, but not when it's one of  
the largest distros present.  If someone wants to update an  
individual bioperl module for a quick bug fix they are more than  
welcome to download it via cvs, svn, or even using a web browser, and  
replace the one they have.  In most cases, it works w/o problems.   
With Module::Build you have even made it easier if a full  
installation is necessary.

I'm trying to reason how one could break up the individual SeqIO/ 
SearchIO/otherIO modules into single module distributions.  They are  
intrinsically tied together (SeqIO::genbank won't work w/o SeqIO,  
which relies on the various interfaces, RootIO, and on down).  How  
would tests be run off CPAN when the modules are distributed  
independently?  Would they also be individually distributed?  What  
would you use to tie all the individual modules together?  How would  
you explain to the CPAN maintainers that you want to split bioperl  
into 990 individual modules, all updated independently, but intend on  
bundling them afterwards anyway?

I'm failing to see the advantages to this approach, but if you can  
find an example where this was done successfully on CPAN or elsewhere  
maybe I could see what you mean.

Splitting up core:

As I see it, here are the advantages of a defined split as Steve and  
I see it (off the top of my head).  Some of this probably reiterates  
my previous points, as well as Steve's, so apologies in advance.

- A lean, mean, focused set of bioperl base modules (core) w/o or  
with very few external deps, minimal installation issues, etc.  The  
very basic stuff to get up and running.

- BioPerl bundled modules (Nathan's 'cliques') with defined, focused  
functionality, code, and tests, which add a bit more 'sugar' to the  
base functionality of the core.  If you only care about parsing BLAST  
reports, get SearchIO, which requires core and optionally other  
modules (XML::SAX).  If you want additional DB functionality apart  
from the very basic ones in core, install DB (with it's additional  
requirements, including core, DBI, and so on).  Same with Graphics,  
Tools, Tree/Phylo, etc.  We just need to define and limit the number  
of splits.

- Easier to add additional bundled modules.  For instance, I could  
focus all of my RNA work into a discrete set of modules (say, bioperl- 
rna) which I maintain, I ensure works with the latest core code, I  
ensure also plays well with the other children =) , and I distribute  
via CPAN.  Same with EUtilities, which could go into a separated DB- 
related set or stay in core.

- If we want a full-fledged 'install everything', the CPAN Bundle  
system is available.  I think it's easier to use a Bundle for 4-5,  
even 10 groups of modules as opposed to over 900.

- A Bundle or a build file where discrete distributions are listed  
(Bio::SearchIO, etc) wouldn't need to be updated every time a new  
module is added to a distribution.  I suppose this could be  
automated, but why have the additional headache?

- A chance to cut out some cruft.  We all know that particular areas  
need work or a complete overhaul (Restriction, Structure, maybe a few  
others).  Smaller, concentrated sets of modules I believe would be  
easier to maintain, and those that don't get use will eventually fall  
out of favor and may be lost or replaced from the more maintained  
group of modules.  Survival of the fittest.

- We already have had practice; bioperl-db, bioperl-run, bioperl- 
network, and others.  Those that have been routinely maintained and  
enjoy wide use (db, run, network) have survived; others not so much  
(corba-related stuff, microarray, ext, etc., though the code is still  
available if someone else wants to take it up and revive it!).

Disadvantages of a defined split:

- The initial headache of identifying which groups go where,  
coordinating with those who rely on bioperl (GMOD, etc) on how this  
will be set up, so on...

- Separate groups of modules require testing together to ensure  
functionality is consistent and maintained (something I think you  
pointed out previously).

- I think an increased possibility of branching is possible.

- Extra headaches for devs, who have to keep track of the various  
critical distributions and make sure they work well together.

- Maybe others, but it's getting late here.  Add more as needed; I'm  
sure there are a number more.


chris


From cjfields at uiuc.edu  Thu Jun 28 05:17:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 00:17:01 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <671B8432-28DA-47DA-9E0C-66AF0E3D5973@uiuc.edu>

D'oh!  Just when I wanted to go to bed.  It's not fair, you're in  
California...

On Jun 27, 2007, at 10:51 PM, Jason Stajich wrote:

> Hey guys - I'm wading in a bit late as I haven't had time to keep up
> with whole discussion.
>
> So you are suggesting 800+ individual CPAN modules?  I don't think
> that is a good idea.  Why would you split up Bio::Seq::RichSeq and
> Bio::Seq into two separate packages for example? I think if you
> really want to move away from the monolithic install it has to be
> more logical by function - but I am not that optimistic that this is
> going to actually be easier for people.  Maybe I'm misunderstanding.

Okay, so maybe it wasn't just me.

> What are the arguments for separating things -- to make it so people
> aren't scared by the number of modules so they'll code?  It seems
> like some people just want it to be installed and run scripts - does
> having them install dozens of modules work.  Do we need to consider
> people how much this would suck if someone can't use CPAN or
> Module::Builder to automate dependancy tracking installation?  How
> does it work when modules are deprecated?

What I envision for core is maybe not just one distribution, but a  
cluster of distributions:

base - Bio::Seq; Bio::SeqIO; Bio::AlignIO, some Bio::DB, associated  
modules.  Bare bones, with as few dependencies as possible.
aux - Any Bio::SeqIO, Bio::AlignIO, Bio::DB etc. that requires  
additional modules.
search - Bio::Search and SearchIO
tools - Bio::Tools, Bio::Restriction, maybe DB modules, GFF-related  
stuff?
graphics - Bio::Graphics.  Maybe GMOD-related stuff here?

The last four would list bioperl-core as a dependency themselves  
along with any other modules necessary.  We could also have the core  
Build.PL ask the user if they want to install the other non-base  
distros, and maybe include bioperl-db, bioperl-network, and bioperl- 
run in the loop if requested.

All would be installed as a bundle similar to Bundle::BioPerl, but  
have regular CPAN point releases (1.x.x) independently from one  
another i.e. for bug fixes, with a yearly/biyearly timed full release  
(1.x) of the whole shebang.  Any point release for any 'core'  
distribution would have to be tested against the others prior to  
release.

This is basically following Steve's train of thought, though more  
elaborated:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/15315/ 
focus=15315

> I'm not sure I have made up my mind on what I'd like to see, but at
> some point I think we need to get a clearer idea of what audience we
> are trying to serve best.  If want it to be easy to install maybe we
> should invest time into making OSX double-click installers, RPMs, and
> the Windows stuff easily installable.  If we want to serve the
> developers who aren't using SVN so we want to push out releases of
> modules ASAP?  I just am not clear on the motivation for some of the
> proposed changes.

I think regular CPAN releases with updated PPMs hosted via portal  
work fine for the most part, but it would be nice to host RPMs.   
Others (Allen Day, for instance) have donated time to generate RPMs  
but they seem to lag behind a bit more.

The original idea for svn arose from an unrelated thread with Mark  
Johnson discussing something (Glimmer maybe?) and took off from  
there.  I was actually pretty surprised it took on a life of it's  
own.  As for the motivation to switch, I haven't specifically used it  
myself, but the large number of responses seem to indicate others  
have and seem happy with it.  Rutger Vos had also indicated he would  
move Bio::Phylo over to the repo if we used svn.  We def. should  
address the issues you bring up (why _WE_ need svn) more succinctly  
but that shouldn't be an issue.

> Also - the main point I wanted to make - Can I suggest we spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

Agreed.  We prob. need to schedule a good couple of days (or so) to  
squash bugs.

> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

Um, not likely as nothing has been addressed Feature/Annotation-wise  
(overloads are still there, methods have not been deprecated, etc).   
There was an underlying assumption these would have an effect on GMOD- 
related stuff (I remember reading a post from Scott Cain in the mail  
archive mentioning something along these lines after the 1.5 release  
hubbub).

Maybe a quick 1.5.3 for BOSC, with a 1.6 for fall?

> Will it be productive to schedule a fair amount of time at BOSC
> discussing how to partition out the packages into separate sub-
> packages after we've done a successful release rather than trying to
> change things right now? I realize not everyone will be there but
> maybe it will be easier to interact on this then.

How many are going to be there?  I can't go this year except on my  
own dime (which I don't have many of, student loans and all, sorry),  
though I'll likely be in a new lab by spring which is likely more  
amenable to funding.  If there is a hackathon in the late fall (post- 
sept) I'll make it a point to go regardless.

> I think it will also be time to talk with Lincoln/Scott about how
> Gbrowse is structured and if that is working for them.  There is too
> much code in different places that I think we need to figure out how
> to structure it properly so those packages can be released.  It would
> probably mean moving Bio::Graphics, Bio::DB::GFF and
> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
> so they could be released more regularly on par with Gbrowse
> schedules.   Also I think someone needs to figure out Bio::Tools::GFF
> vs Bio::FeatureIO -- what do we want to do?  I don't think we really
> fully support GFF3 that well -- the X2GFF scripts probably need some
> more good testing (where X is BLAST,FASTA,Sim4, GenBank, EMBL,
> etc... ) and or migration to the proper GFF writing.
>
>
> -jason

Will Lincoln or Scott be at BOSC?

chris


From dmessina at wustl.edu  Thu Jun 28 05:21:58 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 00:21:58 -0500
Subject: [Bioperl-l] finding statistics on AA
In-Reply-To: <4681F4B4.8010609@pacific.net.sg>
References: <4681F4B4.8010609@pacific.net.sg>
Message-ID: <F57E70E8-BBDA-45CF-B2C7-E05AED04F4C6@wustl.edu>

Hi Melvin,

I don't think BioPerl has any information content-related code. I'm  
not terribly familiar with it myself, but the usual recommendation is  
to look at the EMBOSS package:

	http://en.wikipedia.org/wiki/EMBOSS

Dave


From bix at sendu.me.uk  Thu Jun 28 06:38:48 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 07:38:48 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <46835778.5070901@sendu.me.uk>

Jason Stajich wrote:
> So you are suggesting ou are suggesting 800+ individual CPAN modules?
> I don't think that is a good idea.  Why would you split up
> Bio::Seq::RichSeq and Bio::Seq into two separate packages for
> example? I think if you really want to move away from the monolithic
> install it has to be more logical by function - but I am not that
> optimistic that this is going to actually be easier for people.
> Maybe I'm misunderstanding.
> 
> What are the arguments for separating things -- to make it so people
>  aren't scared by the number of modules so they'll code?  It seems
> like some people just want it to be installed and run scripts - does
> having them install dozens of modules work.  Do we need to consider
> people how much this would suck if someone can't use CPAN or
> Module::Builder to automate dependancy tracking installation?  How
> does it work when modules are deprecated?

See my upcoming reply to Chris. Briefly, if the only change is to the
dist action of Build.PL, we can make a single archive of all modules
available to non-CPAN users, and individual modules available to CPAN
users. No problems.


> Also - the main point I wanted to make - Can I suggest we spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

I'd recommend that a 'stable' release shouldn't happen until we resolve
all the missing tests and bugzilla bugs (because I think the opportunity
should be taken to have it stable both in terms of interface /and/
bugs). Which is a lot of work.


> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

None.


From bix at sendu.me.uk  Thu Jun 28 07:25:03 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 08:25:03 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
Message-ID: <4683624F.6020402@sendu.me.uk>

Chris Fields wrote:
> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>> What advantage is there of these defined splits instead of  
>> individual modules? As I see it you lose some of the potential  
>> benefits of breaking Bioperl up completely, whilst also suffering  
>> the maintenance problems I outlined in my objection to Steve's post.
>>
>> Being able to work on all Bioperl from a single cvs (ne svn) check  
>> out/ archive, whilst distributing it as individual modules on CPAN  
>> seems like the best of both worlds to me. What am I missing?
> 
> Okay, forewarned, but here's my long-winded reasoning.  The short and  
> sweet version: I (very) respectfully don't agree with you, at least  
> re: the idea we should commit all modules to CPAN independently. It  
> doesn't make any sense to me, but maybe you can elaborate more?   
> Maybe I'm misinterpreting what you mean?

The short and sweet version: my proposal has all the benefits of yours, 
but none of the disadvantages. What's not to like?


> Finally, all of this should wait until later.  Much later, like after  
> a decent release, after svn, etc kind of 'later'.  I think we can  
> agree on that.

Hmm, not really. If it can be implemented by a change in just Build.PL 
and ModuleBuildBioperl, its really independent of everything else. 
That's the beauty of it: the only thing that changes is how things are 
uploaded to and downloaded from CPAN. The only person that normally 
deals with that issue is the pumpkin for a release, and he only cares 
about it at release time.

In fact, if we're going to do it at all it makes sense to try it out on 
a minor release like 1.5.3. We've already got experience of doing it 
split-style from 1.5.2. (And let me tell you: splits at the code-base 
level suck.)


> Individual CPAN modules:
> 
> CPAN is not our personal versioning system; it may be if a  
> distribution consists of only a few modules, but not when it's one of  
> the largest distros present.  If someone wants to update an  
> individual bioperl module for a quick bug fix they are more than  
> welcome to download it via cvs, svn, or even using a web browser, and  
> replace the one they have.

And where is the harm in letting them do it via CPAN as well? In fact, 
there are significant benefits:


> I'm trying to reason how one could break up the individual SeqIO/ 
> SearchIO/otherIO modules into single module distributions.  They are  
> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO,  
> which relies on the various interfaces, RootIO, and on down).  How  
> would tests be run off CPAN when the modules are distributed  
> independently?

Bio::SeqIO::genbank would have a dependency on the latest version of 
Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.

So when a user wants to get the latest version of Bio::SeqIO::genbank, 
they no longer have to worry about what other modules in its dependency 
hierarchy they should also install.

Instead they just request Bio::SeqIO::genbank which itself ensures you 
have the latest version of all its dependencies before installing itself 
and running its tests.

When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank 
users should have, he could just call './Build dist Bio::SeqIO::genbank' 
which would generate a new package for Bio::SeqIO::genbank suitable for 
uploading to CPAN. No more long release cycles and having to constantly 
tell people to 'use CVS' to get working Bioperl code.


> Would they also be individually distributed?  What  
> would you use to tie all the individual modules together?  How would  
> you explain to the CPAN maintainers that you want to split bioperl  
> into 990 individual modules, all updated independently, but intend on  
> bundling them afterwards anyway?

They would be tied together by a CPAN bundle. You don't have to 
'explain' anything to the CPAN maintainers because you're not doing 
anything wrong. In fact, you're using it the way you're supposed to.


> Splitting up core:
> 
> As I see it, here are the advantages of a defined split as Steve and  
> I see it (off the top of my head).  Some of this probably reiterates  
> my previous points, as well as Steve's, so apologies in advance.

Below I answer with how it would be with my single-module approach 
compared to the defined splits.


> - A lean, mean, focused set of bioperl base modules (core) w/o or  
> with very few external deps, minimal installation issues, etc.  The  
> very basic stuff to get up and running.

Even leaner, even more focused.


> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused  
> functionality, code, and tests, which add a bit more 'sugar' to the  
> base functionality of the core.  If you only care about parsing BLAST  
> reports, get SearchIO, which requires core and optionally other  
> modules (XML::SAX).  If you want additional DB functionality apart  
> from the very basic ones in core, install DB (with it's additional  
> requirements, including core, DBI, and so on).  Same with Graphics,  
> Tools, Tree/Phylo, etc.  We just need to define and limit the number  
> of splits.

The same can be achieved with CPAN bundles for each kind of functional 
grouping you can think of. And since its just a single text file that 
defines such a grouping, its easy to change or add new ones as you feel 
like it, as opposed to the rather more permanent and substantial effort 
of creating one of your splits on the code-base level.

Also, the world doesn't have to rely on /our/ ideas of what a useful 
functional split is. If someone just wants to parse Blast results, they 
can just use CPAN to install Bio::SearchIO::blast_pull instead of having 
to install all of SearchIO.


> - Easier to add additional bundled modules.  For instance, I could  
> focus all of my RNA work into a discrete set of modules (say, bioperl- 
> rna) which I maintain, I ensure works with the latest core code, I  
> ensure also plays well with the other children =) , and I distribute  
> via CPAN.  Same with EUtilities, which could go into a separated DB- 
> related set or stay in core.

And if you lose interest in them? They eventually die because they no 
longer have someone looking after them by default (the pumpkin and other 
devs). Alternatively you could just make a CPAN bundle. One text file! 
Easy! No duplication of modules in CPAN, no new hassle for you or the 
Bioperl 'core' pumpkin to ensure that the latest version of each work 
with each other and other splits.


> - If we want a full-fledged 'install everything', the CPAN Bundle  
> system is available.  I think it's easier to use a Bundle for 4-5,  
> even 10 groups of modules as opposed to over 900.

No, it isn't any easier. Its /equally/ easy to install a bundle of 900 
packages of 900 modules as it is to install 5 packages of 900 modules.

When not installing absolutely everything, but perhaps 'most' things, 
there's the additional benefit that it would be easier to skip a 
particular Bio::module because you didn't want to install its external 
dependencies and weren't that interested in it anyway.


> - A Bundle or a build file where discrete distributions are listed  
> (Bio::SearchIO, etc) wouldn't need to be updated every time a new  
> module is added to a distribution.  I suppose this could be  
> automated, but why have the additional headache?

Yes, it would be automated, and no, it wouldn't at all be any kind of 
additional headache. I'm proposing a fully-automated system that the 
pumpkin wouldn't even have to think about it. Much /less/ of a headache 
than dealing with splits. Orders of magnitude easier to deal with.


> - A chance to cut out some cruft.  We all know that particular areas  
> need work or a complete overhaul (Restriction, Structure, maybe a few  
> others).  Smaller, concentrated sets of modules I believe would be  
> easier to maintain, and those that don't get use will eventually fall  
> out of favor and may be lost or replaced from the more maintained  
> group of modules.  Survival of the fittest.

And the smallest, most concentrated set of modules is the individual module.


> - We already have had practice; bioperl-db, bioperl-run, bioperl- 
> network, and others.  Those that have been routinely maintained and  
> enjoy wide use (db, run, network) have survived; others not so much  
> (corba-related stuff, microarray, ext, etc., though the code is still  
> available if someone else wants to take it up and revive it!).

The reason some of these existing splits (micoarray, ext) have fallen by 
the way-side? /Because/ they're splits. If they had been part of 
bioperl-live all along, they'd have been kept in a working, compatible 
state and would have been released along with everything else in 1.5.2


> Disadvantages of a defined split:
> 
> - The initial headache of identifying which groups go where,  
> coordinating with those who rely on bioperl (GMOD, etc) on how this  
> will be set up, so on...

No need to worry about this with individual modules.


> - Separate groups of modules require testing together to ensure  
> functionality is consistent and maintained (something I think you  
> pointed out previously).

No need to worry.


> - I think an increased possibility of branching is possible.
> 
> - Extra headaches for devs, who have to keep track of the various  
> critical distributions and make sure they work well together.

No headaches.


From charles-listes+bioperl at plessy.org  Thu Jun 28 07:40:04 2007
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Thu, 28 Jun 2007 16:40:04 +0900
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
Message-ID: <20070628074004.GD6338@kunpuu.plessy.org>

Dear developpers,

I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
it would make sense to call it "bioperl-live" and distribute it in
parallel with the stable 1.4.0 version, if bioperl-live means "the
current developepr version".

If I am wrong, can somebody explain me what bioperl-live exactly refers
to ?

Have a nice day,

-- 
Charles Plessy
Debian-med packaging team
Wako, Saitama, Japan


From n.haigh at sheffield.ac.uk  Thu Jun 28 08:23:10 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:23:10 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <46836FEE.5030203@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Chris Fields wrote:
>> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>>> What advantage is there of these defined splits instead of 
>>> individual modules? As I see it you lose some of the potential 
>>> benefits of breaking Bioperl up completely, whilst also suffering 
>>> the maintenance problems I outlined in my objection to Steve's post.
>>>
>>> Being able to work on all Bioperl from a single cvs (ne svn) check 
>>> out/ archive, whilst distributing it as individual modules on CPAN 
>>> seems like the best of both worlds to me. What am I missing?
>>
>> Okay, forewarned, but here's my long-winded reasoning.  The short and 
>> sweet version: I (very) respectfully don't agree with you, at least 
>> re: the idea we should commit all modules to CPAN independently. It 
>> doesn't make any sense to me, but maybe you can elaborate more?  
>> Maybe I'm misinterpreting what you mean?
> 
> The short and sweet version: my proposal has all the benefits of yours,
> but none of the disadvantages. What's not to like?
> 
> 
>> Finally, all of this should wait until later.  Much later, like after 
>> a decent release, after svn, etc kind of 'later'.  I think we can 
>> agree on that.
> 
> Hmm, not really. If it can be implemented by a change in just Build.PL
> and ModuleBuildBioperl, its really independent of everything else.
> That's the beauty of it: the only thing that changes is how things are
> uploaded to and downloaded from CPAN. The only person that normally
> deals with that issue is the pumpkin for a release, and he only cares
> about it at release time.
> 
> In fact, if we're going to do it at all it makes sense to try it out on
> a minor release like 1.5.3. We've already got experience of doing it
> split-style from 1.5.2. (And let me tell you: splits at the code-base
> level suck.)
> 
> 
>> Individual CPAN modules:
>>
>> CPAN is not our personal versioning system; it may be if a 
>> distribution consists of only a few modules, but not when it's one of 
>> the largest distros present.  If someone wants to update an 
>> individual bioperl module for a quick bug fix they are more than 
>> welcome to download it via cvs, svn, or even using a web browser, and 
>> replace the one they have.
> 
> And where is the harm in letting them do it via CPAN as well? In fact,
> there are significant benefits:
> 
> 
>> I'm trying to reason how one could break up the individual SeqIO/
>> SearchIO/otherIO modules into single module distributions.  They are 
>> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, 
>> which relies on the various interfaces, RootIO, and on down).  How 
>> would tests be run off CPAN when the modules are distributed 
>> independently?
> 
> Bio::SeqIO::genbank would have a dependency on the latest version of
> Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.
> 
> So when a user wants to get the latest version of Bio::SeqIO::genbank,
> they no longer have to worry about what other modules in its dependency
> hierarchy they should also install.
> 
> Instead they just request Bio::SeqIO::genbank which itself ensures you
> have the latest version of all its dependencies before installing itself
> and running its tests.

This was my thinking when I first brought this up at the
begining/splitting of this thread. This way of thinking of modules as
the constituent parts of a larger package should make it easier for
people to define dependencies far easier as well as users only needing
to install those parts they require. As Sendu points out, if the user
wants to convert seqs from genbank to fasta they could simply install
Bio::SeqIO::genbank and Bio::SeqIO::fasta and they would get all the
other modules that are the dependencies of Bio::SeqIO::genbank and
Bio::SeqIO::fasta.

> 
> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
> users should have, he could just call './Build dist Bio::SeqIO::genbank'
> which would generate a new package for Bio::SeqIO::genbank suitable for
> uploading to CPAN. No more long release cycles and having to constantly
> tell people to 'use CVS' to get working Bioperl code.

However, how would the test suite work out with this? e.g. when someone
installs Bio::SeqIO::genbank they want to have the tests associated with
Bio::SeqIO::genbank to be run. Would there be tests that would be run
redundantly if for example someone installed Bio::SeqIO::genbank and
Bio::SeqIO::fasta?

> 
> 
>> Would they also be individually distributed?  What  would you use to
>> tie all the individual modules together?  How would  you explain to
>> the CPAN maintainers that you want to split bioperl  into 990
>> individual modules, all updated independently, but intend on  bundling
>> them afterwards anyway?
> 
> They would be tied together by a CPAN bundle. You don't have to
> 'explain' anything to the CPAN maintainers because you're not doing
> anything wrong. In fact, you're using it the way you're supposed to.

Yep. real modules are released as modules, each with their own set of
dependencies. The use CPAN bundles the way there were supposed to be for
- - distributing a set of CPAN modules that make a coherent set of
functionality. You "could" also bundle in other authors modules e.g.
Bio::ASN1::EntrezGene?

> 
> 
>> Splitting up core:
>>
>> As I see it, here are the advantages of a defined split as Steve and 
>> I see it (off the top of my head).  Some of this probably reiterates 
>> my previous points, as well as Steve's, so apologies in advance.
> 
> Below I answer with how it would be with my single-module approach
> compared to the defined splits.
> 
> 
>> - A lean, mean, focused set of bioperl base modules (core) w/o or 
>> with very few external deps, minimal installation issues, etc.  The 
>> very basic stuff to get up and running.
> 
> Even leaner, even more focused.
> 
> 
>> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused 
>> functionality, code, and tests, which add a bit more 'sugar' to the 
>> base functionality of the core.  If you only care about parsing BLAST 
>> reports, get SearchIO, which requires core and optionally other 
>> modules (XML::SAX).  If you want additional DB functionality apart 
>> from the very basic ones in core, install DB (with it's additional 
>> requirements, including core, DBI, and so on).  Same with Graphics, 
>> Tools, Tree/Phylo, etc.  We just need to define and limit the number 
>> of splits.
> 
> The same can be achieved with CPAN bundles for each kind of functional
> grouping you can think of. And since its just a single text file that
> defines such a grouping, its easy to change or add new ones as you feel
> like it, as opposed to the rather more permanent and substantial effort
> of creating one of your splits on the code-base level.
> 
> Also, the world doesn't have to rely on /our/ ideas of what a useful
> functional split is. If someone just wants to parse Blast results, they
> can just use CPAN to install Bio::SearchIO::blast_pull instead of having
> to install all of SearchIO.
> 
> 
>> - Easier to add additional bundled modules.  For instance, I could 
>> focus all of my RNA work into a discrete set of modules (say, bioperl-
>> rna) which I maintain, I ensure works with the latest core code, I 
>> ensure also plays well with the other children =) , and I distribute 
>> via CPAN.  Same with EUtilities, which could go into a separated DB-
>> related set or stay in core.
> 
> And if you lose interest in them? They eventually die because they no
> longer have someone looking after them by default (the pumpkin and other
> devs). Alternatively you could just make a CPAN bundle. One text file!
> Easy! No duplication of modules in CPAN, no new hassle for you or the
> Bioperl 'core' pumpkin to ensure that the latest version of each work
> with each other and other splits.

Hmm, how would module versions be handled? Wouldn't this approach
require each module to have it's own independent version number, which
could then be used for building the dependencies? Each new release of
that module would only bump that module's version number.

Bundles can specify the minimum version of a module to be installed,
such that bug fixes to individual modules and be released into CPAN and
would automatically get picked up when installing bundles etc.

I'm not quite sure how the current stable/dev releases would work. I
assume bug fixes would have to be made on a branch e.g. branch 1.6 and
released to cpan from there. Then when the next stable release is made,
all module versions would be bumped and and released to CPAN. With any
modifications to the content of the bundle to be made. Is it possible to
have a stable and developer release bundles that are able to specify the
minimum stable and developer modules versions respectively?


> 
> 
>> - If we want a full-fledged 'install everything', the CPAN Bundle 
>> system is available.  I think it's easier to use a Bundle for 4-5, 
>> even 10 groups of modules as opposed to over 900.
> 
> No, it isn't any easier. Its /equally/ easy to install a bundle of 900
> packages of 900 modules as it is to install 5 packages of 900 modules.
> 
> When not installing absolutely everything, but perhaps 'most' things,
> there's the additional benefit that it would be easier to skip a
> particular Bio::module because you didn't want to install its external
> dependencies and weren't that interested in it anyway.
> 
> 
>> - A Bundle or a build file where discrete distributions are listed 
>> (Bio::SearchIO, etc) wouldn't need to be updated every time a new 
>> module is added to a distribution.  I suppose this could be 
>> automated, but why have the additional headache?
> 
> Yes, it would be automated, and no, it wouldn't at all be any kind of
> additional headache. I'm proposing a fully-automated system that the
> pumpkin wouldn't even have to think about it. Much /less/ of a headache
> than dealing with splits. Orders of magnitude easier to deal with.
> 
> 
>> - A chance to cut out some cruft.  We all know that particular areas 
>> need work or a complete overhaul (Restriction, Structure, maybe a few 
>> others).  Smaller, concentrated sets of modules I believe would be 
>> easier to maintain, and those that don't get use will eventually fall 
>> out of favor and may be lost or replaced from the more maintained 
>> group of modules.  Survival of the fittest.
> 
> And the smallest, most concentrated set of modules is the individual
> module.
> 
> 
>> - We already have had practice; bioperl-db, bioperl-run, bioperl-
>> network, and others.  Those that have been routinely maintained and 
>> enjoy wide use (db, run, network) have survived; others not so much 
>> (corba-related stuff, microarray, ext, etc., though the code is still 
>> available if someone else wants to take it up and revive it!).
> 
> The reason some of these existing splits (micoarray, ext) have fallen by
> the way-side? /Because/ they're splits. If they had been part of
> bioperl-live all along, they'd have been kept in a working, compatible
> state and would have been released along with everything else in 1.5.2
> 
> 
>> Disadvantages of a defined split:
>>
>> - The initial headache of identifying which groups go where, 
>> coordinating with those who rely on bioperl (GMOD, etc) on how this 
>> will be set up, so on...
> 
> No need to worry about this with individual modules.
> 
> 
>> - Separate groups of modules require testing together to ensure 
>> functionality is consistent and maintained (something I think you 
>> pointed out previously).
> 
> No need to worry.

Maye need to worry aout how the tests are run when installing individual
modules etc?

> 
> 
>> - I think an increased possibility of branching is possible.
>>
>> - Extra headaches for devs, who have to keep track of the various 
>> critical distributions and make sure they work well together.
> 
> No headaches.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg2/uczuW2jkwy2gRAlR4AJ44kHIXWWapNVGOIrkFBJdP9rn3vwCdErhT
VkymyXNshguE44/RilEXWDA=
=O5ex
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Thu Jun 28 08:27:54 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:27:54 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <4683710A.9010808@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> Chris Fields wrote:
>> On Jun 27, 2007, at 5:43 PM, Sendu Bala wrote:
>>> What advantage is there of these defined splits instead of 
>>> individual modules? As I see it you lose some of the potential 
>>> benefits of breaking Bioperl up completely, whilst also suffering 
>>> the maintenance problems I outlined in my objection to Steve's post.
>>>
>>> Being able to work on all Bioperl from a single cvs (ne svn) check 
>>> out/ archive, whilst distributing it as individual modules on CPAN 
>>> seems like the best of both worlds to me. What am I missing?
>>
>> Okay, forewarned, but here's my long-winded reasoning.  The short and 
>> sweet version: I (very) respectfully don't agree with you, at least 
>> re: the idea we should commit all modules to CPAN independently. It 
>> doesn't make any sense to me, but maybe you can elaborate more?  
>> Maybe I'm misinterpreting what you mean?
> 
> The short and sweet version: my proposal has all the benefits of yours,
> but none of the disadvantages. What's not to like?
> 
> 
>> Finally, all of this should wait until later.  Much later, like after 
>> a decent release, after svn, etc kind of 'later'.  I think we can 
>> agree on that.
> 
> Hmm, not really. If it can be implemented by a change in just Build.PL
> and ModuleBuildBioperl, its really independent of everything else.
> That's the beauty of it: the only thing that changes is how things are
> uploaded to and downloaded from CPAN. The only person that normally
> deals with that issue is the pumpkin for a release, and he only cares
> about it at release time.
> 
> In fact, if we're going to do it at all it makes sense to try it out on
> a minor release like 1.5.3. We've already got experience of doing it
> split-style from 1.5.2. (And let me tell you: splits at the code-base
> level suck.)
> 
> 
>> Individual CPAN modules:
>>
>> CPAN is not our personal versioning system; it may be if a 
>> distribution consists of only a few modules, but not when it's one of 
>> the largest distros present.  If someone wants to update an 
>> individual bioperl module for a quick bug fix they are more than 
>> welcome to download it via cvs, svn, or even using a web browser, and 
>> replace the one they have.
> 
> And where is the harm in letting them do it via CPAN as well? In fact,
> there are significant benefits:
> 
> 
>> I'm trying to reason how one could break up the individual SeqIO/
>> SearchIO/otherIO modules into single module distributions.  They are 
>> intrinsically tied together (SeqIO::genbank won't work w/o SeqIO, 
>> which relies on the various interfaces, RootIO, and on down).  How 
>> would tests be run off CPAN when the modules are distributed 
>> independently?
> 
> Bio::SeqIO::genbank would have a dependency on the latest version of
> Bio::SeqIO (etc.), and Bio::SeqIO would have its own dependencies.
> 
> So when a user wants to get the latest version of Bio::SeqIO::genbank,
> they no longer have to worry about what other modules in its dependency
> hierarchy they should also install.
> 
> Instead they just request Bio::SeqIO::genbank which itself ensures you
> have the latest version of all its dependencies before installing itself
> and running its tests.
> 
> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
> users should have, he could just call './Build dist Bio::SeqIO::genbank'
> which would generate a new package for Bio::SeqIO::genbank suitable for
> uploading to CPAN. No more long release cycles and having to constantly
> tell people to 'use CVS' to get working Bioperl code.
> 
> 
>> Would they also be individually distributed?  What  would you use to
>> tie all the individual modules together?  How would  you explain to
>> the CPAN maintainers that you want to split bioperl  into 990
>> individual modules, all updated independently, but intend on  bundling
>> them afterwards anyway?
> 
> They would be tied together by a CPAN bundle. You don't have to
> 'explain' anything to the CPAN maintainers because you're not doing
> anything wrong. In fact, you're using it the way you're supposed to.
> 


The successor to Bundles - may prove interesting:
http://search.cpan.org/~adamk/Task-1.01/lib/Task.pm


> 
>> Splitting up core:
>>
>> As I see it, here are the advantages of a defined split as Steve and 
>> I see it (off the top of my head).  Some of this probably reiterates 
>> my previous points, as well as Steve's, so apologies in advance.
> 
> Below I answer with how it would be with my single-module approach
> compared to the defined splits.
> 
> 
>> - A lean, mean, focused set of bioperl base modules (core) w/o or 
>> with very few external deps, minimal installation issues, etc.  The 
>> very basic stuff to get up and running.
> 
> Even leaner, even more focused.
> 
> 
>> - BioPerl bundled modules (Nathan's 'cliques') with defined, focused 
>> functionality, code, and tests, which add a bit more 'sugar' to the 
>> base functionality of the core.  If you only care about parsing BLAST 
>> reports, get SearchIO, which requires core and optionally other 
>> modules (XML::SAX).  If you want additional DB functionality apart 
>> from the very basic ones in core, install DB (with it's additional 
>> requirements, including core, DBI, and so on).  Same with Graphics, 
>> Tools, Tree/Phylo, etc.  We just need to define and limit the number 
>> of splits.
> 
> The same can be achieved with CPAN bundles for each kind of functional
> grouping you can think of. And since its just a single text file that
> defines such a grouping, its easy to change or add new ones as you feel
> like it, as opposed to the rather more permanent and substantial effort
> of creating one of your splits on the code-base level.
> 
> Also, the world doesn't have to rely on /our/ ideas of what a useful
> functional split is. If someone just wants to parse Blast results, they
> can just use CPAN to install Bio::SearchIO::blast_pull instead of having
> to install all of SearchIO.
> 
> 
>> - Easier to add additional bundled modules.  For instance, I could 
>> focus all of my RNA work into a discrete set of modules (say, bioperl-
>> rna) which I maintain, I ensure works with the latest core code, I 
>> ensure also plays well with the other children =) , and I distribute 
>> via CPAN.  Same with EUtilities, which could go into a separated DB-
>> related set or stay in core.
> 
> And if you lose interest in them? They eventually die because they no
> longer have someone looking after them by default (the pumpkin and other
> devs). Alternatively you could just make a CPAN bundle. One text file!
> Easy! No duplication of modules in CPAN, no new hassle for you or the
> Bioperl 'core' pumpkin to ensure that the latest version of each work
> with each other and other splits.
> 
> 
>> - If we want a full-fledged 'install everything', the CPAN Bundle 
>> system is available.  I think it's easier to use a Bundle for 4-5, 
>> even 10 groups of modules as opposed to over 900.
> 
> No, it isn't any easier. Its /equally/ easy to install a bundle of 900
> packages of 900 modules as it is to install 5 packages of 900 modules.
> 
> When not installing absolutely everything, but perhaps 'most' things,
> there's the additional benefit that it would be easier to skip a
> particular Bio::module because you didn't want to install its external
> dependencies and weren't that interested in it anyway.
> 
> 
>> - A Bundle or a build file where discrete distributions are listed 
>> (Bio::SearchIO, etc) wouldn't need to be updated every time a new 
>> module is added to a distribution.  I suppose this could be 
>> automated, but why have the additional headache?
> 
> Yes, it would be automated, and no, it wouldn't at all be any kind of
> additional headache. I'm proposing a fully-automated system that the
> pumpkin wouldn't even have to think about it. Much /less/ of a headache
> than dealing with splits. Orders of magnitude easier to deal with.
> 
> 
>> - A chance to cut out some cruft.  We all know that particular areas 
>> need work or a complete overhaul (Restriction, Structure, maybe a few 
>> others).  Smaller, concentrated sets of modules I believe would be 
>> easier to maintain, and those that don't get use will eventually fall 
>> out of favor and may be lost or replaced from the more maintained 
>> group of modules.  Survival of the fittest.
> 
> And the smallest, most concentrated set of modules is the individual
> module.
> 
> 
>> - We already have had practice; bioperl-db, bioperl-run, bioperl-
>> network, and others.  Those that have been routinely maintained and 
>> enjoy wide use (db, run, network) have survived; others not so much 
>> (corba-related stuff, microarray, ext, etc., though the code is still 
>> available if someone else wants to take it up and revive it!).
> 
> The reason some of these existing splits (micoarray, ext) have fallen by
> the way-side? /Because/ they're splits. If they had been part of
> bioperl-live all along, they'd have been kept in a working, compatible
> state and would have been released along with everything else in 1.5.2
> 
> 
>> Disadvantages of a defined split:
>>
>> - The initial headache of identifying which groups go where, 
>> coordinating with those who rely on bioperl (GMOD, etc) on how this 
>> will be set up, so on...
> 
> No need to worry about this with individual modules.
> 
> 
>> - Separate groups of modules require testing together to ensure 
>> functionality is consistent and maintained (something I think you 
>> pointed out previously).
> 
> No need to worry.
> 
> 
>> - I think an increased possibility of branching is possible.
>>
>> - Extra headaches for devs, who have to keep track of the various 
>> critical distributions and make sure they work well together.
> 
> No headaches.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg3EKczuW2jkwy2gRAriiAJ47Qz9jTshEXuaG0XMYrUTI0hHqAwCeL45r
r/BykCKbM9lqJM0khARuEms=
=NB4B
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Thu Jun 28 08:51:19 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 09:51:19 +0100
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org>
References: <20070628074004.GD6338@kunpuu.plessy.org>
Message-ID: <46837687.7010101@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Charles Plessy wrote:
> Dear developpers,
> 
> I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
> it would make sense to call it "bioperl-live" and distribute it in
> parallel with the stable 1.4.0 version, if bioperl-live means "the
> current developepr version".
> 
> If I am wrong, can somebody explain me what bioperl-live exactly refers
> to ?
> 
> Have a nice day,
> 

bioperl-live really means the HEAD of the cvs repository so is the most
bleeding-edge code available.

Version 1.5.* is the developer release, while the 1.4.* is the stable
release. However, there have been few updates to the 1.4.* release which
means that it is more unstable than the 1.5.* dev release. I think the
consensus, was to have more rapid release cycles of the stable branch in
future in order to avoid this. I'm sure there are others more qualified
to expand/correct me on this if needs e.

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg3aHczuW2jkwy2gRAo5pAJ95BGqrA5bLwRKNfUQi/HfBnkUJjwCg0mYB
/fHFyYkqAvcmOSxu4djPll0=
=KwVH
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Thu Jun 28 09:11:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 10:11:39 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <46836FEE.5030203@sheffield.ac.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk> <46836FEE.5030203@sheffield.ac.uk>
Message-ID: <46837B4B.7060705@sendu.me.uk>

Nathan S. Haigh wrote:
(Please try and snip more: don't quote whole posts just to reply to 
certain paragraphs)

> Sendu Bala wrote:
>> Chris Fields wrote:
>> When a dev makes a major bugfix to Bio::SeqIO::genbank that all genbank
>> users should have, he could just call './Build dist Bio::SeqIO::genbank'
>> which would generate a new package for Bio::SeqIO::genbank suitable for
>> uploading to CPAN. No more long release cycles and having to constantly
>> tell people to 'use CVS' to get working Bioperl code.
> 
> However, how would the test suite work out with this? e.g. when someone
> installs Bio::SeqIO::genbank they want to have the tests associated with
> Bio::SeqIO::genbank to be run. Would there be tests that would be run
> redundantly if for example someone installed Bio::SeqIO::genbank and
> Bio::SeqIO::fasta?

We would want to move to a strict test-script-per-module system. But 
that's desirable in any case, as it would greatly ease reaching our goal 
of complete test coverage, and subsequent maintenance of those tests.

The genbank test would only run tests specific to genbank parsing, and 
likewise for fasta. They would both have a dependency on Bio::SeqIO, and 
if that was also recently updated, it would get installed prior to you 
installing genbank (and therefor run its own generic SeqIO tests), but 
wouldn't get installed again (wouldn't run its tests again) when you 
install fasta afterwards.


On the subject of tests, I'm reminded of another benefit of the 
individual-module approach. Currently if a test fails during a CPAN 
install, nothing gets installed. Users do one of:

# refuse to install at all (strict sys-admins)
# cry and give up (newbies)
# cry and seek help (newbies who really really need Bioperl)
# force install, leaving them in some undefined state because they 
didn't understand the problems (most remaining users)
# force install, happy that the problems are ok (some Bioperl devs)

With a bundle of individual modules you would install virtually all 
Bioperl modules with no problems, and the problems with the remainder 
would be clear to everyone. No one would need to force install since the 
tests results would now be meaningful: the thing you're trying to 
install really isn't going to work if the tests are failing. If you 
really needed that particular Bioperl module you could then pay 
particular attention to why its failing (most likely some problem with 
an external dependency).


>>> Would they also be individually distributed?  What  would you use to
>>> tie all the individual modules together?
>>
>> They would be tied together by a CPAN bundle. You don't have to
>> 'explain' anything to the CPAN maintainers because you're not doing
>> anything wrong. In fact, you're using it the way you're supposed to.
> 
> Yep. real modules are released as modules, each with their own set of
> dependencies. The use CPAN bundles the way there were supposed to be for
> - - distributing a set of CPAN modules that make a coherent set of
> functionality. You "could" also bundle in other authors modules e.g.
> Bio::ASN1::EntrezGene?

Any bundle featuring Bio::SeqIO::entrezgene would necessarily include 
Bio::ASN1::EntrezGene in the bundle.


> Hmm, how would module versions be handled? Wouldn't this approach
> require each module to have it's own independent version number, which
> could then be used for building the dependencies? Each new release of
> that module would only bump that module's version number.

Yes, that's how it would work. No more global version number.


> Bundles can specify the minimum version of a module to be installed,
> such that bug fixes to individual modules and be released into CPAN and
> would automatically get picked up when installing bundles etc.

Yes.


> I'm not quite sure how the current stable/dev releases would work. I
> assume bug fixes would have to be made on a branch e.g. branch 1.6 and
> released to cpan from there. Then when the next stable release is made,
> all module versions would be bumped and and released to CPAN. With any
> modifications to the content of the bundle to be made. Is it possible to
> have a stable and developer release bundles that are able to specify the
> minimum stable and developer modules versions respectively?

No, the distinction becomes pretty meaningless. We could still do big 
major releases, but modules wouldn't be version-bumped. The big release 
would just be an update of the bundle that specifies the latest version 
of all Bioperl modules.

Remember that bundles only specify the minimum version, not the required 
version: in this brave new world users would end up with the same 
versions of modules if they installed a 1.8 bundle compared to 1.7 bundle.

The only way to get a true snapshot of 1.7 after it was released would 
be if we took snapshots and archived them, making them available from 
bioperl.org (or by checking out the 1.7 tag from cvs/svn).

I don't see that as a significant problem. You lose the trivial benefit 
of being able to install old snapshots from CPAN. The people who have a 
great need to install old snapshots can find their way to bioperl.org no 
problem.


From bix at sendu.me.uk  Thu Jun 28 08:50:09 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 09:50:09 +0100
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <20070628074004.GD6338@kunpuu.plessy.org>
References: <20070628074004.GD6338@kunpuu.plessy.org>
Message-ID: <46837641.8050106@sendu.me.uk>

Charles Plessy wrote:
> I am considering to bring bioperl 1.5.2 in Debian, and I am wondering if
> it would make sense to call it "bioperl-live" and distribute it in
> parallel with the stable 1.4.0 version, if bioperl-live means "the
> current developepr version".
> 
> If I am wrong, can somebody explain me what bioperl-live exactly refers
> to ?

bioperl-live is the name of the CVS repository containing what is 
currently considered the 'Core package' or core modules.
http://www.bioperl.org/wiki/Using_CVS

If you want to call it something to distinguish it from stable, call it 
'developer' vs 'stable' or '1.5.2' vs '1.4.0'.

To distinguish them both from the other packages, call them 'core' vs 
'run' etc.


From hlapp at gmx.net  Thu Jun 28 10:31:29 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 28 Jun 2007 07:31:29 -0300
Subject: [Bioperl-l] Splits again
In-Reply-To: <5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
Message-ID: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>


On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote:

> [...] Also - the main point I wanted to make - Can I suggest we  
> spend a
> little time discussing what it will take to get a stable release for
> the current code as it stands (bioperl-live and bioperl-run)?  It
> seems like we really need to do this first so that we have a stable
> release that can be followed by CVS -> SVN migration, then consider
> major changes to the repository structure and release packaging, and
> potential deprecation and incorporation of other modules.

I agree we need to discuss a path towards 1.6, but I think that  
should be kept separate from the cvs->svn migration. Otherwise one  
stalls the other (by stopping people who seem to have the energy and  
motivation right now to do one but not the other) for no really good  
reason.

> I assume there is no chance that we'd have a 1.6 candidate by BOSC
> next month?

I'm not sure that's feasible to be happening but if someone steps up  
it maybe it is.

>
> Will it be productive to schedule a fair amount of time at BOSC
> discussing how to partition out the packages into separate sub-
> packages after we've done a successful release rather than trying to
> change things right now?

I agree. I also don't think that people are partitioning right now  
(other than the existing partitioning), though maybe I'm mistaken.

> [...]
> It would  probably mean moving Bio::Graphics, Bio::DB::GFF and
> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
> so they could be released more regularly on par with Gbrowse
> schedules.

Possibly. I'm not fully sure why those modules couldn't also be  
released more often out of the "main trunk" of modules. In Java/ant,  
it'd be relatively easy to write build script filters that select the  
appropriate modules and package them on the fly. I'm not sure whether  
the build tools for Perl can do that too, though.

>   Also I think someone needs to figure out Bio::Tools::GFF
> vs Bio::FeatureIO -- what do we want to do?

I believe FeatureIO has the ontology download tied into it?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Thu Jun 28 10:47:39 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 28 Jun 2007 07:47:39 -0300
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>


On Jun 28, 2007, at 12:29 AM, Jason Stajich wrote:

> As I tried to ask for in the past, would someone also illustrate the
> importance of why _WE_ need to switch to SVN on a wiki page on
> Bioperl so that when someone complains/asks about this in the future
> the arguments are already laid out.  I am basically fine with it, but
> I don't honestly see a compelling reason beyond what has been
> mentioned wrt better integration in IDEs.
> http://bioperl.org/wiki/Why_SVN

I guess at the end of the day svn is just the system of choice for  
new developers. I've had people tell me who started with svn that cvs  
seems a lot harder to use. The newer projects are all on svn and for  
example to integrate Bio::Phylo into BioPerl should become a question  
of the revision control system.

At the end of the day if being on svn makes it easier for new people  
to contribute it's enough of an argument for me, whether it's  
rational or not.

IMHO, there's two advantages that svn has over cvs. First,  
directories are versioned, have properties, and generally are the  
same class of citizens as files. They can be added, renamed, and  
removed from the repository. In cvs, we all know what a hassle it is  
to rename or even retire directories. Second, svn log gives you the  
commits, i.e., the set of changes that constituted one particular  
commit (and therefore version increase). In cvs that's hard or  
impossible to reconstruct.

Bottom line - I don't think many people if any will question why we  
moved from cvs to svn ...

My $0.02 ...

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hartzell at alerce.com  Thu Jun 28 00:34:37 2007
From: hartzell at alerce.com (George Hartzell)
Date: Wed, 27 Jun 2007 20:34:37 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>
References: <C2A83EA3.EC27%bosborne11@verizon.net>
	<4F928479-AEDF-44C7-88CE-A8A71628D78A@uiuc.edu>
	<9FEF16FD-47F6-4D7B-A1E6-1A92CC82570C@gmx.net>
	<1CB82DD7-A997-47E8-A6A5-2AC9B375E875@uiuc.edu>
Message-ID: <18051.541.684705.567954@almost.alerce.com>

Chris Fields writes:
 > We should port them all, yes.
 > 
 > chris
 > 
 > On Jun 27, 2007, at 4:35 PM, Hilmar Lapp wrote:
 > 
 > > Is there a reason not to port every subproject over?
 > >
 > > 	-hilmar

They're all there.  At least everything that I found in the CVS repo.
Some of the directories were empty, some had very little content, I
was just mechanical about it.

Here's what I have:

  [hartzell at dev ~]$ svn ls file://`pwd`/bioperl
  biodata/
  bioperl-cookbook/
  bioperl-corba-client/
  bioperl-corba-server/
  bioperl-das-client/
  bioperl-db/
  bioperl-ext/
  bioperl-gui/
  bioperl-live/
  bioperl-microarray/
  bioperl-network/
  bioperl-papers/
  bioperl-pedigree/
  bioperl-pipeline/
  bioperl-run/
  biosql-schema/
  html/
  task-manager/
  xml-html/

I wasn't very clear in my original request, but I was hoping that
someone out there who's familiar with the various out-of-the-way bits
and pieces could take a look at them.  I was afraid that everyone was
just checking out bioperl-live and doing 'make test'.

Someone (chris?) made a point about binary files in bioperl-run.  It'd
be great if someone in the know could check on them.

Also, to the degree that it's possible, look around at various tags
and branches and see if they're what you'd expect.

Thanks!

g.


From bix at sendu.me.uk  Thu Jun 28 12:21:37 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 13:21:37 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18049.30026.61328.134490@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
Message-ID: <4683A7D1.8070403@sendu.me.uk>

George Hartzell wrote:
> Chris Fields writes:
>  > [...]
>  > It looks like George Hartzell may be taking a crack at it, with  
>  > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
>  > could have something testable relatively soon.  After that we'll need  
>  > to work out a few other issues, basically what's on Hilmar's list.
> 
> There's a repository on file:///home/hartzell/bioperl with all of the
> components projects in place.
> 
> If you have a dev.open-bio.org account and you're in the bioperl
> group, you're good to get at it via:
> 
>   file:///home/hartzell/bioperl

I'm confused. Presumably that only works whilst logged into 
dev.open-bio.org?


>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl

I just tried:

svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl

on Mac OS X and things seemed to go well, except for this error message 
at the end:


svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
svn: Can't move source to dest
svn: Can't move 
'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
to 
'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
No such file or directory

I also ended up with only:
bioperl-corba-server    bioperl-db              bioperl-live 
bioperl-network         bioperl-papers          biosql-schema


Am I doing something totally wrong here?


From hartzell at alerce.com  Thu Jun 28 12:32:36 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:32:36 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN
	and	...Re:	Perltidy]
In-Reply-To: <E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
Message-ID: <18051.43620.481558.447399@almost.alerce.com>

Jason Stajich writes:
 > [...]
 > The repository machine (dev) is a locked down machine meaning it only  
 > really runs ssh and not many servers include httpd.  We have  
 > anonymous CVS (client and through httpd browsing) running on a  
 > separate machine (code) that has the info rsynced over every 10 or 15  
 > minutes.

A great way to provide a read-only mirror of the repos. for anonymous
users is to have svnsync running out of cron on code.open-bio.org,
configured to pull from the dev.open-bio.org repository.  It might
actually work to have rsync mirror the fsfs-backed repository, but
that's scary-poking-into-the-internals.

g.


From hartzell at alerce.com  Thu Jun 28 12:43:37 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:43:37 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
Message-ID: <18051.44281.831316.749586@almost.alerce.com>

David Messina writes:
 > 
 > On Jun 27, 2007, at 2:49 PM, Hilmar Lapp wrote:
 > 
 > >
 > > On Jun 27, 2007, at 1:27 PM, David Messina wrote:
 > >
 > >> I would think we would want "Author Date Id Rev URL" set on
 > >> everything, no?. So either cvs2svn or your tool (whichever you think
 > >> is better), followed by
 > >>
 > >> 	svn propset svn:keywords "Author Date Id Rev URL" *
 > >
 > > Shouldn't this be done recursively?
 > 
 > 
 > Yep, good catch! Thanks, Hilmar.
 > 
 > Should be:
 > 
 > 	svn propset --recursive svn:keywords "Author Date Id Rev URL" *

That's not quite what you want either.  It'll set the the keyword
property on all of the files, including things where you probably
don't want expansion to happen (e.g. images, someone said there are
binary wads in bioperl-run, etc...).

The Right Thing To Do is to grub around (grep) for '\$Id:' (and the
others) and set svn:keywords to files that are already using
keywords.  I have a bourne shell hack that'll do this, although it's
painful because it has to run in working directories....

Once we settle on a list of keywords to use, I'll take a wack at the
demo repository.

Likewise, you probably DON'T want to use this in your config file:

	  enable-auto-props = yes
	  * = svn:keywords="Author Date Id Rev URL"

since it'll do the same thing.

The Right Thing To Do is a more tedious 

	  *.pl = svn:keywords="Author Date Id Rev URL"
	  *.pm = svn:keywords="Author Date Id Rev URL"
  	  *.c = svn:keywords="Author Date Id Rev URL"

A bit of googling will give you a good starting point for the list,
and we should probably maintain a common one somewhere in the repo.

I don't think that there's a server side way of doing this, short of
running some script via a hook around commit time.

g.


From hartzell at alerce.com  Thu Jun 28 12:54:40 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 08:54:40 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN
	and	...Re:	Perltidy]
In-Reply-To: <F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>
References: <13347041.1182954205903.JavaMail.myubc2@brahms.my.ubc.ca>
	<79775122-E7CA-43FA-A6E0-E7E5CB31C13F@uiuc.edu>
	<E535001B-26DB-4F15-8ED7-427F92DB3E94@gmx.net>
	<5B8B5D67-855C-4B47-B33F-68A29A1A4E2E@uiuc.edu>
	<18051.1253.87485.235496@almost.alerce.com>
	<E0DCED5F-1B99-4B49-A13D-76088A718B71@bioperl.org>
	<F2858007-63BC-4E72-B5BD-5420BE39E6D2@gmx.net>
Message-ID: <18051.44944.982207.37624@almost.alerce.com>

Hilmar Lapp writes:
 > [...]
 > IMHO, there's two advantages that svn has over cvs. First,  
 > directories are versioned, have properties, and generally are the  
 > same class of citizens as files. They can be added, renamed, and  
 > removed from the repository. In cvs, we all know what a hassle it is  
 > to rename or even retire directories. Second, svn log gives you the  
 > commits, i.e., the set of changes that constituted one particular  
 > commit (and therefore version increase). In cvs that's hard or  
 > impossible to reconstruct.

Two more:

  - svn groups changes into revisions, so that they can be considered
    together, CVS versions individual files.
  - subversion tracks renames/moves correctly,
  - subversion commits are atomic, so you never have to worry about
    all of your stuff making it into the repos. at the same time [if
    you've never had to un-muck this, count yourself blessed!] ,
  - svk, which allows disconnected development while still commiting
    your work to a repo at natural points along the way (you can
    revert, branch, etc.... to your hearts content).

[yeah, that's 3, err, 4. Math is hard.]

g.


From cjfields at uiuc.edu  Thu Jun 28 13:07:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 08:07:24 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<5141C933-99D4-4DFF-A82F-09AE220F2AF1@bioperl.org>
	<23C3729D-6E13-445F-99AF-54B4DC0BB4E8@gmx.net>
Message-ID: <01812F01-9409-49FB-9061-330FA52177C1@uiuc.edu>


On Jun 28, 2007, at 5:31 AM, Hilmar Lapp wrote:

>
> On Jun 28, 2007, at 12:51 AM, Jason Stajich wrote:
>
>> ...It
>> seems like we really need to do this first so that we have a stable
>> release that can be followed by CVS -> SVN migration, then consider
>> major changes to the repository structure and release packaging, and
>> potential deprecation and incorporation of other modules.
>
> I agree we need to discuss a path towards 1.6, but I think that
> should be kept separate from the cvs->svn migration. Otherwise one
> stalls the other (by stopping people who seem to have the energy and
> motivation right now to do one but not the other) for no really good
> reason.

It's good to discuss it as long as it doesn't take time and energy  
away from other priorities.

>> I assume there is no chance that we'd have a 1.6 candidate by BOSC
>> next month?
>
> I'm not sure that's feasible to be happening but if someone steps up
> it maybe it is.

Maybe a 1.5.3 and (if we work hard on it) a 1.6 soon after.  Then  
maybe work on partitioning if everyone's up for it and a scheme is  
worked out.

>> Will it be productive to schedule a fair amount of time at BOSC
>> discussing how to partition out the packages into separate sub-
>> packages after we've done a successful release rather than trying to
>> change things right now?
>
> I agree. I also don't think that people are partitioning right now
> (other than the existing partitioning), though maybe I'm mistaken.

The original proposal was based on Steve's idea of splitting up  
core.  I don't think a partition is feasible at this point, at least  
until we put more thought into it  (our energy should be focused  
elsewhere), but it's well worth discussing as a future path.

At this time there are two proposals:

1)  Steve's and my 'split into discrete sections' proposal, where we  
split core into self-sustaining sections with a common core listed as  
a dependency, tying installation of all together with a Bundle or  
similar.

2)  Sendu's 'break everything up' approach where all modules are  
submitted independently to CPAN, with their own tests, dependencies,  
etc.

There are advantages and disadvantages to both approaches.  Not sure  
if CPAN would go for the latter (it's pretty drastic), but I don't  
know for sure.  If you want in on that discussion (in this thread)  
feel free to join in!  The more the merrier!

>> [...]
>> It would  probably mean moving Bio::Graphics, Bio::DB::GFF and
>> Bio::DB::SeqFeature and gff tools for Gbrowse into separate packages
>> so they could be released more regularly on par with Gbrowse
>> schedules.
>
> Possibly. I'm not fully sure why those modules couldn't also be
> released more often out of the "main trunk" of modules. In Java/ant,
> it'd be relatively easy to write build script filters that select the
> appropriate modules and package them on the fly. I'm not sure whether
> the build tools for Perl can do that too, though.

Both approaches above would probably use Module::Build to install  
other bioperl dependencies, each of which could have it's own  
dependency set, possibly using a Bundle to tie everything together.

>>   Also I think someone needs to figure out Bio::Tools::GFF
>> vs Bio::FeatureIO -- what do we want to do?
>
> I believe FeatureIO has the ontology download tied into it?
>
> 	-hilmar

 From recent posts here and on the gbrowse mail list by Scott and  
Lincoln, it seemed like they were moving away from using Bio::DB::GFF  
and were trying to get users to switch to Bio::DB::SeqFeature.  Maybe  
should get a more direct response?

chris


From hartzell at alerce.com  Thu Jun 28 13:16:18 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 09:16:18 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <18051.46242.942184.758493@almost.alerce.com>

Sendu Bala writes:
 > George Hartzell wrote:
 > > Chris Fields writes:
 > >  > [...]
 > >  > It looks like George Hartzell may be taking a crack at it, with  
 > >  > Rutger Vos, Nathan Haigh, and moi helping out where needed.  If so we  
 > >  > could have something testable relatively soon.  After that we'll need  
 > >  > to work out a few other issues, basically what's on Hilmar's list.
 > > 
 > > There's a repository on file:///home/hartzell/bioperl with all of the
 > > components projects in place.
 > > 
 > > If you have a dev.open-bio.org account and you're in the bioperl
 > > group, you're good to get at it via:
 > > 
 > >   file:///home/hartzell/bioperl
 > 
 > I'm confused. Presumably that only works whilst logged into 
 > dev.open-bio.org?

Yes, that only works if you're actually on the machine.

 > >   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > I just tried:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > on Mac OS X and things seemed to go well, except for this error message 
 > at the end:
 > 
 > 
 > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
 > svn: Can't move source to dest
 > svn: Can't move 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
 > to 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
 > No such file or directory
 > 
 > I also ended up with only:
 > bioperl-corba-server    bioperl-db              bioperl-live 
 > bioperl-network         bioperl-papers          biosql-schema
 > 
 > 
 > Am I doing something totally wrong here?

It looks like you tried to check out the *entire* repository.  It
never occured to me to try that.  I'll take a look at what you
reported.

g.


From bix at sendu.me.uk  Thu Jun 28 13:20:19 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 14:20:19 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.46242.942184.758493@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.46242.942184.758493@almost.alerce.com>
Message-ID: <4683B593.3050108@sendu.me.uk>

George Hartzell wrote:
> Sendu Bala writes:
>> I just tried:
>> 
>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
[snip]
> It looks like you tried to check out the *entire* repository.

Yes. If you don't want everything, how does one 'browse' the repository
to find out the address of the thing you /do/ want?


> It never occured to me to try that.  I'll take a look at what you 
> reported.

Cheers.


From bix at sendu.me.uk  Thu Jun 28 13:27:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 14:27:29 +0100
Subject: [Bioperl-l] SVN and ...Re: Perltidy
In-Reply-To: <18049.22260.967524.353173@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.22260.967524.353173@almost.alerce.com>
Message-ID: <4683B741.5020600@sendu.me.uk>

George Hartzell wrote:
> There don't seem to be any .cvsignore files in the repository, or in
> CVSROOT/cvsignore.
> 
> Am I missing something, or don't we use them?

It would be great to have the following files svn:ignored :

In all package roots:
? Build
? MANIFEST
? MANIFEST.SKIP
? META.yml
? _build
? bioperl-*.tar.bz2
? bioperl-*.tar.gz
? bioperl-*.zip
? blib
? cover_db

In any and all directories:
? .DS_Store
? .DAV

In bioperl-live:
? t/BioDBSeqFeature.t
? t/BioDBSeqFeature_BDB.t
? t/BioDBSeqFeature_mysql.t


Can't think of anything else right now.

Thanks for your efforts,
Sendu.


From cjfields at uiuc.edu  Thu Jun 28 13:30:43 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 08:30:43 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <A2B0A715-BEF7-4632-91B3-1A215FBFE3D5@uiuc.edu>


On Jun 28, 2007, at 7:21 AM, Sendu Bala wrote:

>> ...
>>   file:///home/hartzell/bioperl
>
> I'm confused. Presumably that only works whilst logged into
> dev.open-bio.org?

Yes, it's just a tester.

>>   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>
> I just tried:
>
> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl

Try 'svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/trunk /mybiodir' to check out the main trunk for core.

chris


From hartzell at alerce.com  Thu Jun 28 13:57:00 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 09:57:00 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683A7D1.8070403@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
Message-ID: <18051.48684.996884.134046@almost.alerce.com>

Sendu Bala writes:
 > [...]
 > I just tried:
 > 
 > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
 > 
 > on Mac OS X and things seemed to go well, except for this error message 
 > at the end:
 > 
 > 
 > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
 > svn: Can't move source to dest
 > svn: Can't move 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
 > to 
 > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
 > No such file or directory
 > 
 > I also ended up with only:
 > bioperl-corba-server    bioperl-db              bioperl-live 
 > bioperl-network         bioperl-papers          biosql-schema
 > 
 > 
 > Am I doing something totally wrong here?

So, you probably wanted something like

  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

to pick up the head of the bioperl live tree (or
/.../bioperl-run/trunk, etc...).

I just checked out

  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/

and it ran to completion and gave me 

   (delicious)[6:50am]~/tmp>>ls bioperl | cat
   biodata
   bioperl-cookbook
   bioperl-corba-client
   bioperl-corba-server
   bioperl-das-client
   bioperl-db
   bioperl-ext
   bioperl-gui
   bioperl-live
   bioperl-microarray
   bioperl-network
   bioperl-papers
   bioperl-pedigree
   bioperl-pipeline
   bioperl-run
   biosql-schema
   html
   task-manager
   xml-html

Can another mac os x user out there give the Great Big Checkout a try
and see if it runs to completion.  Potential problems that come to
mind are:

  - the "mac's are case insensitive, sort of" problem
  - you filled up your disk
  - something else.

g.


From charles-listes+bioperl at plessy.org  Thu Jun 28 13:44:56 2007
From: charles-listes+bioperl at plessy.org (Charles Plessy)
Date: Thu, 28 Jun 2007 22:44:56 +0900
Subject: [Bioperl-l] Is bioperl-live the nickname of the CVS head ?
In-Reply-To: <46837687.7010101@sheffield.ac.uk>
References: <20070628074004.GD6338@kunpuu.plessy.org>
	<46837687.7010101@sheffield.ac.uk>
Message-ID: <20070628134456.GB14492@kunpuu.plessy.org>

Le Thu, Jun 28, 2007 at 09:51:19AM +0100, Nathan S. Haigh a ?crit :
> 
> Version 1.5.* is the developer release, while the 1.4.* is the stable
> release. However, there have been few updates to the 1.4.* release which
> means that it is more unstable than the 1.5.* dev release. I think the
> consensus, was to have more rapid release cycles of the stable branch in
> future in order to avoid this. I'm sure there are others more qualified
> to expand/correct me on this if needs e.

Ok, thank you all for the answers. I think that I will simply upgrade
bioperl to 1.5.2 in Debian testing, and maybe rename it bioperl-core
when I will package other components.

Have a nice day,

-- 
Charles Plessy
Debian-Med packaging team
Wako, Saitama, Japan


From bix at sendu.me.uk  Thu Jun 28 14:19:49 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 15:19:49 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.48684.996884.134046@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
Message-ID: <4683C385.3050904@sendu.me.uk>

George Hartzell wrote:
> Sendu Bala writes:
>  > [...]
>  > I just tried:
>  > 
>  > svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>  > 
>  > on Mac OS X and things seemed to go well, except for this error message 
>  > at the end:
>  > 
>  > 
>  > svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
>  > svn: Can't move source to dest
>  > svn: Can't move 
>  > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' 
>  > to 
>  > 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': 
>  > No such file or directory
>  > 
>  > I also ended up with only:
>  > bioperl-corba-server    bioperl-db              bioperl-live 
>  > bioperl-network         bioperl-papers          biosql-schema

I tried again in the same location and it told me I had to 'svn 
cleanup', which I did. But subsequently it kept complaining about files 
already being there.


> I just checked out
> 
>   svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/
> 
> and it ran to completion
[snip]
> Can another mac os x user out there give the Great Big Checkout a try
> and see if it runs to completion.  Potential problems that come to
> mind are:
> 
>   - the "mac's are case insensitive, sort of" problem
>   - you filled up your disk
>   - something else.

Well, I didn't run out of disc space. After a rm -fr * and trying again 
it failed at exactly the same point, in the same way.

svn co 
svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data

causes this repeatable problem:

[...]
A    data/phredfile.phd
svn: In directory 'data'
svn: Can't move source to dest
svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 
'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory

That is with Mac OS X svn command-line client, version 1.4.4

I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with 
a linux svn command-line client, version 1.2.3.


Cheers,
Sendu.


From dmessina at wustl.edu  Thu Jun 28 15:08:59 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 10:08:59 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18051.44281.831316.749586@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
Message-ID: <F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>

> [George]
> Likewise, you probably DON'T want to use this in your config file:
>
> 	  enable-auto-props = yes
> 	  * = svn:keywords="Author Date Id Rev URL"
>
> since it'll do the same thing.

Ah, so I've been doing it wrong all along then. :) Thanks, George!


> The Right Thing To Do is a more tedious
>
> 	  *.pl = svn:keywords="Author Date Id Rev URL"
> 	  *.pm = svn:keywords="Author Date Id Rev URL"
>   	  *.c = svn:keywords="Author Date Id Rev URL"
>
> A bit of googling will give you a good starting point for the list,
> and we should probably maintain a common one somewhere in the repo.


I've googled around and gathered the following as a possible list for  
our repo. Since I obviously don't know what I'm doing :), of course  
adjust and refine as necessary.

Dave

-------
[auto-props]
# Code formats
*.c          = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.cpp        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.h          = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.java       = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.as         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/plain
*.cgi        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn-mine-type=text/plain
*.js         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/javascript
*.php        = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL" Rev Date; svn:mime-type=text/x-php
*.pl         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-perl; svn:executable
*.pm         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-perl
*.py         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-python; svn:executable
*.sh         = svn:eol-style=native; svn:keywords="Author Date Id Rev  
URL"; svn:mime-type=text/x-sh; svn:executable

# Image formats
*.bmp        = svn:mime-type=image/bmp
*.gif        = svn:mime-type=image/gif
*.ico        = svn:mime-type=image/ico
*.jpeg       = svn:mime-type=image/jpeg
*.jpg        = svn:mime-type=image/jpeg
*.png        = svn:mime-type=image/png
*.tif        = svn:mime-type=image/tiff
*.tiff       = svn:mime-type=image/tiff

# Data formats
*.pdf        = svn:mime-type=application/pdf
*.avi        = svn:mime-type=video/avi
*.doc        = svn:mime-type=application/msword
*.eps        = svn:mime-type=application/postscript
*.gz         = svn:mime-type=application/gzip
*.mov        = svn:mime-type=video/quicktime
*.mp3        = svn:mime-type=audio/mpeg
*.ppt        = svn:mime-type=application/vnd.ms-powerpoint
*.ps         = svn:mime-type=application/postscript
*.psd        = svn:mime-type=application/photoshop
*.rtf        = svn:mime-type=text/rtf
*.swf        = svn:mime-type=application/x-shockwave-flash
*.tgz        = svn:mime-type=application/gzip
*.wav        = svn:mime-type=audio/wav
*.xls        = svn:mime-type=application/vnd.ms-excel
*.zip        = svn:mime-type=application/zip

# Text formats
.htaccess    = svn:mime-type=text/plain
*.css        = svn:mime-type=text/css
*.dtd        = svn:mime-type=text/xml
*.html       = svn:mime-type=text/html
*.ini        = svn:mime-type=text/plain
*.sql        = svn:mime-type=text/x-sql
*.txt        = svn:mime-type=text/plain
*.xhtml      = svn:mime-type=text/xhtml+xml
*.xml        = svn:mime-type=text/xml
*.xsd        = svn:mime-type=text/xml
*.xsl        = svn:mime-type=text/xml
*.xslt       = svn:mime-type=text/xml
*.xul        = svn:mime-type=text/xul
*.yml        = svn:mime-type=text/plain
CHANGES      = svn:mime-type=text/plain
COPYING      = svn:mime-type=text/plain
INSTALL      = svn:mime-type=text/plain
Makefile*    = svn:mime-type=text/plain
README       = svn:mime-type=text/plain
TODO         = svn:mime-type=text/plain


From dmessina at wustl.edu  Thu Jun 28 15:11:23 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 10:11:23 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683B593.3050108@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.46242.942184.758493@almost.alerce.com>
	<4683B593.3050108@sendu.me.uk>
Message-ID: <F55A8B8A-B7B8-4354-85B7-E459B3679E41@wustl.edu>

> [Sendu]
>
> Yes. If you don't want everything, how does one 'browse' the  
> repository
> to find out the address of the thing you /do/ want?

svn ls file://dev.open-bio.org/home/hartzell/bioperl

or

svn ls svn+ssh://dev.open-bio.org/home/hartzell/bioperl


From n.haigh at sheffield.ac.uk  Thu Jun 28 15:13:58 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 16:13:58 +0100
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683B593.3050108@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>	<18051.46242.942184.758493@almost.alerce.com>
	<4683B593.3050108@sendu.me.uk>
Message-ID: <4683D036.5060109@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> George Hartzell wrote:
>> Sendu Bala writes:
>>> I just tried:
>>>
>>> svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl
> [snip]
>> It looks like you tried to check out the *entire* repository.
> 
> Yes. If you don't want everything, how does one 'browse' the repository
> to find out the address of the thing you /do/ want?
> 

You could try:
svn ls

or

svn ls -R

to get a list of directories.

> 
>> It never occured to me to try that.  I'll take a look at what you 
>> reported.
> 
> Cheers.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg9A2czuW2jkwy2gRAgirAKCnMAg6a7W7RM22O2rOi4vD5w3HPwCePsku
akLhIszoQbRc/aVX3d/Jp7w=
=mlHY
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Thu Jun 28 15:20:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 10:20:46 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683C385.3050904@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
Message-ID: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>

I can replicate the same problem (Mac OS X) with a full checkout:

svn: In directory 'bioperl/bioperl-live/tags/release-0-9-2/t/data'
svn: Can't move source to dest
svn: Can't move 'bioperl/bioperl-live/tags/release-0-9-2/t/data/.svn/ 
tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 'bioperl/bioperl-live/ 
tags/release-0-9-2/t/data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base':  
No such file or directory

What local (mac) svn version are you using?  I'm running off macports:

svn --version
svn, version 1.4.4 (r25188)
    compiled Jun 16 2007, 23:40:53

chris

On Jun 28, 2007, at 9:19 AM, Sendu Bala wrote:
...

> I tried again in the same location and it told me I had to 'svn
> cleanup', which I did. But subsequently it kept complaining about  
> files
> already being there.
>>
> [snip]
>> Can another mac os x user out there give the Great Big Checkout a try
>> and see if it runs to completion.  Potential problems that come to
>> mind are:
>>
>>   - the "mac's are case insensitive, sort of" problem
>>   - you filled up your disk
>>   - something else.
>
> Well, I didn't run out of disc space. After a rm -fr * and trying  
> again
> it failed at exactly the same point, in the same way.
>
> svn co
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/ 
> release-0-9-2/t/data
>
> causes this repeatable problem:
>
> [...]
> A    data/phredfile.phd
> svn: In directory 'data'
> svn: Can't move source to dest
> svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to
> 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or  
> directory
>
> That is with Mac OS X svn command-line client, version 1.4.4
>
> I can get bioperl-live/tags/release-0-9-2/t/data to check out fine  
> with
> a linux svn command-line client, version 1.2.3.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Jun 28 15:37:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 10:37:27 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683624F.6020402@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
Message-ID: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>

On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> ...
>
> The short and sweet version: my proposal has all the benefits of  
> yours, but none of the disadvantages. What's not to like?

The short and sweet version: I'm more convinced after you laid out  
your argument in detail, which would have saved me some typing last  
night, BTW, thanks! ; >

The other core devs need to chip in and we need to openly (candidly)  
discuss it some more (I've added Hilmar to this).  There is also a  
tenable solution that allows both aspects ('cliques' and single mode)  
which might make everybody happy.

Let's say we only want to install Bio::SeqIO::genbank.  The  
Bio::SeqIO::genbank Build.PL would only install what was needed (as  
you indicated), only Bio::SeqIO::genbank-related tests would run  
(along with dependency test, if available), and life would go on.   
However, what if we wanted to install everything in SeqIO/DB/AlignIO/ 
etc?

We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO  
modules installed or a select few (maybe a quick 'install all (y/n)?'  
followed by a list, which installs them one at a time along with  
dependencies), or have the option to specifically denote them as  
passed args to SeqIO's Build.PL, something like 'perl Build.PL - 
install-plugins genbank embl swiss', 'perl Build.PL -install-plugins  
all', etc.  If a specific module (Bio::SeqIO::genbank) is installed  
directly then maybe the installation q&a's of followed modules could  
be bypassed when installing down the dependency tree with additional  
passed args.

This would, in effect, be a bioperl-specific mini-CPAN within CPAN.   
Nice!

Now, this doesn't address several related issues, such as how we  
handle versioning of the independent modules (should be in a  
controlled manner), what we do about deprecated modules which linger  
about on CPAN, how we deal with PPMs/RPMs/packaging, and so on.  All  
have possible reasonable ways they can be addressed, I believe.   
Also, I think we should still think about doing regular full-scale  
'stable' (1.#) releases (sort of our stamp of approval for that batch  
of modules at that point in time, with a reasonable 'sell-by' date).

Again, it should be seriously discussed among the core devs and the  
bioperl community at large prior to any serious work on it, and it  
would be quite a large-scale project, but possibly worth it.  It can  
only go forward if there is enough momentum behind it.

>> Finally, all of this should wait until later.  Much later, like  
>> after  a decent release, after svn, etc kind of 'later'.  I think  
>> we can  agree on that.
>
> Hmm, not really. If it can be implemented by a change in just  
> Build.PL and ModuleBuildBioperl, its really independent of  
> everything else. That's the beauty of it: the only thing that  
> changes is how things are uploaded to and downloaded from CPAN. The  
> only person that normally deals with that issue is the pumpkin for  
> a release, and he only cares about it at release time.
>
> In fact, if we're going to do it at all it makes sense to try it  
> out on a minor release like 1.5.3. We've already got experience of  
> doing it split-style from 1.5.2. (And let me tell you: splits at  
> the code-base level suck.)

BOSC is coming up, and I would like to focus on getting svn migration  
taken care of ASAP (which is sounding more and more like we plan on  
moving all open-bio over, unless I misread Jason's post?) and  
stomping of bugs (my next priority after EUtilities).  Maybe in the  
interim we should try focusing on bug squashing, get out a quick  
standard dev release (1.5.3) before BOSC, and then a few of us could  
all communicate there via email/text/IM/phone off-list?  Maybe post  
updates via the bioperl blog and list?

> And where is the harm in letting them do it via CPAN as well? In  
> fact, there are significant benefits:
...

I'm already pretty convinced...

> The same can be achieved with CPAN bundles for each kind of  
> functional grouping you can think of. And since its just a single  
> text file that defines such a grouping, its easy to change or add  
> new ones as you feel like it, as opposed to the rather more  
> permanent and substantial effort of creating one of your splits on  
> the code-base level.

... or it could be run right in Module::Build for specific parent  
classes (as I mention above).  Bundling could be instituted for  
something like a standard GBrowse release (Bundle::BioPerl::GBrowse)  
where the functionality might be more spread out (Bio::DB*,  
Bio::Graphics, Bio::FeatureIO, etc).  For a full-scale old-style core  
install, another Bundle (Bundle::BioPerl::Standard).

...

> Yes, it would be automated, and no, it wouldn't at all be any kind  
> of additional headache. I'm proposing a fully-automated system that  
> the pumpkin wouldn't even have to think about it. Much /less/ of a  
> headache than dealing with splits. Orders of magnitude easier to  
> deal with.

The 'headache' would be the initial setup (splitting test, individual  
Build.PL, etc), but this could be done stepwise or section-wise, I  
suppose.
...

> And the smallest, most concentrated set of modules is the  
> individual module.

Well, only if it runs correctly (i.e. has the entire dep. tree  
installed).  But the 'follow' tests would handle that.

> The reason some of these existing splits (micoarray, ext) have  
> fallen by the way-side? /Because/ they're splits. If they had been  
> part of bioperl-live all along, they'd have been kept in a working,  
> compatible state and would have been released along with everything  
> else in 1.5.2

microarray fell out of favor for other reasons (much faster ways to  
do the same thing via R), though I think it still could be salvaged  
if someone wanted to take it up.

the other bioperl distros (network, db, run, etc) would also  
necessitate following the same path as core, but I guess they could  
be bundled as well.

> ...
> No headaches.

I already have one, sorry!

chris


From n.haigh at sheffield.ac.uk  Thu Jun 28 15:53:52 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 28 Jun 2007 16:53:52 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <4683D990.8090909@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Fields wrote:
> On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>> ...
>>
>> The short and sweet version: my proposal has all the benefits of
>> yours, but none of the disadvantages. What's not to like?
> 
> The short and sweet version: I'm more convinced after you laid out your
> argument in detail, which would have saved me some typing last night,
> BTW, thanks! ; >
> 
> The other core devs need to chip in and we need to openly (candidly)
> discuss it some more (I've added Hilmar to this).  There is also a
> tenable solution that allows both aspects ('cliques' and single mode)
> which might make everybody happy.

Couldn't "cliques" simply be satisfied with CPAN Bundles?

> 
> Let's say we only want to install Bio::SeqIO::genbank.  The
> Bio::SeqIO::genbank Build.PL would only install what was needed (as you
> indicated), only Bio::SeqIO::genbank-related tests would run (along with
> dependency test, if available), and life would go on.  However, what if
> we wanted to install everything in SeqIO/DB/AlignIO/etc?

I think this might be where Bundles come in for installing these
"cliques" of related modules?

- -- snip --

> 
>> Yes, it would be automated, and no, it wouldn't at all be any kind of
>> additional headache. I'm proposing a fully-automated system that the
>> pumpkin wouldn't even have to think about it. Much /less/ of a
>> headache than dealing with splits. Orders of magnitude easier to deal
>> with.
> 
> The 'headache' would be the initial setup (splitting test, individual
> Build.PL, etc), but this could be done stepwise or section-wise, I suppose.

Yes, I think this is where most of the labour will be. However, setting
the test suite up like this would be beneficial with or without
publishing modules individually.

- -- snip --
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGg9mQczuW2jkwy2gRAlfBAKCFP7XUvWXsjycSv0MVGN3Ru40D/wCcDiDg
UKE/Q/wA3gu1Gb7S6rarCQw=
=WQdY
-----END PGP SIGNATURE-----


From bix at sendu.me.uk  Thu Jun 28 16:03:54 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 17:03:54 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <4683DBEA.90005@sendu.me.uk>

Chris Fields wrote:
> On Jun 28, 2007, at 2:25 AM, Sendu Bala wrote:
> Let's say we only want to install Bio::SeqIO::genbank.  The 
> Bio::SeqIO::genbank Build.PL would only install what was needed (as you 
> indicated), only Bio::SeqIO::genbank-related tests would run (along with 
> dependency test, if available), and life would go on.  However, what if 
> we wanted to install everything in SeqIO/DB/AlignIO/etc?
> 
> We could have the Bio::SeqIO Build.PL ask whether you want all SeqIO 
> modules installed or a select few (maybe a quick 'install all (y/n)?' 
> followed by a list, which installs them one at a time along with 
> dependencies), or have the option to specifically denote them as passed 
> args to SeqIO's Build.PL, something like 'perl Build.PL -install-plugins 
> genbank embl swiss', 'perl Build.PL -install-plugins all', etc.  If a 
> specific module (Bio::SeqIO::genbank) is installed directly then maybe 
> the installation q&a's of followed modules could be bypassed when 
> installing down the dependency tree with additional passed args.

I'd probably stay away from something like this. My primary reason 
being, off-the-top-of-my-head I don't see how to get it to work. If 
you're installing Bio::SeqIO for the first time via CPAN you can't ask 
it to install Bio::SeqIO::genbank et al. at the same time because 
Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some circularity.

I also wouldn't want these things to be complicated. There should be 
little in the way of questions to ask during install. Each module's 
Build.PL should be ultra-simple with no advanced logic at all. It should 
just specify things that are absolute requirements. This simplicity 
helps avoid some of the problems we face by distributing the monolithic 
Bioperl.

No, much better for us and for users to provide a Bundle::Bio-SeqIO.


> Now, this doesn't address several related issues, such as how we handle 
> versioning of the independent modules (should be in a controlled 
> manner),

When a module is changed, it gets a version bump. Nothing complicated 
needs to be done. Transparent and obvious, behaving like all other CPAN 
modules would be my choice.


> what we do about deprecated modules which linger about on CPAN,

Delete them from CPAN seems appropriate.


> how we deal with PPMs/RPMs/packaging, and so on.  All have possible 
> reasonable ways they can be addressed, I believe.  Also, I think we 
> should still think about doing regular full-scale 'stable' (1.#) 
> releases (sort of our stamp of approval for that batch of modules at 
> that point in time, with a reasonable 'sell-by' date).

Yes, we can still choose to take a snapshot and announce it to the 
world, but at the module-level nothing special would happen. There would 
just be an updated Bundle::Bioperl-everything (or whatever).


> Again, it should be seriously discussed among the core devs and the 
> bioperl community at large prior to any serious work on it, and it would 
> be quite a large-scale project, but possibly worth it.  It can only go 
> forward if there is enough momentum behind it.

The requirement for this approach is per-module test scripts. Which as I 
identified already, is very desirable anyway so we can hit 100% test 
coverage.

So, regardless of anything else can we all agree that per-module test 
scripts are a good idea and should be worked on? If so, I'll look into 
the feasibility and figure out how much work will be involved.


From cjfields at uiuc.edu  Thu Jun 28 17:17:50 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 12:17:50 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <4683DBEA.90005@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
Message-ID: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>


On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:

> ...
> I'd probably stay away from something like this. My primary reason  
> being, off-the-top-of-my-head I don't see how to get it to work. If  
> you're installing Bio::SeqIO for the first time via CPAN you can't  
> ask it to install Bio::SeqIO::genbank et al. at the same time  
> because Bio::SeqIO::genbank depends on Bio::SeqIO, giving us some  
> circularity.

True...

> I also wouldn't want these things to be complicated. There should  
> be little in the way of questions to ask during install. Each  
> module's Build.PL should be ultra-simple with no advanced logic at  
> all. It should just specify things that are absolute requirements.  
> This simplicity helps avoid some of the problems we face by  
> distributing the monolithic Bioperl.
>
> No, much better for us and for users to provide a Bundle::Bio-SeqIO.

I just don't want too much Bundle-itis as it'll gets confusing for  
newbie (i.e. Vista-itis, or AdobeCS-itis).  It should be limited to  
functional grouping (SeqIO, AlignIO, DB, etc), 'install everything',  
or distribution-specific (GBrowse).

I also think (though Hilmar may veto this) that we should work on  
integrating bioperl-db, network, etc. into this if it goes forward.

Here's a question: how do we plan on handling uploading bioperl  
updates to CPAN via PAUSE?  Do we want to run every single module  
through one pumpkin?  Or do we want to have a core dev group PAUSE  
account?  I can see, for instance, removing everything EUtilities- 
related and submitting it independently using my own PAUSE account,  
but it would be nice to have it under an umbrella 'bioperl-devs'  
account instead.

>> Now, this doesn't address several related issues, such as how we  
>> handle versioning of the independent modules (should be in a  
>> controlled manner),
>
> When a module is changed, it gets a version bump. Nothing  
> complicated needs to be done. Transparent and obvious, behaving  
> like all other CPAN modules would be my choice.
>
>> what we do about deprecated modules which linger about on CPAN,
>
> Delete them from CPAN seems appropriate.

I know you can do that via PAUSE, but I think it lingers about on  
search.cpan.org (unless that's been fixed).  This would prob. have to  
be used sparingly.

>> how we deal with PPMs/RPMs/packaging, and so on.  All have  
>> possible reasonable ways they can be addressed, I believe.  Also,  
>> I think we should still think about doing regular full-scale  
>> 'stable' (1.#) releases (sort of our stamp of approval for that  
>> batch of modules at that point in time, with a reasonable 'sell- 
>> by' date).
>
> Yes, we can still choose to take a snapshot and announce it to the  
> world, but at the module-level nothing special would happen. There  
> would just be an updated Bundle::Bioperl-everything (or whatever).

Right, it would basically be a stamp of certification.

>> Again, it should be seriously discussed among the core devs and  
>> the bioperl community at large prior to any serious work on it,  
>> and it would be quite a large-scale project, but possibly worth  
>> it.  It can only go forward if there is enough momentum behind it.
>
> The requirement for this approach is per-module test scripts. Which  
> as I identified already, is very desirable anyway so we can hit  
> 100% test coverage.
>
> So, regardless of anything else can we all agree that per-module  
> test scripts are a good idea and should be worked on? If so, I'll  
> look into the feasibility and figure out how much work will be  
> involved.

I think so, but the feasibility issue is critical.  Do we want cvs/ 
svn to be divided up into 900 subdirectories (one for each module),  
or do we want to have a similar directory structure as we have now,  
but with each module in it's own directory?  Or leave everything as  
is and generate Build.PL on-the-fly (prob. least feasible)?

This is where it might be wise to do it piece-meal at first (maybe  
starting with something somewhat segregated like Bio::Tools), then  
progress from there.

chris


From hartzell at alerce.com  Thu Jun 28 17:38:48 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 13:38:48 -0400
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
Message-ID: <18051.61992.627473.323346@almost.alerce.com>

David Messina writes:
 > > [George]
 > > Likewise, you probably DON'T want to use this in your config file:
 > >
 > > 	  enable-auto-props = yes
 > > 	  * = svn:keywords="Author Date Id Rev URL"
 > >
 > > since it'll do the same thing.
 > 
 > Ah, so I've been doing it wrong all along then. :) Thanks, George!

It's not *wrong* if it's never done anything to you that you've
regretted.  The right answer depends on your situation....

 > [...]
 > I've googled around and gathered the following as a possible list for  
 > our repo. Since I obviously don't know what I'm doing :), of course  
 > adjust and refine as necessary.
 > 

That's a great starting point.  Do you have write access to the wiki?
Could you link it off of the instructions for using svn?

g.


From hartzell at alerce.com  Thu Jun 28 18:06:50 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 14:06:50 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <4683C385.3050904@sendu.me.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
Message-ID: <18051.63674.685297.426813@almost.alerce.com>

Sendu Bala writes:
 > [...]
 > I tried again in the same location and it told me I had to 'svn 
 > cleanup', which I did. But subsequently it kept complaining about files 
 > already being there.

You need to do the cleanup because svn exited gracelessly and you
needed to help it get back in it's feet.  The cleanup doesn't remove
the stuff that you did get checked out, so it's still there getting in
the way of your new checkout.

 > [...]
 > svn co 
 > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data
 > 
 > causes this repeatable problem:
 > 
 > [...]
 > A    data/phredfile.phd
 > svn: In directory 'data'
 > svn: Can't move source to dest
 > svn: Can't move 'data/.svn/tmp/prop-base/HUMBETGLOA.FASTA.svn-base' to 
 > 'data/.svn/prop-base/HUMBETGLOA.FASTA.svn-base': No such file or directory
 > 
 > That is with Mac OS X svn command-line client, version 1.4.4
 > 
 > I can get bioperl-live/tags/release-0-9-2/t/data to check out fine with 
 > a linux svn command-line client, version 1.2.3.

I'm not 100% sure what's going on here, but I'm inclined to say "get a
real computer" (and yes, I'm typing this on a mac...).  I have a mac
pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
the tiger used to say)....

I think that we're having trouble with case sensitivity.  My only
evidence is that I can see where there have been both HUMBETGLOA.FASTA
and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
anything else that's weird about that file.  On the other hand, I
can't see how this would cause the error you're seeing though.

The experiment would be to grab a usb or firewire disk (or even a
memory stick), partition/format it as case sensitive (or even *unix*)
and try to do

 svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/tags/release-0-9-2/t/data

into it.  If it works, voila.  If not, I'll keep making stuff up, err,
thinking about it.

g.


From dmessina at wustl.edu  Thu Jun 28 18:15:32 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 13:15:32 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<6830CBB0-70A1-4F84-9C52-F61AEDC4B83B@uiuc.edu>
Message-ID: <459D9BC0-4FBA-4560-80A8-E6243DE9D9CC@wustl.edu>

Same svn error here on the full checkout.


> What local (mac) svn version are you using?  I'm running off macports:
>
> svn --version
> svn, version 1.4.4 (r25188)
>     compiled Jun 16 2007, 23:40:53

I have svn 1.4.3.

% svn --version
svn, version 1.4.3 (r23084)
    compiled Apr  1 2007, 02:47:14

Copyright (C) 2000-2006 CollabNet.
Subversion is open source software, see http://subversion.tigris.org/
This product includes software developed by CollabNet (http:// 
www.Collab.Net/).

The following repository access (RA) modules are available:

* ra_dav : Module for accessing a repository via WebDAV (DeltaV)  
protocol.
   - handles 'http' scheme
* ra_svn : Module for accessing a repository using the svn network  
protocol.
   - handles 'svn' scheme
* ra_local : Module for accessing a repository on local disk.
   - handles 'file' scheme


From cjfields at uiuc.edu  Thu Jun 28 18:54:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 13:54:15 -0500
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18051.63674.685297.426813@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
Message-ID: <D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>


On Jun 28, 2007, at 1:06 PM, George Hartzell wrote:

> ...
> I'm not 100% sure what's going on here, but I'm inclined to say "get a
> real computer" (and yes, I'm typing this on a mac...).  I have a mac
> pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
> the tiger used to say)....

Ouch!  Though it could be worse (**coughwindowscough**).

> I think that we're having trouble with case sensitivity.  My only
> evidence is that I can see where there have been both HUMBETGLOA.FASTA
> and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
> anything else that's weird about that file.  On the other hand, I
> can't see how this would cause the error you're seeing though.

Odd that other branches (including the main trunk) work but that one  
doesn't.

> The experiment would be to grab a usb or firewire disk (or even a
> memory stick), partition/format it as case sensitive (or even *unix*)
> and try to do
>
>  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
> live/tags/release-0-9-2/t/data
>
> into it.  If it works, voila.  If not, I'll keep making stuff up, err,
> thinking about it.
>
> g.

I'll have to figure out why I can't get ssh keys to work locally to  
test it out more (I have a usb drive to test with); just don't have  
time at the moment.

chris


From dmessina at wustl.edu  Thu Jun 28 18:47:04 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 13:47:04 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <18051.61992.627473.323346@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>
	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>
	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>
	<18051.44281.831316.749586@almost.alerce.com>
	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
Message-ID: <0027C4E0-26B1-41F3-8FD8-EAB5465CA80E@wustl.edu>

> That's a great starting point.  Do you have write access to the wiki?
> Could you link it off of the instructions for using svn?

Done.

http://www.bioperl.org/wiki/Svn_auto-props

linked from:
http://www.bioperl.org/wiki/Using_Subversion (bottom of page)


From bix at sendu.me.uk  Thu Jun 28 19:19:35 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 20:19:35 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
Message-ID: <468409C7.7020102@sendu.me.uk>

Chris Fields wrote:
> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:
> Here's a question: how do we plan on handling uploading bioperl  
> updates to CPAN via PAUSE?  Do we want to run every single module  
> through one pumpkin?  Or do we want to have a core dev group PAUSE  
> account?  I can see, for instance, removing everything EUtilities- 
> related and submitting it independently using my own PAUSE account,  
> but it would be nice to have it under an umbrella 'bioperl-devs'  
> account instead.

All Bioperl modules (except the Bundle!) are owned by BIOPERLML on 
PAUSE. Its a little akward since PAUSE is uploader-centric, but see my 
notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release

And certainly, everything that wants to consider itself part of Bioperl 
(and gain the benefit of lots of devs looking after it) should certainly 
  have BIOPERLML as the primary owner.


> I think so, but the feasibility issue is critical.  Do we want cvs/ 
> svn to be divided up into 900 subdirectories (one for each module),  
> or do we want to have a similar directory structure as we have now,  
> but with each module in it's own directory?  Or leave everything as  
> is and generate Build.PL on-the-fly (prob. least feasible)?

Very definitely the latter. The key benefit of my approach is that the 
organisation stays as is and that a snapshot of the repository remains a 
single directory of modules in Bio so that people don't have to 
'install' Bioperl, they can still just uncompress the archive (or check 
out the package from svn) and point their PERL5LIB to the root dir of 
the package.

For that reason I very much like the idea of folding the current 
split-out packages (run, network etc.) back into the core package so 
everything is one place. Folding them back in should obviously wait 
until everything is in place and working with core already.


My proposal obviously wasn't very clear. As far as all other devs are 
concerned, nothing changes at all (except for lots of new improved test 
scripts). The pumpkin will, however, be able to say:

./Build dist

Right now that generates the distribution archives (in different 
compression formats) - one big archive containing everything.
My proposal is simply that instead it generates lots of archives, one 
archive per module. It will also generate some Bundles and whatever else 
might be needed.

I don't envisage any major difficulties in achieving this. The 
'feasibility' issue I was going to look into was strictly regarding 
doing all the new test scripts.


From hartzell at alerce.com  Thu Jun 28 19:43:38 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 15:43:38 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
Message-ID: <18052.3946.224905.415905@almost.alerce.com>

Chris Fields writes:
 > 
 > On Jun 28, 2007, at 1:06 PM, George Hartzell wrote:
 > 
 > > ...
 > > I'm not 100% sure what's going on here, but I'm inclined to say "get a
 > > real computer" (and yes, I'm typing this on a mac...).  I have a mac
 > > pro that runs FreeBSD-STABLE part time and it's ggrrreeaatt (as tony
 > > the tiger used to say)....
 > 
 > Ouch!  Though it could be worse (**coughwindowscough**).
 > 
 > > I think that we're having trouble with case sensitivity.  My only
 > > evidence is that I can see where there have been both HUMBETGLOA.FASTA
 > > and HUMBETGLOA.fasta in the tree at various times.  I can't figure out
 > > anything else that's weird about that file.  On the other hand, I
 > > can't see how this would cause the error you're seeing though.
 > 
 > Odd that other branches (including the main trunk) work but that one  
 > doesn't.
 > 
 > > The experiment would be to grab a usb or firewire disk (or even a
 > > memory stick), partition/format it as case sensitive (or even *unix*)
 > > and try to do
 > >
 > >  svn co svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
 > > live/tags/release-0-9-2/t/data
 > >
 > > into it.  If it works, voila.  If not, I'll keep making stuff up, err,
 > > thinking about it.
 > >
 > > g.
 > 
 > I'll have to figure out why I can't get ssh keys to work locally to  
 > test it out more (I have a usb drive to test with); just don't have  
 > time at the moment.

I just did the experiment, and filename-insensitivity seems to be
breaking something.

I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.

I reformatted a memory stick to be case sensitive and co of

  bioperl/bioperl-live/tags/release-0-9-2/t 

worked, then I made a directory in my home dir (normal mac thing) and
got the same error as above.

I can get a copy of the trunk, so I'm inclined to ask someone to
mention the problem on the wiki and then just ignore it.

g.


From cjfields at uiuc.edu  Thu Jun 28 20:29:09 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 15:29:09 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <468409C7.7020102@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
Message-ID: <026156F4-4C46-4CC6-82B5-07FC5326A244@uiuc.edu>


On Jun 28, 2007, at 2:19 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> On Jun 28, 2007, at 11:03 AM, Sendu Bala wrote:
>> Here's a question: how do we plan on handling uploading bioperl
>> updates to CPAN via PAUSE?  Do we want to run every single module
>> through one pumpkin?  Or do we want to have a core dev group PAUSE
>> account?  I can see, for instance, removing everything EUtilities-
>> related and submitting it independently using my own PAUSE account,
>> but it would be nice to have it under an umbrella 'bioperl-devs'
>> account instead.
>
> All Bioperl modules (except the Bundle!) are owned by BIOPERLML on
> PAUSE. Its a little akward since PAUSE is uploader-centric, but see my
> notes at http://www.bioperl.org/wiki/Making_a_BioPerl_release
>
> And certainly, everything that wants to consider itself part of  
> Bioperl
> (and gain the benefit of lots of devs looking after it) should  
> certainly
>   have BIOPERLML as the primary owner.

Alrighty then.

>> I think so, but the feasibility issue is critical.  Do we want cvs/
>> svn to be divided up into 900 subdirectories (one for each module),
>> or do we want to have a similar directory structure as we have now,
>> but with each module in it's own directory?  Or leave everything as
>> is and generate Build.PL on-the-fly (prob. least feasible)?
>
> Very definitely the latter. The key benefit of my approach is that the
> organisation stays as is and that a snapshot of the repository  
> remains a
> single directory of modules in Bio so that people don't have to
> 'install' Bioperl, they can still just uncompress the archive (or  
> check
> out the package from svn) and point their PERL5LIB to the root dir of
> the package.

Okay, makes sense.

> For that reason I very much like the idea of folding the current
> split-out packages (run, network etc.) back into the core package so
> everything is one place. Folding them back in should obviously wait
> until everything is in place and working with core already.

I agree, but that's up to Brian, Hilmar, and the others who donated  
the packages (or at least a consensus of core devs).  One thing at a  
time.

> My proposal obviously wasn't very clear. As far as all other devs are
> concerned, nothing changes at all (except for lots of new improved  
> test
> scripts). The pumpkin will, however, be able to say:
>
> ./Build dist
>
> Right now that generates the distribution archives (in different
> compression formats) - one big archive containing everything.
> My proposal is simply that instead it generates lots of archives, one
> archive per module. It will also generate some Bundles and whatever  
> else
> might be needed.

We'll need to define which tests and data goes with each module and  
so on.

> I don't envisage any major difficulties in achieving this. The
> 'feasibility' issue I was going to look into was strictly regarding
> doing all the new test scripts.

Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3  
is ready to go.  We'll still need to get thoughts on this from other  
core devs out there, and it prob. should until everybody is  
comfortable with the idea.

chris


From dmessina at wustl.edu  Thu Jun 28 22:13:48 2007
From: dmessina at wustl.edu (David Messina)
Date: Thu, 28 Jun 2007 17:13:48 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
Message-ID: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>

Coming late to this party, I'm replying to snippets from multiple  
emails.


> [Chris]
> what we do about deprecated modules which linger
> about on CPAN

> [Sendu]
> Delete them from CPAN seems appropriate.

I coulda sworn this was frowned upon, but a recent thread suggests  
it's totally kosher.

	http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html


> [Sendu]
> So, regardless of anything else can we all agree that per-module test
> scripts are a good idea and should be worked on?

I agree.


> [Sendu]
> people don't have to
> 'install' Bioperl, they can still just uncompress the archive (or  
> check
> out the package from svn) and point their PERL5LIB to the root dir of
> the package.

Could you elaborate a bit on how this works? How is XS code that  
needs compiling handled? Or the scripts directory? I would love to be  
able to do this.


> [Sendu]
> For that reason I very much like the idea of folding the current
> split-out packages (run, network etc.) back into the core package so
> everything is one place. Folding them back in should obviously wait
> until everything is in place and working with core already.

 From an organizational standpoint, I'm concerned that with ~900  
modules in core right now, adding all of the additional stuff from  
the split-out packages would make for a daunting directory.

But as you said, this is way down the road, so this proposal doesn't  
bear on the other, closer-to-now issues on the table.


> [Chris]
> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
> is ready to go.  We'll still need to get thoughts on this from other
> core devs out there, and it prob. should until everybody is
> comfortable with the idea.

If we go forward with the CPAN split plan, I like the idea of having  
a trial. We can foresee some of the issues that such a change may  
bring, and yet still more no doubt wait for us once we do it.


Dave


From bix at sendu.me.uk  Thu Jun 28 22:59:35 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 28 Jun 2007 23:59:35 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <46843D57.2080409@sendu.me.uk>

David Messina wrote:
>> people don't have to 'install' Bioperl, they can still just
>> uncompress the archive (or check out the package from svn) and
>> point their PERL5LIB to the root dir of the package.
> 
> Could you elaborate a bit on how this works? How is XS code that 
> needs compiling handled? Or the scripts directory? I would love to be
> able to do this.

I meant for the most part. Core doesn't have any XS code so that's not 
an issue. Scripts can be run manually like any other perl script. When 
you discover something isn't working because of a missing external 
dependency, you just install it. (But that happens very rarely.)

Personally I've /never/ installed Bioperl and used that installed set of 
modules. I've always just pointed my PERL5LIB at the distribution folder 
or my cvs checkout.

Which makes me a strange candidate for advocating all these 
CPAN-specific changes, but there you go ;)


From cjfields at uiuc.edu  Thu Jun 28 23:03:02 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 28 Jun 2007 18:03:02 -0500
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <8B6FBB52-5CCE-4122-876C-B9827C86E46E@uiuc.edu>


On Jun 28, 2007, at 5:13 PM, David Messina wrote:

> Coming late to this party, I'm replying to snippets from multiple  
> emails.
>
>
>> [Chris]
>> what we do about deprecated modules which linger
>> about on CPAN
>
>> [Sendu]
>> Delete them from CPAN seems appropriate.
>
> I coulda sworn this was frowned upon, but a recent thread suggests  
> it's totally kosher.
>
> 	http://www.nntp.perl.org/group/perl.qa/2007/03/msg8473.html

As long as it doesn't show up somewhere to confuse newbies I'm okay  
with it.

>> [Sendu]
>> people don't have to
>> 'install' Bioperl, they can still just uncompress the archive (or  
>> check
>> out the package from svn) and point their PERL5LIB to the root dir of
>> the package.
>
> Could you elaborate a bit on how this works? How is XS code that  
> needs compiling handled? Or the scripts directory? I would love to  
> be able to do this.

Maybe Sendu can add to this, but the XS code is limited to bioperl- 
ext AFAIK.  We could keep that separate until it plays well with  
bioperl itself.

Scripts and examples - maybe packaged along with a Bundle?

>> [Sendu]
>> For that reason I very much like the idea of folding the current
>> split-out packages (run, network etc.) back into the core package so
>> everything is one place. Folding them back in should obviously wait
>> until everything is in place and working with core already.
>
> From an organizational standpoint, I'm concerned that with ~900  
> modules in core right now, adding all of the additional stuff from  
> the split-out packages would make for a daunting directory.
>
> But as you said, this is way down the road, so this proposal  
> doesn't bear on the other, closer-to-now issues on the table.

Well, the code in bioperl-db and network complement code in core, so  
I agree with Sendu they belong there.  They should be under the same  
scrutiny as the rest anyway (code, tests, etc), but won't be bundled  
unles there is an 'install everything' Bundle.

>> [Chris]
>> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
>> is ready to go.  We'll still need to get thoughts on this from other
>> core devs out there, and it prob. should until everybody is
>> comfortable with the idea.
>
> If we go forward with the CPAN split plan, I like the idea of  
> having a trial. We can foresee some of the issues that such a  
> change may bring, and yet still more no doubt wait for us once we  
> do it.

That's what branches are for; testing stuff out like this.

chris


From hartzell at alerce.com  Thu Jun 28 23:05:32 2007
From: hartzell at alerce.com (George Hartzell)
Date: Thu, 28 Jun 2007 19:05:32 -0400
Subject: [Bioperl-l] problem with binary files.
Message-ID: <18052.16060.932502.183552@almost.alerce.com>


Ok, after pointing out the problem with setting the svn:keywords
property on binary files, it turns out that I *did* that.  Worse yet,
I set the svn:eol-style to 'native' on everything, including binary
files, so depending on your platform they're likely to be fubar.

For example, bioperl-run/t/data/H_pylori_J99.glimmer2.icm may or may
not be what you expect it to be, depending on whether your eol-style
matches the servers and whether any conversions were done.

I'll touch up the way that the little tool I'm using calls cvs2svn and
redo the repository.

g.


From n.haigh at sheffield.ac.uk  Fri Jun 29 06:59:21 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 07:59:21 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
References: <467949EC.9040100@sendu.me.uk>	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>	<4682C6F5.4020406@sendu.me.uk>
	<4682D12E.3000803@sendu.me.uk>	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>	<4682E824.1050507@sendu.me.uk>	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>	<4683624F.6020402@sendu.me.uk>	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<BAC2EF08-D48C-4CEF-8860-26A1D3C41B69@wustl.edu>
Message-ID: <4684ADC9.8040404@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- -- split --
>> [Sendu]
>> For that reason I very much like the idea of folding the current
>> split-out packages (run, network etc.) back into the core package so
>> everything is one place. Folding them back in should obviously wait
>> until everything is in place and working with core already.
> 
>  From an organizational standpoint, I'm concerned that with ~900  
> modules in core right now, adding all of the additional stuff from  
> the split-out packages would make for a daunting directory.
> 
> But as you said, this is way down the road, so this proposal doesn't  
> bear on the other, closer-to-now issues on the table.
> 

I don't think this is an issue - it would simply mean everything is
under the same version control hierarchy. And with svn it's Soooooo much
easier to fiddle around with directory structures

> 
> 
>> [Chris]
>> Okay.  Maybe it's worth doing on a branch  as a test run when 1.5.3
>> is ready to go.  We'll still need to get thoughts on this from other
>> core devs out there, and it prob. should until everybody is
>> comfortable with the idea.
> 
> If we go forward with the CPAN split plan, I like the idea of having  
> a trial. We can foresee some of the issues that such a change may  
> bring, and yet still more no doubt wait for us once we do it.
> 

Under svn it would be easy to make an "svn copy" of run, network etc
into a branch of live to test this out. Not that this might be a
problem, but: Since we are looking at bioperl-* packages being under the
same svn repository, then then "svn copy's" are cheap for disk space.

> 
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhK3JczuW2jkwy2gRAtI2AJ4kNrpGY8XMMh9KxOqs+l0PrEVcwgCfVFj6
BCvltmPyWF4ImueYmd7VFAc=
=ktl+
-----END PGP SIGNATURE-----


From n.haigh at sheffield.ac.uk  Fri Jun 29 07:05:33 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 08:05:33 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <18051.61992.627473.323346@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
Message-ID: <4684AF3D.5090907@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

George Hartzell wrote:

- -- snip --

>  > [...]
>  > I've googled around and gathered the following as a possible list for  
>  > our repo. Since I obviously don't know what I'm doing :), of course  
>  > adjust and refine as necessary.
>  > 
> 
> That's a great starting point.  Do you have write access to the wiki?
> Could you link it off of the instructions for using svn?
> 
> g.

Don't .t files need adding to the auto-props?

Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhK89czuW2jkwy2gRAnRGAJ0VnBNVBAdQdfUnqPhmvsyQnD/bswCggSHC
/Iivb6Lc4/51bUdrTmRQYlE=
=V+t2
-----END PGP SIGNATURE-----


From sac at bioperl.org  Fri Jun 29 08:25:36 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Fri, 29 Jun 2007 01:25:36 -0700
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
Message-ID: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>

On 6/27/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> On Jun 26, 2007, at 3:21 PM, George Hartzell wrote:
>
> > ...
> > If you have a dev.open-bio.org account and you're in the bioperl
> > group, you're good to get at it via:
> >
> >   file:///home/hartzell/bioperl
> >
> > or
> >
> >   svn+ssh://dev.open-bio.org/home/hartzell/bioperl
>
> I managed to get it working using file://.  Haven't tried svn+ssh yet
> but I've had persistent problems getting ssh to work properly on my
> macbook; not sure why yet but I haven't had time to play around with it.

Are you using the ssh that comes installed with OSX? If so, I'd
recommend installing openssh from MacPorts. I recall having issues
with the stock version which were resolved by using the more
up-to-date version you can get via MacPorts.

BTW, I haven't been able to check out the new svn repository via
svn+ssh:// because I can't get svn to authenticate with an alternative
username. My username on dev.open-bio.org differs from what it is on
my local machine, so I issue a command such as:

steve at localhost $ svn --username sac checkout
svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

but I get challenged with:
steve at dev.open-bio.org's password:

I also tried putting the --username argument after the subcommand, but
it still wants to use my local username. I can ssh -l sac into the dev
box no problem. Any suggestions?

Steve


From bix at sendu.me.uk  Fri Jun 29 08:52:42 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 29 Jun 2007 09:52:42 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
 Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <4684C85A.5030206@sendu.me.uk>

Steve Chervitz wrote:
> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username. My username on dev.open-bio.org differs from what it is on
> my local machine, so I issue a command such as:
> 
> steve at localhost $ svn --username sac checkout
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> 
> but I get challenged with:
> steve at dev.open-bio.org's password:
> 
> I also tried putting the --username argument after the subcommand, but
> it still wants to use my local username. I can ssh -l sac into the dev
> box no problem. Any suggestions?

Set up your ssh key on the dev machine. I'm also on a machine with the 
wrong username and it works even without attempting to supply the 
correct one.

It does, however, show the 'Welcome to the new developer system' message 
2 or 3 times for every svn+ssh action, which freaks me out a little.


From N.Haigh at sheffield.ac.uk  Fri Jun 29 09:32:38 2007
From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 29 Jun 2007 10:32:38 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and
	...Re:	Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <1183109558.4684d1b69bcec@webmail.shef.ac.uk>

Quoting Steve Chervitz <sac at bioperl.org>:

-- snip --

> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username. My username on dev.open-bio.org differs from what it is on
> my local machine, so I issue a command such as:
> 
> steve at localhost $ svn --username sac checkout
> svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> 
> but I get challenged with:
> steve at dev.open-bio.org's password:
> 
> I also tried putting the --username argument after the subcommand, but
> it still wants to use my local username. I can ssh -l sac into the dev
> box no problem. Any suggestions?
> 
> Steve
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


You could try:
svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

Nath


From dmessina at wustl.edu  Fri Jun 29 12:28:26 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 29 Jun 2007 07:28:26 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
Message-ID: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>

>
> BTW, I haven't been able to check out the new svn repository via
> svn+ssh:// because I can't get svn to authenticate with an alternative
> username.

I have the same issue. I set up a stanza in my ~/.ssh/config:

Host dev.open-bio.org
   User dave_messina

where dave_messina is my dev.open-bio.org username.


From cjfields at uiuc.edu  Fri Jun 29 17:00:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 29 Jun 2007 12:00:27 -0500
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
Message-ID: <F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>


On Jun 29, 2007, at 7:28 AM, David Messina wrote:

>>
>> BTW, I haven't been able to check out the new svn repository via
>> svn+ssh:// because I can't get svn to authenticate with an  
>> alternative
>> username.
>
> I have the same issue. I set up a stanza in my ~/.ssh/config:
>
> Host dev.open-bio.org
>    User dave_messina
>
> where dave_messina is my dev.open-bio.org username.

I changed to the macports ssh w/o luck.  It appears the key is  
offered up, so maybe the problem is how I have everything set up on  
dev (though I followed everything on the wiki):

....
  Contact 'support at open-bio.org' for
your new login information.
======================================
debug1: Authentications that can continue: publickey,gssapi-with- 
mic,password
debug1: Next authentication method: publickey
debug1: Offering public key: /Users/cjfields/.ssh/id_dsa
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,gssapi-with- 
mic,password
debug2: we did not send a packet, disable method
debug1: Next authentication method: password

It's odd; I can use passwordless logins for other servers (admittedly  
Mac servers) w/o problems using ssh keys, but dev.open-bio.org always  
prompts for a password regardless.

My feeling is it's something with my local ssh or sshd config; I'll  
try fiddling with it to see what happens.  Anyone have suggestions?   
I've lost enough hair as is; don't want to lose more!

chris


From sac at bioperl.org  Fri Jun 29 17:07:45 2007
From: sac at bioperl.org (Steve Chervitz)
Date: Fri, 29 Jun 2007 10:07:45 -0700
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and ...Re:
	Perltidy]
In-Reply-To: <1183109558.4684d1b69bcec@webmail.shef.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<1183109558.4684d1b69bcec@webmail.shef.ac.uk>
Message-ID: <8f200b4c0706291007x2b765323n75c9003a47fe7cbb@mail.gmail.com>

On 6/29/07, Nathan S. Haigh <N.Haigh at sheffield.ac.uk> wrote:
> Quoting Steve Chervitz <sac at bioperl.org>:
>
> -- snip --
>
> > BTW, I haven't been able to check out the new svn repository via
> > svn+ssh:// because I can't get svn to authenticate with an alternative
> > username. My username on dev.open-bio.org differs from what it is on
> > my local machine, so I issue a command such as:
> >
> > steve at localhost $ svn --username sac checkout
> > svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk
> >
> > but I get challenged with:
> > steve at dev.open-bio.org's password:
> >
> > I also tried putting the --username argument after the subcommand, but
> > it still wants to use my local username. I can ssh -l sac into the dev
> > box no problem. Any suggestions?
>
> [...]
> You could try:
> svn+ssh://USERNAME at dev.open-bio.org/home/hartzell/bioperl/bioperl-live/trunk

Bingo. Thanks for the tips, guys.

BTW, setting up ssh keys was not the issue, since my key is already
set up on the dev machine. The svn --username setting appears to not
be operative at the ssh layer. I  suspected this might be the case
given that the usage info says:

 $ svn --help co
  --username arg           : specify a username ARG
  --password arg           : specify a password ARG

which seemed insecure. I didn't want to send my password in the clear,
and didn't know if or whether svn would hand it off to ssh. It wasn't
even sending my username to ssh, so I knew something was wrong. These
args are probably only intended for accessing local svn repositories,
or non-svn+ssh-based checkouts.

BTW, the svn+ssh check out on Mac OS X works for me. I'm using svn and
openssh installed via MacPorts:

$ svn --version
svn, version 1.4.4 (r25188)
   compiled Jun 28 2007, 23:51:53

$ ssh -version
OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007

Steve


From hartzell at alerce.com  Fri Jun 29 19:19:31 2007
From: hartzell at alerce.com (George Hartzell)
Date: Fri, 29 Jun 2007 15:19:31 -0400
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
	<F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
Message-ID: <18053.23363.102371.602742@almost.alerce.com>

Chris Fields writes:
 > 
 > On Jun 29, 2007, at 7:28 AM, David Messina wrote:
 > 
 > >>
 > >> BTW, I haven't been able to check out the new svn repository via
 > >> svn+ssh:// because I can't get svn to authenticate with an  
 > >> alternative
 > >> username.
 > >
 > > I have the same issue. I set up a stanza in my ~/.ssh/config:
 > >
 > > Host dev.open-bio.org
 > >    User dave_messina
 > >
 > > where dave_messina is my dev.open-bio.org username.
 > 
 > I changed to the macports ssh w/o luck.  It appears the key is  
 > offered up, so maybe the problem is how I have everything set up on  
 > dev (though I followed everything on the wiki):

A couple of things to check.

  - make sure that you put your public key in ~/.ssh/authorized_keys2
    (not authorized_keys)

  - make sure that authorized_keys2 is chmod'ed 600 (644 might be
    enough...).

  - make sure that ~/.ssh is chmoded 700.

  - make sure that your home directory is 755.

Then see if it works.  You might be able to relax some of those
protections a bit, but ssh's uptight about letting other people mess
with that data.

g.


From dmessina at wustl.edu  Fri Jun 29 22:47:14 2007
From: dmessina at wustl.edu (David Messina)
Date: Fri, 29 Jun 2007 17:47:14 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <4684AF3D.5090907@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
Message-ID: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>

> [Nathan]
> Don't .t files need adding to the auto-props?

Yes -- thanks for reminding me. Please feel free to add it to the  
wiki page. I'll be tweaking it some more later on in any case.


Dave


From n.haigh at sheffield.ac.uk  Sat Jun 30 09:55:56 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 30 Jun 2007 10:55:56 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
Message-ID: <468628AC.9060200@sheffield.ac.uk>

David Messina wrote:
>> [Nathan]
>> Don't .t files need adding to the auto-props?
> 
> Yes -- thanks for reminding me. Please feel free to add it to the wiki 
> page. I'll be tweaking it some more later on in any case.
> 
> 
> Dave

I noticed this has already been done. I have just been through the 
t/data dir and added a list of extensions I found (without props). There 
are some files without extensions, how should these be dealt with? There 
seems to be a plethora of file naming styles which means there's a 
pretty long list of non-standard extensions. So at some point someone 
will commit a new data file with a new extension (often describing what 
program created the output or the test for which it's intended) that 
won't be in the auto-props file - can you think of a way around this?

Nath


From cjfields at uiuc.edu  Sat Jun 30 12:48:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 07:48:10 -0500
Subject: [Bioperl-l] ssh keys [was Re: First cut svn repository]
In-Reply-To: <18053.23363.102371.602742@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>
	<8f200b4c0706290125w2394887cg4fbb0de769e8673d@mail.gmail.com>
	<42A2ACAF-645B-497F-8335-A8D72CCBEC73@wustl.edu>
	<F2320571-9B57-4671-90C8-E76182544DA3@uiuc.edu>
	<18053.23363.102371.602742@almost.alerce.com>
Message-ID: <3874B4EE-0119-40BC-8B92-11133A766417@uiuc.edu>


On Jun 29, 2007, at 2:19 PM, George Hartzell wrote:

> Chris Fields writes:
>>
>> On Jun 29, 2007, at 7:28 AM, David Messina wrote:
>>
>>>>
>>>> BTW, I haven't been able to check out the new svn repository via
>>>> svn+ssh:// because I can't get svn to authenticate with an
>>>> alternative
>>>> username.
>>>
>>> I have the same issue. I set up a stanza in my ~/.ssh/config:
>>>
>>> Host dev.open-bio.org
>>>    User dave_messina
>>>
>>> where dave_messina is my dev.open-bio.org username.
>>
>> I changed to the macports ssh w/o luck.  It appears the key is
>> offered up, so maybe the problem is how I have everything set up on
>> dev (though I followed everything on the wiki):
>
> A couple of things to check.
>
>   - make sure that you put your public key in ~/.ssh/authorized_keys2
>     (not authorized_keys)
>
>   - make sure that authorized_keys2 is chmod'ed 600 (644 might be
>     enough...).
>
>   - make sure that ~/.ssh is chmoded 700.
>
>   - make sure that your home directory is 755.
>
> Then see if it works.  You might be able to relax some of those
> protections a bit, but ssh's uptight about letting other people mess
> with that data.
>
> g.

Got it working; it was the permissions on my home dir (the last  
one).  Thanks George!

chris


From dmessina at wustl.edu  Sat Jun 30 15:37:44 2007
From: dmessina at wustl.edu (David Messina)
Date: Sat, 30 Jun 2007 10:37:44 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <468628AC.9060200@sheffield.ac.uk>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
Message-ID: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>

> I have just been through the t/data dir and added a list of  
> extensions I found

Thanks! That's a big help. I'll add prop definitions to those shortly.


>  There are some files without extensions, how should these be dealt  
> with?

If you look in the text files section, there are some files there  
which don't have extensions, e.g. AUTHORS, BUGS. There's also

	Makefile.*

so we have some flexibility in how svn knows to auto-prop a file. I  
haven't read up on the details yet to find out how it handles files  
that match multiple criteria -- it may be dependent simply on the  
order they're defined.


> There seems to be a plethora of file naming styles which means  
> there's a pretty long list of non-standard extensions. So at some  
> point someone will commit a new data file with a new extension  
> (often describing what program created the output or the test for  
> which it's intended) that won't be in the auto-props file - can you  
> think of a way around this?

Ive been thinking about this a bit. How about this?

- We have just "standard" files and extensions (like *.blast,  
*.fasta) in the auto-props list.

- We manually add props for the files that have nonstandard,  
arbitrary extensions so all the files have now are prop'd.

- At some point we rename those nonstandard files to have standard  
extensions. Especially for the t/data/ files, we'll have to make sure  
to update the tests that rely on them.

- We can have the suggested list of extensions for new files that get  
added. I don't think we need to strictly enforce this just for the  
sake of svn (after all, its primary function of version control will  
work just fine without any properties set), but it would be nice if  
we could try to keep to it mostly.

Many distros come with an /etc/mime.types file which has the list of  
officially registered MIME types. I found a script that will take  
this list and convert it into auto-props format. I don't think we  
need to support *all* of the gazillion filetypes since most of the  
them our repository will never see, but we certainly could.


Dave


From dmessina at wustl.edu  Sat Jun 30 16:26:27 2007
From: dmessina at wustl.edu (David Messina)
Date: Sat, 30 Jun 2007 11:26:27 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
Message-ID: <D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>


On Jun 30, 2007, at 10:37 AM, David Messina wrote:

> - We manually add props for the files that have nonstandard,
> arbitrary extensions so all the files have now are prop'd.

Er, that should be

- We manually add props for the files that have nonstandard,  
arbitrary extensions so that all the files now in the repository are  
prop'd.


From n.haigh at sheffield.ac.uk  Sat Jun 30 17:25:58 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 30 Jun 2007 18:25:58 +0100
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
 Perltidy]
In-Reply-To: <461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
Message-ID: <46869226.70203@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- -- snip --
> 
> 
>> There seems to be a plethora of file naming styles which means there's
>> a pretty long list of non-standard extensions. So at some point
>> someone will commit a new data file with a new extension (often
>> describing what program created the output or the test for which it's
>> intended) that won't be in the auto-props file - can you think of a
>> way around this?
> 
> Ive been thinking about this a bit. How about this?
> 
> - We have just "standard" files and extensions (like *.blast, *.fasta)
> in the auto-props list.

I think the list of seq formats recognised by Bioperl in Bio::SeqIO and
Bio::AlignIO would be a good start. As these are likely to be the ones
that are sensitive to file format recognition and thus could break tests
if renamed.

I think a lot of people have used "." in file names as an alternative to
a space. I think it would be beneficial to use an underscore "_" in
these cases and leave the "." to represent the beginning of the file
extension.

> 
> - We manually add props for the files that have nonstandard, arbitrary
> extensions so all the files that we currently have now are prop'd.
> 
> - At some point we rename those nonstandard files to have standard
> extensions. Especially for the t/data/ files, we'll have to make sure to
> update the tests that rely on them.

Nice and easy with svn :)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGhpHiczuW2jkwy2gRAuZ5AKCnd2MvCsvSn1NemDVMmabnieR2vACg1Qk0
pYVvXwxq0lpiGfM09RQ6A1I=
=3Lhw
-----END PGP SIGNATURE-----


From cjfields at uiuc.edu  Sat Jun 30 19:11:52 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 30 Jun 2007 14:11:52 -0500
Subject: [Bioperl-l] First cut svn repository [was Re: SVN and	...Re:
	Perltidy]
In-Reply-To: <D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>	<4673C7CB.1030709@mail.nih.gov>	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>	<18049.30026.61328.134490@almost.alerce.com>	<5764264E-5C40-4C9E-B1C9-A70628AC1DD0@uiuc.edu>	<BFBA575A-E653-40F6-9242-D72655B6AE9C@wustl.edu>	<E83D9D3C-96F2-4B5A-B503-09C3860586D0@gmx.net>	<D7111143-D173-42DE-AAEF-C2365AA453A0@wustl.edu>	<18051.44281.831316.749586@almost.alerce.com>	<F5B048F4-CBA5-493A-8A5C-2033709D8A63@wustl.edu>
	<18051.61992.627473.323346@almost.alerce.com>
	<4684AF3D.5090907@sheffield.ac.uk>
	<843758CD-9C5B-4DDA-9FF4-B90AA225BDB3@wustl.edu>
	<468628AC.9060200@sheffield.ac.uk>
	<461F64B9-87FD-458A-8945-8238E7076109@wustl.edu>
	<D6917C62-FA0C-4261-ACFD-014DEF4D89E6@wustl.edu>
Message-ID: <C274666B-9771-4296-80BB-8DFFB036F29C@uiuc.edu>


On Jun 30, 2007, at 11:26 AM, David Messina wrote:

>
> On Jun 30, 2007, at 10:37 AM, David Messina wrote:
>
>> - We manually add props for the files that have nonstandard,
>> arbitrary extensions so all the files have now are prop'd.
>
> Er, that should be
>
> - We manually add props for the files that have nonstandard,
> arbitrary extensions so that all the files now in the repository are
> prop'd.

Do we need to define every filetype extension, or can there be a  
fallback (eg if it isn't on the list or has no extension it's plain  
text)?

chris


From hlapp at gmx.net  Sat Jun 30 21:26:22 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 17:26:22 -0400
Subject: [Bioperl-l] Splits again
In-Reply-To: <468409C7.7020102@sendu.me.uk>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
Message-ID: <A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>


On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:

> [...]
> Very definitely the latter. The key benefit of my approach is that  
> the organisation stays as is and that a snapshot of the repository  
> remains a single directory of modules in Bio so that people don't  
> have to 'install' Bioperl, they can still just uncompress the  
> archive (or check out the package from svn) and point their  
> PERL5LIB to the root dir of the package.

I think this is absolutely key to keep in mind. Anything without this  
feature will likely be a non-starter.

I don't really have time to follow the discussion let alone  
participate, so really all I can contribute is to offer some sanity/ 
reality checks (such as the above).

In this sense, I understand a release pumpkin will generate ~900  
packages to upload to CPAN? How much hassle is that compared to what  
uploading a bioperl release means right now?

How brittle is all the Build.PL code that will be needed to automate  
all of this, and how difficult will it be to maintain? For example,  
if someone adds in 10 new modules, what Build.PL-related work will  
need to be done?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Sat Jun 30 21:32:52 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Sat, 30 Jun 2007 22:32:52 +0100
Subject: [Bioperl-l] Splits again
In-Reply-To: <A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
References: <467949EC.9040100@sendu.me.uk>
	<467FBDD3.8050009@sendu.me.uk>	<46823ABE.2080300@sendu.me.uk>
	<4682B000.2050707@sheffield.ac.uk>	<A17327A5-A174-4110-B793-A80775D80623@uiuc.edu>	<4682B798.1010409@sheffield.ac.uk>
	<4682C6F5.4020406@sendu.me.uk> <4682D12E.3000803@sendu.me.uk>
	<2517AA40-9CDF-44F0-9665-107549DFD30C@uiuc.edu>
	<4682E824.1050507@sendu.me.uk>
	<FBAC5A51-B894-4508-996F-B0248CCF5022@uiuc.edu>
	<4683624F.6020402@sendu.me.uk>
	<CFF085C7-89F1-4DB7-BDA2-935E96AEEE5B@uiuc.edu>
	<4683DBEA.90005@sendu.me.uk>
	<904D660A-3A2F-46F5-A198-0C00CBBF14C1@uiuc.edu>
	<468409C7.7020102@sendu.me.uk>
	<A910978B-C0E9-40DE-B674-7B693520807E@gmx.net>
Message-ID: <4686CC04.6000403@sendu.me.uk>

Hilmar Lapp wrote:
> On Jun 28, 2007, at 3:19 PM, Sendu Bala wrote:
> 
>> [...]
>> Very definitely the latter. The key benefit of my approach is that  
>> the organisation stays as is and that a snapshot of the repository  
>> remains a single directory of modules in Bio so that people don't  
>> have to 'install' Bioperl, they can still just uncompress the  
>> archive (or check out the package from svn) and point their  
>> PERL5LIB to the root dir of the package.
[snip]
> In this sense, I understand a release pumpkin will generate ~900  
> packages to upload to CPAN? How much hassle is that compared to what  
> uploading a bioperl release means right now?

I'd have to investigate. I did my uploads using the PAUSE website, which 
for 900 packages would be unfeasible. Will have to see if the process 
can be automated.


> How brittle is all the Build.PL code that will be needed to automate  
> all of this, and how difficult will it be to maintain? For example,  
> if someone adds in 10 new modules, what Build.PL-related work will  
> need to be done?

Well, my plan will be that once the work is done, you won't need to 
touch the Build.PL code again. My intent is that the pumpkin can just 
type one command and not think about anything.

As for the reality, I won't know until I think about it properly and 
experiment.


From hlapp at gmx.net  Sat Jun 30 23:36:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 30 Jun 2007 19:36:45 -0400
Subject: [Bioperl-l] First cut svn repository
In-Reply-To: <18052.3946.224905.415905@almost.alerce.com>
References: <3097065.1181941697249.JavaMail.myubc2@brahms.my.ubc.ca>
	<bba689ec0706151440o56a7d6c6ncf72a37cd2b2cdc5@mail.gmail.com>
	<185BDA34-1449-49CA-B146-ADF27D2928CD@gmx.net>
	<8D3B697E-2072-46FE-A1C9-E546D9DEAA45@uiuc.edu>
	<4673C7CB.1030709@mail.nih.gov>
	<410EF5F9-A30E-4AB7-85F7-7E761E3890D5@uiuc.edu>
	<18049.30026.61328.134490@almost.alerce.com>
	<4683A7D1.8070403@sendu.me.uk>
	<18051.48684.996884.134046@almost.alerce.com>
	<4683C385.3050904@sendu.me.uk>
	<18051.63674.685297.426813@almost.alerce.com>
	<D554E628-AB22-44C2-B253-3CDDB3F71253@uiuc.edu>
	<18052.3946.224905.415905@almost.alerce.com>
Message-ID: <2159ED58-E6F4-4ED8-AC23-E8BAF69FE240@gmx.net>


On Jun 28, 2007, at 3:43 PM, George Hartzell wrote:

> I just did the experiment, and filename-insensitivity seems to be
> breaking something.
>
> I'm using an svn I picked up from http://www.codingmonkeys.de/mbo/.
>
> I reformatted a memory stick to be case sensitive and co of
>
>   bioperl/bioperl-live/tags/release-0-9-2/t
>
> worked, then I made a directory in my home dir (normal mac thing) and
> got the same error as above.

You picked up a rename of a file from lower case extension to upper  
case extension. Unfortunately, there are several months between  
adding the upper-case and removing the lower-case version.

We can reconstruct what happened with this using svn log on the  
directory (this does not require a checkout):

$ svn log --verbose svn+ssh://dev.open-bio.org/home/hartzell/bioperl/ 
bioperl-live/trunk/t/data

Searching for HUMBETGLOA yields the following two commits that added  
one and removed the other:

------------------------------------------------------------------------
r2245 | jason | 2001-12-08 11:59:05 -0500 (Sat, 08 Dec 2001) | 2 lines
Changed paths:
    M /bioperl-live/trunk/t/SearchIO.t
    A /bioperl-live/trunk/t/data/HUMBETGLOA.FASTA
    A /bioperl-live/trunk/t/data/cysprot1.FASTA

added tests for FASTA

------------------------------------------------------------------------
r2877 | jason | 2002-03-11 22:39:40 -0500 (Mon, 11 Mar 2002) | 2 lines
Changed paths:
    A /bioperl-live/trunk/t/data/HUMBETGLOA.fa
    D /bioperl-live/trunk/t/data/HUMBETGLOA.fasta

renaming file to avoid clobbering on windows

Unfortunately, both files are in the tag (again, no checkout required):

$ svn list svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/tags/release-0-9-2/t/data | grep HUMBETGLOA | grep -i fasta
HUMBETGLOA.FASTA
HUMBETGLOA.fasta

We can remove the offending version from the repository (again,  
without needing a checkout):

$ svn rm svn+ssh://dev.open-bio.org/home/hartzell/bioperl/bioperl- 
live/tags/release-0-9-2/t/data/HUMBETGLOA.fasta

I did this, and now the tag checks out fine on OSX. Can anyone confirm?

(BTW the ability to operate on the repository w/o needing a checkout  
is another advantage of svn)

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================