From cjfields at uiuc.edu  Sun Oct  1 13:05:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 12:05:25 -0500
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>
	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>
	<451E3707.4090400@sendu.me.uk>
	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>
	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
Message-ID: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>


On Sep 30, 2006, at 4:43 PM, Hilmar Lapp wrote:

>
> On Sep 30, 2006, at 10:57 AM, Chris Fields wrote:
>
>> There should be a failed test to let us know of the problem.  As
>> currently set up, the XEMBL server failure doesn't show up in
>> Test::Harness test summaries.  Biblio_biofetch.t had the similar
>> problems before Brian's fixes.
>
> Just keep in mind that you may not want somebody's CPAN installation
> to fail (or require a 'forced' install) just because some server
> happens to be down for maintenance.
>
> 	-hilmar

I don't think this would be a problem unless users specifically set  
BIOPERLDEBUG to 1, which is something most people don't bother with  
before installation (and probably not something we should promote for  
normal installation anyway).  So, for CPAN installation we would  
suggest that BIOPERLDEBUG be 0 or not set at all, and outline the  
reasons why.

The idea is to retain current behavior (remote DB access will not be  
run unless BIOPERLDEBUG is set to 1) and apply it to all tests  
requiring such access.  Otherwise, just those tests are skipped (and  
not the rest of the tests, which occurs currently).  If BIOPERLDEBUG  
is set, the next tests would check the URL, which passes/fails (based  
on the specific value of $@), and runs/skips tests based on the mere  
presence of $@, which indicates some URL issue.  You can do this with  
Test::More, but I'm not sure this can be done with Test.pm or  
Test::Simple.

The current behavior just skips all tests based on a single failed  
URL.  Then, Test::Harness, as currently set, shows skipped tests as  
passed.  The last run I posted previously where XEMBL_DB.t remote DB  
tests failed, I also ran all tests (make test) and get this, which  
doesn't tell us that the remote URL failed:

-----------------------------------------

...
t/WABA.......................ok
t/XEMBL_DB...................ok
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext  
is not installed or is installed incorrectly - skipping ztr.t tests
ok
All tests successful, 5 subtests skipped.

-----------------------------------------


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct  1 13:17:24 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 12:17:24 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880609301610w7b838543t5f8ba7313d285915@mail.gmail.com>
References: <b99962880609271039s75cc4af4nc109cd637b5b267@mail.gmail.com>
	<7A592EAB-A869-4A6C-BFA8-F73F3DFD8F5B@gmx.net>
	<09FB1EB0-2C1C-4FCF-8339-E78556EFEFF2@uiuc.edu>
	<b99962880609280842w47401efnd6d00ff2a6e7fd98@mail.gmail.com>
	<8D75FE6D-C02D-4A86-93FA-B7256050AF11@uiuc.edu>
	<b99962880609280910i68a649fw38a4a77d514eccf@mail.gmail.com>
	<40155903-555A-4662-BCCE-38E5E3784118@uiuc.edu>
	<54E79A5F-5446-4D8E-AD26-B70894048D60@gmx.net>
	<b99962880609301444h3e0a8bd2y5d3ecb2ca9e222e6@mail.gmail.com>
	<1D69005A-DF0E-4F37-93FE-7577A32CC625@gmx.net>
	<b99962880609301610w7b838543t5f8ba7313d285915@mail.gmail.com>
Message-ID: <CAD572AC-B108-4520-8335-6B2F138905C9@uiuc.edu>

The '-w' flag on the shebang line is the source of those errors.  I  
never set it anymore on Windows due to this; I just use the 'use  
warnings' pragma.

If you use 'perl -I. t/test.t' you can normally get around the '-w'  
assumed by using 'make test'.

I will try running tests on bioperl-db and bioperl tomorrow on WinXP  
to confirm these.

Chris

On Sep 30, 2006, at 6:10 PM, Seth Johnson wrote:

> How do I get rid of all of the warnings for "redefined subroutines"  
> during
> the test??  It clutters the output and I can't see the errors.
>
> On 9/30/06, Hilmar Lapp <hlapp at gmx.net> wrote:
>>
>> It doesn't shed more light but it does raise an alert flag. All tests
>> are supposed to pass. The fact that they don't means the problems you
>> are seeing have nothing to do with your specific data or script.
>>
>> First off - can anyone else confirm those errors using the latest
>> Bioperl-db and Bioperl?
>>
>> Second - Seth could you run those tests individually, e.g., using
>>
>>         $ make test test_02species TEST_VERBOSE=1
>>
>> and similarly for the other tests that have failures and post the
>> output. Let's start with 02species and 03simpleseq.
>>
>>         -hilmar
>>
>> On Sep 30, 2006, at 5:44 PM, Seth Johnson wrote:
>>
>>> There are errors during the test. Here's their summary:
>>> ____________________________
>>> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
>>> -------------------------------------------------------------
>>> t\02species.t                 65    2   3.08%  63 65
>>> t\03simpleseq.t    1   256    59  106 179.66%  7-59
>>> t\04swiss.t                   52   14  26.92%  25 27-34 38-42
>>> t\12ontology.t     2   512   738 1471 199.32%  3-738
>>> t\16obda.t                    12    3  25.00%  10-12
>>> ____________________________
>>>
>>> May be that can shed some light on the problem?!?!
>>>
>>> On 9/29/06, Hilmar Lapp < hlapp at gmx.net> wrote:This may in fact be
>>> a knock-on effect of the fixes? <sigh>
>>>
>>> Seth, did you run the test suite that comes with bioperl-db, and did
>>> you get any errors?
>>>
>>>         -hilmar
>>>
>>> On Sep 28, 2006, at 2:26 PM, Chris Fields wrote:
>>>
>>>> Seth,
>>>>
>>>> The organism issue is a bug and has been reported, though I thought
>>>> it was fixed.
>>>>
>>>> The lack of the date and the version is a bit odd, but there have
>>>> been a lot of changes lately to bioperl-live (core bioperl in CVS),
>>>> and a few to bioperl-db.  How old is your bioperl and bioperl-db
>>>> installation.  Hilmar, any additional thoughts?
>>>>
>>>> Chris
>>>>
>>>> On Sep 28, 2006, at 11:10 AM, Seth Johnson wrote:
>>>>
>>>>> Thank you.  That takes care of that, however, I do have another
>>>>> gripe.  When
>>>>> running my script, quoted before, with "my $out =
>>>>> Bio::SeqIO->newFh('-format' => 'genbank');", I have several key
>>>>> pieces of
>>>>> information missing.  The most important one is the version
>>>>> number.  There's
>>>>> also a date missing, and source organism name is corrupted.
>>>>> Here's what I
>>>>> get:
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> LOCUS       NM_014580               2145 bp    dna     linear    
>>>>> UNK
>>>>> DEFINITION  Homo sapiens solute carrier family 2, (facilitated
>>>>> glucose
>>>>>             transporter) member 8 (SLC2A8), mRNA.
>>>>> ACCESSION   NM_014580
>>>>> SOURCE      sapiens.
>>>>>   ORGANISM  sapiens
>>>>>             Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa;
>>>>> Bilateria;
>>>>>             Coelomata; Deuterostomia; Chordata; Craniata;
>>> Vertebrata;
>>>>>             Gnathostomata; Teleostomi; Euteleostomi;  
>>>>> Sarcopterygii;
>>>>> Tetrapoda;
>>>>>             Amniota; Mammalia; Theria; Eutheria; Euarchontoglires;
>>>>> Primates;
>>>>>             Haplorrhini; Simiiformes; Catarrhini; Hominoidea;
>>>>> Hominidae;
>>>>>             Homo/Pan/Gorilla group; Homo.
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>> All of the missing information is stored in BioSQL and
>>>>> theoretically should
>>>>> be in the outpu. Here's how NCBI genbank file looks:
>>>>>
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>> LOCUS       NM_014580               2145 bp    mRNA    linear
>>>>> PRI 17-OCT-2005
>>>>> DEFINITION  Homo sapiens solute carrier family 2, (facilitated
>>>>> glucose
>>>>>             transporter) member 8 (SLC2A8), mRNA.
>>>>> ACCESSION   NM_014580
>>>>> VERSION     NM_014580.3  GI:51870928
>>>>> KEYWORDS    .
>>>>> SOURCE      Homo sapiens (human)
>>>>>   ORGANISM  Homo sapiens
>>>>> <http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606 >
>>>>>             Eukaryota; Metazoa; Chordata; Craniata; Vertebrata;
>>>>> Euteleostomi;
>>>>>             Mammalia; Eutheria; Euarchontoglires; Primates;
>>>>> Haplorrhini;
>>>>>             Catarrhini; Hominidae; Homo.
>>>>>
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>>
>>>>> On 9/28/06, Chris Fields <cjfields at uiuc.edu> wrote:
>>>>>>
>>>>>> Those are from the excessively paranoid '-w' flag on the shebang
>>>>>> line.  If you remove the flag but add the 'use warnings' pragma
>>> the
>>>>>> 'subroutine x redefined' warnings go away.  This, BTW, is one
>>> of the
>>>>>> quirks of the ActivePerl distribution; other OSs don't have the
>>> same
>>>>>> problem.
>>>>>>
>>>>>> The 'solution' described on that page is actually a workaround,
>>>>>> not a
>>>>>> bugfix.  It causes problems with stack traces with error handling
>>>>>> but
>>>>>> seems harmless beyond that.  I haven't been able to find a
>>>>>> satisfactory fix which works on all OS's.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>> On Sep 28, 2006, at 10:42 AM, Seth Johnson wrote:
>>>>>>
>>>>>>> This is under Windows, but using ActiveState Komodo 3.5 and  
>>>>>>> their
>>>>>>> latest Perl for Windows and latest BioPerl & BioPerl-db from  
>>>>>>> CVS.
>>>>>>>
>>>>>>> I actually just stumbled upon a solution.  It's described in the
>>>>>>> "Installing Bioperl on Windows" by adding a comma after
>>> $class: in
>>>>>>> Bio::Root::Root throw() subroutine.  Thanks for hinting me about
>>>>>>> what I run it on.
>>>>>>>
>>>>>>> The code works now, BUT it spews whole bunch of warnings about
>>>>>>> "Subroutine .... redefined":
>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\BioEntry
>>>>>>> .pm line 88.
>>>>>>> Subroutine object_id redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 128.
>>>>>>> Subroutine version redefined at c:/Perl/site/lib/Bio\BioEntry.pm
>>>>>>> line 150.
>>>>>>> Subroutine authority redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 171.
>>>>>>> Subroutine namespace redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 192.
>>>>>>> Subroutine display_name redefined at c:/Perl/site/lib/Bio
>>>>>>> \BioEntry.pm line 217.
>>>>>>> Subroutine description redefined at c:/Perl/site/lib/Bio
>>>>>>> \BioEntry.pm line 241.
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>> line
>>>>>>> 201.
>>>>>>> Subroutine verbose redefined at c:/Perl/site/lib/Bio\Root 
>>>>>>> \Root.pm
>>>>>>> line 234.
>>>>>>> Subroutine _register_for_cleanup redefined at c:/Perl/site/lib/
>>> Bio
>>>>>>> \Root\Root.pm line 246.
>>>>>>> Subroutine _unregister_for_cleanup redefined at c:/Perl/site/ 
>>>>>>> lib/
>>>>>>> Bio
>>>>>>> \Root\Root.pm line 256.
>>>>>>> Subroutine _cleanup_methods redefined at c:/Perl/site/lib/Bio
>>> \Root
>>>>>>> \Root.pm line 263.
>>>>>>> Subroutine throw redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>>>>>> line 316.
>>>>>>> Subroutine debug redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>>>>>> line 379.
>>>>>>> Subroutine _load_module redefined at c:/Perl/site/lib/Bio\Root
>>>>>>> \Root.pm line 398.
>>>>>>> Subroutine DESTROY redefined at c:/Perl/site/lib/Bio\Root 
>>>>>>> \Root.pm
>>>>>>> line 426.
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\RootI.pm
>>> line
>>>>>>> 117.
>>>>>>> Subroutine _initialize redefined at c:/Perl/site/lib/Bio\Root
>>>>>>> \RootI.pm line 128.
>>>>>>> ...
>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>>
>>>>>>>
>>>>>>> On 9/28/06, Chris Fields <cjfields at uiuc.edu> wrote: I had
>>> problems
>>>>>>> with bioperl-db on native WinXP (not cygwin), but I
>>>>>>> did manage to get it running in cygwin with some effort.  The
>>> issue
>>>>>>> on native WinXP was related to Bio::Root::Root::throw(), though.
>>>>>>>
>>>>>>> There is a bug and workaround filed on Bugzilla, but I haven't
>>>>>>> worked
>>>>>>> on it in a while (and the workaround has some problems as
>>> well).  I
>>>>>>> may try running it again to see what happens.
>>>>>>>
>>>>>>> http://bugzilla.open-bio.org/show_bug.cgi?id=1938
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On Sep 28, 2006, at 9:04 AM, Hilmar Lapp wrote:
>>>>>>>
>>>>>>>> Very odd. This is under Windows, presumably using Cygwin?
>>>>>>>>
>>>>>>>> The method Bio::Root::Root::throw() clearly exists, and
>>>>>>>> PersistentObject inherits from it. The exception it was
>>> trying to
>>>>>>>> throw has nothing to do with failure or success to find the
>>>>>>>> database
>>>>>>>> row (actually it did succeed since otherwise it wouldn't
>>> construct
>>>>>>>> the object) but with dynamically loading a class, presumably
>>>>>>>> Bio::DB::Persistent::Seq.
>>>>>>>>
>>>>>>>> Are you using the 1.5.x release of bioperl?
>>>>>>>>
>>>>>>>> Does anyone on the list have any experience with these sorts of
>>>>>>>> things on Windows?
>>>>>>>>
>>>>>>>> (Seth, I've moved this thread to the bioperl list, since  
>>>>>>>> this is
>>>>>>> what
>>>>>>>> the problem is about.)
>>>>>>>>
>>>>>>>>       -hilmar
>>>>>>>>
>>>>>>>> On Sep 27, 2006, at 1:39 PM, Seth Johnson wrote:
>>>>>>>>
>>>>>>>>> Hello guys,
>>>>>>>>>
>>>>>>>>> I successfully populated the biosql database, thanks to you.
>>>>>>>>> Now,
>>>>>>>>> I'm
>>>>>>>>> trying to retrieve a sequence from it following the example
>>> from
>>>>>>>>> BOSC2003
>>>>>>>>> slides and ran into uninformative error (at least to me it
>>>>>>>>> doesn't
>>>>>>>>> mean
>>>>>>>>> anyting).  I suspect that I'm missing something and hope you
>>> can
>>>>>>>>> point me in
>>>>>>>>> the right direction.  Here's my source code:
>>>>>>>>>
>>>>>>>
>>> -------------------------------------------------------------------
>>>>>>> --
>>>>>>>>> -
>>>>>>>>> ---
>>>>>>>>> #!/usr/bin/perl -w
>>>>>>>>> use strict;
>>>>>>>>> use warnings;
>>>>>>>>>
>>>>>>>>> use Bio::Seq;
>>>>>>>>> use Bio::Seq::SeqFactory;
>>>>>>>>> use Bio::DB::SimpleDBContext;
>>>>>>>>> use Bio::DB::BioDB;
>>>>>>>>>
>>>>>>>>> my $dbc = Bio::DB::SimpleDBContext->new(
>>>>>>>>>     -driver => 'mysql',
>>>>>>>>>     -dbname => 'BioSQL_1',
>>>>>>>>>     -host => ' 192.168.1.3',
>>>>>>>>>     -user => 'xxxxx',
>>>>>>>>>     -pass => 'xxxxxx'
>>>>>>>>> );
>>>>>>>>>
>>>>>>>>> my $db = Bio::DB::BioDB->new(-database  => 'biosql',
>>>>>>>>>                             -dbcontext => $dbc);
>>>>>>>>>
>>>>>>>>> my $seq = Bio::Seq->new(-accession_number => 'NM_014580', -
>>>>>>>>> namespace =>
>>>>>>>>> 'refseq_H_sapiens');
>>>>>>>>> my $seqfact = Bio::Seq::SeqFactory->new(-type => 'Bio::Seq');
>>>>>>>>> my $adp = $db->get_object_adaptor($seq);
>>>>>>>>> my $dbseq = $adp->find_by_unique_key($seq, -obj_factory =>
>>>>>>> $seqfact);
>>>>>>>>>
>>>>>>>>> my $out = Bio::SeqIO->newFh('-format' => 'EMBL');
>>>>>>>>> print $out $dbseq;
>>>>>>>>>
>>>>>>>>> exit;
>>>>>>>>>
>>> -----------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> Just when the "find_by_unique_key" function is executed I
>>> get the
>>>>>>>>> following
>>>>>>>>> error:
>>>>>>>>>
>>>>>>>>> ================================
>>>>>>>>> Undefined subroutine &Bio::Root::Root::throw called at
>>>>>>>>> c:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm line
>>> 199.
>>>>>>>>> ================================
>>>>>>>>>
>>>>>>>>> The sequence does exist in the database. I checked that.  Any
>>>>>>>>> ideas???
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best Regards,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Seth Johnson
>>>>>>>>> Senior Bioinformatics Associate
>>>>>>>>> _______________________________________________
>>>>>>>>> BioSQL-l mailing list
>>>>>>>>> BioSQL-l at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ===========================================================
>>>>>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>>>>>> ===========================================================
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>
>>>>>>> Christopher Fields
>>>>>>> Postdoctoral Researcher
>>>>>>> Lab of Dr. Robert Switzer
>>>>>>> Dept of Biochemistry
>>>>>>> University of Illinois Urbana-Champaign
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>>
>>>>>>>
>>>>>>> Seth Johnson
>>>>>>> Senior Bioinformatics Associate
>>>>>>>
>>>>>>> Ph: (202) 470-0900
>>>>>>> Fx: (775) 251-0358
>>>>>>
>>>>>> Christopher Fields
>>>>>> Postdoctoral Researcher
>>>>>> Lab of Dr. Robert Switzer
>>>>>> Dept of Biochemistry
>>>>>> University of Illinois Urbana-Champaign
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>>
>>>>>
>>>>> Seth Johnson
>>>>> Senior Bioinformatics Associate
>>>>>
>>>>> Ph: (202) 470-0900
>>>>> Fx: (775) 251-0358
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher
>>>> Lab of Dr. Robert Switzer
>>>> Dept of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>
>>>
>>> --
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>>
>>>
>>> Seth Johnson
>>> Senior Bioinformatics Associate
>>>
>>> Ph: (202) 470-0900
>>> Fx: (775) 251-0358
>>
>> --
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>>
>
>
> -- 
> Best Regards,
>
>
> Seth Johnson
> Senior Bioinformatics Associate
>
> Ph: (202) 470-0900
> Fx: (775) 251-0358
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From osborne1 at optonline.net  Sun Oct  1 17:49:47 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Sun, 01 Oct 2006 17:49:47 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061001183214.GB12075@iucha.net>
Message-ID: <C145B03B.A8A5%osborne1@optonline.net>

Florin,

This is fixed in CVS now. What had happened is that the DIP file had some
minimal protein (node) entries where the only id available was DIP's
internal identifier. Not ideal to have to use these as accessions but
there's no other choice.

Thank you for the note, and in the future write to bioperl-l since there may
be others who are interested in hearing about what you've encountered.

Brian O.


On 10/1/06 2:32 PM, "Florin Iucha" <florin at iucha.net> wrote:

> Hello,
> 
> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and
> I am using it to read the 20060402 edition release of the DIP [2] dataset.
> 
> Starting with the simple program you show in the man page:
> 
>    my $io = Bio::Network::IO->new(-format => 'psi',
>                                   -file   => $ARGV[0]);
> 
>    my $network = $io->next_network;
> 
> I get 772 instances of:
> 
>    Use of uninitialized value in string eq at
>    /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 326.
> 
> I don't know if it is just an annoyance or something bad, so you might
> want to take a look at it.
> 
> Thank you for your work,
> florin
> 
> [1] http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-network/
> [2] http://dip.doe-mbi.ucla.edu/


From osborne1 at optonline.net  Sun Oct  1 17:56:39 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Sun, 01 Oct 2006 17:56:39 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061001211844.GC12075@iucha.net>
Message-ID: <C145B1D7.A8A8%osborne1@optonline.net>

Florin,

I'm not seeing any segmentation fault using the same file you're using as
input (dip20060402.mif). I'm assuming you don't see this error when you use
smaller files as input, like those in the t/data directory.

When I watch the script in top I see Perl using about 135Mb (RSIZE) right
before the script exits. How much memory do you use?

Thank you for the note, and in the future write to bioperl-l since there may
be others who are interested in hearing about what you've encountered.

Brian O.


On 10/1/06 5:18 PM, "Florin Iucha" <florin at iucha.net> wrote:

> On Sun, Oct 01, 2006 at 01:32:14PM -0500, Florin Iucha wrote:
>> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and
>> I am using it to read the 20060402 edition release of the DIP [2] dataset.
> 
> Using the attached script, I am getting a segmentation fault at the
> end, right after printing "That's all, Folks!"  Maybe some cleanup is
> going off in a wrong direction.
> 
> florin


From florin at iucha.net  Sun Oct  1 20:24:03 2006
From: florin at iucha.net (Florin Iucha)
Date: Sun, 1 Oct 2006 19:24:03 -0500
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <C145B1D7.A8A8%osborne1@optonline.net>
References: <20061001211844.GC12075@iucha.net>
	<C145B1D7.A8A8%osborne1@optonline.net>
Message-ID: <20061002002403.GD12075@iucha.net>

On Sun, Oct 01, 2006 at 05:56:39PM -0400, Brian Osborne wrote:
> I'm not seeing any segmentation fault using the same file you're using as
> input (dip20060402.mif). I'm assuming you don't see this error when you use
> smaller files as input, like those in the t/data directory.

The t/data files are fine.

Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the
MINT [1] database does not produce the crash.  It has a new warning, however:

   Can't call method "text" on an undefined value at
   /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290.

> When I watch the script in top I see Perl using about 135Mb (RSIZE) right
> before the script exits. How much memory do you use?

"ps ux" tells me VSZ = 272788 and RSZ = 254992. This is on x86-64 with
64 bit perl.  The box has 2 GB of physical memory so these numbers
don't seem to be a concern.

> Thank you for the note, and in the future write to bioperl-l since there may
> be others who are interested in hearing about what you've encountered.

Do'h! You have the list address loud and clear in three places, but I got
your contact info from the AUTHORS.  Will use the proper channel from now
on!

Thanks,
florin

[1] ftp://mint.bio.uniroma2.it/pub/release/psi1/

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061001/901e447e/attachment.bin 

From cjfields at uiuc.edu  Mon Oct  2 00:35:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 23:35:22 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880609301635w421fae0er3497ba655679f0bc@mail.gmail.com>
Message-ID: <000001c6e5dc$2eceabe0$15327e82@pyrimidine>

Seth,

What version of MySQL and perl are you using?  I'm using MySQL 5.0.18 (but
am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819.

I ran into a few problems with bioperl-db tests which were unrelated the
ones below, but I'm wondering if it is a difference in MySQL versions.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Seth Johnson
> Sent: Saturday, September 30, 2006 6:35 PM
> To: Hilmar Lapp
> Cc: Chris Fields; Bioperl List
> Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> 
> Here're complete test details:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

...

> FAILED tests 10-12
>     Failed 3/12 tests, 75.00% okay
> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> --------------------------------------------------------------------------
> -----
> t\02species.t                 65    2   3.08%  63 65
> t\03simpleseq.t    1   256    59  106 179.66%  7-59
> t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> t\12ontology.t     2   512   738 1471 199.32%  3-738
> t\16obda.t                    12    3  25.00%  10-12
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From torsten.seemann at infotech.monash.edu.au  Mon Oct  2 02:06:50 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Mon, 02 Oct 2006 16:06:50 +1000
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
References: <451C8ED8.2060003@infotech.monash.edu.au>
	<451CC40D.2030401@sendu.me.uk>
	<2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
Message-ID: <4520AC7A.1050009@infotech.monash.edu.au>


 >>> I have removed all use/@ISA Bio::Root::Object references from
 >>> bioperl-live, except for those in Bio::Root::* itself:

 >> So I'd say they're both relics that can be removed. In fact I was
 >> planning on getting rid off all references to both of these modules
 >> before you did, so thanks! :)

> I think they can go. It's probably a pre-1.0 deprecation that somehow  
> was never followed through on.

Today I did a fresh CVS checkout of bioperl-live, and deleted the 
following modules and tests, and all tests passed with BIOPERLDEBUG=0

     * Bio::Root::Err
     * Bio::Root::Global
     * Bio::Root::IOManager
     * Bio::Root::Object
     * Bio::Root::Storable
     * Bio::Root::Utilities  # may be used by third parties?
     * Bio::Root::Vector
     * Bio::Root::Xref
     * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
     * t/RootStorable.t

Should we schedule for deprecation, or deprecate immediately as Hilmar 
suggested they were meant to be deprecated long ago ?

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From bix at sendu.me.uk  Mon Oct  2 05:40:02 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 10:40:02 +0100
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>	<451E3707.4090400@sendu.me.uk>	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
	<3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
Message-ID: <4520DE72.4000603@sendu.me.uk>

Chris Fields wrote:
>
> The idea is to retain current behavior (remote DB access will not be  
> run unless BIOPERLDEBUG is set to 1) and apply it to all tests  
> requiring such access.  Otherwise, just those tests are skipped (and  
> not the rest of the tests, which occurs currently).  If BIOPERLDEBUG  
> is set, the next tests would check the URL, which passes/fails (based  
> on the specific value of $@), and runs/skips tests based on the mere  
> presence of $@, which indicates some URL issue.  You can do this with  
> Test::More, but I'm not sure this can be done with Test.pm or  
> Test::Simple.

Firstly, BIOPERLDEBUG should not be abused; it should be used only when 
you want to see extra debugging messages. There should be another 
variable that you can set to choose if network-requiring tests are run, 
and it should also be a configurable choice when you run perl Makefile.PL.

(But changing this isn't going to happen for 1.5.2)

When the server problem is ambiguous we should not fail the test. Just 
make the skip message visible and pass all ok...


> The current behavior just skips all tests based on a single failed  
> URL.  Then, Test::Harness, as currently set, shows skipped tests as  
> passed.  The last run I posted previously where XEMBL_DB.t remote DB  
> tests failed, I also ran all tests (make test) and get this, which  
> doesn't tell us that the remote URL failed:
> 
> -----------------------------------------
> 
> ...
> t/WABA.......................ok
> t/XEMBL_DB...................ok
> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext  
> is not installed or is installed incorrectly - skipping ztr.t tests
> ok
> All tests successful, 5 subtests skipped.

All you have to do to make it visible is start the skip message with the 
work 'Skip':

skip('Skip server may be down',1);

...
t/WABA.......................ok 

t/XEMBL_DB...................ok 

         1/9 skipped: server may be down
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is 
not installed or is installed incorrectly - skipping ztr.t tests
t/ztr........................ok


It's nicer when using Test::More.

From bix at sendu.me.uk  Mon Oct  2 05:55:27 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 10:55:27 +0100
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au>
References: <451C8ED8.2060003@infotech.monash.edu.au>	<451CC40D.2030401@sendu.me.uk>	<2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
	<4520AC7A.1050009@infotech.monash.edu.au>
Message-ID: <4520E20F.6040406@sendu.me.uk>

Torsten Seemann wrote:
>  >>> I have removed all use/@ISA Bio::Root::Object references from
>  >>> bioperl-live, except for those in Bio::Root::* itself:
> 
>  >> So I'd say they're both relics that can be removed. In fact I was
>  >> planning on getting rid off all references to both of these modules
>  >> before you did, so thanks! :)
> 
>> I think they can go. It's probably a pre-1.0 deprecation that somehow  
>> was never followed through on.
> 
> Today I did a fresh CVS checkout of bioperl-live, and deleted the 
> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> 
>      * Bio::Root::Err
>      * Bio::Root::Global
>      * Bio::Root::IOManager
>      * Bio::Root::Object
>      * Bio::Root::Storable
>      * Bio::Root::Utilities  # may be used by third parties?
>      * Bio::Root::Vector
>      * Bio::Root::Xref
>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>      * t/RootStorable.t
> 
> Should we schedule for deprecation, or deprecate immediately as Hilmar 
> suggested they were meant to be deprecated long ago ?

I'm happy to get rid of them all straight away. Does anyone object?

From florin at iucha.net  Sun Oct  1 21:40:07 2006
From: florin at iucha.net (Florin Iucha)
Date: Sun, 1 Oct 2006 20:40:07 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on
	AMD64
Message-ID: <20061002014007.GG12075@iucha.net>

Hello,

I am trying to install bioperl-network from CVS.  I found this to
require bioperl from CVS, which requires bioperl-ext from CVS.
I have compiled and installed io_lib 1.10.1.

After running "perl Makefile.PL; make test" in bioperl-ext I see a lot 
sources being compiled, then:

cc -c  -I./libs -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O2   -DVERSION=\"1.5.1\" -DXS_VERSION=\"1.5.1\" -fPIC "-I/usr/lib/perl/5.8/CORE"  -DPOSIX -DNOERROR Align.c
Running Mkbootstrap for Bio::Ext::Align ()
chmod 644 Align.bs
rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so
cc  -shared -L/usr/local/lib Align.o  -o ../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a  \
           -lm          \

/usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC
libs/libsw.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1
make[1]: Leaving directory `/scratch/dmbio/tools/bioperl-ext/Bio/Ext/Align'
make: *** [subdirs] Error 2

This is on a Debian AMD64 box:

florin at zeus $ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu
Thread model: posix
gcc version 4.1.2 20060901 (prerelease) (Debian 4.1.1-13)
florin at zeus $ perl -V
Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
  Platform:
    osname=linux, osvers=2.6.16-1-vserver-amd64-k8, archname=x86_64-linux-gnu-thread-multi
    uname='linux excelsior 2.6.16-1-vserver-amd64-k8 #2 smp tue apr 4 03:40:49 utc 2006 x86_64 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=define use64bitall=define uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion='', gccversion='4.1.2 20060729 (prerelease) (Debian 4.1.1-10)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.6.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8
    gnulibc_version='2.3.6'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl):
  Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT
                        PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ALL
                        USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES
                        USE_PERLIO USE_REENTRANT_API

The compiler command line for aln.o is lacking -fPIC:

cc -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN
-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -DPOSIX -DNOERROR   -c -o aln.o aln.c

Adding -fPIC to the CCFLAGS variable in Bio/Ext/Align/Makefile and
Makefile seems to take build further, but it fails with a similar
error in Bio/SeqIO/staden/_Inline/build/Bio/SeqIO/staden/read. That
Makefile seems to be regenerated every time I run 'make test' in the
top level directory.

The error in ../staden/read is:

rm -f blib/arch/auto/Bio/SeqIO/staden/read/read.so
cc  -shared -L/usr/local/lib read.o  -o blib/arch/auto/Bio/SeqIO/staden/read/read.so    \
           -L/usr/local/lib -lread -lz          \

/usr/bin/ld: /usr/local/lib/libread.a(libread_a-Read.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC
/usr/local/lib/libread.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [blib/arch/auto/Bio/SeqIO/staden/read/read.so] Error 1

So, the questions appears to be:
   - should "-fPIC" be appended to CFLAGS in the generated Makefiles?
   - is there anything wrong with io_lib flags?
   - has anybody built bioperl-ext on AMD64?

I can help with debugging or testing if given a gentle nudge in the right
direction, but I have little experience with the interactions between perl
and static libraries on 64 bit.

Thanks,
florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061001/bc134c7e/attachment.bin 

From bix at sendu.me.uk  Mon Oct  2 06:52:47 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 11:52:47 +0100
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on	AMD64
In-Reply-To: <20061002014007.GG12075@iucha.net>
References: <20061002014007.GG12075@iucha.net>
Message-ID: <4520EF7F.40908@sendu.me.uk>

Florin Iucha wrote:
> Hello,
> 
> I am trying to install bioperl-network from CVS.  I found this to
> require bioperl from CVS, which requires bioperl-ext from CVS.

I can't help with the compile problems you encountered (other than to 
say I also have problems under AMD64), but from where did you get the 
idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
recent changes to Makefile.PL may give that impression...


From cjfields at uiuc.edu  Mon Oct  2 08:26:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 07:26:57 -0500
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <4520DE72.4000603@sendu.me.uk>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>	<451E3707.4090400@sendu.me.uk>	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
	<3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
	<4520DE72.4000603@sendu.me.uk>
Message-ID: <DAAC7FDC-0C03-4345-9E09-DBF04D521628@uiuc.edu>


On Oct 2, 2006, at 4:40 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>> The idea is to retain current behavior (remote DB access will not be
>> run unless BIOPERLDEBUG is set to 1) and apply it to all tests
>> requiring such access.  Otherwise, just those tests are skipped (and
>> not the rest of the tests, which occurs currently).  If BIOPERLDEBUG
>> is set, the next tests would check the URL, which passes/fails (based
>> on the specific value of $@), and runs/skips tests based on the mere
>> presence of $@, which indicates some URL issue.  You can do this with
>> Test::More, but I'm not sure this can be done with Test.pm or
>> Test::Simple.
>
> Firstly, BIOPERLDEBUG should not be abused; it should be used only  
> when
> you want to see extra debugging messages. There should be another
> variable that you can set to choose if network-requiring tests are  
> run,
> and it should also be a configurable choice when you run perl  
> Makefile.PL.
>
> (But changing this isn't going to happen for 1.5.2)
>
> When the server problem is ambiguous we should not fail the test. Just
> make the skip message visible and pass all ok...

I agree, as well as with your assessment of BIOPERLDEBUG (which I  
alluded to in a previous post).  Torsten suggested creating a new  
env. variable for network tests.

It's obvious this won't be done before 1.5.2, but we can make plans  
towards the next release.

>> The current behavior just skips all tests based on a single failed
>> URL.  Then, Test::Harness, as currently set, shows skipped tests as
>> passed.  The last run I posted previously where XEMBL_DB.t remote DB
>> tests failed, I also ran all tests (make test) and get this, which
>> doesn't tell us that the remote URL failed:
>>
>> -----------------------------------------
>>
>> ...
>> t/WABA.......................ok
>> t/XEMBL_DB...................ok
>> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext
>> is not installed or is installed incorrectly - skipping ztr.t tests
>> ok
>> All tests successful, 5 subtests skipped.
>
> All you have to do to make it visible is start the skip message  
> with the
> work 'Skip':
>
> skip('Skip server may be down',1);
>
> ...
> t/WABA.......................ok
>
> t/XEMBL_DB...................ok
>
>          1/9 skipped: server may be down
> t/ztr........................Bio::SeqIO::staden::read of bioperl- 
> ext is
> not installed or is installed incorrectly - skipping ztr.t tests
> t/ztr........................ok
>
>
> It's nicer when using Test::More.

Okay, if Test::Harness picks that up it would be okay.  We could use  
skip blocks to skip subsets of tests that require remote access (like  
SeqFeature.t) as opposed to skipping all tests.

I think we want to avoid promoting running tests with BIOPERLDEBUG  
(or similar) upon installation for everyday installation anyway (such  
as from CPAN, which Hilmar points out).  It's not something everybody  
installing a new BioPerl should be running unless they run into  
problems.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From florin at iucha.net  Mon Oct  2 08:15:06 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 07:15:06 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
	on	AMD64
In-Reply-To: <4520EF7F.40908@sendu.me.uk>
References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk>
Message-ID: <20061002121506.GB14409@iucha.net>

On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
> Florin Iucha wrote:
> > I am trying to install bioperl-network from CVS.  I found this to
> > require bioperl from CVS, which requires bioperl-ext from CVS.
> 
> I can't help with the compile problems you encountered (other than to 
> say I also have problems under AMD64), but from where did you get the 
> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
> recent changes to Makefile.PL may give that impression...

Running the tests for bioperl-live mention in some places that 'this
test has been skipped since $foo is not available' and I found the
'foos' in bioperl-ext.

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/8fc9df03/attachment.bin 

From bix at sendu.me.uk  Mon Oct  2 10:05:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 15:05:11 +0100
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on	AMD64
In-Reply-To: <20061002121506.GB14409@iucha.net>
References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk>
	<20061002121506.GB14409@iucha.net>
Message-ID: <45211C97.2060800@sendu.me.uk>

Florin Iucha wrote:
> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
>> Florin Iucha wrote:
>>> I am trying to install bioperl-network from CVS.  I found this to
>>> require bioperl from CVS, which requires bioperl-ext from CVS.
>> I can't help with the compile problems you encountered (other than to 
>> say I also have problems under AMD64), but from where did you get the 
>> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
>> recent changes to Makefile.PL may give that impression...
> 
> Running the tests for bioperl-live mention in some places that 'this
> test has been skipped since $foo is not available' and I found the
> 'foos' in bioperl-ext.

Right, yes. The idea is, you'd only need to install bioperl-ext if you 
wanted to use the modules that the complaining tests test.
So if none of the things that were skipped matter to you, don't install ext.

I guess this needs to be clarified in documentation somewhere.

From cjfields at uiuc.edu  Mon Oct  2 10:13:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:13:56 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au>
Message-ID: <001801c6e62d$02c883d0$15327e82@pyrimidine>


>  >>> I have removed all use/@ISA Bio::Root::Object references from
>  >>> bioperl-live, except for those in Bio::Root::* itself:
> 
>  >> So I'd say they're both relics that can be removed. In fact I was
>  >> planning on getting rid off all references to both of these modules
>  >> before you did, so thanks! :)
> 
> > I think they can go. It's probably a pre-1.0 deprecation that somehow
> > was never followed through on.
> 
> Today I did a fresh CVS checkout of bioperl-live, and deleted the
> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> 
>      * Bio::Root::Err
>      * Bio::Root::Global
>      * Bio::Root::IOManager
>      * Bio::Root::Object
>      * Bio::Root::Storable
>      * Bio::Root::Utilities  # may be used by third parties?
>      * Bio::Root::Vector
>      * Bio::Root::Xref
>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>      * t/RootStorable.t
> 
> Should we schedule for deprecation, or deprecate immediately as Hilmar
> suggested they were meant to be deprecated long ago ?

I vote for quick deprecation; I had also noticed that these were superfluous
and added them as possible deprecations to the wiki page.  However, we need
to be careful about that 'third-party use' caveat you have for
Bio::Root::Utilities; there's another one with Bio::Root::Storable and
Ensembl:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2924/focus=2924

and it seems to have it's users:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/8242/focus=8242

The others (including Bio::Root::Utilities) haven't had any major threads on
the mail lists in a very long time.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> --
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Mon Oct  2 10:16:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:16:31 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of
	bioperl-exton	AMD64
In-Reply-To: <20061002121506.GB14409@iucha.net>
Message-ID: <001901c6e62d$5c4fac80$15327e82@pyrimidine>

They're not absolutely necessary; the tests are skipped w/o failure because
bioperl-ext is optional.  These are only necessary if you want the ability
to read sequence trace files.  

BTW, you might have a rough time on trying to install bioperl-ext depending
on your platform.  Note the following bug report:

http://bugzilla.open-bio.org/show_bug.cgi?id=2074

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Florin Iucha
> Sent: Monday, October 02, 2006 7:15 AM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-
> exton AMD64
> 
> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
> > Florin Iucha wrote:
> > > I am trying to install bioperl-network from CVS.  I found this to
> > > require bioperl from CVS, which requires bioperl-ext from CVS.
> >
> > I can't help with the compile problems you encountered (other than to
> > say I also have problems under AMD64), but from where did you get the
> > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though
> > recent changes to Makefile.PL may give that impression...
> 
> Running the tests for bioperl-live mention in some places that 'this
> test has been skipped since $foo is not available' and I found the
> 'foos' in bioperl-ext.
> 
> florin
> 
> --
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra


From osborne1 at optonline.net  Mon Oct  2 10:14:13 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 10:14:13 -0400
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520E20F.6040406@sendu.me.uk>
Message-ID: <C14696F5.A903%osborne1@optonline.net>

Sendu,

No objection but someone should check the scripts in examples/root to make
sure that they are not used there.

Brian O.


On 10/2/06 5:55 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> Torsten Seemann wrote:
>>>>> I have removed all use/@ISA Bio::Root::Object references from
>>>>> bioperl-live, except for those in Bio::Root::* itself:
>> 
>>>> So I'd say they're both relics that can be removed. In fact I was
>>>> planning on getting rid off all references to both of these modules
>>>> before you did, so thanks! :)
>> 
>>> I think they can go. It's probably a pre-1.0 deprecation that somehow
>>> was never followed through on.
>> 
>> Today I did a fresh CVS checkout of bioperl-live, and deleted the
>> following modules and tests, and all tests passed with BIOPERLDEBUG=0
>> 
>>      * Bio::Root::Err
>>      * Bio::Root::Global
>>      * Bio::Root::IOManager
>>      * Bio::Root::Object
>>      * Bio::Root::Storable
>>      * Bio::Root::Utilities  # may be used by third parties?
>>      * Bio::Root::Vector
>>      * Bio::Root::Xref
>>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>>      * t/RootStorable.t
>> 
>> Should we schedule for deprecation, or deprecate immediately as Hilmar
>> suggested they were meant to be deprecated long ago ?
> 
> I'm happy to get rid of them all straight away. Does anyone object?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From johnson.biotech at gmail.com  Mon Oct  2 10:21:50 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 2 Oct 2006 10:21:50 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <000001c6e5dc$2eceabe0$15327e82@pyrimidine>
References: <b99962880609301635w421fae0er3497ba655679f0bc@mail.gmail.com>
	<000001c6e5dc$2eceabe0$15327e82@pyrimidine>
Message-ID: <b99962880610020721j776d3801m4f5b49cd1bdf66c6@mail.gmail.com>

I'm using MySQL 5.0.19 and Perl v5.8.7 [MSWin32-x86-multi-thread]

On 10/2/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Seth,
>
> What version of MySQL and perl are you using?  I'm using MySQL 5.0.18 (but
> am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819.
>
> I ran into a few problems with bioperl-db tests which were unrelated the
> ones below, but I'm wondering if it is a difference in MySQL versions.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------
> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358

From osborne1 at optonline.net  Mon Oct  2 10:08:50 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 10:08:50 -0400
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on AMD64
In-Reply-To: <20061002014007.GG12075@iucha.net>
Message-ID: <C14695B2.A900%osborne1@optonline.net>

Florian,

Minor correction here, the Bioperl package does not require bioperl-ext.
However we see there is a problem compiling bioperl-ext...

Brian O.


On 10/1/06 9:40 PM, "Florin Iucha" <florin at iucha.net> wrote:

> I am trying to install bioperl-network from CVS.  I found this to
> require bioperl from CVS, which requires bioperl-ext from CVS.


From JK at novozymes.com  Mon Oct  2 10:05:34 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Mon, 2 Oct 2006 16:05:34 +0200
Subject: [Bioperl-l] Blast parser.
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net>


Hi. 

I've tried to use the blast-parser but I cannot get the original alignment
out of the parser. Is it possible to get that out of the 
Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a
clustalw alignment out when it isn't that type of alignment people are
used to get from blast. 

Thanks 

Jesper


From cjfields at uiuc.edu  Mon Oct  2 10:36:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:36:31 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <C14696F5.A903%osborne1@optonline.net>
Message-ID: <001d01c6e630$27792fb0$15327e82@pyrimidine>

> Sendu,
> 
> No objection but someone should check the scripts in examples/root to make
> sure that they are not used there.
> 
> Brian O.

I suppose it's also possible that the other bioperl distributions (like
bioperl-run) could use them as well.  

If they do we can take care of them as they pop up.  These are really old
and haven't been revised in a long time.  

The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
anyone know where Will Spooner is?  He's the maintainer for
Bio::Root::Storable.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct  2 11:01:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 10:01:44 -0500
Subject: [Bioperl-l] Blast parser.
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net>
Message-ID: <000001c6e633$ad0a6ce0$15327e82@pyrimidine>

The alignment that you get should come from GenericHSP, not BLASTHSP.
Either way, the HSP alignment that is retrieved using $hsp->get_aln() should
be a Bio::SimpleAlign object.  You can then output that to the proper
AlignIO format using an AlignIO stream object or use the Bio::SimpleAlign
methods for further analysis.  

my $aln = $hsp->get_aln();
my $alnout = Bio::AlignIO->new(-format => 'msf',
                               -fh  => \*STDOUT);
$alnout->write_aln($aln);

Quick note: not all AlignIO formats have write_aln() support at this time,
but most do.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of JK (Jesper Agerbo Krogh)
> Sent: Monday, October 02, 2006 9:06 AM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Blast parser.
> 
> 
> Hi.
> 
> I've tried to use the blast-parser but I cannot get the original alignment
> out of the parser. Is it possible to get that out of the
> Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a
> clustalw alignment out when it isn't that type of alignment people are
> used to get from blast.
> 
> Thanks
> 
> Jesper
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From whs at ebi.ac.uk  Mon Oct  2 12:00:19 2006
From: whs at ebi.ac.uk (Will Spooner)
Date: Mon, 2 Oct 2006 17:00:19 +0100 (BST)
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <001d01c6e630$27792fb0$15327e82@pyrimidine>
References: <001d01c6e630$27792fb0$15327e82@pyrimidine>
Message-ID: <Pine.LNX.4.64.0610021651550.1560@parrot.ebi.ac.uk>

On Mon, 2 Oct 2006, Chris Fields wrote:

>> Sendu,
>>
>> No objection but someone should check the scripts in examples/root to make
>> sure that they are not used there.
>>
>> Brian O.
>
> I suppose it's also possible that the other bioperl distributions (like
> bioperl-run) could use them as well.
>
> If they do we can take care of them as they pop up.  These are really old
> and haven't been revised in a long time.
>
> The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
> anyone know where Will Spooner is?  He's the maintainer for
> Bio::Root::Storable.
>

Hi Chris,

I'm still lurking...

If the tests for Bio::Root::Storable still pass (I assume that they do), 
then the module is working as advertised.

The idea behind Storable is very simple; object instances of any 
inhereting class can be serialised/retrieved from disk. BioPerl objects 
will probably not want this functionality by default, but it is trival to 
implement if needed.

Will


From cjfields at uiuc.edu  Mon Oct  2 13:58:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 12:58:15 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <Pine.LNX.4.64.0610021651550.1560@parrot.ebi.ac.uk>
Message-ID: <000601c6e64c$5746f990$15327e82@pyrimidine>

> On Mon, 2 Oct 2006, Chris Fields wrote:
> 
> >> Sendu,
> >>
> >> No objection but someone should check the scripts in examples/root to
> make
> >> sure that they are not used there.
> >>
> >> Brian O.
> >
> > I suppose it's also possible that the other bioperl distributions (like
> > bioperl-run) could use them as well.
> >
> > If they do we can take care of them as they pop up.  These are really
> old
> > and haven't been revised in a long time.
> >
> > The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
> > anyone know where Will Spooner is?  He's the maintainer for
> > Bio::Root::Storable.
> >
> 
> Hi Chris,
> 
> I'm still lurking...
> 
> If the tests for Bio::Root::Storable still pass (I assume that they do),
> then the module is working as advertised.
> 
> The idea behind Storable is very simple; object instances of any
> inhereting class can be serialised/retrieved from disk. BioPerl objects
> will probably not want this functionality by default, but it is trival to
> implement if needed.
> 
> Will

Okay, nice to know you're listening in!  Based on that we should keep it in.
The rest that Torsten mentioned could probably be removed right away.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From osborne1 at optonline.net  Mon Oct  2 13:59:58 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 13:59:58 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061002002403.GD12075@iucha.net>
Message-ID: <C146CBDE.A938%osborne1@optonline.net>

Florin,

OK, this is fixed in CVS now. The problem is that there's some variability
in how the PSI MI "standard" is used. In this case there was a species that
was not given a value for its scientific name ("fullName"), I had to use
common name in its place. Fortunately there's an NCBI taxon id behind all
this.

Thanks again,

Brian O.


On 10/1/06 8:24 PM, "Florin Iucha" <florin at iucha.net> wrote:

> Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the
> MINT [1] database does not produce the crash.  It has a new warning, however:
> 
>    Can't call method "text" on an undefined value at
>    /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290.


From mmacho at gmail.com  Mon Oct  2 13:43:13 2006
From: mmacho at gmail.com (ende)
Date: Mon, 2 Oct 2006 19:43:13 +0200
Subject: [Bioperl-l] Variable scope
Message-ID: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>


	Hi

this may be a typical perl topic and then out of this list center  
topic.  My apologize for any inconvenience.

It is a annoying problem that is making me waste lot of time.

I have a package with its new object, etc... and constants in it like:

#-----
use constant False => 0;
use constant True => 1;

our %CLRFG = (
               PLASMIDO      => RED,
               POLY_A        => GREEN,
               RESTR_SITES   => BLUE,
               CONECTORS     => MAGENTA,
               CONTAMINANTS  => CYAN,
           );

our %CLRBG = (
               PLASMIDO      => "",
               POLY_A        => "",
               RESTR_SITES   => "",
               CONECTORS     => "",
               CONTAMINANTS  => "",
           );
#------

this constants are include with require "h.pl" from the main package  
file.

I use this module from the mail command line driver to test it  
"using" it.  In the command line driver I can use with no gripe the  
constants False and True directly, for example "return True", etc  
without any reference to the origin of that constant.

But, with respect to the variables (I would like they also were  
constants.. but how?), %CLRFG and %CLRBG I can't find the way of  
refering those int the module.  Finally I have desisted and _copy_  
the definitions where  I have needed it (in the sub were I print Ansi  
terminal colouring seqs...).  I don't find how to refer those  
variables out of the module.

I have tried %modulename::CLRFG, for example, but Perl gives me errors.

Any help?


--
     Juan Falgueras
     Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n
     Universidad de M?laga


From cjfields at uiuc.edu  Mon Oct  2 16:52:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 15:52:11 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <C14696F5.A903%osborne1@optonline.net>
Message-ID: <000001c6e664$a25538d0$15327e82@pyrimidine>

I have updated the Deprecation page with the Bio::Root::* modules that we
plan on deprecating (note that I have them being removed for rel. 1.5.2).  I
have left out Bio::Root::Storable for now based on Will's response.  

http://www.bioperl.org/wiki/Deprecated_modules

I'll update the DEPRECATED doc in CVS as well.  There is a tentative
schedule for when warnings are added for modules before they are removed.  

In relation to the recent trend for house-cleaning, I noticed that all of
the Bio::Tools::BP* BLAST-related modules all are still present but haven't
been modified or had deprecation warnings added.  BPLite was marked for
deprecation around rel 1.5 since the functionality is present in
Bio::SearchIO, as well as the others.  Judging by the mail list, no one has
used these in quite a while, and everyone has been redirected to use
Bio::SearchIO instead.  Based on that I have added warnings in CVS for
deprecation to BPlite and the related modules BPpsilite and BPbl2seq.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Brian Osborne
> Sent: Monday, October 02, 2006 9:14 AM
> To: Sendu Bala; bioperl-l
> Subject: Re: [Bioperl-l] Do we need Bio::Root::Object anymore?
> 
> Sendu,
> 
> No objection but someone should check the scripts in examples/root to make
> sure that they are not used there.
> 
> Brian O.
> 
> 
> On 10/2/06 5:55 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:
> 
> > Torsten Seemann wrote:
> >>>>> I have removed all use/@ISA Bio::Root::Object references from
> >>>>> bioperl-live, except for those in Bio::Root::* itself:
> >>
> >>>> So I'd say they're both relics that can be removed. In fact I was
> >>>> planning on getting rid off all references to both of these modules
> >>>> before you did, so thanks! :)
> >>
> >>> I think they can go. It's probably a pre-1.0 deprecation that somehow
> >>> was never followed through on.
> >>
> >> Today I did a fresh CVS checkout of bioperl-live, and deleted the
> >> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> >>
> >>      * Bio::Root::Err
> >>      * Bio::Root::Global
> >>      * Bio::Root::IOManager
> >>      * Bio::Root::Object
> >>      * Bio::Root::Storable
> >>      * Bio::Root::Utilities  # may be used by third parties?
> >>      * Bio::Root::Vector
> >>      * Bio::Root::Xref
> >>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
> >>      * t/RootStorable.t
> >>
> >> Should we schedule for deprecation, or deprecate immediately as Hilmar
> >> suggested they were meant to be deprecated long ago ?
> >
> > I'm happy to get rid of them all straight away. Does anyone object?
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From florin at iucha.net  Mon Oct  2 16:47:01 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 15:47:01 -0500
Subject: [Bioperl-l] Variable scope
In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
Message-ID: <20061002204701.GG14409@iucha.net>

On Mon, Oct 02, 2006 at 07:43:13PM +0200, ende wrote:
> It is a annoying problem that is making me waste lot of time.
> 
> I have a package with its new object, etc... and constants in it like:
> 
> #-----
> use constant False => 0;
> use constant True => 1;
> 
> our %CLRFG = (
>                PLASMIDO      => RED,
>                POLY_A        => GREEN,
>                RESTR_SITES   => BLUE,
>                CONECTORS     => MAGENTA,
>                CONTAMINANTS  => CYAN,
>            );
> 
> our %CLRBG = (
>                PLASMIDO      => "",
>                POLY_A        => "",
>                RESTR_SITES   => "",
>                CONECTORS     => "",
>                CONTAMINANTS  => "",
>            );
> #------
> 
> this constants are include with require "h.pl" from the main package  
> file.
> 
> I use this module from the mail command line driver to test it  
> "using" it.  In the command line driver I can use with no gripe the  
> constants False and True directly, for example "return True", etc  
> without any reference to the origin of that constant.

It is possible you get them from somewhere else.

> But, with respect to the variables (I would like they also were  
> constants.. but how?), %CLRFG and %CLRBG I can't find the way of  
> refering those int the module.  Finally I have desisted and _copy_  
> the definitions where  I have needed it (in the sub were I print Ansi  
> terminal colouring seqs...).  I don't find how to refer those  
> variables out of the module.
> 
> I have tried %modulename::CLRFG, for example, but Perl gives me errors.

Did you actually declare a package name in "h.pl" ?

Is there any reason you don't call the file ".pm" and load it with
"use"?  I have attached a small example of importing that works.

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: one.pm
Type: text/x-perl
Size: 118 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: two.pl
Type: text/x-perl
Size: 69 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0001.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0002.bin 

From Kevin.M.Brown at asu.edu  Mon Oct  2 19:44:50 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 2 Oct 2006 16:44:50 -0700
Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module
Message-ID: <1A4207F8295607498283FE9E93B775B4021960CD@EX02.asurite.ad.asu.edu>

Well, for anyone that wants to know, I found a way to capture the output
of ClustalW to get at things like the score.

Copy STDOUT to another handle
open(OUTCOPY, ">&STDOUT") or die "Couldn't dup STDOUT: $!";

Change where STDOUT goes
open(STDOUT, ">log.test") or die "Couldn't open log.test: $!";

Run the alignment and its output will be captured by the STDOUT
redirection
$aln, $factory->align(\@seq);

Restore STDOUT to its normal location for the rest of the script
close STDOUT;
open(STDOUT, ">&OUTCOPY");

I guess I can understand why most of this is just dropped by the
ClustalW.pm module since there doesn't seem to be a way to hold it all
in a SimpleAlign object.

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Kevin Brown
> Sent: Thursday, September 28, 2006 2:48 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module
> 
> I've gotten a very simple script to run using bioperl that creates an
> alignment using clustalw of two sequences.  I see that clustal outputs
> to stdout information like the score, but I don't see any way to store
> that or retrieve that from the alignment object that is 
> returned (unless
> I'm just blind).  What follows is my very basic script which used code
> found in the Wiki.
> 
> print $aln->score() spits out an error about using an uninitialized
> value.
> 
> 
> #!/usr/bin/perl -w
> 
> use strict;
> use Bio::SeqIO;
> use Bio::Perl;
> use Bio::AlignIO;
> use Getopt::Long qw(:config no_ignore_case bundling pass_through);
> use POSIX;
> use Bio::Tools::Run::Alignment::Clustalw;
> 
> my $fileName   = "";         # filename(s) to be parsed for 
> information
> my $output_dir = "";
> my $format     = 'fasta';    # default format for SeqIO module
> 
> GetOptions(
>                    'file=s'   => \$fileName,
>                    'output=s' => \$output_dir,
>                   );
> 
> # Parse the input file for the needed information
> # SeqIO supports several normal formats including <tab>, <fasta> and
> <excel>
> 
> my @files = split(/\|/, $fileName);
> my @seq_array;
> 
> my $stream_out =
>   Bio::AlignIO->new(-file => '>test.msf', -format => 'msf', -flush =>
> 0);
> 
> foreach my $fileName (@files)
> {
>         my $file = Bio::SeqIO->new(-format => $format, -file =>
> $fileName);
>         my $seq;
>         while ($seq = $file->next_seq())
>         {
>                 push(@seq_array, $seq);
>         }
> }
> 
> my @params  = ('ktuple' => 2, 'matrix' => 'BLOSUM');
> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
> my $ktuple  = 3;
> $factory->ktuple($ktuple);    # change the parameter before executing
>     # where @seq_array is an array of {{PM|Bio::Seq}} objects
> 
> open my $out, ">seq.txt";
> 
> for (my $i = 1 ; $i <= $#seq_array ; $i++)
> {
>         my @seq = ($seq_array[0], $seq_array[$i]);
>         my $aln = $factory->align(\@seq);
>         $stream_out->write_aln($aln);
>         print $aln->score;
>         for my $seq ($aln->each_seq) {
>                 print $out $seq->display_id() ."\t". $seq->seq()."\n";
>         }
> }
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bix at sendu.me.uk  Mon Oct  2 19:48:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 00:48:34 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
Message-ID: <4521A552.60301@sendu.me.uk>

Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
upload tar.gz files when I have access to the server, then reply here 
with links.

In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
instructions on getting and testing this RC.

Developers:
   Make sure you're in the AUTHORS file in all 4 packages, as
   appropriate.

Users:
   Even though 1.5.2 is a 'developer' release, we consider it the most
   stable and capable version of Bioperl, and recommend that you use
   it in all but the most critical production environments. Please
   try it out and let us know of any problems or difficulties you run
   into.


Thank you,
Sendu.

From lincoln.stein at gmail.com  Mon Oct  2 17:53:38 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 2 Oct 2006 21:53:38 +0000
Subject: [Bioperl-l] Variable scope
In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
Message-ID: <6dce9a0b0610021453va2132c7u73747b9253211a66@mail.gmail.com>

Hi,

Read the documentation in Export. It is much better to formally export
constants, variables and functions and to import them with "use" than to use
"require". Also be sure that you understand how namespaces and modules work.

This is not a BioPerl topic and should have been directed to a general Perl
discussion list, such as Perl Monks.

Lincoln

On 10/2/06, ende <mmacho at gmail.com> wrote:
>
>
>         Hi
>
> this may be a typical perl topic and then out of this list center
> topic.  My apologize for any inconvenience.
>
> It is a annoying problem that is making me waste lot of time.
>
> I have a package with its new object, etc... and constants in it like:
>
> #-----
> use constant False => 0;
> use constant True => 1;
>
> our %CLRFG = (
>                PLASMIDO      => RED,
>                POLY_A        => GREEN,
>                RESTR_SITES   => BLUE,
>                CONECTORS     => MAGENTA,
>                CONTAMINANTS  => CYAN,
>            );
>
> our %CLRBG = (
>                PLASMIDO      => "",
>                POLY_A        => "",
>                RESTR_SITES   => "",
>                CONECTORS     => "",
>                CONTAMINANTS  => "",
>            );
> #------
>
> this constants are include with require "h.pl" from the main package
> file.
>
> I use this module from the mail command line driver to test it
> "using" it.  In the command line driver I can use with no gripe the
> constants False and True directly, for example "return True", etc
> without any reference to the origin of that constant.
>
> But, with respect to the variables (I would like they also were
> constants.. but how?), %CLRFG and %CLRBG I can't find the way of
> refering those int the module.  Finally I have desisted and _copy_
> the definitions where  I have needed it (in the sub were I print Ansi
> terminal colouring seqs...).  I don't find how to refer those
> variables out of the module.
>
> I have tried %modulename::CLRFG, for example, but Perl gives me errors.
>
> Any help?
>
>
>
>
> --
>      Juan Falgueras
>      Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n
>      Universidad de M?laga
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From florin at iucha.net  Mon Oct  2 22:30:31 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 21:30:31 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <20061003023031.GI14409@iucha.net>

On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.
> 
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
> instructions on getting and testing this RC.

[I won't create a wiki account just to report this.]

Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
not set.  Lots of warnings about missing packages and all, but this
looks interesting:

   Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.

Otherwise:

   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay.

The failed test is:

   t/ESEfinder..................dubious
      Test returned status 255 (wstat 65280, 0xff00)
   DIED. FAILED test 15

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra

From cjfields at uiuc.edu  Mon Oct  2 23:50:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 22:50:47 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>

So far all tests pass on Mac OS X.  I'll add this to the release page.

This RC will throw warnings for four tests I didn't remove in time  
(BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which  
correspond to their namesake deprecated Bio::Tools modules.  These  
are no longer in CVS HEAD so should be gone by the next RC, and the  
relevant modules marked for deprecation.

I can verify the Bio::DB::SeqFeature.t warning on Mac OS X that  
Florin reported, but ESEFinder.t works fine:

t/BioDBSeqFeature............Argument "+" isn't numeric in numeric lt  
(<) at Bio/DB/SeqFeature/Segment.pm line 423.
ok
....

I'll report WinXP tests tomorrow on the wiki.

Chris


On Oct 2, 2006, at 6:48 PM, Sendu Bala wrote:

> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll
> upload tar.gz files when I have access to the server, then reply here
> with links.
>
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
>
> Developers:
>    Make sure you're in the AUTHORS file in all 4 packages, as
>    appropriate.
>
> Users:
>    Even though 1.5.2 is a 'developer' release, we consider it the most
>    stable and capable version of Bioperl, and recommend that you use
>    it in all but the most critical production environments. Please
>    try it out and let us know of any problems or difficulties you run
>    into.
>
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct  2 23:54:29 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 22:54:29 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003023031.GI14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
Message-ID: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>

> [I won't create a wiki account just to report this.]
>
> Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> not set.  Lots of warnings about missing packages and all, but this
> looks interesting:
>
>    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ 
> SeqFeature/Segment.pm line 423.

This is verified on Mac OS X.

> Otherwise:
>
>    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,  
> 99.99% okay.
>
> The failed test is:
>
>    t/ESEfinder..................dubious
>       Test returned status 255 (wstat 65280, 0xff00)
>    DIED. FAILED test 15

What do you get when you run that set of tests using 'perl -I. -w t/ 
ESEFinder.t'?  The bad status code is odd and could be a remote  
server issue.

Chris


>
> florin
>
> -- 
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Tue Oct  3 00:30:06 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 03 Oct 2006 14:30:06 +1000
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
Message-ID: <4521E74E.1040404@infotech.monash.edu.au>

My understanding is that all Bioperl-compliant classes should inherit 
from Bio::Root::Root, not Bio::Root::RootI.

Additionally, if functions such as throw() or _rearrange() are to be 
used without a class instance reference, they are to be used as class 
methods via Bio::Root::Root, not Bio::Root::RootI.

Is this correct?

My naive audit of bioperl-live CVS brought up the following statistics:

# Root.pm
/cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
26
/cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
346

# RootI.pm
/cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
9
/cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
79

My guess would be that all RootI should be changed to plain Root ?

Any help appreciated,

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From jason at bioperl.org  Tue Oct  3 02:03:17 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 2 Oct 2006 23:03:17 -0700
Subject: [Bioperl-l] t/ESEFinder.t fixed on branch
Message-ID: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>

Looks like good work everyone.

All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1  
with RC1 except for the t/ESEFinder problem which I've fixed.

It skipped too few tests when BIOPERLDEBUG=0.

Don't forget to merge branch changes back to head for this test when  
it is done.   I don't want to muddy water so I'm holding off  
migrating the changes to main trunk as the files is substantially  
different (I presume pre-Test::More adoption?).

-jason

From bix at sendu.me.uk  Tue Oct  3 03:28:48 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:28:48 +0100
Subject: [Bioperl-l] t/ESEFinder.t fixed on branch
In-Reply-To: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>
References: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>
Message-ID: <45221130.2060405@sendu.me.uk>

Jason Stajich wrote:
> Looks like good work everyone.
> 
> All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1  
> with RC1 except for the t/ESEFinder problem which I've fixed.
> 
> It skipped too few tests when BIOPERLDEBUG=0.
> 
> Don't forget to merge branch changes back to head for this test when  
> it is done.   I don't want to muddy water so I'm holding off  
> migrating the changes to main trunk as the files is substantially  
> different (I presume pre-Test::More adoption?).

Actually, it was the same until Torsten made his own (different) fixes 
to HEAD but not to branch. It was my mistake and I've corrected in yet a 
third way, and now branch and HEAD match.

No harm done :)

From bix at sendu.me.uk  Tue Oct  3 03:31:10 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:31:10 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>
References: <4521A552.60301@sendu.me.uk>
	<7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>
Message-ID: <452211BE.6080107@sendu.me.uk>

Chris Fields wrote:
> So far all tests pass on Mac OS X.  I'll add this to the release page.
> 
> This RC will throw warnings for four tests I didn't remove in time  
> (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which  
> correspond to their namesake deprecated Bio::Tools modules.  These  
> are no longer in CVS HEAD so should be gone by the next RC, and the  
> relevant modules marked for deprecation.

Thanks Chris. Sorry I missed these.

From bix at sendu.me.uk  Tue Oct  3 03:32:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:32:08 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003023031.GI14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
Message-ID: <452211F8.8040104@sendu.me.uk>

Florin Iucha wrote:
> On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote:
>> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
>> upload tar.gz files when I have access to the server, then reply here 
>> with links.
>>
>> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
>> instructions on getting and testing this RC.
> 
> [I won't create a wiki account just to report this.]
> 
> Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> not set.  Lots of warnings about missing packages and all, but this
> looks interesting:
> 
>    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.
> 
> Otherwise:
> 
>    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay.
> 
> The failed test is:
> 
>    t/ESEfinder..................dubious
>       Test returned status 255 (wstat 65280, 0xff00)
>    DIED. FAILED test 15

Thanks for your feedback Florin. The ESEfinder fail will be fixed in the 
next RC.


From bix at sendu.me.uk  Tue Oct  3 04:29:37 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 09:29:37 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <45221F71.40206@sendu.me.uk>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.

Live/core:
http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-1.5.2-RC1.zip

Run:
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.zip

DB:
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.zip

Network:
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.zip

Md5 checksums are in:
http://bioperl.org/DIST/SIGNATURES.md5

From jason at bioperl.org  Tue Oct  3 02:11:30 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 2 Oct 2006 23:11:30 -0700
Subject: [Bioperl-l]  Use of Root.pm versus RootI.pm
Message-ID: <87F9B64E-8BDA-464B-814D-3F117AA646A1@bioperl.org>

I only briefly saw your question - but RootI is for interfaces,  
Root.pm is for instantiated objects.

From florin at iucha.net  Tue Oct  3 07:39:12 2006
From: florin at iucha.net (Florin Iucha)
Date: Tue, 3 Oct 2006 06:39:12 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
Message-ID: <20061003113912.GJ14409@iucha.net>

On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote:
> >Otherwise:
> >
> >   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,  
> >99.99% okay.
> >
> >The failed test is:
> >
> >   t/ESEfinder..................dubious
> >      Test returned status 255 (wstat 65280, 0xff00)
> >   DIED. FAILED test 15

$ perl -I. -w t/ESEfinder.t
1..15
ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
ok 2 - use Data::Dumper;
ok 3 - use Bio::PrimarySeq;
ok 4 - use Bio::Seq;
ok 5
ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
# Looks like you planned 15 tests but only ran 14.
$ grep Id t/ESEfinder.t
# $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra

From hlapp at gmx.net  Tue Oct  3 08:27:46 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 3 Oct 2006 08:27:46 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au>
References: <4521E74E.1040404@infotech.monash.edu.au>
Message-ID: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net>

The interface classes (those ending in 'I') should actually inherit  
from RootI, not Root.

In reality this recommendation is more theoretical than it makes that  
much of a difference I think. The motivation is that interface  
classes should not determine the actual implementation of a class  
(hash ref, array ref, whatever), and since Root.pm contains lots of  
implementation using a hash ref that decision will basically have  
been made.

On the contrary though, RootI contains implementation too, although  
I'm not sure it would prescribe the object implementation as opposed  
to merely implementing static methods (like throw(), warn(), etc).  
That would need to be checked.

	-hilmar

On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:

> My understanding is that all Bioperl-compliant classes should inherit
> from Bio::Root::Root, not Bio::Root::RootI.
>
> Additionally, if functions such as throw() or _rearrange() are to be
> used without a class instance reference, they are to be used as class
> methods via Bio::Root::Root, not Bio::Root::RootI.
>
> Is this correct?
>
> My naive audit of bioperl-live CVS brought up the following  
> statistics:
>
> # Root.pm
> /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> 26
> /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
> 346
>
> # RootI.pm
> /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> 9
> /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
> 79
>
> My guess would be that all RootI should be changed to plain Root ?
>
> Any help appreciated,
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct  3 08:33:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 07:33:37 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003113912.GJ14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
	<20061003113912.GJ14409@iucha.net>
Message-ID: <44724E16-74CD-4778-B04F-529475B47E37@uiuc.edu>

Florin,

Looks like this is fixed and should be working in the next release.

Chris

On Oct 3, 2006, at 6:39 AM, Florin Iucha wrote:

> On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote:
>>> Otherwise:
>>>
>>>   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,
>>> 99.99% okay.
>>>
>>> The failed test is:
>>>
>>>   t/ESEfinder..................dubious
>>>      Test returned status 255 (wstat 65280, 0xff00)
>>>   DIED. FAILED test 15
>
> $ perl -I. -w t/ESEfinder.t
> 1..15
> ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
> ok 2 - use Data::Dumper;
> ok 3 - use Bio::PrimarySeq;
> ok 4 - use Bio::Seq;
> ok 5
> ok 6 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 7 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 8 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 9 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 10 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 11 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 12 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 13 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 14 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> # Looks like you planned 15 tests but only ran 14.
> $ grep Id t/ESEfinder.t
> # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $
>
> florin
>
> -- 
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct  3 10:29:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 09:29:51 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net>
Message-ID: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>

> The interface classes (those ending in 'I') should actually inherit
> from RootI, not Root.
> 
> In reality this recommendation is more theoretical than it makes that
> much of a difference I think. The motivation is that interface
> classes should not determine the actual implementation of a class
> (hash ref, array ref, whatever), and since Root.pm contains lots of
> implementation using a hash ref that decision will basically have
> been made.
> 
> On the contrary though, RootI contains implementation too, although
> I'm not sure it would prescribe the object implementation as opposed
> to merely implementing static methods (like throw(), warn(), etc).
> That would need to be checked.
> 
> 	-hilmar

The constructor in Bio::Root::RootI lets one know that its use is
deprecated, so you shouldn't have any cases of 'our qw(Bio::Root::RootI)';
there should be some way of inheriting Root directly or indirectly.  I would
say that any direct use of RootI is not good practice, though.  For the
current implementation we should only inherit Bio::Root::Root, which
implements RootI.

Is there any reason to shut off the warning with BIOPERLDEBUG?  

>From RootI:

sub new {
  my $class = shift;
  my @args = @_;
  unless ( $ENV{'BIOPERLDEBUG'} ) {
      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
Bio::Root::Root instead");
  }
  eval "require Bio::Root::Root";
  return Bio::Root::Root->new(@args);
}


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> 
> On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> 
> > My understanding is that all Bioperl-compliant classes should inherit
> > from Bio::Root::Root, not Bio::Root::RootI.
> >
> > Additionally, if functions such as throw() or _rearrange() are to be
> > used without a class instance reference, they are to be used as class
> > methods via Bio::Root::Root, not Bio::Root::RootI.
> >
> > Is this correct?
> >
> > My naive audit of bioperl-live CVS brought up the following
> > statistics:
> >
> > # Root.pm
> > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > 26
> > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
> > 346
> >
> > # RootI.pm
> > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > 9
> > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
> > 79
> >
> > My guess would be that all RootI should be changed to plain Root ?
> >
> > Any help appreciated,
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From slenk at emich.edu  Tue Oct  3 13:31:47 2006
From: slenk at emich.edu (Stephen Gordon Lenk)
Date: Tue, 03 Oct 2006 13:31:47 -0400
Subject: [Bioperl-l] Perl 6 has 'roles' - may be cleanly applicable to the
	Root/RootI issue
Message-ID: <5147da5514e402.514e4025147da5@emich.edu>

I looked at the Perl6 site, there is an RFC on interfaces:
http://dev.perl.org/perl6/rfc/265.html

Roles seem to be the Perl 6 answer to the Root/RootI issue in Bioperl. 
Maybe it is too early to suggest this.

http://dev.perl.org/perl6/doc/design/apo/A12.html:
The primary role of a class is to manage instances, that is, objects. 
So a class must worry about object creation and destruction, and 
everything that happens in between. Classes have a secondary role as 
units of software reuse, in that they can be inherited from or 
delegated to. However, because this is a secondary role, and because 
of weaknesses in models of inheritance, composition, and delegation, 
Perl 6 will split out the notion of software reuse into a separate 
class-like entity called a "role". Roles are an abstraction mechanism 
for use by classes that don't care about the secondary aspects of 
software reuse, or that (looking at it the other way) care so much 
about it that they want to encapsulate any decisions about 
implementation, composition, delegation, and maybe even inheritance. 
Sounds fancy, but just think of them as includes of partial classes, 
with some safety checks. Roles don't manage objects. They manage 
interfaces and other abstract behavior (like default implementations), 
and they help classes manage objects. As such, a role may only be 
composed into a class or into another role, never inherited from or 
delegated to. That's what classes are for.


From slenk at emich.edu  Tue Oct  3 12:45:15 2006
From: slenk at emich.edu (Stephen Gordon Lenk)
Date: Tue, 03 Oct 2006 12:45:15 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
Message-ID: <5120d6a511f5a7.511f5a75120d6a@emich.edu>

The separation of interface and implementation is generally
regarded as a good idea. Right now the Bioperl community is
doing this as part of the implementation of Bioperl. I suggest
that this is an example of something which you might want to
have as part of the Perl implementation. If Perl 6 (or even
Perl 5) does not have this as a core part of the language or
as a standard package (reusable by all in a common fashion),
you may want to suggest to the Perl implementers that a way
for interface/implementation distinctions be made part of the
core language. My 2 cents, as you people are the experts on 
your own code.


----- Original Message -----
From: Chris Fields <cjfields at uiuc.edu>
Date: Tuesday, October 3, 2006 10:29 am
Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm

> > The interface classes (those ending in 'I') should actually inherit
> > from RootI, not Root.
> > 
> > In reality this recommendation is more theoretical than it makes 
> that> much of a difference I think. The motivation is that interface
> > classes should not determine the actual implementation of a class
> > (hash ref, array ref, whatever), and since Root.pm contains lots of
> > implementation using a hash ref that decision will basically have
> > been made.
> > 
> > On the contrary though, RootI contains implementation too, although
> > I'm not sure it would prescribe the object implementation as 
opposed
> > to merely implementing static methods (like throw(), warn(), etc).
> > That would need to be checked.
> > 
> > 	-hilmar
> 
> The constructor in Bio::Root::RootI lets one know that its use is
> deprecated, so you shouldn't have any cases of 'our 
> qw(Bio::Root::RootI)';there should be some way of inheriting Root 
> directly or indirectly.  I would
> say that any direct use of RootI is not good practice, though.  
> For the
> current implementation we should only inherit Bio::Root::Root, which
> implements RootI.
> 
> Is there any reason to shut off the warning with BIOPERLDEBUG?  
> 
> >From RootI:
> 
> sub new {
>  my $class = shift;
>  my @args = @_;
>  unless ( $ENV{'BIOPERLDEBUG'} ) {
>      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
> Bio::Root::Root instead");
>  }
>  eval "require Bio::Root::Root";
>  return Bio::Root::Root->new(@args);
> }
> 
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> > 
> > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> > 
> > > My understanding is that all Bioperl-compliant classes should 
> inherit> > from Bio::Root::Root, not Bio::Root::RootI.
> > >
> > > Additionally, if functions such as throw() or _rearrange() are 
> to be
> > > used without a class instance reference, they are to be used 
> as class
> > > methods via Bio::Root::Root, not Bio::Root::RootI.
> > >
> > > Is this correct?
> > >
> > > My naive audit of bioperl-live CVS brought up the following
> > > statistics:
> > >
> > > # Root.pm
> > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > > 26
> > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | 
> wc -l
> > > 346
> > >
> > > # RootI.pm
> > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > > 9
> > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | 
> wc -l
> > > 79
> > >
> > > My guess would be that all RootI should be changed to plain 
> Root ?
> > >
> > > Any help appreciated,
> > >
> > > --
> > > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > > Victorian Bioinformatics Consortium, Monash University, Australia
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > 
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

From cjfields at uiuc.edu  Tue Oct  3 13:49:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 12:49:35 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <5120d6a511f5a7.511f5a75120d6a@emich.edu>
Message-ID: <000001c6e714$4c2cbb80$15327e82@pyrimidine>

Perl6 already has added flexibility for separation of
implementation/interface (I believe they are called roles).  

http://dev.perl.org/perl6/doc/design/syn/S12.html

To tell the truth, I'm not sure about Perl 5, except the way the Bioperl
devs have up the distinction between interface and implementation.  However,
I find the way we use interfaces is very simple (set up interface with
some/all methods as unimplemented, use the module as an abstract base class,
then override the unimplemented methods).  It works for me.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Stephen Gordon Lenk [mailto:slenk at emich.edu]
> Sent: Tuesday, October 03, 2006 11:45 AM
> To: Chris Fields
> Cc: 'Hilmar Lapp'; 'Torsten Seemann'; 'bioperl-l'
> Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm
> 
> The separation of interface and implementation is generally
> regarded as a good idea. Right now the Bioperl community is
> doing this as part of the implementation of Bioperl. I suggest
> that this is an example of something which you might want to
> have as part of the Perl implementation. If Perl 6 (or even
> Perl 5) does not have this as a core part of the language or
> as a standard package (reusable by all in a common fashion),
> you may want to suggest to the Perl implementers that a way
> for interface/implementation distinctions be made part of the
> core language. My 2 cents, as you people are the experts on
> your own code.
> 
> 
> ----- Original Message -----
> From: Chris Fields <cjfields at uiuc.edu>
> Date: Tuesday, October 3, 2006 10:29 am
> Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm
> 
> > > The interface classes (those ending in 'I') should actually inherit
> > > from RootI, not Root.
> > >
> > > In reality this recommendation is more theoretical than it makes
> > that> much of a difference I think. The motivation is that interface
> > > classes should not determine the actual implementation of a class
> > > (hash ref, array ref, whatever), and since Root.pm contains lots of
> > > implementation using a hash ref that decision will basically have
> > > been made.
> > >
> > > On the contrary though, RootI contains implementation too, although
> > > I'm not sure it would prescribe the object implementation as
> opposed
> > > to merely implementing static methods (like throw(), warn(), etc).
> > > That would need to be checked.
> > >
> > > 	-hilmar
> >
> > The constructor in Bio::Root::RootI lets one know that its use is
> > deprecated, so you shouldn't have any cases of 'our
> > qw(Bio::Root::RootI)';there should be some way of inheriting Root
> > directly or indirectly.  I would
> > say that any direct use of RootI is not good practice, though.
> > For the
> > current implementation we should only inherit Bio::Root::Root, which
> > implements RootI.
> >
> > Is there any reason to shut off the warning with BIOPERLDEBUG?
> >
> > >From RootI:
> >
> > sub new {
> >  my $class = shift;
> >  my @args = @_;
> >  unless ( $ENV{'BIOPERLDEBUG'} ) {
> >      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
> > Bio::Root::Root instead");
> >  }
> >  eval "require Bio::Root::Root";
> >  return Bio::Root::Root->new(@args);
> > }
> >
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > >
> > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> > >
> > > > My understanding is that all Bioperl-compliant classes should
> > inherit> > from Bio::Root::Root, not Bio::Root::RootI.
> > > >
> > > > Additionally, if functions such as throw() or _rearrange() are
> > to be
> > > > used without a class instance reference, they are to be used
> > as class
> > > > methods via Bio::Root::Root, not Bio::Root::RootI.
> > > >
> > > > Is this correct?
> > > >
> > > > My naive audit of bioperl-live CVS brought up the following
> > > > statistics:
> > > >
> > > > # Root.pm
> > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > > > 26
> > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio |
> > wc -l
> > > > 346
> > > >
> > > > # RootI.pm
> > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > > > 9
> > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio |
> > wc -l
> > > > 79
> > > >
> > > > My guess would be that all RootI should be changed to plain
> > Root ?
> > > >
> > > > Any help appreciated,
> > > >
> > > > --
> > > > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > > > Victorian Bioinformatics Consortium, Monash University, Australia
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > >
> > > --
> > > ===========================================================
> > > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > > ===========================================================
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >


From cmlapid at up.edu.ph  Tue Oct  3 22:06:06 2006
From: cmlapid at up.edu.ph (Carlo Lapid)
Date: Wed, 4 Oct 2006 10:06:06 +0800
Subject: [Bioperl-l] genbank mirror
Message-ID: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>

Hi,

I'm trying to set up a local mirror of a large part of the Genbank database.
For users to access the local database, I need to create a web-based search
tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
flat files I've downloaded based on a query entered by the user.

I'm trying to use Bioperl to create this from scratch, but I'm having a very
hard time, especially since I want the user to have reasonable flexibility
in customizing his search. The best that I've been able to accomplish is a
search function that retrieves genbank sequence objects based on their
primary IDs or accession numbers; by using the fetch method of the
Bio::Index::GenBank module. But this doesn't help users who don't know the
exact IDs for the sequences they want.

Can anybody suggest a way to use Bioperl to search for an ordinary word or
phrase, like "16S gene", which could be matched against the description
field, or the entire genbank entry? (Alternatively, is there some other
freely available tool or software that can do this?) I've been scouring the
Bioperl documentation, but I couldn't find anything. I just need to be
pointed in the right direction. What I thought was a relatively simple
problem has been driving me crazy for days; if anybody has any suggestions I
would really, really appreciate it.

From torsten.seemann at infotech.monash.edu.au  Tue Oct  3 22:58:03 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 04 Oct 2006 12:58:03 +1000
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
References: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <4523233B.7030505@infotech.monash.edu.au>

> I'm trying to set up a local mirror of a large part of the Genbank database.
> For users to access the local database, I need to create a web-based search
> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
> flat files I've downloaded based on a query entered by the user.

Have you coinsidered bioperl-db / BioSQL ?

http://www.bioperl.org/wiki/BioPerl_db
http://lists.open-bio.org/pipermail/biosql-l/

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From osborne1 at optonline.net  Tue Oct  3 23:16:20 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Tue, 03 Oct 2006 23:16:20 -0400
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <C1489FC4.AA43%osborne1@optonline.net>

Carlo,

You might want to look at the Bio::DB::Query::GenBank module:

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_dat
abase

However this works through NCBI's own eutils API, setting it up to query a
local mirror may be very difficult.


Brian O.


On 10/3/06 10:06 PM, "Carlo Lapid" <cmlapid at up.edu.ph> wrote:

> Hi,
> 
> I'm trying to set up a local mirror of a large part of the Genbank database.
> For users to access the local database, I need to create a web-based search
> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
> flat files I've downloaded based on a query entered by the user.
> 
> I'm trying to use Bioperl to create this from scratch, but I'm having a very
> hard time, especially since I want the user to have reasonable flexibility
> in customizing his search. The best that I've been able to accomplish is a
> search function that retrieves genbank sequence objects based on their
> primary IDs or accession numbers; by using the fetch method of the
> Bio::Index::GenBank module. But this doesn't help users who don't know the
> exact IDs for the sequences they want.
> 
> Can anybody suggest a way to use Bioperl to search for an ordinary word or
> phrase, like "16S gene", which could be matched against the description
> field, or the entire genbank entry? (Alternatively, is there some other
> freely available tool or software that can do this?) I've been scouring the
> Bioperl documentation, but I couldn't find anything. I just need to be
> pointed in the right direction. What I thought was a relatively simple
> problem has been driving me crazy for days; if anybody has any suggestions I
> would really, really appreciate it.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From osborne1 at optonline.net  Tue Oct  3 23:28:06 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Tue, 03 Oct 2006 23:28:06 -0400
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <4523233B.7030505@infotech.monash.edu.au>
Message-ID: <C148A286.AA47%osborne1@optonline.net>

Torsten and Carlo,

Right. For some simple examples of using Bio::DB::Query::BioQuery to query a
BioSQL db take a look at Bio::DB::BioSQL::OBDA.

You may also want to take a look at NCBI's eutils API, it's quite powerful
but not local. Or the ENSEMBL API, people have set up their own local
ENSEMBL dbs. There's an example of this API here:

http://www.bioperl.org/wiki/Getting_Genomic_Sequences


Brian O.


On 10/3/06 10:58 PM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

>> I'm trying to set up a local mirror of a large part of the Genbank database.
>> For users to access the local database, I need to create a web-based search
>> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
>> flat files I've downloaded based on a query entered by the user.
> 
> Have you coinsidered bioperl-db / BioSQL ?
> 
> http://www.bioperl.org/wiki/BioPerl_db
> http://lists.open-bio.org/pipermail/biosql-l/


From torsten.seemann at infotech.monash.edu.au  Wed Oct  4 01:21:24 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 04 Oct 2006 15:21:24 +1000
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
Message-ID: <452344D4.8070908@infotech.monash.edu.au>

Hi all,

Now that we have Perl 5.6.1 as a minimum, the following modules are 
standard: File::Spec, File::Temp, File::Path

Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() 
which currently dispatch to the File:: version, or try to emulate it. We 
don't need to emulate anymore. Jason Stajich suggested in a previous 
post that they should be deprecated, and that users should use directly 
the File:: functions themselves.

I have an uncommitted simplified version of Bio::Root::IO which does 
this, and "all tests pass". The functions currently (silently) dispatch 
directly to their native counterparts.

The only tricky function is tempfile() which is *mostly* like 
File::Temp::tempfile(), but does some voodoo of converting 
(TEMPLATE=>'xxx') to the non-hash first parameter of the File:: version, 
so I'm hesitant to commit. It may do other magic - Hilmar?

Comments?

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From gianluca.debellis at itb.cnr.it  Wed Oct  4 05:25:26 2006
From: gianluca.debellis at itb.cnr.it (Gianluca De Bellis)
Date: Wed, 04 Oct 2006 11:25:26 +0200
Subject: [Bioperl-l] Bioperl under WinXP
Message-ID: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>

I'm trying to use Bioperl under WinXP-SP2 (novice)

Bioperl has been just downloaded  (v 1.2.3)

Even the simplest program with a single command (use Bio::Perl;) ends up in
an error of the Perl interpreter with these details

AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll

ModVer: 0.0.0.0      Offset: 00003294

Coming from the  windos reporting system

Where is the problem?

 
Thanks in advance


From epsteinj at mail.nih.gov  Wed Oct  4 07:25:57 2006
From: epsteinj at mail.nih.gov (Epstein, Jonathan A (NIH/NICHD) [E])
Date: Wed, 4 Oct 2006 07:25:57 -0400
Subject: [Bioperl-l] genbank mirror
References: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <42504F69898FE546B3F0238C9BD03275532603@NIHCESMLBX7.nih.gov>

There's Seqhound:
  http://seqhound.blueprint.org/report.html

We set this up locally, and it's probably the most comprehensive free solution out there, but it's non-trivial to setup. Also, since the Blueprint&BIND have lost most of their funding, I'm not sure how long you can count on SeqHound to remain operational (although for now it is being updated).

Jonathan


-----Original Message-----
From: Carlo Lapid [mailto:cmlapid at up.edu.ph]
Sent: Tue 10/3/2006 10:06 PM
To: bioperl-l at bioperl.org
Subject: [Bioperl-l] genbank mirror
 
Hi,

I'm trying to set up a local mirror of a large part of the Genbank database.
For users to access the local database, I need to create a web-based search
tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
flat files I've downloaded based on a query entered by the user.

I'm trying to use Bioperl to create this from scratch, but I'm having a very
hard time, especially since I want the user to have reasonable flexibility
in customizing his search. The best that I've been able to accomplish is a
search function that retrieves genbank sequence objects based on their
primary IDs or accession numbers; by using the fetch method of the
Bio::Index::GenBank module. But this doesn't help users who don't know the
exact IDs for the sequences they want.

Can anybody suggest a way to use Bioperl to search for an ordinary word or
phrase, like "16S gene", which could be matched against the description
field, or the entire genbank entry? (Alternatively, is there some other
freely available tool or software that can do this?) I've been scouring the
Bioperl documentation, but I couldn't find anything. I just need to be
pointed in the right direction. What I thought was a relatively simple
problem has been driving me crazy for days; if anybody has any suggestions I
would really, really appreciate it.
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Wed Oct  4 09:19:45 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 04 Oct 2006 14:19:45 +0100
Subject: [Bioperl-l] Bioperl under WinXP
In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
References: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
Message-ID: <4523B4F1.3010305@sendu.me.uk>

Gianluca De Bellis wrote:
> I'm trying to use Bioperl under WinXP-SP2 (novice)
> 
> Bioperl has been just downloaded  (v 1.2.3)
> 
> Even the simplest program with a single command (use Bio::Perl;) ends up in
> an error of the Perl interpreter with these details
> 
> AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll
> 
> ModVer: 0.0.0.0      Offset: 00003294
> 
> Coming from the  windos reporting system
> 
> Where is the problem?

Hard to say. Do non-bioperl scripts work?

Make sure to follow the Bioperl installation instructions carefully:
http://bioperl.org/wiki/Installing_Bioperl_on_Windows

And make sure to install at least version 1.4. 1.2.3 is ancient and 
effectively unsupported.

From cjfields at uiuc.edu  Wed Oct  4 10:03:34 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 4 Oct 2006 09:03:34 -0500
Subject: [Bioperl-l] Bioperl under WinXP
In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
Message-ID: <000601c6e7bd$e22ad190$15327e82@pyrimidine>

If you're using PPM, you can install a (much) newer version of BioPerl from
here:

http://www.gmod.org/ggb/ppm/

Add that as one of your repositories in PPM4 (seeing that you are using
ActivePerl 5.8.8.819), then search for bioperl.  The version should be
1.512.

In a few weeks we'll be releasing a new developer release.  A WinXP PPM is
expected, as well as a bundled package to install all prerequisites.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Gianluca De Bellis
> Sent: Wednesday, October 04, 2006 4:25 AM
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] Bioperl under WinXP
> 
> I'm trying to use Bioperl under WinXP-SP2 (novice)
> 
> Bioperl has been just downloaded  (v 1.2.3)
> 
> Even the simplest program with a single command (use Bio::Perl;) ends up
> in
> an error of the Perl interpreter with these details
> 
> AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll
> 
> ModVer: 0.0.0.0      Offset: 00003294
> 
> Coming from the  windos reporting system
> 
> Where is the problem?
> 
> 
> 
> Thanks in advance
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at gmx.net  Wed Oct  4 10:25:23 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 4 Oct 2006 10:25:23 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>
References: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>
Message-ID: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net>


On Oct 3, 2006, at 10:29 AM, Chris Fields wrote:

> The constructor in Bio::Root::RootI lets one know that its use is
> deprecated, so you shouldn't have any cases of 'our qw 
> (Bio::Root::RootI)';

Don't confuse the constructor with the inheritance tree.

Interface classes should never be instantiated, hence the  
constructor, consistent with the documentation, should never get  
executed.

> there should be some way of inheriting Root directly or  
> indirectly.  I would
> say that any direct use of RootI is not good practice, though.

I don't know what you mean by 'directly' or 'indirectly' but  
inheritance from interfaces, and interfaces extending (inheriting  
from) other interfaces, is certainly standard practice. I'm not sure  
at all why it would be a bad one.

> For the current implementation we should only inherit  
> Bio::Root::Root, which
> implements RootI.

For the implementation classes, yes. For the interface classes, no.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Oct  4 10:43:54 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 4 Oct 2006 10:43:54 -0400
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <452344D4.8070908@infotech.monash.edu.au>
References: <452344D4.8070908@infotech.monash.edu.au>
Message-ID: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>


On Oct 4, 2006, at 1:21 AM, Torsten Seemann wrote:

> Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree()
> which currently dispatch to the File:: version, or try to emulate  
> it. We
> don't need to emulate anymore. Jason Stajich suggested in a previous
> post that they should be deprecated, and that users should use  
> directly
> the File:: functions themselves.

I don't think there's a need to deprecate - if the methods just plain  
delegate to whatever File:: module is appropriate their  
implementation (supposedly) will become very simple and hence won't  
pose a maintenance burden anymore.

One can still recommend for all new scripts or modules or code  
written to use the File:: modules directly, just I'm not sure there's  
a need to tell users that they should start changing their existing  
stuff.

>
> I have an uncommitted simplified version of Bio::Root::IO which does
> this, and "all tests pass". The functions currently (silently)  
> dispatch
> directly to their native counterparts.
>
> The only tricky function is tempfile() which is *mostly* like
> File::Temp::tempfile(), but does some voodoo of converting
> (TEMPLATE=>'xxx') to the non-hash first parameter of the File::  
> version,
> so I'm hesitant to commit. It may do other magic - Hilmar?

Not that I would know of. If the tests pass (without having to change  
them!) I'd give it a try.

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Oct  4 11:35:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 4 Oct 2006 10:35:16 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net>
Message-ID: <001901c6e7ca$b12fd5b0$15327e82@pyrimidine>

...
> Don't confuse the constructor with the inheritance tree.
> 
> Interface classes should never be instantiated, hence the
> constructor, consistent with the documentation, should never get
> executed.

I know that interfaces shouldn't be instantiated.  I had noticed there are
cases of 'our qw (Bio::Root::RootI)' where it is completely acceptable to
inherit the interface.  Makes sense to me now.

> > there should be some way of inheriting Root directly or
> > indirectly.  I would
> > say that any direct use of RootI is not good practice, though.
> 
> I don't know what you mean by 'directly' or 'indirectly' but
> inheritance from interfaces, and interfaces extending (inheriting
> from) other interfaces, is certainly standard practice. I'm not sure
> at all why it would be a bad one.

I was talking specifically about inheriting RootI, and not about all Bioperl
interfaces in general.  I completely understand the use of
interface/implementation in Bioperl.  However, I missed one small fact until
yesterday (of course AFTER I posed my reply), which was that interfaces may
inherit RootI directly.  My oops.

I had understood that, in general, any Bioperl implementation should not
inherit the RootI interface directly (they should inherit Root, since that
implements RootI).  The 'constructor' present in RootI is essentially to
make sure that no one inherits from the wrong class.

Probably a bad use of the terms 'direct' and 'indirect', so maybe I didn't
get that across very well.  What I meant was that all classes inherit Root
in some way, either 'directly' (as the direct parent class) or 'indirectly'
(through the inheritance tree). Probably comes from being primarily a
molecular microbiologist and not a computer scientist.

OT, but it would be nice to have an updated class diagram to sort out the
inheritance hierarchy a bit easier.  In the meantime, the Deobfuscator does
help quite a bit.

> > For the current implementation we should only inherit
> > Bio::Root::Root, which
> > implements RootI.
> 
> For the implementation classes, yes. For the interface classes, no.

I agree (see above).  That's the one small bit about interfaces I missed
along the way.  Makes sense; they use throw_not_implemented(), which is a
RootI method.

> 	-hilmar

Chris


From pmiguel at purdue.edu  Wed Oct  4 15:38:51 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Wed, 04 Oct 2006 15:38:51 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <45240DCB.2080204@purdue.edu>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.
>
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
> instructions on getting and testing this RC.
>
> Developers:
>    Make sure you're in the AUTHORS file in all 4 packages, as
>    appropriate.
>
> Users:
>    Even though 1.5.2 is a 'developer' release, we consider it the most
>    stable and capable version of Bioperl, and recommend that you use
>    it in all but the most critical production environments. Please
>    try it out and let us know of any problems or difficulties you run
>    into.
>
>
> Thank you,
> Sendu.
>   
I didn't see any tests done under solaris, so I asked our sys admin to 
do the install on one of our machines.

Just another data point:

He installed this release candidate on a Sun E450 box running solaris. 
uname -a gives:

SunOS descartes 5.10 Generic_118833-18 sun4u sparc SUNW,Ultra-4

perl -v gives:

This is perl, v5.8.8 built for sun4-solaris
(etc.)


$ time make test
PERL_DL_NONLAZY=1 /usr/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/AAChange...................ok
t/AAReverseMutate............ok
t/abi........................Bio::SeqIO::staden::read from bioperl-ext is not installed or is installed incorrectly - skipping abi.t tests
t/abi........................ok
t/ace........................ok
t/AlignIO....................ok
t/AlignStats.................ok
t/AlignUtil..................ok
t/alignUtilities.............ok
t/Allele.....................ok
t/Alphabet...................ok
t/Annotation.................ok
t/AnnotationAdaptor..........ok
t/asciitree..................ok
t/Assembly...................ok
        1/19 skipped:
t/Biblio.....................ok
t/Biblio_biofetch............ok
t/Biblio_eutils..............ok
t/BiblioReferences...........ok
t/BioDBGFF...................ok
t/BioDBSeqFeature............ok 1/46Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.
t/BioDBSeqFeature............ok
t/BioDBSeqFeature_BDB........ok
t/BioDBSeqFeature_mysql......ok 3/46prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT sequence,offset
   FROM sequence as s,locationlist as ll
   WHERE s.id=ll.id
     AND ll.seqname= ?
     AND offset >= ?
     AND offset <= ?
   ORDER BY offset
) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT sequence,offset
   FROM sequence as s,locationlist as ll
   WHERE s.id=ll.id
     AND ll.seqname= ?
     AND offset >= ?
     AND offset <= ?
   ORDER BY offset
) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
t/BioDBSeqFeature_mysql......ok
t/BioFetch_DB................ok
t/BioGraphics................ok
t/BlastIndex.................ok 1/13
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BlastIndex.................ok
t/BPbl2seq...................
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPbl2seq...................ok 1/108
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPbl2seq...................ok
t/BPlite.....................ok 1/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok 52/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok 88/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
STACK Bio::Tools::BPlite::new /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/Tools/BPlite.pm:197
STACK toplevel t/BPlite.t:127

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok
t/BPpsilite..................
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPpsilite..................ok 4/11
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPpsilite..................ok
t/bsml_sax...................ok
t/Chain......................ok
t/chaosxml...................ok
t/cigarstring................ok
t/ClusterIO..................ok
t/Coalescent.................ok
t/CodonTable.................ok
t/Compatible.................ok
t/consed.....................ok
t/CoordinateGraph............ok
t/CoordinateMapper...........ok
t/Correlate..................ok
t/ctf........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ctf.t tests
t/ctf........................ok
t/CytoMap....................ok
t/DB.........................skipped
        all skipped: Skipping all tests since they require network access, set BIOPERLDEBUG=1 to test
t/DBCUTG.....................ok
        11/34 skipped: Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
t/DBFasta....................ok
t/DNAMutation................ok
t/Domcut.....................ok
t/ECnumber...................ok
t/ELM........................ok 1/13
-------------------- WARNING ---------------------
MSG: sleeping for 1 seconds

---------------------------------------------------
t/ELM........................ok
t/embl.......................ok
t/EMBL_DB....................ok
t/EMBOSS_Tools...............ok
t/EncodedSeq.................ok
t/entrezgene.................ok 491/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 695/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 723/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 824/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok
t/ePCR.......................ok
t/ESEfinder..................ok 1/15# Looks like you planned 15 tests but only ran 14.
t/ESEfinder..................dubious
        Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED test 15
        Failed 1/15 tests, 93.33% okay (less 9 skipped tests: 5 okay, 33.33%)
t/est2genome.................ok
t/EUtilities.................skipped
        all skipped: Set BIOPERLDEBUG=1 to run tests
t/Exception..................ok
t/Exonerate..................ok
t/exp........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping exp.t tests
t/exp........................ok
t/fasta......................ok
t/FeatureIO..................ok 7/33
-------------------- WARNING ---------------------
MSG: '##feature-ontology' directive handling not yet implemented
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: '##attribute-ontology' directive handling not yet implemented
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: '##source-ontology' directive handling not yet implemented
---------------------------------------------------
t/FeatureIO..................ok
t/flat.......................ok
t/FootPrinter................ok
t/game.......................ok
t/GbrowseGFF.................ok
t/gcg........................ok
t/GDB........................ok
t/Gel........................ok
t/genbank....................ok
t/GeneCoordinateMapper.......ok
t/Geneid.....................ok
t/Genewise...................ok
        2/51 skipped:
t/Genomewise.................ok
t/Genpred....................ok
t/GFF........................ok
t/GOR4.......................ok
t/GOterm.....................ok
t/GraphAdaptor...............ok
t/GuessSeqFormat.............ok
t/hmmer......................ok
t/hmmer_pull.................ok
t/HNN........................ok
t/HtSNP......................ok
t/Index......................ok
t/InstanceSite...............ok
t/interpro...................ok
t/InterProParser.............ok
t/IUPAC......................ok
t/kegg.......................ok
t/largefasta.................ok
t/LargeLocatableSeq..........ok
t/largepseq..................ok
t/lasergene..................ok
t/LinkageMap.................ok
t/LiveSeq....................ok
t/LocatableSeq...............ok
t/Location...................ok
t/LocationFactory............ok
t/LocusLink..................ok
t/lucy.......................ok
t/Map........................ok
t/MapIO......................ok
t/masta......................ok
t/Matrix.....................ok
t/Measure....................ok
t/MeSH.......................ok
t/metafasta..................ok
t/MetaSeq....................ok
t/MicrosatelliteMarker.......ok
t/MiniMIMentry...............ok
t/MitoProt...................ok
t/Molphy.....................ok
t/MultiFile..................ok
t/multiple_fasta.............ok
t/Mutation...................ok
t/Mutator....................ok
t/NetPhos....................ok
        10/14 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/Node.......................ok
t/obo_parser.................ok
t/OddCodes...................ok
t/OMIMentry..................ok
t/OMIMentryAllelicVariant....ok
t/OMIMparser.................ok
t/Ontology...................ok
t/OntologyEngine.............ok
t/OntologyStore..............ok
t/PAML.......................ok
t/Perl.......................ok
t/phd........................ok
t/Phenotype..................ok
t/PhylipDist.................ok
t/PhysicalMap................ok
t/pICalculator...............ok
t/Pictogram..................ok
t/pir........................ok
t/pln........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping pln.t tests
t/pln........................ok
t/PopGen.....................ok
        2/89 skipped:
t/PopGenSims.................ok
t/primaryqual................ok
t/PrimarySeq.................ok
t/primedseq..................ok
t/Primer.....................ok
t/primer3....................ok
t/Promoterwise...............ok
t/ProtDist...................ok
t/protgraph..................ok
t/ProtMatrix.................ok
t/ProtPsm....................ok
t/Pseudowise.................ok
t/psm........................ok
t/QRNA.......................ok
t/qual.......................ok
t/RandDistFunctions..........ok
t/RandomTreeFactory..........ok
t/Range......................ok
t/RangeI.....................ok
t/raw........................ok
t/RefSeq.....................ok
t/Registry...................ok
t/Relationship...............ok
t/RelationshipType...........ok
t/RemoteBlast................ok
        11/13 skipped: to avoid timeout
t/RepeatMasker...............ok
t/RestrictionAnalysis........ok
t/RestrictionEnzyme..........ok 1/14
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::RestrictionEnzyme is deprecatedUse Bio::Restriction classes instead
---------------------------------------------------
t/RestrictionEnzyme..........ok
t/RestrictionIO..............ok
t/RNAChange..................ok
t/rnamotif...................ok
t/RootI......................ok
t/RootIO.....................ok
        2/27 skipped: various reasons
t/RootStorable...............ok
t/Scansite...................ok
t/scf........................ok
t/SearchDist.................ok
t/SearchIO...................ok
t/Seg........................ok
t/Seq........................ok
t/seq_quality................ok
t/SeqAnalysisParser..........ok
t/SeqBuilder.................ok
t/SeqDiff....................ok
t/SeqFeatCollection..........ok
t/SeqFeature.................ok
t/seqfeaturePrimer...........ok
t/SeqHound_DB................ok 4/14Writing into 'shoundlog' log file.
t/SeqHound_DB................ok
t/SeqIO......................ok
t/SeqPattern.................ok
t/seqread_fail...............ok
t/SeqStats...................ok
t/SequenceFamily.............ok
t/sequencetrace..............ok
t/SeqUtils...................ok
t/SeqVersion.................ok
t/seqwithquality.............ok
t/SeqWords...................ok
t/Sigcleave..................ok
t/Signalp....................ok
t/Sim4.......................ok
t/SimilarityPair.............ok
t/SimpleAlign................ok
t/simpleGOparser.............ok
t/singlet....................ok
t/sirna......................ok
t/SiteMatrix.................ok
t/SNP........................ok
t/Sopma......................ok
t/Species....................ok
        5/20 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/Spidey.....................ok
t/splicedseq.................ok
t/StandAloneBlast............ok
t/StructIO...................ok
t/Structure..................ok
t/swiss......................ok
t/Symbol.....................ok
t/tab........................ok
t/table......................ok
t/TagHaplotype...............ok
t/Taxonomy...................ok
        44/98 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/TaxonTree..................ok
t/Tempfile...................ok
t/Term.......................ok
t/tigrxml....................ok
t/tinyseq....................ok
t/Tmhmm......................ok
t/Tools......................ok
t/Tree.......................ok
t/TreeBuild..................ok
t/TreeIO.....................ok
t/trim.......................ok
t/tRNAscanSE.................ok
t/UCSCParsers................ok
t/Unflattener................ok
t/Unflattener2...............ok
t/UniGene....................ok
t/Variation_IO...............ok
t/WABA.......................ok
t/XEMBL_DB...................ok
        1/9 skipped: server may be down
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests
t/ztr........................ok
Failed Test   Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
t/ESEfinder.t  255 65280    15    2  13.33%  15
2 tests and 98 subtests skipped.
Failed 1/240 test scripts, 99.58% okay. 1/11910 subtests failed, 99.99% okay.
*** Error code 29
make: Fatal error: Command failed for target `test_dynamic'

real    13m10.064s
user    11m14.891s
sys     0m45.417s

$ TEST_VERBOSE=1 perl t/ESEfinder.t
1..15
ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
ok 2 - use Data::Dumper;
ok 3 - use Bio::PrimarySeq;
ok 4 - use Bio::Seq;
ok 5
ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
# Looks like you planned 15 tests but only ran 14.


From bix at sendu.me.uk  Thu Oct  5 03:19:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 08:19:39 +0100
Subject: [Bioperl-l] EUtilities term handling
Message-ID: <4524B20B.5010703@sendu.me.uk>

This is actually a general question and not limited to EUtilities. As I 
see it EUtiltiies lets you do queries in Bioperl that you can do on a 
website. The question is, should a Bioperl module always work with 
queries that the website it is a front-end to works with?

So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is 
essentially a frontend onto:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=

With a web-browser you can complete that url by supplying a term. For 
example, the term 'BRCA2+9606[taxid]' works and returns results:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid]

If you supply the exact same term to EUtilities::esearch like so:

my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => 
"gene", -term "BRCA2+9606[taxid]");

The search fails. From my 'user' perspective this is highly unexpected. 
Chris (the author) and I both understand /why/ it fails, but Chris 
doesn't think it is a bug, or at least something than can/should be 
changed. What do other people think? At the very least, if something 
unexpected happens, I'd suggest making a note of it in the POD 
somewhere. Eg. "Do not use + in term strings, even though they might 
work on the website".

Chris: what is the disadvantage of always submitting '+' as '+' to the 
server?

From bix at sendu.me.uk  Thu Oct  5 03:24:45 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 08:24:45 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <4524B33D.9070607@sendu.me.uk>

Sendu Bala wrote:
>
> With a web-browser you can complete that url by supplying a term. For 
> example, the term 'BRCA2+9606[taxid]' works and returns results:
> 
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] 
> 
> 
> If you supply the exact same term to EUtilities::esearch like so:
> 
> my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => 
> "gene", -term "BRCA2+9606[taxid]");

*cough*

my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db =>
"gene", -term => "BRCA2+9606[taxid]");


> The search fails. 


From m.weimer at dkfz-heidelberg.de  Thu Oct  5 08:15:53 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Thu, 05 Oct 2006 14:15:53 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Error
Message-ID: <1160050554.18691.11.camel@localhost>

When running


--------------------------------------------------------------

  #! /usr/bin/perl -w

  use strict;
  use Bio::DB::SwissProt;

  my $db_obj = new Bio::DB::SwissProt(-verbose=>1);

  my $seq_obj = $db_obj->get_Seq_by_acc('P43780');


-------------------------------------------------------------

using Bioperl 1.4-1 I get the error message

---------------------------------------------------------------------------------

  request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
  Content-Length: 45
  Content-Type: application/x-www-form-urlencoded

  format=swissprot&db=swall&style=raw&id=P43780


  ------------- EXCEPTION: Bio::Root::Exception -------------
  MSG: swissprot stream with no ID. Not swissprot in my book
  STACK: Error::throw
  STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
  STACK
Bio::SeqIO::swiss::next_seq /usr/share/perl5/Bio/SeqIO/swiss.pm:179
  STACK:
Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/share/perl5/Bio/DB/WebDBSeqI.pm:187
  STACK: ./putativeGele.pl:8
  -----------------------------------------------------------

--------------------------------------------------------------------------------

Any suggestions?

Thanks,

Marc


From bix at sendu.me.uk  Thu Oct  5 09:21:23 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 14:21:23 +0100
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <1160050554.18691.11.camel@localhost>
References: <1160050554.18691.11.camel@localhost>
Message-ID: <452506D3.5050501@sendu.me.uk>

Marc Weimer wrote:
[snip]
>   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
> 
>   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
[snip]
> using Bioperl 1.4-1 I get the error message
[snip]
>   ------------- EXCEPTION: Bio::Root::Exception -------------
>   MSG: swissprot stream with no ID. Not swissprot in my book
[snip]
> Any suggestions?

It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most 
recent official release), but 1.5.2 does 
(http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS 
(http://bioperl.org/wiki/Getting_BioPerl#CVS).

From m.weimer at dkfz-heidelberg.de  Thu Oct  5 09:35:06 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Thu, 05 Oct 2006 15:35:06 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <452506D3.5050501@sendu.me.uk>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
Message-ID: <1160055306.18691.14.camel@localhost>

Works fine with 1.5.2

Thanks,

Marc


> Marc Weimer wrote:
> [snip]
> >   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
> > 
> >   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
> [snip]
> > using Bioperl 1.4-1 I get the error message
> [snip]
> >   ------------- EXCEPTION: Bio::Root::Exception -------------
> >   MSG: swissprot stream with no ID. Not swissprot in my book
> [snip]
> > Any suggestions?
> 
> It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most 
> recent official release), but 1.5.2 does 
> (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS 
> (http://bioperl.org/wiki/Getting_BioPerl#CVS).
-- 
########################################

Dr. Marc Weimer
German Cancer Research Center
Central Unit Biostatistics
Im Neuenheimer Feld 280
D-69120 Heidelberg
Phone: +49 (0) 6221/42-2387
Fax: +49 (0) 6221/42-2397

########################################


From hlapp at gmx.net  Thu Oct  5 09:55:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 09:55:58 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>


On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:

> This is actually a general question and not limited to EUtilities.  
> As I
> see it EUtiltiies lets you do queries in Bioperl that you can do on a
> website. The question is, should a Bioperl module always work with
> queries that the website it is a front-end to works with?

I think yes, but stick to this definition.

Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez  
website it will actually not work. Hence, it should be no surprise  
that it doesn't work either using Bio::DB::EUtilities.

The URL you are using to make your point is much more an example for  
using a web-service (SOAP, REST, or not) than it is for using a  
website. Using the web-service URL with a space in place of the '+'  
works, but yields a different result (just searches for BRCA2), so if  
tested for correct result the test fails.

I.e., you don't expect an input form on a website to accept URL- 
encoded input. Instead, you expect it to do any URL-encoding for you  
that needs to be done. Conversely, if you are using a URL to retrieve  
stuff using e.g. wget or curl, it is clear that you will need to do  
URL encoding yourself unless there is a command line option that lets  
you instruct the querying program to do so.

I would be careful with mangling the two definitions into one,  
resulting in a module that needs to serve two masters. You could  
consider providing an option though that lets you turn off the URL  
encoding on demand.

Aside from that, one of the advantages of having the service wrapped  
in Bioperl is in fact that you can have it accept a wider variety of  
parameters that the actual service would allow you to have, e.g.,  
arrays, hashes, or whatever seems appropriate.

My $0.02.

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Thu Oct  5 10:08:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:08:01 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
Message-ID: <452511C1.5020709@sendu.me.uk>

Hilmar Lapp wrote:
> 
> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:
> 
>> This is actually a general question and not limited to EUtilities. As I
>> see it EUtiltiies lets you do queries in Bioperl that you can do on a
>> website. The question is, should a Bioperl module always work with
>> queries that the website it is a front-end to works with?
> 
> I think yes, but stick to this definition.
> 
> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez 
> website it will actually not work. Hence, it should be no surprise that 
> it doesn't work either using Bio::DB::EUtilities.

On the contrary, I find it a surprise because EUtilities is an interface 
to NCBI's eutils, not the entrez website.

If I had previously read instructions on using eutils:
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls
I might (do) expect that I /should/ use + in my term.


> Aside from that, one of the advantages of having the service wrapped in 
> Bioperl is in fact that you can have it accept a wider variety of 
> parameters that the actual service would allow you to have, e.g., 
> arrays, hashes, or whatever seems appropriate.

I was going to suggest that terms be supplied as an array, leaving 
Bioperl code to decide how to 'AND' all the terms (elements in the 
array) together. It would also further force the user not to think of 
how eutils normally works, but to only consider the Bioperl instructions 
on how to form a query. But I'm not sure of the value of all that.


From cjfields at uiuc.edu  Thu Oct  5 10:06:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:06:50 -0500
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <452506D3.5050501@sendu.me.uk>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
Message-ID: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>

On Oct 5, 2006, at 8:21 AM, Sendu Bala wrote:

> Marc Weimer wrote:
> [snip]
>>   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
>>
>>   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
> [snip]
>> using Bioperl 1.4-1 I get the error message
> [snip]
>>   ------------- EXCEPTION: Bio::Root::Exception -------------
>>   MSG: swissprot stream with no ID. Not swissprot in my book
> [snip]
>> Any suggestions?
>
> It works with the latest Bioperl. I'm not sure if 1.5.1 works (the  
> most
> recent official release), but 1.5.2 does
> (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS
> (http://bioperl.org/wiki/Getting_BioPerl#CVS).

Mark, you'll have to update to 1.5.2 or CVS, as Sendu suggested.   
There were server changes for biofetch which were fixed about 4-6  
months ago (post rel. 1.5.1); I think several changes were made to  
Bio::SeqIO::swiss as well during this period.

I think the error here results from Bio::SeqIO::swiss trying to parse  
an empty byte stream.  Sendu, do you think that Bio::SeqIO::swiss  
(and other SeqIO parsers) should throw a more specific message for  
getting an empty byte stream?  Or is it more trouble than it's worth?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 10:14:40 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:14:40 +0100
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
	<1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>
Message-ID: <45251350.5030608@sendu.me.uk>

Chris Fields wrote:
>
>>>   ------------- EXCEPTION: Bio::Root::Exception -------------
>>>   MSG: swissprot stream with no ID. Not swissprot in my book
[snip]
> I think the error here results from Bio::SeqIO::swiss trying to parse an 
> empty byte stream.  Sendu, do you think that Bio::SeqIO::swiss (and 
> other SeqIO parsers) should throw a more specific message for getting an 
> empty byte stream?  Or is it more trouble than it's worth?

Trouble wise, I've no idea without looking into it. Generally speaking 
though I can say that the error message is pretty useless and I'm always 
in favour of better error messages.

From hlapp at gmx.net  Thu Oct  5 10:21:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 10:21:49 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452511C1.5020709@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
Message-ID: <F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>


On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote:

>>
>> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:
>>
>>> This is actually a general question and not limited to  
>>> EUtilities. As I
>>> see it EUtiltiies lets you do queries in Bioperl that you can do  
>>> on a
>>> website. The question is, should a Bioperl module always work with
>>> queries that the website it is a front-end to works with?
>>
>> I think yes, but stick to this definition.
>>
>> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez
>> website it will actually not work. Hence, it should be no surprise  
>> that
>> it doesn't work either using Bio::DB::EUtilities.
>
> On the contrary, I find it a surprise because EUtilities is an  
> interface
> to NCBI's eutils, not the entrez website.
>
> If I had previously read instructions on using eutils:
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=coursework.section.constructing-urls
> I might (do) expect that I /should/ use + in my term.

This is my point - stick to your definitions. Are you wrapping a  
query form on a website or are you wrapping a web service (i.e., a URL)?

The examples you give are about wrapping a web-service. Your original  
question was about wrapping a website. Yet another question is what  
the author of Bio::DB::EUtilities intended to wrap.

The other thing to consider is user-friendliness. If you are wrapping  
a web-service, do you still make not URL-encoding the user input the  
default? What will 90% of the users probably want or expect to be  
able to do? URL-encode all input themselves or expect the module to  
do this for them unless they turn it off?

As far as I'm concerned, I'll happily count myself among those who  
are lazy and ignorant, don't read NCBI's documentation, don't want to  
know how to URL encode and why this needs to be done, but just want  
it to work.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Oct  5 10:31:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:31:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>

On Oct 5, 2006, at 2:19 AM, Sendu Bala wrote:

> This is actually a general question and not limited to EUtilities.  
> As I
> see it EUtiltiies lets you do queries in Bioperl that you can do on a
> website. The question is, should a Bioperl module always work with
> queries that the website it is a front-end to works with?
>
> So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is
> essentially a frontend onto:
>
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? 
> retmode=xml&db=gene&term=
>
> With a web-browser you can complete that url by supplying a term. For
> example, the term 'BRCA2+9606[taxid]' works and returns results:
>
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? 
> retmode=xml&db=gene&term=BRCA2+9606[taxid]
>
> If you supply the exact same term to EUtilities::esearch like so:
>
> my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db =>
> "gene", -term "BRCA2+9606[taxid]");
>
> The search fails. From my 'user' perspective this is highly  
> unexpected.
> Chris (the author) and I both understand /why/ it fails, but Chris
> doesn't think it is a bug, or at least something than can/should be
> changed. What do other people think? At the very least, if something
> unexpected happens, I'd suggest making a note of it in the POD
> somewhere. Eg. "Do not use + in term strings, even though they might
> work on the website".
>
> Chris: what is the disadvantage of always submitting '+' as '+' to the
> server?

A few reasons:

1)  According to NCBI, you can use '+' in queries, but not as a  
boolean.  Global changes of '+' to a space may change the meaning of  
the query in a few rare occasions.  So, if you really wanted to  
search for the string 'BRCA2+ATG', NCBI looks for that term literally.

2)  '+' is a URI reserved symbol for a space delimiter.  Therefore,  
any parameters containing '+' are URI-encoded into %2B, which is  
decoded on NCBI's end back to '+' (The is demonstrable with current  
EUtilities output and the returned XML data).

3)  Why not just use a space (implicit AND)?  Or an explicit  
boolean?  Or '&' (which apparently works but is not specified in the  
NCBI Entrez docs)?

The bug is in the query and not in the code, i.e. is is a  user- 
generated bug, not an EUtilities bug.  And it shouldn't be  
unexpected, as NCBI has very specific rules for building queries for  
Entrez (just like any other database).  If I were to use nonstandard  
queries for MySQL, BioFetch, UCSC, or anything else, I would expect  
to get bad results.  As the old saying goes, garbage in, garbage out.

The following link has their updated rules:

http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
rid=helpentrez.chapter.EntrezHelp

Here is their old one:

http://www.ncbi.nlm.nih.gov/entrez/query/static/help/helpdoc.html

We could, of course, put something in POD, but you never presented  
that option to me before.  I'll grant that the EUtilities API needs  
some cleaning up, not easy to do when the returned data varies from  
each utility.  But it does get the URL encoding correct, at least in  
this case.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 10:32:49 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:32:49 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
Message-ID: <45251791.9040409@sendu.me.uk>

Hilmar Lapp wrote:
> 
> On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote:
>
>> On the contrary, I find it a surprise because EUtilities is an interface
>> to NCBI's eutils, not the entrez website.
>>
>> If I had previously read instructions on using eutils:
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls 
>>
>> I might (do) expect that I /should/ use + in my term.
> 
> This is my point - stick to your definitions. Are you wrapping a query 
> form on a website or are you wrapping a web service (i.e., a URL)?
> 
> The examples you give are about wrapping a web-service. Your original 
> question was about wrapping a website.

Right... I don't see that that changes the answer to my question though 
does it?

"The question is, should a Bioperl module always work with
queries that the web-service it is a front-end to works with?"

For me, the answer is still yes.


> As far as I'm concerned, I'll happily count myself among those who are 
> lazy and ignorant, don't read NCBI's documentation, don't want to know 
> how to URL encode and why this needs to be done, but just want it to work.

That's a reasonable attitude to take. Which comes back to the question I 
asked of Chris - naively, if you send + as + you can please everyone, 
can't you? Both people who have read the docs on the web-service and 
those who haven't? Or are there real queries in which a user may want to 
search for a phrase with a literal + in it (and where such a search 
works via eutils)?

From bix at sendu.me.uk  Thu Oct  5 10:44:33 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:44:33 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
Message-ID: <45251A51.6020802@sendu.me.uk>

Chris Fields wrote:
> The bug is in the query and not in the code, i.e. is is a  
> user-generated bug, not an EUtilities bug.  And it shouldn't be 
> unexpected, as NCBI has very specific rules for building queries for 
> Entrez (just like any other database).

So I guess this comes down to something Hilmar mentioned and I never 
even considered before. You consider your EUtilities stuff as a frontend 
to entrez, and therefore consider valid queries as queries that are 
valid for entrez and not eutils?

If that's the case, fine. I understand why you don't think this is a 
bug. Again, something that might warrant a mention in the POD.
Currently the naming of the modules and the explicit references to 
eutils (and me knowing the implementation uses eutils) got me confused.

From cjfields at uiuc.edu  Thu Oct  5 10:51:28 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:51:28 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452511C1.5020709@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
Message-ID: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>


On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote:

>>> This is actually a general question and not limited to  
>>> EUtilities. As I
>>> see it EUtiltiies lets you do queries in Bioperl that you can do  
>>> on a
>>> website. The question is, should a Bioperl module always work with
>>> queries that the website it is a front-end to works with?
>>
>> I think yes, but stick to this definition.
>>
>> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez
>> website it will actually not work. Hence, it should be no surprise  
>> that
>> it doesn't work either using Bio::DB::EUtilities.
>
> On the contrary, I find it a surprise because EUtilities is an  
> interface
> to NCBI's eutils, not the entrez website.

It uses NCBI's CGI interface for eutils, not the SOAP interface.   
Very different.  I have considered using the NCBI SOAP-based  
interface, but the web services are still somewhat incomplete, unlike  
the CGI interface.

> If I had previously read instructions on using eutils:
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=coursework.section.constructing-urls
> I might (do) expect that I /should/ use + in my term.

You are looking at part of the naked URL on that page.  Here's what  
that page says:

"When constructing URLs for the eUtils, please use lowercase  
characters for all parameters except &WebEnv. There is no required  
order for the URL parameters in an eUtils URL, and null values or  
inappropriate parameters are ignored. Avoid placing spaces in the  
URLs, particularly in queries. If a space is required, use a plus  
sign (+) instead of a space:

     * Incorrect: &id=352, 25125, 234, ...
     * Correct: &id=352,25125,234,...
     * Incorrect: &term=biomol mrna[properties] AND mouse[organism]
     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]

Other special characters, such as the # symbol used in referring to a  
query key on the History server, should be represented by their URL  
encodings (%23 for #).top link"

I use URI for building the URL with the parameters.  URI specifically  
encodes all of this for you, so spaces convert to '+' and '+'  
converts to %2B.

>> Aside from that, one of the advantages of having the service  
>> wrapped in
>> Bioperl is in fact that you can have it accept a wider variety of
>> parameters that the actual service would allow you to have, e.g.,
>> arrays, hashes, or whatever seems appropriate.
>
> I was going to suggest that terms be supplied as an array, leaving
> Bioperl code to decide how to 'AND' all the terms (elements in the
> array) together. It would also further force the user not to think of
> how eutils normally works, but to only consider the Bioperl  
> instructions
> on how to form a query. But I'm not sure of the value of all that.

Why do we need to intuit what the user is thinking at an particular  
time?  How would I know that someone actually wanted to search using  
the literal string 'abc+123' as opposed to 'abc 123'?

I see value in your last suggestion but I think a class or set of  
classes would be best suited for that:

MySQL Query     |  in                      out   | MySQL Query
Entrez Query    |-----> Generic Query class----->| Entrez Query
SRS Query       |                                | SRS Query
ad infinitum...

The generic query object could then be used in DB searches as an  
option besides using a raw string.  Though it would get tricky with  
SQL's complexity...

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Thu Oct  5 10:54:04 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 10:54:04 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251791.9040409@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
	<45251791.9040409@sendu.me.uk>
Message-ID: <9916EDEE-EA3C-4C55-A004-A46F37B559BF@gmx.net>


On Oct 5, 2006, at 10:32 AM, Sendu Bala wrote:

>> The examples you give are about wrapping a web-service. Your  
>> original question was about wrapping a website.
>
> Right... I don't see that that changes the answer to my question  
> though does it?
>
> "The question is, should a Bioperl module always work with
> queries that the web-service it is a front-end to works with?"
>
> For me, the answer is still yes.

The answer is still yes. My point was the query that works with a  
website is not necessarily the query that works with a web-service,  
even if that web-service also powers the website.

>
>> As far as I'm concerned, I'll happily count myself among those who  
>> are lazy and ignorant, don't read NCBI's documentation, don't want  
>> to know how to URL encode and why this needs to be done, but just  
>> want it to work.
>
> That's a reasonable attitude to take. Which comes back to the  
> question I asked of Chris - naively, if you send + as + you can  
> please everyone, can't you? Both people who have read the docs on  
> the web-service and those who haven't? Or are there real queries in  
> which a user may want to search for a phrase with a literal + in it  
> (and where such a search works via eutils)?

So are you suggesting to URL-encode some characters but not others?  
This would move you into muddy waters and I'm wondering what the gain  
is from that, and for whom it is a gain.

It sounds like it will mostly benefit those who have studied the NCBI  
documentation and know exactly the URL they want to send and want to  
ignore the EUtilities POD.

My humble guess is the far majority of people will either not read  
any documentation, or read the module's POD.

Maybe a better way to serve both types of people is to accept a  
parameter -querystring that is expected to include everything from  
'term=' onwards (including 'term=' itself) which gives you complete  
control and freedom if you know what you are doing, and otherwise  
implement what you suggested before:

> I was going to suggest that terms be supplied as an array, leaving
> Bioperl code to decide how to 'AND' all the terms (elements in the
> array) together. It would also further force the user not to think of
> how eutils normally works, but to only consider the Bioperl  
> instructions
> on how to form a query.


	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Thu Oct  5 11:02:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:02:01 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
Message-ID: <45251E69.7040507@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote:
>
>> On the contrary, I find it a surprise because EUtilities is an interface
>> to NCBI's eutils, not the entrez website.
> 
> It uses NCBI's CGI interface for eutils, not the SOAP interface.  Very 
> different.  I have considered using the NCBI SOAP-based interface, but 
> the web services are still somewhat incomplete, unlike the CGI interface.

I don't know anything about the SOAP interface. I'm talking about the 
CGI interface that you use.


>> If I had previously read instructions on using eutils:
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls 
>>
>> I might (do) expect that I /should/ use + in my term.
> 
> You are looking at part of the naked URL on that page.  Here's what that 
> page says:

I know what it says...

>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]

The correct query is the one that has +s in it.


> I use URI for building the URL with the parameters.  URI specifically 
> encodes all of this for you, so spaces convert to '+' and '+' converts 
> to %2B.

Well, yes. This causes what I thought of as a bug. It prevents me from 
submitting a /correct/ eutils term. However it isn't a bug if you 
explain to users they shouldn't be submitting valid eutils terms, but 
only valid /entrez/ terms.

From cjfields at uiuc.edu  Thu Oct  5 11:15:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:15:49 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251A51.6020802@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
	<45251A51.6020802@sendu.me.uk>
Message-ID: <B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>


On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> The bug is in the query and not in the code, i.e. is is a  user- 
>> generated bug, not an EUtilities bug.  And it shouldn't be  
>> unexpected, as NCBI has very specific rules for building queries  
>> for Entrez (just like any other database).
>
> So I guess this comes down to something Hilmar mentioned and I  
> never even considered before. You consider your EUtilities stuff as  
> a frontend to entrez, and therefore consider valid queries as  
> queries that are valid for entrez and not eutils?

The eutils tools access the same databases as the web page, in the  
same way, using the same search terms.  From the EUtilities docs:

"The eUtils access the core search and retrieval engine of the Entrez  
system and, therefore, are only capable of retrieving data that are  
already in Entrez."

> If that's the case, fine. I understand why you don't think this is  
> a bug. Again, something that might warrant a mention in the POD.
> Currently the naming of the modules and the explicit references to  
> eutils (and me knowing the implementation uses eutils) got me  
> confused.

I'll note that in there is URI encoding in POD, but that should be a  
no-brainer.  I don't think every Bio::DB* class specifies this,  
mainly because it is taken for granted.  Pretty much anything that  
builds URL strings needs to encode based on the URI standard, and any  
server that accepts URLs is expected to decode using the same standard.

So, again, why does that have to be specifically outlined in POD?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 11:24:39 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:24:39 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251E69.7040507@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
Message-ID: <BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>

>> I use URI for building the URL with the parameters.  URI  
>> specifically encodes all of this for you, so spaces convert to '+'  
>> and '+' converts to %2B.
>
> Well, yes. This causes what I thought of as a bug. It prevents me  
> from submitting a /correct/ eutils term. However it isn't a bug if  
> you explain to users they shouldn't be submitting valid eutils  
> terms, but only valid /entrez/ terms.

I can specify in POD that URI encoding is in effect if that placates  
you, and maybe add a bit about how terms are to be built (based on  
the website).  I also noticed that the esearch POD doesn't have a  
demo in the SYNOPSIS yet (my fault).

However, I think this is all a bit silly.  This is something most  
people already realize and take for granted (it's standard for any  
CGI interface to use URI encoding).

Also, most Entrez users do not use a term like 'BRCA2+Human 
[ORGANISM]'.  They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human 
[ORGANISM]', the latter which is implicit.  All of this is on the  
Entrez website.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From MEC at stowers-institute.org  Thu Oct  5 11:12:02 2006
From: MEC at stowers-institute.org (Cook, Malcolm)
Date: Thu, 5 Oct 2006 10:12:02 -0500
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
Message-ID: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>

Lincoln,

I committed a change to Bio::SeqFeature::Store to use nfreeze instead of
freeze which should allow SeqFeature objects to survive database
freeze/thaw cycles across architectures.

I hope I was not presumptuous or in error in doing this....

Regards,

Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
 

From bix at sendu.me.uk  Thu Oct  5 11:28:55 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:28:55 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
	<45251A51.6020802@sendu.me.uk>
	<B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>
Message-ID: <452524B7.5080003@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>> The bug is in the query and not in the code, i.e. is is a  
>>> user-generated bug, not an EUtilities bug.  And it shouldn't be 
>>> unexpected, as NCBI has very specific rules for building queries for 
>>> Entrez (just like any other database).
>>
>> So I guess this comes down to something Hilmar mentioned and I never 
>> even considered before. You consider your EUtilities stuff as a 
>> frontend to entrez, and therefore consider valid queries as queries 
>> that are valid for entrez and not eutils?
> 
> The eutils tools access the same databases as the web page, in the same 
> way, using the same search terms.

It doesn't. The eutils interface behaves differently with +s than does 
the entrez website interface. In eutils + means space, whilst in entrez, 
+ means the plus symbol.


>> If that's the case, fine. I understand why you don't think this is a 
>> bug. Again, something that might warrant a mention in the POD.
>> Currently the naming of the modules and the explicit references to 
>> eutils (and me knowing the implementation uses eutils) got me confused.
> 
> I'll note that in there is URI encoding in POD, but that should be a 
> no-brainer.

Just that it is URI encoded isn't the problem. The problem is the 
difference in behaviour outlined above.


> I don't think every Bio::DB* class specifies this, mainly 
> because it is taken for granted.  Pretty much anything that builds URL 
> strings needs to encode based on the URI standard, and any server that 
> accepts URLs is expected to decode using the same standard.
> 
> So, again, why does that have to be specifically outlined in POD?

Because they're different. If I construct a valid eutils query it might 
not work. You ought to explain why.

"EUtilities takes any valid entrez query and transforms it into a valid 
eutils query for submission. Do not try and provide a valid eutils query 
of your own, or the extra transformation will result in no results"

From bix at sendu.me.uk  Thu Oct  5 11:30:44 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:30:44 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
Message-ID: <45252524.7030006@sendu.me.uk>

Chris Fields wrote:
>>> I use URI for building the URL with the parameters.  URI specifically 
>>> encodes all of this for you, so spaces convert to '+' and '+' 
>>> converts to %2B.
>>
>> Well, yes. This causes what I thought of as a bug. It prevents me from 
>> submitting a /correct/ eutils term. However it isn't a bug if you 
>> explain to users they shouldn't be submitting valid eutils terms, but 
>> only valid /entrez/ terms.
> 
> I can specify in POD that URI encoding is in effect if that placates 
> you, and maybe add a bit about how terms are to be built (based on the 
> website).  I also noticed that the esearch POD doesn't have a demo in 
> the SYNOPSIS yet (my fault).
> 
> However, I think this is all a bit silly.  This is something most people 
> already realize and take for granted (it's standard for any CGI 
> interface to use URI encoding).
> 
> Also, most Entrez users do not use a term like 'BRCA2+Human[ORGANISM]'.  
> They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human[ORGANISM]', the 
> latter which is implicit.  All of this is on the Entrez website.

Exactly. You're assuming an entrez user and expecting an entrez query. I 
don't think its silly given the name of the modules for the user to 
assume the code needs an eutils query, which is a different thing with 
different behaviour /independent/ of URI encoding.

From cjfields at uiuc.edu  Thu Oct  5 11:50:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:50:51 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251E69.7040507@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
Message-ID: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>

> I know what it says...

Ah, that's the Sendu I know and love.

>
>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>
> The correct query is the one that has +s in it.

Yes, that's because it's a URL, not a raw search term string (it has  
been URI-encoded so spaces are converted to '+').  If you use that as  
a direct query in Entrez you will not get the same response.  You do  
get something if you use the new NCBI global query form on the main  
page, but clicking on the nucleotide or PMC hits reveals that the URL  
is malformed and no term is present.  That is exactly the same  
response in EUtilities:

<?xml version="1.0"?>
<!DOCTYPE eSearchResult PUBLIC "-//NLM//DTD eSearchResult, 11 May  
2002//EN" "http://www.ncbi.nlm.nih.gov/entrez/query/DTD/ 
eSearch_020511.dtd">
<eSearchResult>
         <Count>0</Count>
         <RetMax>0</RetMax>
         <RetStart>0</RetStart>
         <IdList>
         </IdList>
         <TranslationSet>
         </TranslationSet>
         <QueryTranslation></QueryTranslation>
</eSearchResult>

Note the QueryTranslation tag is empty.

The only noticeable difference is using egquery (which I just fixed  
in CVS yesterday).  The returned XML gives no hits for any database,  
which is true based on individual esearch queries for those database,  
and is actually more consistent than the website version.

>> I use URI for building the URL with the parameters.  URI specifically
>> encodes all of this for you, so spaces convert to '+' and '+'  
>> converts
>> to %2B.
>
> Well, yes. This causes what I thought of as a bug. It prevents me from
> submitting a /correct/ eutils term. However it isn't a bug if you
> explain to users they shouldn't be submitting valid eutils terms, but
> only valid /entrez/ terms.

If you mean that most users will actually use a URL-like search term,  
then I would say you have a point.  But that simply isn't the case.

If clarifying the docs makes it better, then so be it.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 11:59:53 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:59:53 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45252524.7030006@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
Message-ID: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>


On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>>> I use URI for building the URL with the parameters.  URI  
>>>> specifically encodes all of this for you, so spaces convert to  
>>>> '+' and '+' converts to %2B.
>>>
>>> Well, yes. This causes what I thought of as a bug. It prevents me  
>>> from submitting a /correct/ eutils term. However it isn't a bug  
>>> if you explain to users they shouldn't be submitting valid eutils  
>>> terms, but only valid /entrez/ terms.
>> I can specify in POD that URI encoding is in effect if that  
>> placates you, and maybe add a bit about how terms are to be built  
>> (based on the website).  I also noticed that the esearch POD  
>> doesn't have a demo in the SYNOPSIS yet (my fault).
>> However, I think this is all a bit silly.  This is something most  
>> people already realize and take for granted (it's standard for any  
>> CGI interface to use URI encoding).
>> Also, most Entrez users do not use a term like 'BRCA2+Human 
>> [ORGANISM]'.  They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human 
>> [ORGANISM]', the latter which is implicit.  All of this is on the  
>> Entrez website.
>
> Exactly. You're assuming an entrez user and expecting an entrez  
> query. I don't think its silly given the name of the modules for  
> the user to assume the code needs an eutils query, which is a  
> different thing with different behaviour /independent/ of URI  
> encoding.

It's a silly distinction.  The POD for Bio::DB::EUtilities states:

Bio::DB::EUtilities - interface for handling web queries and data  
retrieval from NCBI's Entrez Utilities.

My question is this : why would anyone (particularly the everyday  
bioperl user) want to use URL-encoded parameters for a query?  That  
seems to be your main argument here.  If so, wouldn't I just paste  
them together then send them off NCBI eutils?  Would I devote ~ 10  
classes to that?  I could do that in a short program using an array,  
join, and LWP::Simple.

The purpose is quite clearly stated, but if you feel that by  
badgering me to add something to POD I consider common sense, then  
you're right.  You've succeeded.  Bravo.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 12:02:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:02:05 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
Message-ID: <45252C7D.3050009@sendu.me.uk>

Chris Fields wrote:
>
>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>
>> The correct query is the one that has +s in it.
> 
> Yes, that's because it's a URL, not a raw search term string (it has 
> been URI-encoded so spaces are converted to '+').  If you use that as a 
> direct query in Entrez you will not get the same response.

But we're not doing Entrez queries. We're using a module called 
EUtilities to do an eutils query, which involves forming a url in which 
spaces should to be converted to +. That's the source of confusion. Is 
the user supposed to do this, or is EUtilities?

All you had to do 8 emails ago is tell me that EUtilities is supposed to 
do that. You /still/ haven't told me that. I give up.


From cjfields at uiuc.edu  Thu Oct  5 12:12:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 11:12:11 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45252C7D.3050009@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
Message-ID: <A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>


On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>>
>>> The correct query is the one that has +s in it.
>> Yes, that's because it's a URL, not a raw search term string (it  
>> has been URI-encoded so spaces are converted to '+').  If you use  
>> that as a direct query in Entrez you will not get the same response.
>
> But we're not doing Entrez queries. We're using a module called  
> EUtilities to do an eutils query, which involves forming a url in  
> which spaces should to be converted to +. That's the source of  
> confusion. Is the user supposed to do this, or is EUtilities?
>
> All you had to do 8 emails ago is tell me that EUtilities is  
> supposed to do that. You /still/ haven't told me that. I give up.

It should be apparent from the documentation and the URLs posted in  
debugging output the first few times you used it.  Again, why would I  
dedicate ~ 10 classes to pasting together URI-encoded strings?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 12:22:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:22:36 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
Message-ID: <4525314C.7020205@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote:
>
>> Exactly. You're assuming an entrez user and expecting an entrez query. 
>> I don't think its silly given the name of the modules for the user to 
>> assume the code needs an eutils query, which is a different thing with 
>> different behaviour /independent/ of URI encoding.
> 
> It's a silly distinction.  The POD for Bio::DB::EUtilities states:
> 
> Bio::DB::EUtilities - interface for handling web queries and data 
> retrieval from NCBI's Entrez Utilities.
> 
> My question is this : why would anyone (particularly the everyday 
> bioperl user) want to use URL-encoded parameters for a query?

Well I'll tell you why I was trying to use URL-encoded parameters, if 
that helps you any.

I read the pod for EUtilities but all the examples have very simple 
-term s defined with just a single word. So I wonder how I'm supposed to 
make an 'AND' term. I also have no idea what utilities I'm supposed to 
use, or what databases etc. I need to get the answer I want.

The POD points me here:
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
Combined with the EUtilities synopsis I know I'm supposed to start with 
esearch so I look at:
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html
And figure out what my terms are supposed to be.

Then I test some example terms in my web browser using the esearch base 
url (http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?) to see 
if they work, and copy/paste the terms into my EUtilities-using perl 
script, replacing variable terms with perl variables.

Then I find that my terms don't work, ask you about it, and you fail to 
tell me I should be testing my terms at 
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene.

If you think I'm stupid, fine, but I'm probably not the only stupid 
person on the planet. Which is why I suggested a POD addition. You don't 
have to make any POD change if you don't want to. I simply thought it 
might help avoid anyone 'badgering' you in the future with a similar 
problem.

From bix at sendu.me.uk  Thu Oct  5 12:28:51 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:28:51 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
	<A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
Message-ID: <452532C3.9030804@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>>
>>>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>>>
>>>> The correct query is the one that has +s in it.
>>> Yes, that's because it's a URL, not a raw search term string (it has 
>>> been URI-encoded so spaces are converted to '+').  If you use that as 
>>> a direct query in Entrez you will not get the same response.
>>
>> But we're not doing Entrez queries. We're using a module called 
>> EUtilities to do an eutils query, which involves forming a url in 
>> which spaces should to be converted to +. That's the source of 
>> confusion. Is the user supposed to do this, or is EUtilities?
>>
>> All you had to do 8 emails ago is tell me that EUtilities is supposed 
>> to do that. You /still/ haven't told me that. I give up.
> 
> It should be apparent from the documentation and the URLs posted in 
> debugging output the first few times you used it.  Again, why would I 
> dedicate ~ 10 classes to pasting together URI-encoded strings?

I'm not sure how not doing URI-encoding would suddenly make your classes 
worthless. I find them to be very useful (even when I didn't know there 
was any URI-encoding, was incorrectly using +s and it happened to work 
anyway).

From bernd.web at gmail.com  Thu Oct  5 10:09:38 2006
From: bernd.web at gmail.com (Bernd Web)
Date: Thu, 5 Oct 2006 16:09:38 +0200
Subject: [Bioperl-l] Eutilities Batch
Message-ID: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>

Hi,

I am using the new EUtilities. It looks great.
I was trying to use epost followed by elink but i get an error. The
same error is actually given with the example on
http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
Can't call method "get_databases" on an undefined value at EU.pl line 25.

For completeness, the code is shown below too.

Any suggestions what is going wrong?

Regards,
Bernd

# chain EUtilities for complex queries

  use Bio::DB::EUtilities;

  my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                         -db         => 'pubmed',
                                         -term       => 'hutP',
                                         -usehistory => 'y');

  $esearch->get_response; # parse the response, fetch a cookie

  my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
                                       -db           => 'protein,taxonomy',
                                       -dbfrom       => 'pubmed',
                                       -cookie       => $esearch->next_cookie,
                                       -cmd          => 'neighbor');

  # this retrieves the Bio::DB::EUtilities::ElinkData object

  my ($linkset) = $elink->next_linkset;
  my @ids;

  # step through IDs for each linked database in the ElinkData object

  for my $db ($linkset->get_databases) {
    @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
    # do something here
  }

From cjfields at uiuc.edu  Thu Oct  5 13:31:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 12:31:33 -0500
Subject: [Bioperl-l] Eutilities Batch
In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
Message-ID: <F53B83B9-E188-4715-8229-0B6D9C0C982A@uiuc.edu>

I'll look into it.  I'm busy updating the EUtilities tools now.

Chris

On Oct 5, 2006, at 9:09 AM, Bernd Web wrote:

> Hi,
>
> I am using the new EUtilities. It looks great.
> I was trying to use epost followed by elink but i get an error. The
> same error is actually given with the example on
> http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
> Can't call method "get_databases" on an undefined value at EU.pl  
> line 25.
>
> For completeness, the code is shown below too.
>
> Any suggestions what is going wrong?
>
> Regards,
> Bernd
>
> # chain EUtilities for complex queries
>
>   use Bio::DB::EUtilities;
>
>   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                          -db         => 'pubmed',
>                                          -term       => 'hutP',
>                                          -usehistory => 'y');
>
>   $esearch->get_response; # parse the response, fetch a cookie
>
>   my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
>                                        -db           =>  
> 'protein,taxonomy',
>                                        -dbfrom       => 'pubmed',
>                                        -cookie       => $esearch- 
> >next_cookie,
>                                        -cmd          => 'neighbor');
>
>   # this retrieves the Bio::DB::EUtilities::ElinkData object
>
>   my ($linkset) = $elink->next_linkset;
>   my @ids;
>
>   # step through IDs for each linked database in the ElinkData object
>
>   for my $db ($linkset->get_databases) {
>     @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
>     # do something here
>   }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From daniel.lang at biologie.uni-freiburg.de  Thu Oct  5 13:12:02 2006
From: daniel.lang at biologie.uni-freiburg.de (Daniel Lang)
Date: Thu, 05 Oct 2006 19:12:02 +0200
Subject: [Bioperl-l] Bio::DB::SeqFeature
Message-ID: <45253CE2.1070208@biologie.uni-freiburg.de>

Hi,

we are storing Bio::SeqFeature::Gene::GeneStructure objects (with
multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db
(latest bioperl-live checkout).

The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch
out of a database.

The first observation is that is seems to work (fetched objects behave
like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we
get these warnings:

Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into
lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into
lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
        (in cleanup) Not a CODE reference at
/home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
prepare_cached(SELECT f.id,f.object
  FROM feature as f
  WHERE (   f.seqid=?
   AND   f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?))
)

) statement handle DBI::st=HASH(0x1c317cf0) still Active at
/home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm
line 1422
        (in cleanup) Not a CODE reference at
/home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.

Is this something serious? Does this mean that the stored object doesn't
have everything it had before freezing? Or are we using
Bio::DB::SeqFeature inappropriately?

The other question would be, if we can visualize these stored feature
objects easily using gbrowse? I didn't find a hint mentioning
Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages...
Is it working already? Will it?

Thanks in advance,
Daniel

-- 

Daniel Lang
University of Freiburg, Plant Biotechnology
Schaenzlestr. 1, D-79104 Freiburg
fax: +49 761 203 6945
phone: +49 761 203 6974
homepage:  http://www.plant-biotech.net/
e-mail: daniel.lang at biologie.uni-freiburg.de

#################################################
My software never has bugs.
It just develops random features.
#################################################


From cjfields at uiuc.edu  Thu Oct  5 13:45:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 12:45:40 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452532C3.9030804@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
	<A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
	<452532C3.9030804@sendu.me.uk>
Message-ID: <003DD8C4-6E59-44C2-9A1C-117E036D93BC@uiuc.edu>


On Oct 5, 2006, at 11:28 AM, Sendu Bala wrote:

> I'm not sure how not doing URI-encoding would suddenly make your  
> classes worthless. I find them to be very useful (even when I  
> didn't know there was any URI-encoding, was incorrectly using +s  
> and it happened to work anyway).

That's not my point (and sincerest apologies for the 'badgering'  
bit).  If you made the assumption that all the parameters had to be  
URI-encoded, why couldn't I do something like:

my %param = (#make up your list of parameters here#);
my $eutil = 'esearch';
my $url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/$eutil.fcgi";
# join the key value pairs with '=', then join all those with &
# add to end of url
# post and retrieve via LWP::Simple

It's more user-friendly to set up the parameters so that you wouldn't  
have to encode everything yourself, esp. when the most reliable way  
to encode URI strings is to 'use URI'.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 14:11:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 13:11:25 -0500
Subject: [Bioperl-l] Eutilities Batch
In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
Message-ID: <4A340977-C6AD-4728-8947-BF5A8A782807@uiuc.edu>


On Oct 5, 2006, at 9:09 AM, Bernd Web wrote:

> Hi,
>
> I am using the new EUtilities. It looks great.
> I was trying to use epost followed by elink but i get an error. The
> same error is actually given with the example on
> http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
> Can't call method "get_databases" on an undefined value at EU.pl  
> line 25.
>
> For completeness, the code is shown below too.
>
> Any suggestions what is going wrong?
>
> Regards,
> Bernd

Grr...that's my error, sorry Bernd.  The POD wasn't updated to match  
the change I made and has a few errors.  The elink object, for  
starters, doesn't fetch the response using get_response().  Also, the  
ElinkData method has changed slightly but accomplishes the same  
thing.  Odd, since I copied and pasted that from working code...

Just a note: these are considered highly experimental at the moment,  
though they should be ready for general use and toying around.  I  
would like any suggestions on methods and so on you may have (Sendu  
has made some very helpful ones off-list which I plan on implementing).

Feel free to let me know if something doesn't work.  Note that,  
because of their experimental nature, you will want to take note of  
any methods changes in particular as I try to solidify the API and  
clean up the POD, so expect some momentary 'outages'.  I plan on  
setting up a remedial interface for all the container objects (like  
ElinkData) which will help clarify things and solidify the API in the  
next few weeks, at least to a point where the class methods have a  
consistent naming scheme.  I plan on using this as a backend web  
agent for a general Entrez interface at some point to get data into  
Bio* objects.

In the meantime, try this:

use Bio::DB::EUtilities;

my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                        -db         => 'pubmed',
                                        -term       => 'hutP',
                                        -usehistory => 'y');

$esearch->get_response; # parse the response, fetch a cookie

my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
                                      -db           =>  
'protein,taxonomy',
                                      -dbfrom       => 'pubmed',
                                      -cookie       => $esearch- 
 >next_cookie,
                                      -cmd          => 'neighbor');

$elink->get_response;

# this retrieves the Bio::DB::EUtilities::ElinkData object

my $linkset = $elink->next_linkset;
my @ids;

# step through IDs for each linked database in the ElinkData object

for my $db ($linkset->get_all_linkdbs) {
   @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
   print join q(,), @ids;
   # do something here
}


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dmessina at wustl.edu  Thu Oct  5 14:07:56 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 5 Oct 2006 13:07:56 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
Message-ID: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>

I'm pleased to announce a revised version of the BioPerl Deobfuscator  
is now available. Many thanks to Mauricio Cuadra for updating  
bioperl.org's installation:

http://bioperl.org/cgi-bin/deob_interface.cgi

I've incorporated many of the suggestions you all sent in after the  
first release, and many of the modules that had non-standard  
documentation have been updated in the meantime, too, so hopefully  
you'll find it much improved. There are still some issues with a few  
modules; please report any problems you see. Also, it's now indexing  
bioperl-live instead of 1.4, which should make it a little more  
useful, too. A complete list of changes is below.

I welcome your bug reports and suggestions for improvements, via  
email, this list, Bugzilla, or the Wiki page.


Thanks,
Dave


Changes

0.0.3  Mon Oct  2 20:01:45 CDT 2006
        FIX: change default $deob_detail_path to be a relative URL  
instead of
             having localhost hardcoded. Thanks to Jason Stajich for  
pointing
             this out.
        FIX: Bio::Ontology modules are no longer missing their prefix  
in the
             class list, and their methods are now shown in the lower  
pane
             as expected. Thanks to Hilmar Lapp for reporting this bug.
        FIX: can now handle (and ignore) VERSION POD section.
        FIX: missing SYNOPSIS section now handled properly. In fact, the
             SYNOPSIS and DESCRIPTION sections can be in reverse  
order now,
             although for consistency this is not recommended.
        FIX: Bug #2114: "Obfuscator doesn't show "Bio:Matrix:Generic"  
has been
             fixed. This bug turned out to afflict multiple modules,  
which
             weren't getting parsed correctly by deob_index.pl.
        NEW: Table cells have been padded out to get rid of that  
"scrunched"
             look. Thanks to Sendu Bala for this great suggestion.
        NEW: If the 'Returns' subsection of a method's documentation  
contains
             a POD L<> link, the Deobfuscator assumes this to be a  
package
             name, and wraps it in an href for display. This feature is
             not robust, but seems to work well enough for now.
        NEW: the list of classes is now sorted alphabetically depth- 
first, so
             that subclasses appear just after their parent class.  
Thanks to
             Amir Karger for noticing the strange sorting behavior.
        NEW: HTML page title now 'BioPerl Deobfuscator' to  
distinguish it from
             other Deobfuscators out there. Thanks to Amir Karger for
             suggesting this.
        NEW: 'No match' search string now more prominent. Yep, kudos  
to Amir
             Karger again -- another great idea!
        NEW: Search box caption now explicitly states that only  
package names
             can be searched. Big ups to Amir Karger for this  
suggestion.
             The ability to search method names is planned for a  
future version.
        NEW: added -x option to deob_index.pl. This allows the use of an
             'excluded modules' file. This feature was added to  
resolve an
             issue with four modules which rely on external modules  
to compile.
             Class::Inspector, used by the Deobfuscator needs to load a
             module to traverse its inheritance tree, and modules  
must compile
             before they can be loaded.
     CHANGE: using short name now when traversing with File::Find to  
help
             identify excluded modules (deob_index.pl).


From lincoln.stein at gmail.com  Thu Oct  5 14:41:08 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:41:08 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
Message-ID: <6dce9a0b0610051141x6b61407ar1c0a13cf7616b35f@mail.gmail.com>

The non-numeric comparison bug in Bio::DB::SeqFeature is fixed in the
latest CVS. Do I need to do anything special to get the CVS fixes into
the release candidate?

Lincoln

On 10/2/06, Chris Fields <cjfields at uiuc.edu> wrote:
> > [I won't create a wiki account just to report this.]
> >
> > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> > not set.  Lots of warnings about missing packages and all, but this
> > looks interesting:
> >
> >    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/
> > SeqFeature/Segment.pm line 423.
>
> This is verified on Mac OS X.
>
> > Otherwise:
> >
> >    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,
> > 99.99% okay.
> >
> > The failed test is:
> >
> >    t/ESEfinder..................dubious
> >       Test returned status 255 (wstat 65280, 0xff00)
> >    DIED. FAILED test 15
>
> What do you get when you run that set of tests using 'perl -I. -w t/
> ESEFinder.t'?  The bad status code is odd and could be a remote
> server issue.
>
> Chris
>
>
> >
> > florin
> >
> > --
> > If we wish to count lines of code, we should not regard them as lines
> > produced but as lines spent.                       -- Edsger Dijkstra
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu

From MEC at stowers-institute.org  Thu Oct  5 15:18:08 2006
From: MEC at stowers-institute.org (Cook, Malcolm)
Date: Thu, 5 Oct 2006 14:18:08 -0500
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
Message-ID: <CED81D34E37D5043A1211565277A51E5065E9897@exchkc02.stowers-institute.org>


Yes, there is overhead (c.f. perldoc Storable)

    "When writing in network order, all fields are written
    out as standard lengths, which allows full interworking, but takes
    longer to read and write)"

And, I suppose there is also risk of loosing precision in using network
order:

    You can also store data in network order to allow easy sharing
across
    multiple platforms, or when storing on a socket known to be remotely
    connected. The routines to call have an initial "n" prefix for
    *network*, as in "nstore" and "nstore_fd". At retrieval time, your
data
    will be correctly restored so you don't have to know whether you're
    restoring from native or network ordered data. Double values are
stored
    stringified to ensure portability as well, at the slight risk of
loosing
    some precision in the last decimals.

So, I agree, it should be configuration option, perhaps defaulting to
using network order.

However, given the factoring of ../Bio/DB/SeqFeature/Store.pm I'm not
sure how to best make it a configuration option since the two provided
serializers don't share a common interface.  Possibly something like:

=head1 Methods for Connecting and Initializating a Database

=head2 new

 Title   : new
 Usage   : $db = Bio::DB::SeqFeature::Store->new(@options)
 Function: connect to a database
 Returns : A descendent of Bio::DB::Seqfeature::Store
 Args    : several - see below
 Status  : public

This class method creates a new database connection. The following
-name=E<gt>$value arguments are
accepted:http://iowg.brcdevel.org/gff3.html#a_fasta

 Name               Value
 ----               -----

 -adaptor           The name of the Adaptor class (default DBI::mysql)

 -serializer        The name of the serializer class (default Storable)

 -network_order     Strive to 'preserve network order' (if the
serializer implements it.  
		        Currently, only Storable.pm does, and this will
cause it to use nfreeze 
                    instead of freeze.  (default 1)

 -index_subfeatures Whether or not to make subfeatures searchable
                    (default true)

 -cache             Activate LRU caching feature -- size of cache

 -compress          Compresses features before storing them in database
                    using Compress::Zlib


Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
  

> -----Original Message-----
> From: Lincoln Stein [mailto:lincoln.stein at gmail.com] 
> Sent: Thursday, October 05, 2006 1:43 PM
> To: Cook, Malcolm
> Cc: lstein at cshl.org; bioperl-l
> Subject: Re: using nfreeze instead of freeze in Bio::SeqFeature::Store
> 
> I think it's fine unless there is a significant performance hit, in
> which case the change should be made into a configuration option. Do
> you know if there is any overhead on doing this?
> 
> Lincoln
> 
> On 10/5/06, Cook, Malcolm <MEC at stowers-institute.org> wrote:
> > Lincoln,
> >
> > I committed a change to Bio::SeqFeature::Store to use 
> nfreeze instead of
> > freeze which should allow SeqFeature objects to survive database
> > freeze/thaw cycles across architectures.
> >
> > I hope I was not presumptuous or in error in doing this....
> >
> > Regards,
> >
> > Malcolm Cook
> > Database Applications Manager - Bioinformatics
> > Stowers Institute for Medical Research - Kansas City, Missouri
> >
> >
> 
> 
> -- 
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)
> FOR URGENT MESSAGES & SCHEDULING,
> PLEASE CONTACT MY ASSISTANT,
> SANDRA MICHELSEN, AT michelse at cshl.edu
> 


From lincoln.stein at gmail.com  Thu Oct  5 14:32:40 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:32:40 -0400
Subject: [Bioperl-l] Bio::DB::SeqFeature
In-Reply-To: <45253CE2.1070208@biologie.uni-freiburg.de>
References: <45253CE2.1070208@biologie.uni-freiburg.de>
Message-ID: <6dce9a0b0610051132p7d7fcf84g27578731f9727f3f@mail.gmail.com>

Hi Daniel,

The warnings you are seeing are occurring because
Bio::SeqFeature::Gene::GeneStructure contains a CODE reference. I
think it must be registering a cleanup method via its Bio::Root::Root
ancestor. When Storable serializes the object, it complains that it
can't serialize the CODE reference and instead converts it into the
string "CODE(0xXXXXX)". Then, after you thaw the object,
Bio::Root::Root is complaining that the CODE reference is invalid
because it is a string, not a reference.

Yuck. I think, however, that I can fix this by setting some magic
variables in Storable version 2.05 that will decompile and compile the
CODE references. I will try this and send you a note when the code is
in CVS.

GBrowse does run off Bio::DB::SeqFeature::Store and is noticeably
faster than the original Bio::DB::GFF adaptor. Nothing really changes
except that you set the db_adaptor option to
Bio::DB::SeqFeature::Store. I haven't tried it using
Bio::SeqFeature::Gene::GeneStructure, so no guarantees, but I am
hopeful that it will work.

Lincoln


On 10/5/06, Daniel Lang <daniel.lang at biologie.uni-freiburg.de> wrote:
> Hi,
>
> we are storing Bio::SeqFeature::Gene::GeneStructure objects (with
> multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db
> (latest bioperl-live checkout).
>
> The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch
> out of a database.
>
> The first observation is that is seems to work (fetched objects behave
> like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we
> get these warnings:
>
> Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into
> lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
> Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into
> lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
>         (in cleanup) Not a CODE reference at
> /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
> prepare_cached(SELECT f.id,f.object
>   FROM feature as f
>   WHERE (   f.seqid=?
>    AND   f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?))
> )
>
> ) statement handle DBI::st=HASH(0x1c317cf0) still Active at
> /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm
> line 1422
>         (in cleanup) Not a CODE reference at
> /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
>
> Is this something serious? Does this mean that the stored object doesn't
> have everything it had before freezing? Or are we using
> Bio::DB::SeqFeature inappropriately?
>
> The other question would be, if we can visualize these stored feature
> objects easily using gbrowse? I didn't find a hint mentioning
> Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages...
> Is it working already? Will it?
>
> Thanks in advance,
> Daniel
>
> --
>
> Daniel Lang
> University of Freiburg, Plant Biotechnology
> Schaenzlestr. 1, D-79104 Freiburg
> fax: +49 761 203 6945
> phone: +49 761 203 6974
> homepage:  http://www.plant-biotech.net/
> e-mail: daniel.lang at biologie.uni-freiburg.de
>
> #################################################
> My software never has bugs.
> It just develops random features.
> #################################################
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu

From hlapp at gmx.net  Thu Oct  5 16:34:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 16:34:49 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4525314C.7020205@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
Message-ID: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>


On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote:

> If you think I'm stupid, fine, but I'm probably not the only stupid
> person on the planet.

That's a great suggestion that I hope we can all agree on? I'll  
happily count myself among the stupid ones too so you're not alone,  
and stupid people and even more so those who are lucky enough not to  
be stupid have an obligation to document stuff so that even the  
stupid can understand, no matter how silly the documentation might get.

Is that agreeable without causing yet more progressive hair loss?

Actually - I'm having second thoughts. Isn't it a distinguishing  
feature of stupid people that - among other things - they are stupid  
enough to believe they don't need to read documentation? You admitted  
publicly that you read documentation - are you just faking the stupid?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Oct  5 17:11:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:11:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
Message-ID: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>


On Oct 5, 2006, at 3:34 PM, Hilmar Lapp wrote:

>
> On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote:
>
>> If you think I'm stupid, fine, but I'm probably not the only stupid
>> person on the planet.
>
> That's a great suggestion that I hope we can all agree on? I'll  
> happily count myself among the stupid ones too so you're not alone,  
> and stupid people and even more so those who are lucky enough not  
> to be stupid have an obligation to document stuff so that even the  
> stupid can understand, no matter how silly the documentation might  
> get.
>
> Is that agreeable without causing yet more progressive hair loss?
>
> Actually - I'm having second thoughts. Isn't it a distinguishing  
> feature of stupid people that - among other things - they are  
> stupid enough to believe they don't need to read documentation? You  
> admitted publicly that you read documentation - are you just faking  
> the stupid?
>
> 	-hilmar

If lack of good documentation == stupid, I know of a few other  
modules in trouble besides mine.  Based on that we're in for a whole  
lot of stupid!  And I feel stupid for my earlier remarks, Sendu, so  
apologies.

And Hilmar, you're too late on the hair loss, at least on my end.

I have corrected the EUtilities POD to reflect that all text input  
needs to be raw as URI encoding is done in the module, which should  
work (I think).  I plan on committing it tonight.  It also indicates  
that EUtilities search queries need to be made as if they are regular  
Entrez queries.  Would that be sufficient?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From pmiguel at purdue.edu  Thu Oct  5 16:42:00 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Thu, 05 Oct 2006 16:42:00 -0400
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
Message-ID: <45256E18.3080103@purdue.edu>

David Messina wrote:
> I'm pleased to announce a revised version of the BioPerl Deobfuscator  
> is now available. Many thanks to Mauricio Cuadra for updating  
> bioperl.org's installation:
>
> http://bioperl.org/cgi-bin/deob_interface.cgi
>
> I've incorporated many of the suggestions you all sent in after the  
> first release, and many of the modules that had non-standard  
> documentation have been updated in the meantime, too, so hopefully  
> you'll find it much improved. There are still some issues with a few  
> modules; please report any problems you see. Also, it's now indexing  
> bioperl-live instead of 1.4, which should make it a little more  
> useful, too. A complete list of changes is below.
>
> I welcome your bug reports and suggestions for improvements, via  
> email, this list, Bugzilla, or the Wiki page.
>
>
> Thanks,
> Dave
>
>   
Here are some comments:
Would be good to have the column headings for the methods table in the 
fixed part of the page, rather than the scroll box. That way you could 
always see the column headings from anywhere in the list.

Second, I've noticed that there are a fair number of methods that have 
"not documented" for "Returns" and "Usage". But in every case I've 
checked both of these were documented. For example, consider methods for 
Bio::Seq::SeqWithQuality. The method "accession_number" is listed as 
"not documented". But if you click on Bio::Seq:SeqWithQuality link to 
the documentation, usage is defined as: "$unique_biological_key = 
$obj->accession_number;" and returns is defined as "A string".

Finally, it would be good to have the version of bioperl being 
deobfuscated on the deob_interface.cgi page. Just as a quick 
sanity-checking measure. After poking around a bit I found that 
bioperl-live is being indexed in the wiki. But, I can tell, it is just 
the sort of thing I'm going to forget and look for every time come  back 
to the page after a few months...

Overall very nice, though. Just what is needed when I'm trying to 
remember "which was the method that returns subseq string and which one 
returns an object?"


Phillip SanMiguel
Purdue University


From bix at sendu.me.uk  Thu Oct  5 17:24:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 22:24:34 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
Message-ID: <45257812.5050008@sendu.me.uk>

Chris Fields wrote:
> 
> I have corrected the EUtilities POD to reflect that all text input needs 
> to be raw as URI encoding is done in the module, which should work (I 
> think).  I plan on committing it tonight.  It also indicates that 
> EUtilities search queries need to be made as if they are regular Entrez 
> queries.  Would that be sufficient?

You may not even need to mention anything about URI encoding, which 
might frighten some people. Something as simple as:

=head1 SYNOPSIS

use Bio::DB::EUtilities;

   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                          -db         => 'pubmed',
                                          -term       => 'hutP AND xyz',
...

and/or some POD for the new() method:

=head2 new

  Title   : new
...
  Args    : -eutil => ...
            -db    => ...
            -term  => string, an entrez-style query

=cut

would get the point across, I think.

BTW, can the term string be supplied anywhere else other than new()? It 
doesn't matter at all if it can't, I'm just idly wondering if I missed 
anything.

From dmessina at wustl.edu  Thu Oct  5 17:42:49 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 5 Oct 2006 16:42:49 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45256E18.3080103@purdue.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
Message-ID: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>

Thanks so much, Phillip, for taking the time to check out the new  
version and send your comments. I really appreciate it! I've added  
them to the wiki page so I can track them.

Best,
Dave


From cjfields at uiuc.edu  Thu Oct  5 17:50:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:50:11 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45257812.5050008@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
	<45257812.5050008@sendu.me.uk>
Message-ID: <A0B37F41-7C33-49F6-A039-A35AB5696947@uiuc.edu>

Sendu,

I have the parameters all set up as get/sets at this point, but I'm  
open to suggestions on that.  Note in the BEGIN block the heredoc eval 
{} block.  Yes, nasty I know, but I hate AUTOLOAD.  It works as a  
quick way of getting parameter get/sets up-and-running.  I plan on  
making those explicit get/sets as soon as I can then sorting out  
particular ones to the various eutil modules where they are primarily  
used.

Long story short, every parameter is a get/set at this time  
(including term()).  The common ones needed for most EUtilities are  
initialized in the parent EUtilities::_initialize(), and eutil- 
specific parameters are initialized in the individual eutil plugins.   
Each eutil plugin only sets whatever parameters may be needed for  
operation (though you could circumvent that, since all of them are  
inherited via EUtilities).

We could always simplify it to accept simple key-value pairs, but get/ 
sets (at least to me) allow more flexibility as long as you remember  
which parameters are set and to what.

Chris

On Oct 5, 2006, at 4:24 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> I have corrected the EUtilities POD to reflect that all text input  
>> needs to be raw as URI encoding is done in the module, which  
>> should work (I think).  I plan on committing it tonight.  It also  
>> indicates that EUtilities search queries need to be made as if  
>> they are regular Entrez queries.  Would that be sufficient?
>
> You may not even need to mention anything about URI encoding, which  
> might frighten some people. Something as simple as:
>
> =head1 SYNOPSIS
>
> use Bio::DB::EUtilities;
>
>   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                          -db         => 'pubmed',
>                                          -term       => 'hutP AND  
> xyz',
> ...
>
> and/or some POD for the new() method:
>
> =head2 new
>
>  Title   : new
> ...
>  Args    : -eutil => ...
>            -db    => ...
>            -term  => string, an entrez-style query
>
> =cut
>
> would get the point across, I think.
>
> BTW, can the term string be supplied anywhere else other than new 
> ()? It doesn't matter at all if it can't, I'm just idly wondering  
> if I missed anything.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 17:51:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:51:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45257812.5050008@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
	<45257812.5050008@sendu.me.uk>
Message-ID: <5B2E844F-7B8B-4F69-9005-138826B835FB@uiuc.edu>

> You may not even need to mention anything about URI encoding, which
> might frighten some people. Something as simple as:
>
> =head1 SYNOPSIS
>
> use Bio::DB::EUtilities;
>
>    my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                           -db         => 'pubmed',
>                                           -term       => 'hutP AND  
> xyz',
> ...
>
> and/or some POD for the new() method:
>
> =head2 new
>
>   Title   : new
> ...
>   Args    : -eutil => ...
>             -db    => ...
>             -term  => string, an entrez-style query
>
> =cut
>
> would get the point across, I think.

Oops, forgot.  I'll add this in and update new() when I can.  Thanks!

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Thu Oct  5 18:12:49 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 05 Oct 2006 17:12:49 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45256E18.3080103@purdue.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
Message-ID: <45258361.8080803@campus.iztacala.unam.mx>

Phillip San Miguel wrote:
> Finally, it would be good to have the version of bioperl being 
> deobfuscated on the deob_interface.cgi page. Just as a quick 
> sanity-checking measure. After poking around a bit I found that 
> bioperl-live is being indexed in the wiki. But, I can tell, it is just 
> the sort of thing I'm going to forget and look for every time come  back 
> to the page after a few months...

Dave,

I think this value can be stored in one of the index files and passed as 
an argument to the deob_index.pl script. What do you think?

Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From lincoln.stein at gmail.com  Thu Oct  5 14:42:41 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:42:41 -0400
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
In-Reply-To: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>
References: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>
Message-ID: <6dce9a0b0610051142h56479843ofc5429d959cb6e3@mail.gmail.com>

I think it's fine unless there is a significant performance hit, in
which case the change should be made into a configuration option. Do
you know if there is any overhead on doing this?

Lincoln

On 10/5/06, Cook, Malcolm <MEC at stowers-institute.org> wrote:
> Lincoln,
>
> I committed a change to Bio::SeqFeature::Store to use nfreeze instead of
> freeze which should allow SeqFeature objects to survive database
> freeze/thaw cycles across architectures.
>
> I hope I was not presumptuous or in error in doing this....
>
> Regards,
>
> Malcolm Cook
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, Missouri
>
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu

From torsten.seemann at infotech.monash.edu.au  Fri Oct  6 01:26:10 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 06 Oct 2006 15:26:10 +1000
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
Message-ID: <4525E8F2.1000704@infotech.monash.edu.au>

Hilmar,

> I don't think there's a need to deprecate - if the methods just plain  
> delegate to whatever File:: module is appropriate their  
> implementation (supposedly) will become very simple and hence won't  
> pose a maintenance burden anymore.

>> I have an uncommitted simplified version of Bio::Root::IO which does
>> this, and "all tests pass". The functions currently (silently)  
>> dispatch
>> directly to their native counterparts.
>>
>> The only tricky function is tempfile() which is *mostly* like
>> File::Temp::tempfile(), but does some voodoo of converting
>> (TEMPLATE=>'xxx') to the non-hash first parameter of the File::  
>> version,
>> so I'm hesitant to commit. It may do other magic - Hilmar?
> 
> Not that I would know of. If the tests pass (without having to change  
> them!) I'd give it a try.

Tempfile.t had two tests that failed. It seems that Bio::Root::IO had 
some magic whereby it would keep a list of all tempfilenames created 
with UNLINK != 0 and when the Bio::Root::IO object was destroyed (eg. 
undef $obj) it would MANUALLY unlink each of them. This would occur 
before File::Temp got to unlink them. Not sure why it was written like 
this (as File::Temp will delete them at the end of the script anyway) 
but maybe it was legacy for when File::Temp::tempfile WASN'T available.
Anyway, I've kept backward compatibility there, although I think 
eventually it should be removed and Tempfile.t adjusted.

Although all tests pass with my new trim Bio/Root/IO.pm I am still 
concerned about committing as the assumption is that the BioPerl test 
suite is good enough to handle such a change to an important module, but 
the reality may be different :-)

Let me know if you think I should commit anyway,

Your advice is appreciated.

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From dmessina at wustl.edu  Fri Oct  6 01:25:56 2006
From: dmessina at wustl.edu (David Messina)
Date: Fri, 6 Oct 2006 00:25:56 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45258361.8080803@campus.iztacala.unam.mx>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
	<45258361.8080803@campus.iztacala.unam.mx>
Message-ID: <CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>


On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote:
> I think this value can be stored in one of the index files and  
> passed as an argument to the deob_index.pl script. What do you think?

Yep, I think that works nicely. I added this feature and committed it  
to CVS. Here's what the new header looks like if you do deob_index.pl  
-s "bioperl-live":

?
Thanks for the suggestions, guys.

Dave

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061006/1c5819f9/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deob_header.jpg
Type: image/jpeg
Size: 25739 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061006/1c5819f9/attachment-0001.jpg 

From deep_ans at yahoo.com  Fri Oct  6 09:22:49 2006
From: deep_ans at yahoo.com (deepak shingan)
Date: Fri, 6 Oct 2006 06:22:49 -0700 (PDT)
Subject: [Bioperl-l] Sort blast file result according to evalues
Message-ID: <20061006132249.49450.qmail@web51711.mail.yahoo.com>

Hi ,
  Is  there any way to parse the blast file according to evalue for each hit. I want the output sorted according to hit evalue. I am using SearchIO algorithm and already tried sorting the hits according to bits, gaps, but I am not able to sort the hits by evalue. 
  As evalues are mainly associated with hsp and each hit may have multiple hsps. 
   
  waiting for help.
   
  Thanks,
  Dun Dansi
   
   
---------------------------------
How low will we go? Check out Yahoo! Messenger?s low  PC-to-Phone call rates.

From hlapp at gmx.net  Fri Oct  6 10:03:04 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 6 Oct 2006 10:03:04 -0400
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <4525E8F2.1000704@infotech.monash.edu.au>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
	<4525E8F2.1000704@infotech.monash.edu.au>
Message-ID: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>

This is a 1.5, i.e. developers release that's in the works, and also  
you'd be doing this on the main trunk. If you get the tests to pass  
there's no reason to hold back.

You may be right and in reality it has repercussions somewhere, but  
those will be the opportunities to improve our test suite.

	-hilmar

On Oct 6, 2006, at 1:26 AM, Torsten Seemann wrote:

> Although all tests pass with my new trim Bio/Root/IO.pm I am still  
> concerned about committing as the assumption is that the BioPerl  
> test suite is good enough to handle such a change to an important  
> module, but the reality may be different :-)
>
> Let me know if you think I should commit anyway,
>
> Your advice is appreciated.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct  6 10:58:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 09:58:09 -0500
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
Message-ID: <F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>

The evalue for the hit is retrieved by the BlastHit::signifiance()  
method, if I remember correctly.  So if $hit is a  
Bio::Search::Hit::BlastHit object, you use $hit->significance.  If  
you want individual HSP evalues, you would use $hsp->evalue for the  
individual HSP objects.

The output is normally sorted by the order they appear in the  
alignments and table, which is typically by increasing evalue or  
decreasing bits (score).  So they are already sorted.  If you wanted  
to run a sort yourself you could use a sort block using '{$a- 
 >significance() <=> $b->significance()} @hits', but as pointed out  
on the wiki it may be safer to run a Schwartzian transform instead:

http://www.bioperl.org/wiki/Bioperl_Best_Practices#Sorting

Chris

On Oct 6, 2006, at 8:22 AM, deepak shingan wrote:

> Hi ,
>   Is  there any way to parse the blast file according to evalue for  
> each hit. I want the output sorted according to hit evalue. I am  
> using SearchIO algorithm and already tried sorting the hits  
> according to bits, gaps, but I am not able to sort the hits by evalue.
>   As evalues are mainly associated with hsp and each hit may have  
> multiple hsps.
>
>   waiting for help.
>
>   Thanks,
>   Dun Dansi
>
>
>
>
>  		
> ---------------------------------
> How low will we go? Check out Yahoo! Messenger?s low  PC-to-Phone  
> call rates.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Oct  6 11:03:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 10:03:45 -0500
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
	<4525E8F2.1000704@infotech.monash.edu.au>
	<074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>
Message-ID: <265AD609-F74E-4545-B3DD-FF94290BE0B4@uiuc.edu>

On Oct 6, 2006, at 9:03 AM, Hilmar Lapp wrote:

> This is a 1.5, i.e. developers release that's in the works, and also
> you'd be doing this on the main trunk. If you get the tests to pass
> there's no reason to hold back.
>
> You may be right and in reality it has repercussions somewhere, but
> those will be the opportunities to improve our test suite.
>
> 	-hilmar

Agreed, though I think Sendu only wants bug fixes for 1.5.2.  You  
could always commit to CVS HEAD and it could be in 1.5.3.

Let me rethink that.  There were some subtle tempfile/tempdir issues  
that were popping up on WinXP where the some tempfiles were not being  
deleted b/c of permissions issues; I had planned on adding that to  
Bugzilla today or tomorrow.  Maybe changing to File::Temp would fix  
that, so in essence it would be a bug fix!

I'll go ahead and post the bug.

Chris

>> Although all tests pass with my new trim Bio/Root/IO.pm I am still
>> concerned about committing as the assumption is that the BioPerl
>> test suite is good enough to handle such a change to an important
>> module, but the reality may be different :-)
>>
>> Let me know if you think I should commit anyway,
>>
>> Your advice is appreciated.
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From pmiguel at purdue.edu  Fri Oct  6 11:06:56 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Fri, 06 Oct 2006 11:06:56 -0400
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>	<45256E18.3080103@purdue.edu>
	<5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>
Message-ID: <45267110.7030905@purdue.edu>

David Messina wrote:
> Thanks so much, Phillip, for taking the time to check out the new  
> version and send your comments. I really appreciate it! I've added  
> them to the wiki page so I can track them.
>
> Best,
> Dave
>   
Dave,
    No problem.
    I've just added a "keyword" to search BioPerl Deobfuscator to my 
Firefox browser. That way I can just type "deob qual" in my URL bar in 
firefox and the browser jumps directly to BioPerl Deobfuscator (like a 
bookmark) but it pre-submits the search item "qual".
    I heard about the Firefox "keywords" in a TWiT/FLOSS episode on 
mozilla. You just go to any search page and right-click in the search 
box of interest and one of the choices is "Add a Keyword for this 
Search". Then you just have to fill out "Name" and "Keyword" fields and 
drop the keyword into whatever folder you like. The "Keyword" then 
becomes the word to invoke that search with parameters that follow it 
when it is typed into the URL bar.
Phillip

From arareko at campus.iztacala.unam.mx  Fri Oct  6 11:18:02 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Fri, 06 Oct 2006 10:18:02 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>	<45256E18.3080103@purdue.edu>	<45258361.8080803@campus.iztacala.unam.mx>
	<CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>
Message-ID: <452673AA.7070305@campus.iztacala.unam.mx>

Looks great! I'll update it during the weekend.

Mauricio.

David Messina wrote:
> 
> On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote:
>> I think this value can be stored in one of the index files and passed 
>> as an argument to the deob_index.pl script. What do you think?
> 
> Yep, I think that works nicely. I added this feature and committed it to 
> CVS. Here's what the new header looks like if you do deob_index.pl -s 
> "bioperl-live":
> 
> 
> Thanks for the suggestions, guys.
> 
> Dave
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From bix at sendu.me.uk  Fri Oct  6 11:27:14 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 06 Oct 2006 16:27:14 +0100
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
	<F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
Message-ID: <452675D2.9090803@sendu.me.uk>

Chris Fields wrote:
> The evalue for the hit is retrieved by the BlastHit::signifiance()  
> method, if I remember correctly.  So if $hit is a  
> Bio::Search::Hit::BlastHit object, you use $hit->significance.  If  
> you want individual HSP evalues, you would use $hsp->evalue for the  
> individual HSP objects.
> 
> The output is normally sorted by the order they appear in the  
> alignments and table, which is typically by increasing evalue or  
> decreasing bits (score).  So they are already sorted.

Concur.


> If you wanted to run a sort yourself you could use a sort block using
> '{$a->significance() <=> $b->significance()} @hits'

Actually, it is best to use the sort_hits() method of the result object 
prior to asking for any hits. (As this allows for potential optimization 
in the parser.)

->significance is still the thing you need to sort on though.

From cjfields at uiuc.edu  Fri Oct  6 11:52:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 10:52:57 -0500
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <452675D2.9090803@sendu.me.uk>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
	<F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
	<452675D2.9090803@sendu.me.uk>
Message-ID: <31A6FC3A-8BEB-42B8-B51D-66E659EF7495@uiuc.edu>


On Oct 6, 2006, at 10:27 AM, Sendu Bala wrote:

>> If you wanted to run a sort yourself you could use a sort block using
>> '{$a->significance() <=> $b->significance()} @hits'
>
> Actually, it is best to use the sort_hits() method of the result  
> object
> prior to asking for any hits. (As this allows for potential  
> optimization
> in the parser.)

Ah, forgot about that one!

Chris


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Fri Oct  6 14:36:49 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 6 Oct 2006 11:36:49 -0700
Subject: [Bioperl-l] tempfile cleanup
In-Reply-To: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu>
References: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu>
Message-ID: <0FCEC6B2-E190-4800-AAB1-89559C552FA6@bioperl.org>

I think the magic trickery in there for cleanup is that File::Temp  
only cleans up tempfiles when Perl exits not when the Root::IO object  
goes out of scope -- so this can be a problem for people on CGI  
scripts that stay resident in memory and don't ever have tempfiles  
cleaned up.  The managing the list aspect allows us to call _cleanup  
periodically (perhaps before the start of every Blast run) to insure  
that tempfiles are removed.  perhaps newer File::Temp versions can  
solve this better now but I believe that was the behavior we were  
trying to deal with with managing the list of to-be-deleted files by  
the Root::IO object.

This is some hackery that also had to do with not expecting  
File::Temp to be installed I believe.

-jason


From torsten.seemann at infotech.monash.edu.au  Mon Oct  9 00:52:29 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Mon, 09 Oct 2006 14:52:29 +1000
Subject: [Bioperl-l] Multiple packages in the one .pm file
Message-ID: <4529D58D.1080004@infotech.monash.edu.au>

Hi all,

The following modules have more than one "package xxxx;" declaration in 
them. For small, internal classes I guess this is fine, but for others,
they should be split up into the filesystem - otherwise they are 
troublesome to locate and the online documentation doesn't list them!

eg.
bioperl-run/Bio/Tools/Run/Analysis/Job.pm
is in
bioperl-run/Bio/Tools/Run/Analysis.pm

Here's the culprits:

% for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | 
sed 's/:.*$//' | sort | uniq -d ; done

bioperl-live/Bio/AnalysisI.pm
bioperl-live/Bio/DB/Fasta.pm
bioperl-live/Bio/DB/GFF.pm
bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
bioperl-live/Bio/SeqIO/interpro.pm

bioperl-run/Bio/Tools/Run/Analysis.pm
bioperl-run/Bio/Tools/Run/Analysis/soap.pm

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From pmiguel at purdue.edu  Mon Oct  9 15:57:12 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Mon, 09 Oct 2006 15:57:12 -0400
Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC?
Message-ID: <452AA998.5010104@purdue.edu>

I found a bug in Bio::SeqIO::phd and am wondering if the fix will 
propagate into the next release candidate?

The bug is here:

http://bugzilla.open-bio.org/show_bug.cgi?id=2120

I also created a patch that fixes it (on my machine, anyway).  It is a 
fairly minor change, so it seems like it would be worth propagating it 
into the next release candidate.

-- 
Phillip SanMiguel

From bix at sendu.me.uk  Mon Oct  9 16:57:28 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 09 Oct 2006 21:57:28 +0100
Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC?
In-Reply-To: <452AA998.5010104@purdue.edu>
References: <452AA998.5010104@purdue.edu>
Message-ID: <452AB7B8.4040404@sendu.me.uk>

Phillip San Miguel wrote:
> I found a bug in Bio::SeqIO::phd and am wondering if the fix will 
> propagate into the next release candidate?
> 
> The bug is here:
> 
> http://bugzilla.open-bio.org/show_bug.cgi?id=2120
> 
> I also created a patch that fixes it (on my machine, anyway).  It is a 
> fairly minor change, so it seems like it would be worth propagating it 
> into the next release candidate.

If it gets committed to HEAD before I make the next candidate, then yes.
I'll do that if no one beats me to it (and if someone does, please add a 
new test for this).

BTW Phillip, thank you for the bug report but in future use the 
attachment capabilities for files, please don't paste them into the 
comments box.

From bix at sendu.me.uk  Mon Oct  9 17:01:56 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 09 Oct 2006 22:01:56 +0100
Subject: [Bioperl-l] Analysis soap problem
Message-ID: <452AB8C4.1010704@sendu.me.uk>

I thought I'd 'advertise' this bug on the list so more people see it:
http://bugzilla.open-bio.org/show_bug.cgi?id=2117

I don't want to make the next 1.5.2 release candidate until its fixed. 
Does anyone have any idea about it? Even if you can't fix it, just 
explaining what's (supposed) to be going on would help a lot.

Thank you,
Sendu.

From Kevin.M.Brown at asu.edu  Mon Oct  9 18:40:54 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 9 Oct 2006 15:40:54 -0700
Subject: [Bioperl-l] Analysis soap problem
Message-ID: <1A4207F8295607498283FE9E93B775B40219690B@EX02.asurite.ad.asu.edu>

If I had to guess from looking at the snippet provided, the variable
$seq holds no data so when you try to setup the regex /^$seq$/ you end
up with /^$/ (blank line) and the warning.

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 09, 2006 2:02 PM
> To: bioperl-l List
> Subject: [Bioperl-l] Analysis soap problem
> 
> I thought I'd 'advertise' this bug on the list so more people see it:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2117
> 
> I don't want to make the next 1.5.2 release candidate until 
> its fixed. 
> Does anyone have any idea about it? Even if you can't fix it, just 
> explaining what's (supposed) to be going on would help a lot.
> 
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Mon Oct  9 22:34:23 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 9 Oct 2006 21:34:23 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452AB8C4.1010704@sendu.me.uk>
References: <452AB8C4.1010704@sendu.me.uk>
Message-ID: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>

I have 'fixed' this in CVS.  Note the quotes; it depends on what you  
might consider fixed.  Multiple calls to results() were returning  
empty hash refs, so no data was being returned.   For now, I stored  
the hash reference in a variable then tested each one.  All tests now  
pass, including the 'outseq' one.

Maybe it's just me, but shouldn't results() either consistently  
return the same information, or contain documentation that it doesn't  
do so?  Anyway, I have left the bugzilla report open for now.

Chris

On Oct 9, 2006, at 4:01 PM, Sendu Bala wrote:

> I thought I'd 'advertise' this bug on the list so more people see it:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2117
>
> I don't want to make the next 1.5.2 release candidate until its fixed.
> Does anyone have any idea about it? Even if you can't fix it, just
> explaining what's (supposed) to be going on would help a lot.
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bosborne11 at verizon.net  Mon Oct  9 22:09:45 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 09 Oct 2006 22:09:45 -0400
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <C1507929.AB8F%bosborne11@verizon.net>

Torsten,

Fixed interpro.pm, it could have been written more simply (or more like
other SeqIO modules). Can't really address the others.

Brian O.


On 10/9/06 12:52 AM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

> Hi all,
> 
> The following modules have more than one "package xxxx;" declaration in
> them. For small, internal classes I guess this is fine, but for others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
> 
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
> 
> Here's the culprits:
> 
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
> 
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
> 
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm


From bix at sendu.me.uk  Tue Oct 10 03:03:20 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 08:03:20 +0100
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
Message-ID: <452B45B8.8010401@sendu.me.uk>

Chris Fields wrote:
> I have 'fixed' this in CVS.  Note the quotes; it depends on what you  
> might consider fixed.  Multiple calls to results() were returning  
> empty hash refs, so no data was being returned.   For now, I stored  
> the hash reference in a variable then tested each one.  All tests now  
> pass, including the 'outseq' one.
> 
> Maybe it's just me, but shouldn't results() either consistently  
> return the same information, or contain documentation that it doesn't  
> do so?  Anyway, I have left the bugzilla report open for now.

Judging by the tests there seems a clear expectation that multiple calls 
to results() should work, and certainly that makes sense and seems 
natural. So I'd say that results() should be fixed and the test script 
reverted.


From cjfields at uiuc.edu  Tue Oct 10 07:42:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 06:42:33 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452B45B8.8010401@sendu.me.uk>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
	<452B45B8.8010401@sendu.me.uk>
Message-ID: <A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>

I agree, though I think Martin Senger should be contacted, at least  
to get his thoughts.  Has anyone tried yet?

Chris

On Oct 10, 2006, at 2:03 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> I have 'fixed' this in CVS.  Note the quotes; it depends on what you
>> might consider fixed.  Multiple calls to results() were returning
>> empty hash refs, so no data was being returned.   For now, I stored
>> the hash reference in a variable then tested each one.  All tests now
>> pass, including the 'outseq' one.
>>
>> Maybe it's just me, but shouldn't results() either consistently
>> return the same information, or contain documentation that it doesn't
>> do so?  Anyway, I have left the bugzilla report open for now.
>
> Judging by the tests there seems a clear expectation that multiple  
> calls
> to results() should work, and certainly that makes sense and seems
> natural. So I'd say that results() should be fixed and the test script
> reverted.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 10 08:14:31 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 13:14:31 +0100
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
	<452B45B8.8010401@sendu.me.uk>
	<A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>
Message-ID: <452B8EA7.1080800@sendu.me.uk>

Chris Fields wrote:
> I agree, though I think Martin Senger should be contacted, at least to 
> get his thoughts.  Has anyone tried yet?

He's CCd on the bug report, but I haven't tried directly, no. Do you 
want to tackle this (contacting him and/or fixing the bug)?

Cheers,
Sendu.

From cjfields at uiuc.edu  Tue Oct 10 09:20:03 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 08:20:03 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452B8EA7.1080800@sendu.me.uk>
Message-ID: <001801c6ec6e$cc016900$15327e82@pyrimidine>

I'll try giving it a closer look, just didn't have much time yesterday.
I'll also try contacting Martin.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Tuesday, October 10, 2006 7:15 AM
> To: bioperl-l
> Subject: Re: [Bioperl-l] Analysis soap problem
> 
> Chris Fields wrote:
> > I agree, though I think Martin Senger should be contacted, at least to
> > get his thoughts.  Has anyone tried yet?
> 
> He's CCd on the bug report, but I haven't tried directly, no. Do you
> want to tackle this (contacting him and/or fixing the bug)?
> 
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pmiguel at purdue.edu  Tue Oct 10 10:26:35 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Tue, 10 Oct 2006 10:26:35 -0400
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452AB7B8.4040404@sendu.me.uk>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
Message-ID: <452BAD9B.5010903@purdue.edu>

Sendu Bala wrote:
>
> BTW Phillip, thank you for the bug report but in future use the 
> attachment capabilities for files, please don't paste them into the 
> comments box.
>   
Sendu,
    Sounds reasonable to me. I should note, however; when I entered the 
bug, I was looking for some method to attach files. There is none on the 
"Enter Bug: Bioperl" page:

http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl

Also, "bug writing guidelines" makes no mention of it. I vaguely 
remembered there being some method to do it--but given the "bug writing 
guidelines" exhortations to be specific and detailed, I thought I must 
put the information somewhere. So I put them them the only place offered 
(on that page)--"Description:"
    I see that, once submitted, attachments can be added to a bug 
report. Is that normally how it is done? Doesn't each attachment result 
in a separate email to the bioperl guts email list?
    Anyway,  I've just added the files to the bug report as attachments, 
in case someone needs them to construct a test.
   
-- 
Phillip

From bix at sendu.me.uk  Tue Oct 10 11:10:25 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 16:10:25 +0100
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
	<452BAD9B.5010903@purdue.edu>
Message-ID: <452BB7E1.5020200@sendu.me.uk>

Phillip San Miguel wrote:
> Sendu Bala wrote:
>> BTW Phillip, thank you for the bug report but in future use the 
>> attachment capabilities for files, please don't paste them into the
>>  comments box.
>> 
> Sendu, Sounds reasonable to me. I should note, however; when I
> entered the bug, I was looking for some method to attach files. There
> is none on the "Enter Bug: Bioperl" page:
> 
> http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl
> 
> Also, "bug writing guidelines" makes no mention of it. I vaguely 
> remembered there being some method to do it--but given the "bug
> writing guidelines" exhortations to be specific and detailed, I
> thought I must put the information somewhere. So I put them them the
> only place offered (on that page)--"Description:"

I agree that things could be better here. Who looks after bugzilla, and
is this an alterable feature?


> I see that, once submitted, attachments can be added to a bug report.
> Is that normally how it is done?

Yes, AFAIK.


> Doesn't each attachment result in a separate email to the bioperl
> guts email list?

Yes, but that's not a problem. In fact, doing it this way means you
don't email everyone subscribed to guts your big files in plain text,
but instead they get a small email with a link to the download.


> Anyway,  I've just added the files to the bug report as attachments,
>  in case someone needs them to construct a test.

Thank you.

From arareko at campus.iztacala.unam.mx  Tue Oct 10 11:14:00 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Tue, 10 Oct 2006 10:14:00 -0500
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
	<452BAD9B.5010903@purdue.edu>
Message-ID: <452BB8B8.40409@campus.iztacala.unam.mx>

Phillip San Miguel wrote:
> I see that, once submitted, attachments can be added to a bug report.
>  Is that normally how it is done?

Yes, it's the normal method: create the bug report, then attach files.

> Doesn't each attachment result in a separate email to the bioperl 
> guts email list?

Adding a file will generate an informative email per bug change 
(attaching the file in this case) but won't send the attachment to the list.

Regards,
Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Tue Oct 10 11:20:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 10:20:55 -0500
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
Message-ID: <002801c6ec7f$ae8d85f0$15327e82@pyrimidine>

> Also, "bug writing guidelines" makes no mention of it. I vaguely
> remembered there being some method to do it--but given the "bug writing
> guidelines" exhortations to be specific and detailed, I thought I must
> put the information somewhere. So I put them them the only place offered
> (on that page)--"Description:"
>     I see that, once submitted, attachments can be added to a bug
> report. Is that normally how it is done? Doesn't each attachment result
> in a separate email to the bioperl guts email list?
>     Anyway,  I've just added the files to the bug report as attachments,
> in case someone needs them to construct a test.

Phillip,

Initial bug reports only require the general description, OS used, bioperl
version, etc.  That's quite normal.  Any relevant attachments are added
afterward.  We should probably make that clearer upfront on the wiki page; I
don't know if anyone can make similar changes to bugzilla.

Any bug changes, CVS commits, etc are mailed to bioperl-guts, yes.  That
isn't an issue though; it keeps the developers updated on the various
bugs/commits that are going on and is a pretty common practice.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 12:48:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 11:48:22 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
References: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>

There are a number of other bioperl-run examples (the  
Bio::Tools::Run::Analysis::soap issue I looked into revealed such).

I agree with both points, 1) that it depends on the size of the  
classes, and 2) from a maintainability standpoint, it can be very  
frustrating when looking for documentation.  Is there really any  
advantage to doing this?

Chris

On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:

> Hi all,
>
> The following modules have more than one "package xxxx;"  
> declaration in
> them. For small, internal classes I guess this is fine, but for  
> others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
>
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
>
> Here's the culprits:
>
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
>
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
>
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 12:48:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 11:48:22 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
References: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>

There are a number of other bioperl-run examples (the  
Bio::Tools::Run::Analysis::soap issue I looked into revealed such).

I agree with both points, 1) that it depends on the size of the  
classes, and 2) from a maintainability standpoint, it can be very  
frustrating when looking for documentation.  Is there really any  
advantage to doing this?

Chris

On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:

> Hi all,
>
> The following modules have more than one "package xxxx;"  
> declaration in
> them. For small, internal classes I guess this is fine, but for  
> others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
>
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
>
> Here's the culprits:
>
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
>
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
>
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From lzhtom at hotmail.com  Tue Oct 10 15:42:48 2006
From: lzhtom at hotmail.com (zhihua li)
Date: Tue, 10 Oct 2006 19:42:48 +0000
Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise?
Message-ID: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>

Hi netters.

I've installed Bioperl 1.5.1, both core and run modules.  But when I tried 
to use the Pise module, an error occured saying that there's no "new" 
method in this package.

My script is:

use strict;
use warnings;
use Bio::Tools::Run::AnalysisFactory::Pise;
my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
my $program=$factory->program('mfold');
$program->seq('my_input_file');
my $job = $program->run();
print STDERR $job->contect('mfold.out');

The error message I got is:

Can't locate object method "new" via package 
"Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load 
"Bio::Tools::Run::AnalysisFactor::Pise"?)

I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm and 
it DOES contain a sub new.

So what's going on? Anyone could give me a hint?

Thanks a lot!


From cjfields at uiuc.edu  Tue Oct 10 16:27:27 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 15:27:27 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
Message-ID: <E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>

Makes sense to me.  I think, as long as they're documented, it  
shouldn't be a problem.

I think the main point is that the class methods for these don't show  
up using perldoc (something I ran into with Bio::DB::Fasta's  
inclusion of Bio::PrimarySeq::Fasta), but they do show up when using  
other documentation.  So 'perldoc Bio::DB::Fasta' works, but 'perldoc  
Bio::PrimarySeq::Fasta' doesn't.  So these can be problematic when  
looking for specific methods.

However, I think pod2html handles multiple package declarations in  
one module, and the PDOC online do as well.  Does the Deobfuscator?

Chris

On Oct 10, 2006, at 3:11 PM, Lincoln Stein wrote:

> Hi,
>
> These ones are all mine:
>
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
>
> In each case, the second modules are teeny tiny ones that implement  
> iterators which are at most two methods long (typically a new() and  
> a next()). I prefer not to split them out because they will just  
> clutter up the file tree with stuff that is already well documented  
> in the "parent ship" modules.
>
> Lincoln
>
>
> On 10/10/06, Chris Fields <cjfields at uiuc.edu> wrote: There are a  
> number of other bioperl-run examples (the
> Bio::Tools::Run::Analysis::soap issue I looked into revealed such).
>
> I agree with both points, 1) that it depends on the size of the
> classes, and 2) from a maintainability standpoint, it can be very
> frustrating when looking for documentation.  Is there really any
> advantage to doing this?
>
> Chris
>
> On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:
>
> > Hi all,
> >
> > The following modules have more than one "package xxxx;"
> > declaration in
> > them. For small, internal classes I guess this is fine, but for
> > others,
> > they should be split up into the filesystem - otherwise they are
> > troublesome to locate and the online documentation doesn't list  
> them!
> >
> > eg.
> > bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> > is in
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> >
> > Here's the culprits:
> >
> > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/ 
> Bio |
> > sed 's/:.*$//' | sort | uniq -d ; done
> >
> > bioperl-live/Bio/AnalysisI.pm
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> > bioperl-live/Bio/SeqIO/interpro.pm
> >
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> > bioperl-run/Bio/Tools/Run/Analysis/soap.pm
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)
> FOR URGENT MESSAGES & SCHEDULING,
> PLEASE CONTACT MY ASSISTANT,
> SANDRA MICHELSEN, AT michelse at cshl.edu

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 16:30:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 15:30:16 -0500
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <870B7500-AA83-42D7-965B-865B91AA8E7F@uiuc.edu>


On Oct 10, 2006, at 2:42 PM, zhihua li wrote:

> Hi netters.
>
> I've installed Bioperl 1.5.1, both core and run modules.  But when  
> I tried to use the Pise module, an error occured saying that  
> there's no "new" method in this package.
>
> My script is:
>
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
>
> The error message I got is:
>
> Can't locate object method "new" via package  
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load  
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
>
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/ 
> Pise.pm and it DOES contain a sub new.
>
> So what's going on? Anyone could give me a hint?
>
> Thanks a lot!

Well, according to your error output you have AnalysisFactory  
misspelled ('AnalysisFactor'), which should tell you what the problem  
is.  Look for the same thing in your script.

Chris


>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 10 16:43:06 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 21:43:06 +0100
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <452C05DA.5050803@sendu.me.uk>

zhihua li wrote:
> Hi netters.
> 
> I've installed Bioperl 1.5.1, both core and run modules.  But when I 
> tried to use the Pise module, an error occured saying that there's no 
> "new" method in this package.
> 
> My script is:
> 
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
> 
> The error message I got is:
> 
> Can't locate object method "new" via package 
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load 
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
> 
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm 
> and it DOES contain a sub new.
> 
> So what's going on? Anyone could give me a hint?

You have a typo.

Bio::Tools::Run::AnalysisFactory::Pise, not
Bio::Tools::Run::AnalysisFactor::Pise

From lincoln.stein at gmail.com  Tue Oct 10 16:11:00 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 10 Oct 2006 16:11:00 -0400
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
Message-ID: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>

Hi,

These ones are all mine:

> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm

In each case, the second modules are teeny tiny ones that implement
iterators which are at most two methods long (typically a new() and a
next()). I prefer not to split them out because they will just clutter up
the file tree with stuff that is already well documented in the "parent
ship" modules.

Lincoln


On 10/10/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> There are a number of other bioperl-run examples (the
> Bio::Tools::Run::Analysis::soap issue I looked into revealed such).
>
> I agree with both points, 1) that it depends on the size of the
> classes, and 2) from a maintainability standpoint, it can be very
> frustrating when looking for documentation.  Is there really any
> advantage to doing this?
>
> Chris
>
> On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:
>
> > Hi all,
> >
> > The following modules have more than one "package xxxx;"
> > declaration in
> > them. For small, internal classes I guess this is fine, but for
> > others,
> > they should be split up into the filesystem - otherwise they are
> > troublesome to locate and the online documentation doesn't list them!
> >
> > eg.
> > bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> > is in
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> >
> > Here's the culprits:
> >
> > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> > sed 's/:.*$//' | sort | uniq -d ; done
> >
> > bioperl-live/Bio/AnalysisI.pm
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> > bioperl-live/Bio/SeqIO/interpro.pm
> >
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> > bioperl-run/Bio/Tools/Run/Analysis/soap.pm
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu

From asjo at koldfront.dk  Tue Oct 10 16:04:35 2006
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Tue, 10 Oct 2006 22:04:35 +0200
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <871wpglyy4.fsf@topper.koldfront.dk>

On Tue, 10 Oct 2006 19:42:48 +0000, zhihua wrote:

> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
                                               ^
                                               y
[...]

> Can't locate object method "new" via package
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load
> "Bio::Tools::Run::AnalysisFactor::Pise"?)

You missed a 'y' in "Factory".


  Best wishes,

-- 
 "We've reached a special place... Spiritually...             Adam Sj?gren
  ecumenically... grammatically."                        asjo at koldfront.dk


From dmessina at wustl.edu  Tue Oct 10 17:08:45 2006
From: dmessina at wustl.edu (David Messina)
Date: Tue, 10 Oct 2006 16:08:45 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
	<E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
Message-ID: <A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>

> However, I think pod2html handles multiple package declarations in
> one module, and the PDOC online do as well.  Does the Deobfuscator?

Nope. From my cursory examination at the time they mostly were, as  
Lincoln said, short and sweet, so I didn't consider it a big deal.

I do think the Deobfuscator should theoretically handle such cases  
anyway, though. I'll add it as a feature request on the wiki page. Or  
if you're chomping at the bit for it, I could certainly be beer- 
suaded to do it sooner rather than later... :)

Dave


From cjfields at uiuc.edu  Tue Oct 10 17:33:39 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 16:33:39 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
	<E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
	<A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>
Message-ID: <7F35F565-7D28-4B06-A501-4D4083652C5C@uiuc.edu>

Me?  I'm a lowly postdoc.  Lincoln's got the cash!

Chris

On Oct 10, 2006, at 4:08 PM, David Messina wrote:

>> However, I think pod2html handles multiple package declarations in
>> one module, and the PDOC online do as well.  Does the Deobfuscator?
>
> Nope. From my cursory examination at the time they mostly were, as  
> Lincoln said, short and sweet, so I didn't consider it a big deal.
>
> I do think the Deobfuscator should theoretically handle such cases  
> anyway, though. I'll add it as a feature request on the wiki page.  
> Or if you're chomping at the bit for it, I could certainly be beer- 
> suaded to do it sooner rather than later... :)
>
> Dave
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From sdavis2 at mail.nih.gov  Wed Oct 11 05:43:35 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 11 Oct 2006 05:43:35 -0400
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <452CBCC7.30108@mail.nih.gov>

zhihua li wrote:
> Hi netters.
>
> I've installed Bioperl 1.5.1, both core and run modules. But when I
> tried to use the Pise module, an error occured saying that there's no
> "new" method in this package.
>
> My script is:
>
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
>
> The error message I got is:
>
> Can't locate object method "new" via package
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
>
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm
> and it DOES contain a sub new.
>
> So what's going on? Anyone could give me a hint?
>
> Thanks a lot!

The module name is Bio::Tools::Run::AnalysisFactory::Pise. Note that it
is not "factor" but "factory". That should probably fix your problem.

Sean

From jay at jays.net  Sat Oct  7 18:34:23 2006
From: jay at jays.net (Jay Hannah)
Date: Sat, 07 Oct 2006 17:34:23 -0500
Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult
Message-ID: <45282B6F.1030308@jays.net>

I just updated my bioperl-live this morning, so I think I'm current. :)

perldoc Bio::Search::Result::GenericResult
------------
SYNOPSIS
           # typically one gets Results from a SearchIO stream
           use Bio::SearchIO;
           my $io = new Bio::SearchIO(-format => 'blast',
                                       -file   => 't/data/HUMBETGLOA.tblastx');
           while( my $result = $io->next_result) {
               # process all search results within the input stream
               while( my $hit = $result->next_hits()) {
-------------

Except that "next_hits()" does not exist. Should be "next_hit()".

(Should I have posted a patch instead?)

Thanks,

j


From bosborne11 at verizon.net  Tue Oct 10 18:42:25 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 10 Oct 2006 18:42:25 -0400
Subject: [Bioperl-l] Documentation typo:
	Bio::Search::Result::GenericResult
In-Reply-To: <45282B6F.1030308@jays.net>
Message-ID: <C1519A11.ABD1%bosborne11@verizon.net>

j,

No need, not for something so simple.

Brian O.


On 10/7/06 6:34 PM, "Jay Hannah" <jay at jays.net> wrote:

> Except that "next_hits()" does not exist. Should be "next_hit()".
> 
> (Should I have posted a patch instead?)


From zchou at cau.edu.cn  Wed Oct 11 02:34:24 2006
From: zchou at cau.edu.cn (zhuocheng Hou)
Date: Wed, 11 Oct 2006 14:34:24 +0800
Subject: [Bioperl-l] about retreive alinged sequence
Message-ID: <000a01c6ecff$4ea4b2f0$0915020a@zchou>

Hello,everyone,

I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out.

The codes as follows (from the tutorials of HOWTOPAML):

         #
         # These codes run  and can find the screen print out of clustalw
         .......
         my $aa_aln = $aln_factory->align(\@prots, at params);
         # project the protein alignment back to CDS coordinates
         my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs);  
         my @each = $dna_aln->each_seq();         
         
         # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. 


         my $in  = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta');
         my $aln=$dna_aln;
         my $out = Bio::AlignIO->new(-file => ">out.msf" ,
                                   -format => 'msf');
         #print $out $_ while <$in>; 
         while ($aln = $in->next_aln() ) {
               my $out->write_aln($aln);
         }
         

Best regards,

Zhuocheng
CAU


From n.haigh at sheffield.ac.uk  Wed Oct 11 10:00:33 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 11 Oct 2006 15:00:33 +0100
Subject: [Bioperl-l] about retreive alinged sequence
In-Reply-To: <000a01c6ecff$4ea4b2f0$0915020a@zchou>
References: <000a01c6ecff$4ea4b2f0$0915020a@zchou>
Message-ID: <452CF901.6020409@sheffield.ac.uk>

Dear Zhuocheng

I'm not familiar with the aa_to_dna_al method but it appears that from 
your code that it returns an alignment object. Please find comments 
inserted below - hope they help!

Nathan

zhuocheng Hou wrote:
> Hello,everyone,
>
> I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out.
>
> The codes as follows (from the tutorials of HOWTOPAML):
>
>          #
>          # These codes run  and can find the screen print out of clustalw
>          .......
>          my $aa_aln = $aln_factory->align(\@prots, at params);
>          # project the protein alignment back to CDS coordinates
>          my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs);  
>   
$dna_aln should be a Bio::AlignIO object so all you need to do is setup 
the output stream to write the alignment object similar to what you 
wrote below. i.e.

my $out = Bio::AlignIO->new(-file => ">out.msf" ,
                                   -format => 'msf');

Then simply write the input alignment ($dna_aln) to the output stream 
with this:

my $out->write_aln($dna_aln);


>          my @each = $dna_aln->each_seq();         
>          
>          # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. 
>
>
>          my $in  = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta');
>          my $aln=$dna_aln;
>          my $out = Bio::AlignIO->new(-file => ">out.msf" ,
>                                    -format => 'msf');
>          #print $out $_ while <$in>; 
>          while ($aln = $in->next_aln() ) {
>                my $out->write_aln($aln);
>          }
>          
>
> Best regards,
>
> Zhuocheng
> CAU
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From melcher at rescomp.berkeley.edu  Wed Oct 11 17:09:17 2006
From: melcher at rescomp.berkeley.edu (Graham Melcher)
Date: Wed, 11 Oct 2006 14:09:17 -0700
Subject: [Bioperl-l] Accessing GO through MYSQL?
Message-ID: <20061011210917.GA783@rescomp.berkeley.edu>

Hey all,

Preface:: This is my first post to this list, please redirect if my
questions belong elsewhere.  

I need to lookup GO ontology information given GO:Accessors, and I have
a local mysql db that mirrors the GO db from that website.  I am not
sure if the Bio::Ontology::* libraries were designed to be used in a
dynamic, load-as-you-need sort of way, and am wondering how other people
have gone about solving this problem.  Details follow...

Right now I'm using Class::DBI to access the Mysql database, then made a
new set of subclassed Bio::Ontology::TermI and
Bio::Ontology::RelationshipI which use these class::DBI objects to
access the relevent information in the database on the fly.
Unfortunately, I was getting stuck with the implementation of some of
the other Bio::Ontology::*I, especially Ontology.   Making all of these
subclasses seems infeasible, or at least enough work that it might be
available somewhere.  Are mysql accessors out there, and I just haven't
found them, or is Bio::Ontology possibly not way to go?  

Alternatively, if I end up having to write this sort of Bio::Ontology -
Class::DBI interface, would anyone be interested in it being made
generally usable and available?

Finally, I just found go-perl, but although I haven't had a lot of time
to look into it, it doesn't seem to use mysql either.

Thanks!

Graham

-- 
Graham Melcher

From sdavis2 at mail.nih.gov  Thu Oct 12 07:51:14 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 07:51:14 -0400
Subject: [Bioperl-l] Accessing GO through MYSQL?
In-Reply-To: <20061011210917.GA783@rescomp.berkeley.edu>
References: <20061011210917.GA783@rescomp.berkeley.edu>
Message-ID: <452E2C32.7070502@mail.nih.gov>

Graham Melcher wrote:
> Finally, I just found go-perl, but although I haven't had a lot of time
> to look into it, it doesn't seem to use mysql either.
>   
Yep.  Keep going.  Go-perl and Go-db-perl:

http://www.godatabase.org/dev/go-db-perl/doc/go-db-perl-doc.html

Sean

From hlapp at gmx.net  Thu Oct 12 00:44:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 12 Oct 2006 00:44:49 -0400
Subject: [Bioperl-l] NESCent Phyloinformatics Hackathon
Message-ID: <939B253E-2F87-450A-A277-78B5645D3494@gmx.net>

(apologies in advance to those who receive this multiple times)

The National Evolutionary Synthesis Center (NESCent) in collaboration  
with Arlin Stoltzfus (U. Maryland, NIST), Aaron Mackey (GSK), Rutger  
Vos (UBC), and Mark Holder (FSU) sponsors a Phyloinformatics  
Hackathon to take place Dec 11-15 in Durham, NC.

The (wiki) website with more information and a formal proposal is at

	https://www.nescent.org/wg_phyloinformatics/

In short, the goal is to leverage the Bio* toolkits to provide the  
"glue" for evolutionary analyses of various types that depend on  
automation, interoperability, and data integration.

CALL FOR INPUT:

The specific objectives are driven by "use cases", that is, specific  
target problems of interest to evolutionary biologists (click 'Use  
Cases' at the above website). We invite community input in order to  
focus efforts on the most urgent or pervasive problems. The wiki for  
the hackathon allows direct editing of the use cases after  
registration. You may also upload data files, or add comments to the  
"Forum" page. Alternatively, send email to hlapp at nescent.org. You  
may also contact any of the organizers with questions or comments.

ATTENDANCE:

The hackathon is scheduled for Dec 11-15, 2006 in Durham NC. Space is  
limited, and attendance is by invitation. If you have not been  
contacted but desire to attend, please contact Hilmar Lapp (hlapp at  
nescent.org).

ORGANIZERS:

Hilmar Lapp (NESCent; hlapp at nescent.org)
Aaron Mackey (GSK; aaron.j.mackey at gsk.com)
Mark Holder (FSU; mholder at scs.fsu.edu)
Arlin Stoltzfus (CARB, NIST; arlin.stoltzfus at nist.gov)
Todd Vision (NESCent; tjv at bio.unc.edu)
Rutger Vos (UBC; rvosa at sfu.ca)


From neetisomaiya at gmail.com  Thu Oct 12 02:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive

From neetisomaiya at gmail.com  Thu Oct 12 02:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive

From neetisomaiya at gmail.com  Thu Oct 12 02:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive

From sayali_salodkar at persistent.co.in  Thu Oct 12 06:16:34 2006
From: sayali_salodkar at persistent.co.in (Sayali)
Date: Thu, 12 Oct 2006 15:46:34 +0530
Subject: [Bioperl-l] regarding polyphred output
Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in>

Hi, 

I want to parse the output of polyphred
http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already
available in Bioperl which would help me in doing the same.

Thanks,

Sayali

 
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails.

From sayali_salodkar at persistent.co.in  Thu Oct 12 06:16:34 2006
From: sayali_salodkar at persistent.co.in (Sayali)
Date: Thu, 12 Oct 2006 15:46:34 +0530
Subject: [Bioperl-l] regarding polyphred output
Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in>

Hi, 

I want to parse the output of polyphred
http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already
available in Bioperl which would help me in doing the same.

Thanks,

Sayali

 
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails.

From sdavis2 at mail.nih.gov  Thu Oct 12 06:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean

From sdavis2 at mail.nih.gov  Thu Oct 12 06:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean

From sdavis2 at mail.nih.gov  Thu Oct 12 06:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean

From crabtree at tigr.ORG  Thu Oct 12 07:28:06 2006
From: crabtree at tigr.ORG (Jonathan Crabtree)
Date: Thu, 12 Oct 2006 07:28:06 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <452E26C6.6040800@tigr.org>


Hi Neeti-

neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
>   

This doesn't sound like a BioPerl issue per se, so this list might not
be the best venue for your question.  Since SQL*Loader is an Oracle
utility you may have better luck in a forum frequented by Oracle DBAs
and/or general bioinformatics people.  (Not that this isn't such a
forum, but unless your difficulty is actually being caused by BioPerl,
or there's some kind of SQL*Loader wrapper in BioPerl--which I don't
think is the case--you run the risk of having people complain that your
question doesn't have enough to do with BioPerl.)

> We have tried loading sequences into CLOB columns using sql loader, and that
> works fine, but the same syntax when used for loading alignments, is not
> working.
>   

It's been a while since I've done any work with SQL*Loader, but I'd
guess that the reason it works with sequences and not alignments is
because there are characters in the alignments (newlines, perhaps?) that
SQL*Loader is incorrectly interpreting as either column (field) or row
(record) delimiters.  You may need to change your flat file encoding to
use delimiters other than the defaults (and alter the SQL*Loader control
file accordingly.)  As Sean pointed out, however, it's difficult to be
much help without seeing an example of a failed input and the
corresponding error(s)!  One other thing I remember about SQL*Loader (as
of Oracle 8-9 or so) is that all the CLOB values had to appear *last* in
the SQL*Loader record, at least if you were using variable-length
fields.  But since you've loaded sequences successfully, I doubt this is
the issue.  One final thought is that I believe SQL*Loader has an option
whereby you can place your LOB values in files external to the main
SQL*Loader input file, which sidesteps the field/row delimiter issue
completely; you may want to look into this if you're not already loading
your Oracle database this way.

Jonathan


From bix at sendu.me.uk  Fri Oct 13 04:56:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 13 Oct 2006 09:56:01 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au>
References: <4521E74E.1040404@infotech.monash.edu.au>
Message-ID: <452F54A1.7010908@sendu.me.uk>

Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's 
certainly interface-like, but doesn't follow the normal interface naming 
convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed 
WrapperBaseI? Left alone?


From cjfields at uiuc.edu  Fri Oct 13 08:20:58 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 13 Oct 2006 07:20:58 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <452F54A1.7010908@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>
	<452F54A1.7010908@sendu.me.uk>
Message-ID: <43CC4E80-8F15-4C83-929D-DDC719360C8F@uiuc.edu>

I would say, according to BioPerl convention, it should be renamed  
WrapperBaseI.  It has a few interface-like methods and (importantly)  
lacks a constructor.  Unless someone else out there has other reasoning?

Note that this will require lots of bioperl-run changes as well, at  
least I think it will.

Chris

On Oct 13, 2006, at 3:56 AM, Sendu Bala wrote:

> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's
> certainly interface-like, but doesn't follow the normal interface  
> naming
> convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed
> WrapperBaseI? Left alone?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From avilella at gmail.com  Fri Oct 13 11:26:47 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 13 Oct 2006 16:26:47 +0100
Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method
Message-ID: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>

Hi all,

While using the remove_gaps method in Bio::SimpleAlign I found out
that if the alignment is (bad enough for) having no columns without
any gap at all, the method will give a:

Use of uninitialized value in split at this line in add_seq:

map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq);

So my idea was to tweak this line to something like:

map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || '');

But I am unsure about any other side effects this may have.

Anyone?

    Albert.

From cjfields at uiuc.edu  Fri Oct 13 11:51:38 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 13 Oct 2006 10:51:38 -0500
Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method
In-Reply-To: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>
References: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>
Message-ID: <EE9FE57F-EE17-44FE-B298-CD4084675085@uiuc.edu>

You can check to see if it passes all tests.  I'm guessing  
SimpleAlign.t tests this method out in some way (though it's always  
safer to check).

Chris

On Oct 13, 2006, at 10:26 AM, Albert Vilella wrote:

> Hi all,
>
> While using the remove_gaps method in Bio::SimpleAlign I found out
> that if the alignment is (bad enough for) having no columns without
> any gap at all, the method will give a:
>
> Use of uninitialized value in split at this line in add_seq:
>
> map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq);
>
> So my idea was to tweak this line to something like:
>
> map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || '');
>
> But I am unsure about any other side effects this may have.
>
> Anyone?
>
>     Albert.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jay at jays.net  Fri Oct 13 12:09:16 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 11:09:16 -0500
Subject: [Bioperl-l] Documentation typo:
	Bio::Search::Result::GenericResult
In-Reply-To: <C1519A11.ABD1%bosborne11@verizon.net>
References: <C1519A11.ABD1%bosborne11@verizon.net>
Message-ID: <452FBA2C.7070003@jays.net>

Thanks Brian! 

My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :)

/home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v
----------------------------
revision 1.27
date: 2006/10/10 22:41:46;  author: bosborne;  state: Exp;  lines: +4 -4
next_hit, not next_hits
----------------------------

I'm a simple man who takes great satisfaction in the simple things. :)

j


Brian Osborne wrote:
> j,
> 
> No need, not for something so simple.
> 
> Brian O.
> 
> 
> On 10/7/06 6:34 PM, "Jay Hannah" <jay at jays.net> wrote:
>> Except that "next_hits()" does not exist. Should be "next_hit()".
>>
>> (Should I have posted a patch instead?)
> 


From jay at jays.net  Fri Oct 13 12:24:48 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 11:24:48 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
Message-ID: <452FBDD0.2070008@jays.net>

So I'm doing the following:

1) Using Bio::SeqIO to read in a genbank file and kick out fasta.
2) Reading that fasta file w/ command line formatdb.
3) Using that output for command line blastall.
4) Using Bio::SearchIO to read the blast results.

(If there's a better way, do tell. -grin-)

This sequence is working great for nucleotide BLASTing, but I'm stuck on step 1 when trying protein BLAST. 

my $seq_in  = Bio::SeqIO->new(
   -file => "<Organism1.genbank", 
   -format => "genbank", 
   -alphabet => "protein"
);
my $seq_out_protein = Bio::SeqIO->new(
   -file => ">out",
   -format => 'fasta',
   -alphabet => 'protein'
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");
   $seq_out_protein->write_seq($inseq);
}

This creates a nucleotide file "out". Setting -alphabet doesn't seem to do anything. Setting molecule("protein") doesn't seem to do anything either.

I was expecting that it would just pull all the CDS strings out of the genbank file and dump those into fasta format?

Am I missing something obvious?

Thanks,

j


From bosborne11 at verizon.net  Fri Oct 13 12:54:02 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 13 Oct 2006 12:54:02 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <452FBDD0.2070008@jays.net>
Message-ID: <C1553CEA.AC2E%bosborne11@verizon.net>

Jay,

You're looking for the "translation" string in the CDS section, yes? You
need to delve a bit into features, the CDS is considered to be a feature of
the main or parent nucleotide sequence and the translation is part of CDS
feature:

http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank


Brian O.


On 10/13/06 12:24 PM, "Jay Hannah" <jay at jays.net> wrote:

> Am I missing something 


From bix at sendu.me.uk  Fri Oct 13 12:59:46 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 13 Oct 2006 17:59:46 +0100
Subject: [Bioperl-l] Documentation
	typo:	Bio::Search::Result::GenericResult
In-Reply-To: <452FBA2C.7070003@jays.net>
References: <C1519A11.ABD1%bosborne11@verizon.net> <452FBA2C.7070003@jays.net>
Message-ID: <452FC602.3080302@sendu.me.uk>

Jay Hannah wrote:
> Thanks Brian! 
> 
> My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :)
> 
> /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v
> ----------------------------
> revision 1.27
> date: 2006/10/10 22:41:46;  author: bosborne;  state: Exp;  lines: +4 -4
> next_hit, not next_hits
> ----------------------------

Congratulations! :D

Next it will be two byte corrections and from there, the sky's the limit! :)


From hlapp at gmx.net  Fri Oct 13 13:28:50 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 13 Oct 2006 13:28:50 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <452F54A1.7010908@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>
	<452F54A1.7010908@sendu.me.uk>
Message-ID: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>

What does the POD (and the code) say about instantiating it?

	-hilmar

On Oct 13, 2006, at 4:56 AM, Sendu Bala wrote:

> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's
> certainly interface-like, but doesn't follow the normal interface  
> naming
> convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed
> WrapperBaseI? Left alone?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jay at jays.net  Fri Oct 13 14:56:38 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 13:56:38 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <C1553CEA.AC2E%bosborne11@verizon.net>
References: <C1553CEA.AC2E%bosborne11@verizon.net>
Message-ID: <452FE166.5080405@jays.net>

Brian Osborne wrote:
> You're looking for the "translation" string in the CDS section, yes? You
> need to delve a bit into features, the CDS is considered to be a feature of
> the main or parent nucleotide sequence and the translation is part of CDS
> feature:
> 
> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank

Yes. Thanks. I "rolled my own" -- I'm now doing this:

while (my $inseq = $seq_in->next_seq) {
   my @features = $inseq->get_SeqFeatures();
   foreach my $feat ( @features ) {
      next unless ($feat->primary_tag eq "CDS");
      my @db_xrefs = $feat->annotation->get_Annotations("db_xref");
      @db_xrefs = grep { /^GI:/ } @db_xrefs;
      die "Panic! More than one GI: db_xref?"     if (@db_xrefs > 1);
      die "Panic! No GI: db_xref?"            unless (@db_xrefs == 1);
      my $gi = $db_xrefs[0];
      $gi =~ s/^GI://;
      my @translations = $feat->annotation->get_Annotations("translation");
      die "Panic! More than one translation?" if (@translations > 1);
      my @protein_ids = $feat->annotation->get_Annotations("protein_id");
      die "Panic! More than one protein_id?"  if (@protein_ids > 1);
      my @product = $feat->annotation->get_Annotations("product");
      die "Panic! More than one product?"  if (@product > 1);
      print ">gi|$gi|gb|$protein_ids[0]|";
      print $inseq->id . " $product[0]\n";
      print "$translations[0]\n";
   }
}

To generate a homebrew fasta file for a protein BLAST.

I just thought that -alphabet and molecule() would do that stuff for me? What else would "protein" mean in those? Does anyone use -alphabet and/or molecule()? For what? How? Again, here's what I'm talking about:

==========
my $seq_out_protein = Bio::SeqIO->new(
   -file => ">out",
   -format => 'fasta',
   -alphabet => 'protein'    # No effect?
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");    # No effect?
==========

Thanks,

j


From bosborne11 at verizon.net  Fri Oct 13 17:20:40 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 13 Oct 2006 17:20:40 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <452FE166.5080405@jays.net>
Message-ID: <C1557B68.AC3E%bosborne11@verizon.net>

Jay,

Yes, people use the -alphabet parameter. If you set it to something then
Bioperl will not try to determine whether the sequence is protein, rna, or
dna and this is particularly useful when the sequence contains characters
that Bioperl would object to (sequences with distasteful characters can be
created by various applications, for example, or you might introduce some
weird character for some reason). Setting the -alphabet would also speed up
Bioperl a bit, for the same reason.

Brian O.


On 10/13/06 2:56 PM, "Jay Hannah" <jay at jays.net> wrote:

> 
> I just thought that -alphabet and molecule() would do that stuff for me? What
> else would "protein" mean in those? 


From jay at jays.net  Sat Oct 14 11:25:05 2006
From: jay at jays.net (Jay Hannah)
Date: Sat, 14 Oct 2006 10:25:05 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <C1557B68.AC3E%bosborne11@verizon.net>
References: <C1557B68.AC3E%bosborne11@verizon.net>
Message-ID: <45310151.5050901@jays.net>

Brian Osborne wrote:
> Yes, people use the -alphabet parameter. If you set it to something then
> Bioperl will not try to determine whether the sequence is protein, rna, or
> dna and this is particularly useful when the sequence contains characters
> that Bioperl would object to (sequences with distasteful characters can be
> created by various applications, for example, or you might introduce some
> weird character for some reason). Setting the -alphabet would also speed up
> Bioperl a bit, for the same reason.

Huh. That's what I assumed when I stumbled into the -alphabet parameter. So I thought this would read the protein sequences out of my genbank file and write a fasta file for me:

my $seq_in  = Bio::SeqIO->new(
   -file     => "<$file",  
   -format   => "genbank",
   -alphabet => "protein"  # No effect?
);
my $seq_out = Bio::SeqIO->new(
   -file     => ">$outfile",
   -format   => "fasta",
   -alphabet => "protein"  # No effect?
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");    # No effect?
   $seq_out->write_seq($inseq);
}

It didn't. Would it be a Good Thing if it did what I was expecting? (Like I said I rolled my own, but I'm always looking for ways to enhance BioPerl that other people might find useful... Someday I will contribute something useful, by golly. -grin-)

(Background: I'm doing protein BLASTs from genbank files. To make formatdb happy I have to have fasta files full of the protein sequences.)

j


From bosborne11 at verizon.net  Sat Oct 14 14:40:21 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Sat, 14 Oct 2006 14:40:21 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <45310151.5050901@jays.net>
Message-ID: <C156A755.AC52%bosborne11@verizon.net>

Jay,

What you expected was that setting the -alphabet to "protein" would make
Bioperl translate the input nucleotide sequence to output protein. In
Bioperl this is accomplished by using the translate() method, no surprise
there. If you take a look at the documentation on translate() in the online
Bioperl Tutorial you'll see that this is a fairly sophisticated method, you
can do all sorts of different things with it. So using -alphabet for this
purpose won't really work, there are too many different ways to translate.

Brian O.


On 10/14/06 11:25 AM, "Jay Hannah" <jay at jays.net> wrote:

> Would it be a Good Thing if it did what I was expecting?


From cjfields at uiuc.edu  Sat Oct 14 20:44:04 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 14 Oct 2006 19:44:04 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <45310151.5050901@jays.net>
Message-ID: <000601c6eff3$084663c0$15327e82@pyrimidine>

...
> Huh. That's what I assumed when I stumbled into the -alphabet parameter.
> So I thought this would read the protein sequences out of my genbank file
> and write a fasta file for me:

You have to think about it this way: the GenBank record you are using is for
the nucleotide sequence only, and all other information in that record
describes the sequence.  Similarly, if you used a 'GenPept' sequence, the
focus would be the protein sequence.  Both normally contain annotations
which describe the sequence globally, such as references, organism info,
etc.  Both also may contain features (or SeqFeatures), which describe a
feature bound to a particular location on the sequence.  However, features
are not an absolute requirement for a sequence; they're sort of 'window
dressing', albeit almost always essential for describing the main sequence.

I would do exactly as Brian suggests.  See the Feature/Annotation HOWTO for
ideas on how to screen out the particular features you want and either grab
the 'translation' tag data or get the sequence object from the feature and
translate it directly.  You should get the same result either way though
getting the tag may be faster.

...

> It didn't. Would it be a Good Thing if it did what I was expecting? (Like
> I said I rolled my own, but I'm always looking for ways to enhance BioPerl
> that other people might find useful... Someday I will contribute something
> useful, by golly. -grin-)
> 
> (Background: I'm doing protein BLASTs from genbank files. To make formatdb
> happy I have to have fasta files full of the protein sequences.)
> 
> j

You could, theoretically, write up a method to only retrieve features which
correspond to coding regions only (CDS).  You may want to optionally screen
out pseudogenes but that's up to you.

Chris


From avilella at gmail.com  Sun Oct 15 07:08:23 2006
From: avilella at gmail.com (Albert Vilella)
Date: Sun, 15 Oct 2006 12:08:23 +0100
Subject: [Bioperl-l] no_residues test in SimpleAlign.t
Message-ID: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>

Hi all,

Can somebody check the SimpleAlign.t test?

perl t/SimpleAlign.t

I get a few errors, I am looking at one that deals with no_residues. I
don't understand if this is suposed to work:

sub no_residues {
    my $self = shift;
    my $count = 0;

    foreach my $seq ($self->each_seq) {
	my $str = $seq->seq();

	$count += ($str =~ s/[^A-Za-z]//g);
        #is this the same as:
        # $str =~ s/[^A-Za-z]//g;
        # $count += length($str);
    }

Cheers,

    Albert.
    return $count;
}

From cjfields at uiuc.edu  Sun Oct 15 13:53:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 15 Oct 2006 12:53:50 -0500
Subject: [Bioperl-l] no_residues test in SimpleAlign.t
In-Reply-To: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>
References: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>
Message-ID: <FE798536-21DA-4377-96E2-0BF98C235970@uiuc.edu>

Albert,

I get all 75 tests passing.  SimpleAlign.t was recently switched over  
to Test::More, so you should be seeing more explicit test  
descriptions.  It looks like test 27 is no_residues().  Were there  
any more that failed?

I usually run 'perl -I. t/test.t' from the main bioperl directory to  
check individual tests from the local directory.  Otherwise you are  
checking your installed version which may be older (and may not match  
tests and recent bug fixes).  Could that be the problem?

Chris

On Oct 15, 2006, at 6:08 AM, Albert Vilella wrote:

> Hi all,
>
> Can somebody check the SimpleAlign.t test?
>
> perl t/SimpleAlign.t
>
> I get a few errors, I am looking at one that deals with no_residues. I
> don't understand if this is suposed to work:
>
> sub no_residues {
>     my $self = shift;
>     my $count = 0;
>
>     foreach my $seq ($self->each_seq) {
> 	my $str = $seq->seq();
>
> 	$count += ($str =~ s/[^A-Za-z]//g);
>         #is this the same as:
>         # $str =~ s/[^A-Za-z]//g;
>         # $count += length($str);
>     }
>
> Cheers,
>
>     Albert.
>     return $count;
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From DGroskreutz at twt.com  Mon Oct 16 02:00:39 2006
From: DGroskreutz at twt.com (DGroskreutz at twt.com)
Date: Mon, 16 Oct 2006 01:00:39 -0500
Subject: [Bioperl-l] CN=Deb Groskreutz/OU=MSN/O=TWT is out of the office.
Message-ID: <OF66FF39D7.C58855EB-ON86257209.002104F9-86257209.002104F9@twt.com>


I will be out of the office starting  10/13/2006 and will not return until
10/30/2006.

I will be out of the office until October 30, 2006.
I will reply to your message at that time.

Thanks,
Deb


NOTICE OF CONFIDENTIALITY:
The information contained in this communication, including attachments, is intended for the specific delivery to and use by the individual(s) to whom it is addressed. This email includes confidential information that may be attorney-client privileged. Any review, retransmission, dissemination, or unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please reply to the sender immediately and delete the original communication and any copy of it from your computer system, including all attachments.


From bix at sendu.me.uk  Mon Oct 16 04:08:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 09:08:34 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>
	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
Message-ID: <45333E02.9070808@sendu.me.uk>

Hilmar Lapp wrote:
> What does the POD (and the code) say about instantiating it?

=head1 SYNOPSIS

   # do not use this object directly, it provides the following methods
   # for its subclasses

...


=head1 DESCRIPTION

This is a basic module from which to build executable wrapper modules.
It has some basic methods to help when implementing new modules.


There is no new() method.

From bix at sendu.me.uk  Mon Oct 16 09:23:41 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 14:23:41 +0100
Subject: [Bioperl-l] Bio::WebAgent sleep warning
Message-ID: <453387DD.3040105@sendu.me.uk>

Hi,

Does anyone think it's appropriate for Bio::WebAgent to issue warnings 
every time it sleeps? I'd consider the sleeping part of its normal, 
expected and desired behaviour so I don't need to be warned about it. 
Perhaps change the $self->warn to a $self->debug?


From cjfields at uiuc.edu  Mon Oct 16 10:12:10 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 09:12:10 -0500
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <453387DD.3040105@sendu.me.uk>
Message-ID: <000c01c6f12d$121b5000$15327e82@pyrimidine>

> Hi,
> 
> Does anyone think it's appropriate for Bio::WebAgent to issue warnings
> every time it sleeps? I'd consider the sleeping part of its normal,
> expected and desired behaviour so I don't need to be warned about it.
> Perhaps change the $self->warn to a $self->debug?

That sounds fine.  Using debugging output for sleep would be similar
behavior to Bio::DB::NCBIHelper and BioDB::GenericWebDBI.  You may want to
pass it by Heikki (I think that's his module).  

The only reason I would want to see sleep output, personally, is to make
sure it is working properly.

Almost looks like that class has the same intent that GenericWebDBI has
(even down to using LWP::UserAgent as a superclass).  I may look into it to
see if I can use this as a superclass for GenericWebDBI.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Mon Oct 16 10:26:21 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 15:26:21 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
Message-ID: <4533968D.6040009@sheffield.ac.uk>

Did anyone reconfigure the bioperl web server (which ever server hosts
http://bioperl.org/DIST) by adding the following lines to the httpd.conf
file:

RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*)
http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1

This will be required as a workaround to a bug in ActivePerl 5.8.8.819
which will result in a failed install of Bioperl via PPM.

Cheers
Nath

From n.haigh at sheffield.ac.uk  Mon Oct 16 11:30:16 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 16:30:16 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A257.2000207@campus.iztacala.unam.mx>
References: <4533968D.6040009@sheffield.ac.uk>
	<4533A257.2000207@campus.iztacala.unam.mx>
Message-ID: <4533A588.9020505@sheffield.ac.uk>

Mauricio Herrera Cuadra wrote:
> Done. Could you please check if it works as it should?
>
> Cheers,
> Mauricio.
Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
someone to pop it in http://bioperl/DIST

Volunteers?

BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
the PPD? I seem to remember that there was talk about having to maintain
a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
this front?

Nath

From arareko at campus.iztacala.unam.mx  Mon Oct 16 11:16:39 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 10:16:39 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533968D.6040009@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>
Message-ID: <4533A257.2000207@campus.iztacala.unam.mx>

Done. Could you please check if it works as it should?

Cheers,
Mauricio.

Nathan Haigh wrote:
> Did anyone reconfigure the bioperl web server (which ever server hosts
> http://bioperl.org/DIST) by adding the following lines to the httpd.conf
> file:
> 
> RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*)
> http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1
> 
> This will be required as a workaround to a bug in ActivePerl 5.8.8.819
> which will result in a failed install of Bioperl via PPM.
> 
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From arareko at campus.iztacala.unam.mx  Mon Oct 16 11:33:33 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 10:33:33 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>
	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk>
Message-ID: <4533A64D.6040203@campus.iztacala.unam.mx>

Nathan Haigh wrote:
> Mauricio Herrera Cuadra wrote:
>> Done. Could you please check if it works as it should?
>>
>> Cheers,
>> Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?

You can send it to me.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From akarger at CGR.Harvard.edu  Mon Oct 16 11:54:33 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 16 Oct 2006 11:54:33 -0400
Subject: [Bioperl-l] Bio::Location::Split
Message-ID: <B9182BFF5B004245BABC12956EA6322E016B1ED7@huls5.nucleus.harvard.edu>

I recently came across bug 2101, where Bio::Location::Split::to_FTstring
gives the incorrect order for multi-sublocation locations on the minus
strand. That is, I found it by getting incorrect results, and then found
it in Bugzilla and in the September archives.

I'm converting CDS files from one format to another. E.g., I read an
EMBL file with a chromosome and CDS features, and want to output the
location in a FASTA header. If I do something like:

foreach (<$in>) {
    foreach my $feat ($seq->getSeqFeatures) {
        print $feat->location->to_FTstring()
    }
}

I get the wrong results for multi-exon CDSs on the -1 strand, as
described in the bug report.

Is there a relatively easy way around this? I assume I can't get at the
original string of the location, which in this case is all I need. Can I
just flip the order of the exons in certain cases? Chris F, can you tell
me the preliminary solution you mentioned?

I must say I'm sort of surprised this wasn't found before. It seems like
a not-that-rare occurrence. Oh well.

Thanks,

- Amir Karger
Research Computing
Life Sciences Division
Harvard University


From bix at sendu.me.uk  Mon Oct 16 12:14:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 17:14:39 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk>
Message-ID: <4533AFEF.8080103@sendu.me.uk>

Nathan Haigh wrote:
> Mauricio Herrera Cuadra wrote:
>> Done. Could you please check if it works as it should?
>>
>> Cheers,
>> Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?

I'm sure Mauricio would be happy to do it, but so am I. You may want to 
hold off a little while until I release rc2, which may be a few hours away.


> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
> the PPD? I seem to remember that there was talk about having to maintain
> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
> this front?

It depends on what is in the PPD and what kind of auto-dependency 
features the ActiveState installer has. Given Perl 5.8 and your current 
PPD, does Bioperl install with the same or fewer number of skips if you 
also install Bundle::BioPerl first? That is, does Bundle::BioPerl even 
do anything useful anymore? If not, obviously don't bother making it a 
pre-req. If it does, my opinion is that you make it a pre-req. If people 
really don't want to install the optional stuff they can download the 
.zip file and install manually without even a make.

From Kevin.M.Brown at asu.edu  Mon Oct 16 12:14:51 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Oct 2006 09:14:51 -0700
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
Message-ID: <1A4207F8295607498283FE9E93B775B402196FAA@EX02.asurite.ad.asu.edu>

> > Yes, people use the -alphabet parameter. If you set it to 
> something then
> > Bioperl will not try to determine whether the sequence is 
> protein, rna, or
> > dna and this is particularly useful when the sequence 
> contains characters
> > that Bioperl would object to (sequences with distasteful 
> characters can be
> > created by various applications, for example, or you might 
> introduce some
> > weird character for some reason). Setting the -alphabet 
> would also speed up
> > Bioperl a bit, for the same reason.
> 
> Huh. That's what I assumed when I stumbled into the -alphabet 
> parameter. So I thought this would read the protein sequences 
> out of my genbank file and write a fasta file for me:
> 
> my $seq_in  = Bio::SeqIO->new(
>    -file     => "<$file",  
>    -format   => "genbank",
>    -alphabet => "protein"  # No effect?
> );
> my $seq_out = Bio::SeqIO->new(
>    -file     => ">$outfile",
>    -format   => "fasta",
>    -alphabet => "protein"  # No effect?
> );
> while (my $inseq = $seq_in->next_seq) {
>    $inseq->molecule("protein");    # No effect?
>    $seq_out->write_seq($inseq);
> }
> 
> It didn't. Would it be a Good Thing if it did what I was 
> expecting? (Like I said I rolled my own, but I'm always 
> looking for ways to enhance BioPerl that other people might 
> find useful... Someday I will contribute something useful, by 
> golly. -grin-)
> 
> (Background: I'm doing protein BLASTs from genbank files. To 
> make formatdb happy I have to have fasta files full of the 
> protein sequences.)

This might work for your needs (CDS to protein FASTA).

my $seq_in  = Bio::SeqIO->new(
   -file     => "<$file",  
   -format   => "genbank",
);

open my $seq_out, ">$outfile";

while (my $inseq = $seq_in->next_seq) {
   print $seq_out ">". $inseq->display_id(). "\n";
   print $seq_out $inseq->translate() ."\n";
}


From bix at sendu.me.uk  Mon Oct 16 11:44:19 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 16:44:19 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
Message-ID: <4533A8D3.90709@sendu.me.uk>

I think Chris recently deprecated this, but should it be? For me, its 
POD description justifies its existence, and perhaps more importantly, 
Bio::Index::Blast relies on it.

I took a quick peek at the latter and it didn't seem trivial to move it 
over to Bio::SearchIO instead.

Should it be undeprecated?

From n.haigh at sheffield.ac.uk  Mon Oct 16 12:39:02 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 17:39:02 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533AFEF.8080103@sendu.me.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk>
Message-ID: <4533B5A6.1070701@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> Mauricio Herrera Cuadra wrote:
>>> Done. Could you please check if it works as it should?
>>>
>>> Cheers,
>>> Mauricio.
>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
>> someone to pop it in http://bioperl/DIST
>>
>> Volunteers?
>
> I'm sure Mauricio would be happy to do it, but so am I. You may want
> to hold off a little while until I release rc2, which may be a few
> hours away.

Just e-mailed Mauricio links to the files off list, It's not a big job
for me to remake the bioperl PPD, so Mauricio it's up to you if you want
to wait 18hrs for me to make the PPDs for 1.5.2-rc2.
>
>
>> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
>> the PPD? I seem to remember that there was talk about having to maintain
>> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
>> this front?
>
> It depends on what is in the PPD and what kind of auto-dependency
> features the ActiveState installer has. Given Perl 5.8 and your
> current PPD, does Bioperl install with the same or fewer number of
> skips if you also install Bundle::BioPerl first? That is, does
> Bundle::BioPerl even do anything useful anymore? If not, obviously
> don't bother making it a pre-req. If it does, my opinion is that you
> make it a pre-req. If people really don't want to install the optional
> stuff they can download the .zip file and install manually without
> even a make.
As far as the PPDs are concerned - no tests are run during installation.
PPM more or less just copies files into the correct place for Perl to
find so both approaches result in the same thing. However, I've not
tried making a CPAN distribution file for either Bioperl or
Bundle::Bioperl - I wouldn't know where to start!

MakeFile.PL now only documents the prereq in one place (%packages), and
this is used to add the prereq to the bioperl PPD when issuing "nmake
ppd". This way, each release of BioPerl should be up-to-date with prereq
as long as developers add their modules prereq to %packages. If we have
Bundle::BioPerl, most of those prereq need to be moved from the Bioperl
PPD to the Bundle::BioPerl PPD - a bit of a pain because there are no
guidelines as to what should/should not go in Bundle::BioPerl.
Therefore, as far as the PPDs are concerned, it far easier to do away
with Bundel::BioPerl.

Nath

From hlapp at gmx.net  Mon Oct 16 13:04:24 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:04:24 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <45333E02.9070808@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>
	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
	<45333E02.9070808@sendu.me.uk>
Message-ID: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>

So it looks like an abstract base class, not an interface that  
defines a contract or API? Should use Root.pm then, would be my vote.

	-hilmar

On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> What does the POD (and the code) say about instantiating it?
>
> =head1 SYNOPSIS
>
>    # do not use this object directly, it provides the following  
> methods
>    # for its subclasses
>
> ...
>
>
> =head1 DESCRIPTION
>
> This is a basic module from which to build executable wrapper modules.
> It has some basic methods to help when implementing new modules.
>
>
> There is no new() method.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Oct 16 13:08:28 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:08:28 -0400
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <453387DD.3040105@sendu.me.uk>
References: <453387DD.3040105@sendu.me.uk>
Message-ID: <B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>

It depends. What triggers the sleeping? If it's part of every request  
that it processes then I'd agree. If it is triggered by failure to  
precede the next try then the failure is probably not expected  
(though possible), and hence should be reported by warn().

If it is just part of the polling cycle then there should probably be  
a limit up to which the time waited is considered 'normal' and after  
which it is considered 'excessive' and hence should be reported  
through warn().

My $0.02.

	-hilmar

On Oct 16, 2006, at 9:23 AM, Sendu Bala wrote:

> Hi,
>
> Does anyone think it's appropriate for Bio::WebAgent to issue warnings
> every time it sleeps? I'd consider the sleeping part of its normal,
> expected and desired behaviour so I don't need to be warned about it.
> Perhaps change the $self->warn to a $self->debug?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Mon Oct 16 13:13:53 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 18:13:53 +0100
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>
References: <453387DD.3040105@sendu.me.uk>
	<B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>
Message-ID: <4533BDD1.8060204@sendu.me.uk>

Hilmar Lapp wrote:
> It depends. What triggers the sleeping? If it's part of every request 
> that it processes then I'd agree. If it is triggered by failure to 
> precede the next try then the failure is probably not expected (though 
> possible), and hence should be reported by warn().
> 
> If it is just part of the polling cycle then there should probably be a 
> limit up to which the time waited is considered 'normal' and after which 
> it is considered 'excessive' and hence should be reported through warn().

=head2 sleep

  Title   : sleep
  Usage   : $self->sleep
  Function: sleep for a number of seconds indicated by the delay policy
  Returns : none
  Args    : none

NOTE: This method keeps track of the last time it was called and only
imposes a sleep if it was called more recently than the delay_policy()
allows.

=cut

It issues a warning every time it actually sleeps. I find it 
inappropriate that a method warns me that it did what I asked it to do.

From arareko at campus.iztacala.unam.mx  Mon Oct 16 13:14:06 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 12:14:06 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533B5A6.1070701@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>	<4533A588.9020505@sheffield.ac.uk>
	<4533AFEF.8080103@sendu.me.uk> <4533B5A6.1070701@sheffield.ac.uk>
Message-ID: <4533BDDE.2040801@campus.iztacala.unam.mx>

Nathan Haigh wrote:
> Sendu Bala wrote:
>> Nathan Haigh wrote:
>>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
>>> someone to pop it in http://bioperl/DIST
>>>
>>> Volunteers?
>> I'm sure Mauricio would be happy to do it, but so am I. You may want
>> to hold off a little while until I release rc2, which may be a few
>> hours away.
> 
> Just e-mailed Mauricio links to the files off list, It's not a big job
> for me to remake the bioperl PPD, so Mauricio it's up to you if you want
> to wait 18hrs for me to make the PPDs for 1.5.2-rc2.

Too late, I've already placed 1.5.2-rc1 in DIST. hehe :)

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From bix at sendu.me.uk  Mon Oct 16 12:32:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 17:32:11 +0100
Subject: [Bioperl-l] Swissprot problems
Message-ID: <4533B40B.2030908@sendu.me.uk>

t/Biofetch.t and t/DB.t are skipping their swissprot database fetches.
Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for 
maintenance but is now back up. However I'm guessing the databases must 
have changed. I've manually looked for the test case 'YNB3_YEAST' in 
database 'UniProtKB' and it came back with no result, even though I can 
find the test case manually at the expasy website.

Is this an EBI bug or deliberate change that makes sense to someone?

From m.weimer at dkfz-heidelberg.de  Mon Oct 16 12:43:38 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Mon, 16 Oct 2006 18:43:38 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Problem
Message-ID: <1161017019.5203.6.camel@localhost>

Dear list members,

when running 

######################################################################
#! /usr/bin/perl -w

use strict;
use Bio::DB::SwissProt;

my $db_obj = new Bio::DB::SwissProt(-verbose => 1);

my $seq_obj = $db_obj->get_Seq_by_acc("O02938");
######################################################################

using Bioperl 1.5.2 I get the following message:

##########################################################################################

request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
Content-Length: 49
Content-Type: application/x-www-form-urlencoded

format=swissprot&db=UniProtKB&style=raw&id=O02938


------------- EXCEPTION: Bio::Root::Exception -------------
MSG: acc O02938 does not exist
STACK: Error::throw
STACK:
Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350
STACK:
Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181
STACK: ./get.test.pl:8
-----------------------------------------------------------

##########################################################################################

But the accession number does exist. Surprisingly, everything worked
fine a few days ago. Any ideas of what might have happened?

Thanks and best regards,

Marc

 
From hlapp at gmx.net  Mon Oct 16 13:15:50 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:15:50 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
References: <4533A8D3.90709@sendu.me.uk>
Message-ID: <C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>

The problem is it is not maintained, and there are outstanding been  
bug reports.

If you un-deprecate it, then we need a response to people who come  
across problems with it when using it. Either you change the POD to  
say exactly who and when one should use it (or rather not) and point  
to the fact that it is unsupported for all other cases.

Or what would you suggest?

	-hilmar

On Oct 16, 2006, at 11:44 AM, Sendu Bala wrote:

> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
>
> I took a quick peek at the latter and it didn't seem trivial to  
> move it
> over to Bio::SearchIO instead.
>
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Oct 16 13:21:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:21:46 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
Message-ID: <000001c6f147$8efdfd60$15327e82@pyrimidine>

Bio::Tools::BPlite was placed on the deprecation list a while back (~ rel
1.5); the other related Bio::Tools::BP* modules were also supposed to be on
that list as well.  

If we want to undeprecate (de-deprecate? reprecate?) BPlite we also would
need to do the same for the others.  They must be updated to parse current
BLAST/PSI-BLAST/bl2seq text output, something that Bio::SearchIO::blast is
currently capable of (so the functionality is redundant).  And someone needs
to take them over.

In my opinion it may be more trouble than it's worth as they haven't been
touched in a while.    Seems if we 'revive' BPlite we're not really moving
forward esp. since you have added the PullParser recently and made
substantial improvements to SearchIO.  

Maybe Bio::Index::Blast just needs to be deprecated or rewritten to use
SearchIO?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 16, 2006 10:44 AM
> To: bioperl-l
> Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
> 
> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
> 
> I took a quick peek at the latter and it didn't seem trivial to move it
> over to Bio::SearchIO instead.
> 
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Mon Oct 16 13:21:58 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 18:21:58 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>
References: <4533A8D3.90709@sendu.me.uk>
	<C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>
Message-ID: <4533BFB6.5070504@sendu.me.uk>

Hilmar Lapp wrote:
> The problem is it is not maintained, and there are outstanding been bug 
> reports.
> 
> If you un-deprecate it, then we need a response to people who come 
> across problems with it when using it. Either you change the POD to say 
> exactly who and when one should use it (or rather not) and point to the 
> fact that it is unsupported for all other cases.
> 
> Or what would you suggest?

I'm not sure.

Does Bio::Index::Blast even work correctly? Does it suffer from whatever 
bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should 
that be deprecated as well?

Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO 
and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't 
seem trivial (or even appropriate).

Ultimately I just wanted to solve the warnings in the test suite. 
Thoughts, Chris?

From cjfields at uiuc.edu  Mon Oct 16 13:30:05 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:30:05 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
Message-ID: <000101c6f148$b8538b20$15327e82@pyrimidine>

> Mauricio Herrera Cuadra wrote:
> > Done. Could you please check if it works as it should?
> >
> > Cheers,
> > Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?
> 
> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
> the PPD? I seem to remember that there was talk about having to maintain
> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
> this front?
> 
> Nath

Nathan,

I think Chris Dagdigian still maintains Bundle::Bioperl on CPAN.  That
version should be the common basis for prereqs for any Bioperl core
installation.  

It's relatively easy to add/remove modules to the Bundle::Bioperl.  Contact
Chris D. and let him know if anything needs to be changed.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 16 13:33:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:33:50 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
Message-ID: <000201c6f149$3ed63490$15327e82@pyrimidine>

> So it looks like an abstract base class, not an interface that
> defines a contract or API? Should use Root.pm then, would be my vote.
> 
> 	-hilmar

Makes sense to me.  Maybe another audit is needed to catch similar
instances, or has this been done already?

Chris

> On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote:
> 
> > Hilmar Lapp wrote:
> >> What does the POD (and the code) say about instantiating it?
> >
> > =head1 SYNOPSIS
> >
> >    # do not use this object directly, it provides the following
> > methods
> >    # for its subclasses
> >
> > ...
> >
> >
> > =head1 DESCRIPTION
> >
> > This is a basic module from which to build executable wrapper modules.
> > It has some basic methods to help when implementing new modules.
> >
> >
> > There is no new() method.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 16 13:57:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:57:35 -0500
Subject: [Bioperl-l] Bio::Location::Split
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E016B1ED7@huls5.nucleus.harvard.edu>
Message-ID: <000301c6f14c$8fb0e060$15327e82@pyrimidine>

> I recently came across bug 2101, where Bio::Location::Split::to_FTstring
> gives the incorrect order for multi-sublocation locations on the minus
> strand. That is, I found it by getting incorrect results, and then found
> it in Bugzilla and in the September archives.
>
> I'm converting CDS files from one format to another. E.g., I read an
> EMBL file with a chromosome and CDS features, and want to output the
> location in a FASTA header. If I do something like:
> 
> foreach (<$in>) {
>     foreach my $feat ($seq->getSeqFeatures) {
>         print $feat->location->to_FTstring()
>     }
> }
> 
> I get the wrong results for multi-exon CDSs on the -1 strand, as
> described in the bug report.
> 
> Is there a relatively easy way around this? I assume I can't get at the
> original string of the location, which in this case is all I need. Can I
> just flip the order of the exons in certain cases? Chris F, can you tell
> me the preliminary solution you mentioned?
> 
> I must say I'm sort of surprised this wasn't found before. It seems like
> a not-that-rare occurrence. Oh well.
> 
> Thanks,
> 
> - Amir Karger
> Research Computing
> Life Sciences Division
> Harvard University

Could you let me know specifically which EMBL file contains the odd
locations?  The bug report uses theoretical locations, not actual ones, so
it would be nice to have a real-world example to test against.  

As for the lack of catching this, the particular types of locations that
cause the issue are quite rare.  Note that there are two bugs for that bug
report.  The first (and more serious) is still unresolved.  The second
(where remote locations are treated differently in Location::Split, which
caused more problems than it was worth) had a fix committed about a month
ago.  

Any fixes I have made for the first bug invariably break several other
methods, which use the current Location::Split object logic for retrieving
sequences, building feature strings, etc.  Since a new RC is imminent and
the bug only affects a small number of locations, I have held off until
after a final release is made (the last thing I want to do is fix something
that breaks ~6-8 other methods), but I'll try looking at it again this week.


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 16 14:29:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:29:02 -0500
Subject: [Bioperl-l] Swissprot problems
In-Reply-To: <4533B40B.2030908@sendu.me.uk>
Message-ID: <000401c6f150$f57dfc30$15327e82@pyrimidine>


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 16, 2006 11:32 AM
> To: bioperl-l
> Subject: [Bioperl-l] Swissprot problems
> 
> t/Biofetch.t and t/DB.t are skipping their swissprot database fetches.
> Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for
> maintenance but is now back up. However I'm guessing the databases must
> have changed. I've manually looked for the test case 'YNB3_YEAST' in
> database 'UniProtKB' and it came back with no result, even though I can
> find the test case manually at the expasy website.
> 
> Is this an EBI bug or deliberate change that makes sense to someone?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

I can confirm that.  It's not our end, though.  Entering the same data on
the DBFetch web page also gets no data.  I have emailed EBI about the
problem and will let you know if I hear anything; I think it's related to
the maintenance issue.

Notably, nothing on the web page indicates any database name changes yet.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 16 14:29:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:29:52 -0500
Subject: [Bioperl-l] Bio::DB::SwissProt Problem
In-Reply-To: <1161017019.5203.6.camel@localhost>
Message-ID: <000501c6f151$12918710$15327e82@pyrimidine>

We think there is a problem on the SwissProt (DBFetch) server.  I have
contacted them about the problem and will post something when I hear
something back.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Marc Weimer
> Sent: Monday, October 16, 2006 11:44 AM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bio::DB::SwissProt Problem
> 
> Dear list members,
> 
> when running
> 
> ######################################################################
> #! /usr/bin/perl -w
> 
> use strict;
> use Bio::DB::SwissProt;
> 
> my $db_obj = new Bio::DB::SwissProt(-verbose => 1);
> 
> my $seq_obj = $db_obj->get_Seq_by_acc("O02938");
> ######################################################################
> 
> using Bioperl 1.5.2 I get the following message:
> 
> ##########################################################################
> ################
> 
> request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
> Content-Length: 49
> Content-Type: application/x-www-form-urlencoded
> 
> format=swissprot&db=UniProtKB&style=raw&id=O02938
> 
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: acc O02938 does not exist
> STACK: Error::throw
> STACK:
> Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350
> STACK:
> Bio::DB::WebDBSeqI::get_Seq_by_acc
> /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181
> STACK: ./get.test.pl:8
> -----------------------------------------------------------
> 
> ##########################################################################
> ################
> 
> But the accession number does exist. Surprisingly, everything worked
> fine a few days ago. Any ideas of what might have happened?
> 
> Thanks and best regards,
> 
> Marc
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Mon Oct 16 14:39:28 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:39:28 -0500
Subject: [Bioperl-l] SwissProt Down
Message-ID: <000601c6f152$6997dbd0$15327e82@pyrimidine>

Looks like the swissprot problem stems from maintenance at EBI.  From the
EBI page http://www.ebi.ac.uk/Information/ (not on the DBFetch page, BTW):

Please Note: Monday October 16th 12:00-15:00 -  Due to general maintenance,
some services from the EBI may be temporarily unavailable. We apologise for
any inconvenience.

At least we know that Test::More skips are working!

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Mon Oct 16 14:51:31 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 19:51:31 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C15946CA.ACA9%bosborne11@verizon.net>
References: <C15946CA.ACA9%bosborne11@verizon.net>
Message-ID: <4533D4B3.2000809@sendu.me.uk>

Brian Osborne wrote:
> Sendu,
> 
> I just made a commit that makes Bio::Index::Blast use SearchIO instead of
> BPlite.

I was concerned about the whole id_parser thing. Did you determine that 
your change still allows for id_parser to be used and have the intended 
effect, or that id_parser is in someway meaningless and should be 
removed as a method?

From cjfields at uiuc.edu  Mon Oct 16 15:03:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 14:03:33 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533BFB6.5070504@sendu.me.uk>
Message-ID: <000301c6f155$c7029ff0$15327e82@pyrimidine>

> Hilmar Lapp wrote:
> > The problem is it is not maintained, and there are outstanding been bug
> > reports.
> >
> > If you un-deprecate it, then we need a response to people who come
> > across problems with it when using it. Either you change the POD to say
> > exactly who and when one should use it (or rather not) and point to the
> > fact that it is unsupported for all other cases.
> >
> > Or what would you suggest?
> 
> I'm not sure.
> 
> Does Bio::Index::Blast even work correctly? Does it suffer from whatever
> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should
> that be deprecated as well?
> 
> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO
> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't
> seem trivial (or even appropriate).
> 
> Ultimately I just wanted to solve the warnings in the test suite.
> Thoughts, Chris?

My opinion is we either have to completely support BPlite (and the others)
or drop it altogether.  I don't think we can state "use BPLite only with
Bio::Index::Blast, use SearchIO everywhere else."  That's too inconsistent.


It seems simpler to deprecate the various Bio::Tools::BP* classes and either
fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working
on) or deprecate Bio::Index::Blast as well.  

The warnings in the test suite belong to BlastIndex.t, correct?  I updated
using Brian's Bio::Index::blast fix and it passes now w/o warnings.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From akarger at CGR.Harvard.edu  Mon Oct 16 15:00:28 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 16 Oct 2006 15:00:28 -0400
Subject: [Bioperl-l] Bio::Location::Split
Message-ID: <B9182BFF5B004245BABC12956EA6322E016B1F89@huls5.nucleus.harvard.edu>

 
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu] 
> >
> > I'm converting CDS files from one format to another. E.g., I read an
> > EMBL file with a chromosome and CDS features, and want to output the
> > location in a FASTA header.> > 
> > I get the wrong results for multi-exon CDSs on the -1 strand, as
> > described in the bug report.
> > 
>
> Could you let me know specifically which EMBL file contains the odd
> locations?  The bug report uses theoretical locations, not 
> actual ones, so
> it would be nice to have a real-world example to test against. 

I downloaded candida glabrata chromosome B from EBI:
http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948

testportal>perl location.pl new_glabrata_B.embl > bio
testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/'
new_glabrata_B.embl > nonbio
testportal>wc bio nonbio
 217  217 4537 bio
 217  217 4549 nonbio
 434  434 9086 total
testportal>diff bio nonbio
4c4
< complement(join(10632..11157,10347..10372))
---
> join(complement(10632..11157),complement(10347..10372))

Just one example here, but see below.
 
> As for the lack of catching this, the particular types of 
> locations that
> cause the issue are quite rare.  

Really? I guess our definitions of rare depend on which sequences we're
working with. I'm doing fungal genomes, and here's a grep for a few
species' entire genomes:

testportal>foreach i ( *.embl )
foreach? echo $i
foreach? grep CDS $i | grep join | grep -c complement
foreach? end
glabrata_orf.embl
29
hansenii_orf.embl
151
lactis_orf.embl
70
lipolytica_orf.embl
337
pombe_orf.embl
1137

You might like to use pombe as a test case, as it has lots of these
complement joins, including ones with multiple introns.

Anyway, I'd question the "rare" designation. It seems to me like any
species that has introns will have situations like this in their CDSs.
Not to mention any other sequence that uses Bio::Location::Split. (Since
I'm not a Real Biologist, I can't think up mor examples here, but I'm
sure they exist.)

Or are you saying it's rare to use join (complement(C..D),
complement(A..B)) instead of complement(join(A..B, C..D)). In that case,
I guess I just got really unlucky in that five fungal genomes I was
using decided to use the "rare" syntax. 

> Note that there are two bugs 
> for that bug
> report.  The first (and more serious) is still unresolved.  The second
> (where remote locations are treated differently in 
> Location::Split, which
> caused more problems than it was worth) had a fix committed 
> about a month
> ago.  

Sadly, it's the first (and in my case, more common (I have no remote
locations.)) bug for me.

> Any fixes I have made for the first bug invariably break several other
> methods, which use the current Location::Split object logic 
> for retrieving
> sequences, building feature strings, etc.  Since a new RC is 
> imminent and
> the bug only affects a small number of locations, I have held 
> off until
> after a final release is made (the last thing I want to do is 
> fix something
> that breaks ~6-8 other methods), but I'll try looking at it 
> again this week.

IMO this is a pretty serious bug (if these kinds of sequences aren't
that rare as I've shown above), because you're outputting sequence
descriptions that are just plain wrong. Anyone who uses
FTLocationFactory to read these output description will have incorrect
sequence, incorrect translated proteins, etc. And it's even more serious
if other methods are depending on it.

I know I can't dictate your time, and should be volunteering to work on
fixing it. But if it affects other modules, then I will no doubt break
things even more than you have in your attempts.  

-Amir

> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 


From bosborne11 at verizon.net  Mon Oct 16 14:25:14 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 14:25:14 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
Message-ID: <C15946CA.ACA9%bosborne11@verizon.net>

Sendu,

I just made a commit that makes Bio::Index::Blast use SearchIO instead of
BPlite. The BlastIndex.t test is giving a few warnings so I need to take a
look at that but all tests are passing.

An awful lot of work has gone into the SearchIO system, for more on why its
approach is deemed to be superior in the context of Bioperl see the SearchIO
HOWTO. One key feature of this upcoming release is an emphasis on removing
extraneous modules, I think it's safe to say that BPlite has been considered
extraneous for a number of years now.

Brian O.


On 10/16/06 11:44 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
> 
> I took a quick peek at the latter and it didn't seem trivial to move it
> over to Bio::SearchIO instead.
> 
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Mon Oct 16 14:59:38 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 14:59:38 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533D4B3.2000809@sendu.me.uk>
Message-ID: <C1594EDA.ACB9%bosborne11@verizon.net>

Sendu,

OK. I _think_ this change shouldn't affect id_parser() but I will test this
in BlastIndex.t. The id_parser() method is relevant to all these Index*
modules - don't know how much it's used but it certainly is nice to have it
available.

Brian O.


On 10/16/06 2:51 PM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> Brian Osborne wrote:
>> Sendu,
>> 
>> I just made a commit that makes Bio::Index::Blast use SearchIO instead of
>> BPlite.
> 
> I was concerned about the whole id_parser thing. Did you determine that
> your change still allows for id_parser to be used and have the intended
> effect, or that id_parser is in someway meaningless and should be
> removed as a method?


From cjfields at uiuc.edu  Mon Oct 16 16:51:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 15:51:08 -0500
Subject: [Bioperl-l] Bio::Location::Split
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E016B1F89@huls5.nucleus.harvard.edu>
Message-ID: <000001c6f164$d1380190$15327e82@pyrimidine>

...
> I downloaded candida glabrata chromosome B from EBI:
> http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948
> 
> testportal>perl location.pl new_glabrata_B.embl > bio
> testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/'
> new_glabrata_B.embl > nonbio
> testportal>wc bio nonbio
>  217  217 4537 bio
>  217  217 4549 nonbio
>  434  434 9086 total
> testportal>diff bio nonbio
> 4c4
> < complement(join(10632..11157,10347..10372))
> ---
> > join(complement(10632..11157),complement(10347..10372))
> 
> Just one example here, but see below.
> 
> > As for the lack of catching this, the particular types of
> > locations that
> > cause the issue are quite rare.
> 
> Really? I guess our definitions of rare depend on which sequences we're
> working with. I'm doing fungal genomes, and here's a grep for a few
> species' entire genomes:
> 
> testportal>foreach i ( *.embl )
> foreach? echo $i
> foreach? grep CDS $i | grep join | grep -c complement
> foreach? end
> glabrata_orf.embl
> 29
> hansenii_orf.embl
> 151
> lactis_orf.embl
> 70
> lipolytica_orf.embl
> 337
> pombe_orf.embl
> 1137
> 
> You might like to use pombe as a test case, as it has lots of these
> complement joins, including ones with multiple introns.

I'll use those.  I'll see if an analogous GenBank file exists as well.  

I can probably make a preliminary fix for FT_string() so that it arranges
the sublocations correctly, but I think the best way to go is to have
FTLocationFactory not modify the various sublocations to start with, which
it currently does when it sets strand() (strand() propagates the strand info
to sublocations). 

> Anyway, I'd question the "rare" designation. It seems to me like any
> species that has introns will have situations like this in their CDSs.
> Not to mention any other sequence that uses Bio::Location::Split. (Since
> I'm not a Real Biologist, I can't think up mor examples here, but I'm
> sure they exist.)

I think that additional tests are definitely needed for pulling out
sequences.  

What I mean by 'rare' is that the majority of sequences do not have
problems.  Also, this seems to be a 'silent' bug since the error shows up in
to_FTstring() but the object sublocations seem to beprocessed correctly when
using the location object directly (such as via SeqFeatureI).  

Round-tripping the sequence should pick it up though.  Since
complement(join(10632..11157,10347..10372)) is not the same as
join(complement(10632..11157),complement(10347..10372)).  

That is essentially what you are doing, correct? i.e. getting the sequences
using Bioperl, saving them (which passes them through SeqIO), reading them
again (back through SeqIO with the malformed location string).

> Or are you saying it's rare to use join (complement(C..D),
> complement(A..B)) instead of complement(join(A..B, C..D)). In that case,
> I guess I just got really unlucky in that five fungal genomes I was
> using decided to use the "rare" syntax.

Location::Split is supposed to handle all variations, but apparently it
isn't.  

> > Note that there are two bugs
> > for that bug
> > report.  The first (and more serious) is still unresolved.  The second
> > (where remote locations are treated differently in
> > Location::Split, which
> > caused more problems than it was worth) had a fix committed
> > about a month
> > ago.
> 
> Sadly, it's the first (and in my case, more common (I have no remote
> locations.)) bug for me.
> 
> > Any fixes I have made for the first bug invariably break several other
> > methods, which use the current Location::Split object logic
> > for retrieving
> > sequences, building feature strings, etc.  Since a new RC is
> > imminent and
> > the bug only affects a small number of locations, I have held
> > off until
> > after a final release is made (the last thing I want to do is
> > fix something
> > that breaks ~6-8 other methods), but I'll try looking at it
> > again this week.
> 
> IMO this is a pretty serious bug (if these kinds of sequences aren't
> that rare as I've shown above), because you're outputting sequence
> descriptions that are just plain wrong. Anyone who uses
> FTLocationFactory to read these output description will have incorrect
> sequence, incorrect translated proteins, etc. And it's even more serious
> if other methods are depending on it.
> 
> I know I can't dictate your time, and should be volunteering to work on
> fixing it. But if it affects other modules, then I will no doubt break
> things even more than you have in your attempts.
> 
> -Amir

I'll give it a look over the next week.  Like I mentioned above, I may be
able to fix it in Split::to_FTstring() w/o breaking other tests (in which
case I'll commit it for the 1.5.2 release), but it would be a temporary hack
until I can work out why other tests are failing.

Chris


From jason at bioperl.org  Mon Oct 16 18:45:21 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Oct 2006 15:45:21 -0700
Subject: [Bioperl-l] split location problems
Message-ID: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>

The whole point of split locations is to represent genes with introns  
so that is not the "rare" case.

I'm confused where the problem is.  The locations that I get out with  
to_FTstring on the location object are exactly the same as those input.

I have processed the genbank fungal genomes into GFF3 and have had no  
problems so I'm confused where you are breaking down.  If I write  
them out as embl I also get the correct thing.  This is using the CVS  
version of bioperl from the HEAD.

I've added code to test this to bug 2101 including a C.glabrata  
chromsome downloaded from genbank.  Perhaps the problem is on the  
EMBL parsing side, I didn't test that.

On the technical side, I still am not sure I fully know where the  
strand information should be stored - the top level container or the  
sub-features.  I'll try and stay up on the discussion if anything has  
been decided that I should know about.

-jason


From torsten.seemann at infotech.monash.edu.au  Mon Oct 16 18:23:23 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 17 Oct 2006 08:23:23 +1000
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <000201c6f149$3ed63490$15327e82@pyrimidine>
References: <000201c6f149$3ed63490$15327e82@pyrimidine>
Message-ID: <4534065B.9020309@infotech.monash.edu.au>

Chris Fields wrote:
>> So it looks like an abstract base class, not an interface that
>> defines a contract or API? Should use Root.pm then, would be my vote.
>> 	-hilmar
> 
> Makes sense to me.  Maybe another audit is needed to catch similar
> instances, or has this been done already?

The purpose of my original (poorly phrased) question was to try and sort 
out where Root and RootI where being used the wrong way around.

I'm currently "all-audited out" so I leave this task to another volunteer.

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From cjfields at uiuc.edu  Mon Oct 16 21:07:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 20:07:55 -0500
Subject: [Bioperl-l] split location problems
In-Reply-To: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
Message-ID: <BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>


On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote:

> The whole point of split locations is to represent genes with  
> introns so that is not the "rare" case.
>
> I'm confused where the problem is.  The locations that I get out  
> with to_FTstring on the location object are exactly the same as  
> those input.

The problem is with the a subset of split locations described in the  
bug report.  The following works:

complement(join(2691..4571,4918..5163))

whereas this:

join(complement(4918..5163),complement(2691..4571))

gives this:

complement(join(4918..5163,2691..4571))

which is not syntactically the same.  It should be:

complement(join(2691..4571,4918..5163))

since 'join' implies that the order of the segments to be joined is  
important ('order' and 'bond' do not, I guess).

> I have processed the genbank fungal genomes into GFF3 and have had  
> no problems so I'm confused where you are breaking down.  If I  
> write them out as embl I also get the correct thing.  This is using  
> the CVS version of bioperl from the HEAD.
>
> I've added code to test this to bug 2101 including a C.glabrata  
> chromsome downloaded from genbank.  Perhaps the problem is on the  
> EMBL parsing side, I didn't test that.
>
> On the technical side, I still am not sure I fully know where the  
> strand information should be stored - the top level container or  
> the sub-features.  I'll try and stay up on the discussion if  
> anything has been decided that I should know about.
>
> -jason

Split::strand() sets the sublocations as well, which seems to confuse  
the situation more but it is consistent with LocationI, as Hilmar  
points out.  I'm looking into a few solutions now, including a fix in  
Split::to_FTstring().

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Mon Oct 16 22:48:14 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Oct 2006 19:48:14 -0700
Subject: [Bioperl-l] split location problems
In-Reply-To: <BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>
References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
	<BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>
Message-ID: <8273f6c20610161948w201537a5v2fcfa189eb809283@mail.gmail.com>

This probably was exposed by the fact that the Split object used to
explicitly sort the features by start*strand always.  But with remote
locations and needing to be able to explicitly set the order (for features
that are not required to be 5' -> 3') that code must have been removed.   I
think there is just one place that must be missing a 'reverse' on the list
of sub-locations when the top-level feature is a complement.  I'll wait for
your fix before wading in - we probably might want to figure out a
'consolidate' method to shrink redundant and equivalent representations to
the shortest possible form. Ugh this really starts to resemble trying to
write a boolean logic toolkit....
-jason

On 10/16/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
> On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote:
>
> > The whole point of split locations is to represent genes with
> > introns so that is not the "rare" case.
> >
> > I'm confused where the problem is.  The locations that I get out
> > with to_FTstring on the location object are exactly the same as
> > those input.
>
> The problem is with the a subset of split locations described in the
> bug report.  The following works:
>
> complement(join(2691..4571,4918..5163))
>
> whereas this:
>
> join(complement(4918..5163),complement(2691..4571))
>
> gives this:
>
> complement(join(4918..5163,2691..4571))
>
> which is not syntactically the same.  It should be:
>
> complement(join(2691..4571,4918..5163))
>
> since 'join' implies that the order of the segments to be joined is
> important ('order' and 'bond' do not, I guess).
>
> > I have processed the genbank fungal genomes into GFF3 and have had
> > no problems so I'm confused where you are breaking down.  If I
> > write them out as embl I also get the correct thing.  This is using
> > the CVS version of bioperl from the HEAD.
> >
> > I've added code to test this to bug 2101 including a C.glabrata
> > chromsome downloaded from genbank.  Perhaps the problem is on the
> > EMBL parsing side, I didn't test that.
> >
> > On the technical side, I still am not sure I fully know where the
> > strand information should be stored - the top level container or
> > the sub-features.  I'll try and stay up on the discussion if
> > anything has been decided that I should know about.
> >
> > -jason
>
> Split::strand() sets the sublocations as well, which seems to confuse
> the situation more but it is consistent with LocationI, as Hilmar
> points out.  I'm looking into a few solutions now, including a fix in
> Split::to_FTstring().
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
Jason Stajich
jason at bioperl.org
http://www.duke.edu/~jes12/

From cjfields at uiuc.edu  Mon Oct 16 23:34:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 22:34:25 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C159C54B.ACD5%bosborne11@verizon.net>
References: <C159C54B.ACD5%bosborne11@verizon.net>
Message-ID: <AE334107-1639-468E-ABA8-2F992693809A@uiuc.edu>


On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:

> Chris and Sendu,
>
> Sendu was correct in wondering whether id_parser() in Blast.pm  
> would work
> after the module was altered to use SearchIO but what I've found  
> out from my
> local tests is that id_parser() didn't work when BPlite was being used
> either. I can continue to work on this but it's safe to say that  
> removing
> BPlite doesn't cause a problem with id_parser, it was already there.
>
> Brian O.

....

It may be one reason (the main reason?) the method wasn't tested.   
Maybe it should be removed if it can't be easily fixed; I don't think  
it makes sense keeping it otherwise.

Chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bosborne11 at verizon.net  Mon Oct 16 23:24:59 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 23:24:59 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <000301c6f155$c7029ff0$15327e82@pyrimidine>
Message-ID: <C159C54B.ACD5%bosborne11@verizon.net>

Chris and Sendu,

Sendu was correct in wondering whether id_parser() in Blast.pm would work
after the module was altered to use SearchIO but what I've found out from my
local tests is that id_parser() didn't work when BPlite was being used
either. I can continue to work on this but it's safe to say that removing
BPlite doesn't cause a problem with id_parser, it was already there.

Brian O.


On 10/16/06 3:03 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

>> Hilmar Lapp wrote:
>>> The problem is it is not maintained, and there are outstanding been bug
>>> reports.
>>> 
>>> If you un-deprecate it, then we need a response to people who come
>>> across problems with it when using it. Either you change the POD to say
>>> exactly who and when one should use it (or rather not) and point to the
>>> fact that it is unsupported for all other cases.
>>> 
>>> Or what would you suggest?
>> 
>> I'm not sure.
>> 
>> Does Bio::Index::Blast even work correctly? Does it suffer from whatever
>> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should
>> that be deprecated as well?
>> 
>> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO
>> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't
>> seem trivial (or even appropriate).
>> 
>> Ultimately I just wanted to solve the warnings in the test suite.
>> Thoughts, Chris?
> 
> My opinion is we either have to completely support BPlite (and the others)
> or drop it altogether.  I don't think we can state "use BPLite only with
> Bio::Index::Blast, use SearchIO everywhere else."  That's too inconsistent.
> 
> 
> It seems simpler to deprecate the various Bio::Tools::BP* classes and either
> fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working
> on) or deprecate Bio::Index::Blast as well.
> 
> The warnings in the test suite belong to BlastIndex.t, correct?  I updated
> using Brian's Bio::Index::blast fix and it passes now w/o warnings.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Mon Oct 16 23:48:56 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 23:48:56 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <AE334107-1639-468E-ABA8-2F992693809A@uiuc.edu>
Message-ID: <C159CAE8.ACD9%bosborne11@verizon.net>

Chris,

OK. In fact there's no written guarantee that all Bio::Index* modules have
an id_parser() method. It happens that most do, and it's useful. I'll fix
the documentation in Bio::Index::Blast and add an enhancement request to
Bugzilla, may be able to get around to before 1.5.2 release but no promises.

Brian O.


On 10/16/06 11:34 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> 
> On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:
> 
>> Chris and Sendu,
>> 
>> Sendu was correct in wondering whether id_parser() in Blast.pm
>> would work
>> after the module was altered to use SearchIO but what I've found
>> out from my
>> local tests is that id_parser() didn't work when BPlite was being used
>> either. I can continue to work on this but it's safe to say that
>> removing
>> BPlite doesn't cause a problem with id_parser, it was already there.
>> 
>> Brian O.
> 
> ....
> 
> It may be one reason (the main reason?) the method wasn't tested.
> Maybe it should be removed if it can't be easily fixed; I don't think
> it makes sense keeping it otherwise.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Tue Oct 17 02:35:43 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 07:35:43 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
Message-ID: <453479BF.90408@sheffield.ac.uk>

I'm a bit unclear as to what is happening with these files.

Are these files now superseded by the wikified versions? If so, should 
these files now just simply contain a link to the wikified versions - 
otherwise things could get in a mess since I updated the wiki version of 
INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks 
ago - hopefully these differences aren't that big.

Nath

From faruque at ebi.ac.uk  Tue Oct 17 04:19:44 2006
From: faruque at ebi.ac.uk (Nadeem Faruque)
Date: Tue, 17 Oct 2006 09:19:44 +0100
Subject: [Bioperl-l] split location problems
Message-ID: <F2A2DB48-8EDF-43AA-AFCF-45B48AF43B1C@ebi.ac.uk>

EMBL' currently outputs join-complements in the format
join(complement(30..40),complement(10..20))
instead of the Genbank preferred
complement(join(10..20,30..40))

EMBL's may reflect what happens in the cell a little more than  
Genbank's, but it is less readable and less concise.
NB I've also seen a couple of people construct these incorrectly
eg join(complement(10..20),complement(30..40))

I believe we are moving to the complement-join format but I can't  
give a date for the transition.

Having said that, trans-splicing will still give us the joys of  
complex locations,
eg
join(1..5,complement(join(10..20,30..40)))
complement(join(30..40,10..20)) <- looks wrong (unless it is a very  
small circle) but mis-ordered exons are resolved by the trans- 
splicing machinery.

Nadeem


--
S.M. Nadeem N. Faruque
EMBL Nucleotide Database Curation Team
EMBL Outstation
Tel: +44 1223 494611                     Fax: +44 1223 494472
The European Bioinformatics Institute    URL: http://www.ebi.ac.uk/
Email for data submissions: datasubs at ebi.ac.uk
Email for updates: update at ebi.ac.uk
========================================================


From bix at sendu.me.uk  Tue Oct 17 04:59:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 09:59:36 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>	<45333E02.9070808@sendu.me.uk>
	<1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
Message-ID: <45349B78.8090905@sendu.me.uk>

Hilmar Lapp wrote:
> So it looks like an abstract base class, not an interface that  
> defines a contract or API? Should use Root.pm then, would be my vote.

Agreed, that was actually what I did in my local copy when I made a new 
inheriting class (so discovering the problem). This change is harmless 
to other modules, but does mean they'll have redundant use of 
Bio::Root::Root which will want cleaning up at some stage.

From bix at sendu.me.uk  Tue Oct 17 06:32:54 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 11:32:54 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
Message-ID: <4534B156.4090501@sendu.me.uk>

Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
See http://www.bioperl.org/wiki/Release_1.5.2 for
instructions on getting and testing this RC.

Developers:
   This should be the last RC before release ~next monday. Now would
   be a good time for last minute documentaiton updates and additions.

Users:
   Even though 1.5.2 is a 'developer' release, we consider it the most
   stable and capable version of Bioperl, and recommend that you use
   it in all but the most critical production environments. Please
   try it out and let us know of any problems or difficulties you run
   into.


Thank you,
Sendu.


From cjfields at uiuc.edu  Tue Oct 17 07:16:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 06:16:47 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <453479BF.90408@sheffield.ac.uk>
References: <453479BF.90408@sheffield.ac.uk>
Message-ID: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>

The general consensus was to keep text versions available; we could  
add URL links to the wiki pages for the most up-to-dat version.  BTW,  
I have modified INSTALL already.  INSTALL.WIN is next in line (I was  
waiting for your changes).

Chris

On Oct 17, 2006, at 1:35 AM, Nathan S. Haigh wrote:

> I'm a bit unclear as to what is happening with these files.
>
> Are these files now superseded by the wikified versions? If so, should
> these files now just simply contain a link to the wikified versions -
> otherwise things could get in a mess since I updated the wiki  
> version of
> INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks
> ago - hopefully these differences aren't that big.
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Tue Oct 17 07:45:45 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 12:45:45 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>
References: <453479BF.90408@sheffield.ac.uk>
	<72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>
Message-ID: <4534C269.5050704@sheffield.ac.uk>

Chris Fields wrote:
> The general consensus was to keep text versions available; we could 
> add URL links to the wiki pages for the most up-to-dat version.  BTW, 
> I have modified INSTALL already.  INSTALL.WIN is next in line (I was 
> waiting for your changes).
>
Is it possible to generate these files from the wiki whenever there is a 
release? I now edits shouldn't be too severe or too often - but I can 
see things getting a little messy/annoying if edits have to be made in 2 
places.

Nath

From cjfields at uiuc.edu  Tue Oct 17 10:04:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:04:32 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534C269.5050704@sheffield.ac.uk>
Message-ID: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>

There isn't a very easy way since so many links have to be removed/modified.
I have found a few CPAN modules that could help, but for now I just dump the
text output from a text browser (elinks) using the 'printable version' page
and hand-edit, which works very quickly.  That works for the time being
until I can find another more automated solution.

Fortunately there have been very few edits to either INSTALL wiki page so
they should remain relatively stable.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
> Sent: Tuesday, October 17, 2006 6:46 AM
> To: Chris Fields
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> 
> Chris Fields wrote:
> > The general consensus was to keep text versions available; we could
> > add URL links to the wiki pages for the most up-to-dat version.  BTW,
> > I have modified INSTALL already.  INSTALL.WIN is next in line (I was
> > waiting for your changes).
> >
> Is it possible to generate these files from the wiki whenever there is a
> release? I now edits shouldn't be too severe or too often - but I can
> see things getting a little messy/annoying if edits have to be made in 2
> places.
> 
> Nath


From cjfields at uiuc.edu  Tue Oct 17 10:12:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:12:09 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C159CAE8.ACD9%bosborne11@verizon.net>
Message-ID: <000401c6f1f6$424b5580$15327e82@pyrimidine>


> Chris,
> 
> OK. In fact there's no written guarantee that all Bio::Index* modules have
> an id_parser() method. It happens that most do, and it's useful. I'll fix
> the documentation in Bio::Index::Blast and add an enhancement request to
> Bugzilla, may be able to get around to before 1.5.2 release but no
> promises.
> 
> Brian O.

Do the various Bio::Index* modules share a common interface?  

I wouldn't worry too much about it for this release, unless you really have
time.  It is still, after all, a developer's release, and you've noted it in
Bugzilla.  We could try for another dev release in winter (rel 1.5.3, I
guess) to get any bug fixes or new modules added.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> On 10/16/06 11:34 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
> 
> >
> > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:
> >
> >> Chris and Sendu,
> >>
> >> Sendu was correct in wondering whether id_parser() in Blast.pm
> >> would work
> >> after the module was altered to use SearchIO but what I've found
> >> out from my
> >> local tests is that id_parser() didn't work when BPlite was being used
> >> either. I can continue to work on this but it's safe to say that
> >> removing
> >> BPlite doesn't cause a problem with id_parser, it was already there.
> >>
> >> Brian O.
> >
> > ....
> >
> > It may be one reason (the main reason?) the method wasn't tested.
> > Maybe it should be removed if it can't be easily fixed; I don't think
> > it makes sense keeping it otherwise.
> >
> > Chris
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Tue Oct 17 10:15:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 15:15:17 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
Message-ID: <4534E575.5050308@sheffield.ac.uk>

Chris Fields wrote:
> There isn't a very easy way since so many links have to be removed/modified.
> I have found a few CPAN modules that could help, but for now I just dump the
> text output from a text browser (elinks) using the 'printable version' page
> and hand-edit, which works very quickly.  That works for the time being
> until I can find another more automated solution.
>
> Fortunately there have been very few edits to either INSTALL wiki page so
> they should remain relatively stable.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>   
So am I correct in saying that the best way is to make all updates to 
the wikified versions of these files, and then at regular 
intervals/major releases you (or someone else) will update the CVS 
version of the files in the way describe above?

Cheers
Nath

From bix at sendu.me.uk  Tue Oct 17 10:00:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 15:00:39 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E09C.9030707@genomics.dk>
References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk>
Message-ID: <4534E207.8030508@sendu.me.uk>

Niels Larsen wrote:
> Greetings,
> 
> I am no perl beginner, but I am a BioPerl beginner. Today I looked
> for remote similarity services that can be used from Perl. I found
> the EBI SOAP interface where their example script returns
> 
> Can't find method element in the message at 
> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

What script exactly? There was a problem with the SOAP server that was 
fixed earlier today.


> and the DDBJ service which (from Denmark) returns
> 
> undef

What returned undef? Specifics please.


> and then the NCBI server accessed through BioPerls RemoteBlast which
> seems to spin in a loop that fills TMPDIR with many tempfiles. Will
> release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
> is working towards that).

What version of Bioperl were you testing with? What did you do to get it 
to 'spin in a loop'? I can tell you that remote blasting certainly works 
in Bioperl 1.5.2, but you'll have to give more details on the things you 
tried and the problems you encountered.

You can also answer the questions yourself by trying the release candidate.

From B.Beckert at ibmc.u-strasbg.fr  Tue Oct 17 09:59:30 2006
From: B.Beckert at ibmc.u-strasbg.fr (Bertrand Beckert)
Date: Tue, 17 Oct 2006 15:59:30 +0200
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
Message-ID: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>


hi,

I am running a large number of blasts via a connexion to ncbi blast
page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
some problems. I make a simple example with only one sequence in
order to understand how work this module. This is my simple input
file, a DNA sequence in fasta form:

> test
>
TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT
TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG
TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA
I have made some modification of the example available in doc of
bioperl.
It give me a RID which contain the results of my blast but I have a
problem with the "$result=$factory->retrieve_blast($rid)" in my script.
In the documentation it wrote that $result=$factory->retrieve_blast
($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
object. In my case it returns a Bio::SearchIO::blast... I don't
understand why I don't have the good type of object return (see PART I).

I also try to resolve the problem by replace the foreach loop in my
script by a new one in order to explore the blast page result but it
also don't work (see part II).

could you help me please. Thank you

Bertrand Beckert.

PART I:

Here is my script with a little annotation and also the shell window
printing:
------------------------------------------------------------------------

----------------------------
#!/usr/bin/perl -w
use Bio::Tools::Run::RemoteBlast;
use Bio::SearchIO;
sub blast {
my $prog='blastn';
my $db='refseq_genomic';
my $e_val='1e-10';
my $Input='Seq.fasta';
my @params = ('-prog' =>  $prog, '-data' =>  $db, '-expect' =>
$e_val, '-readmethod' => 'SearchIO');
my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
#changes parameters
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25';
$factory->submit_blast($Input);
print STDERR "waiting...\n";
while (my @rids=$factory->each_rid) {
           print "my rid: ", at rids,"\n";
	 #return me the ID of the submited blast i.e. RID:
1161079157-766-185099855365.BLASTQ2
	 #this page contains the result of my blast...
	         foreach my $rid (@rids) {
		                 $result=$factory->retrieve_blast($rid);
		#line in order to understand what type of object is
return by
retrieve_blast		
                  print "rc:", $result,"\n";
		
		                }
			}
		}

&blast;
------------------------------------------------------------------------

----------------------------

here you can see the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc54)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc30)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x89eb7f4)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x8a2cc74)
my rid: 1161079157-766-185099855365.BLASTQ2
...
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x886bbac)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x89eb5f0)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x8a2d2d4)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x84fa054)
...

PARTII:

I also try to resolve the problem by replace the foreach loop in my
script by:
------------------------------------------------------------------------

----------------------------
foreach my $rid (@rids) {
                  while(1) {
                  $result=$factory->retrieve_blast($rid)->next_result();
                  print "rc:", $result,"\n";
                  if ($result) {
                  print  $result->num_hits(),"\n";
                  }
------------------------------------------------------------------------

----------------------------
With tis loop I could explore the result Blast page. that is what I
obtain in the shell window:
		
bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161088606-9905-123050755601.BLASTQ4
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb834)


----
-- 
Berrtrand BECKERT
PhD student
IBMC - UPR 9002 du CNRS - ARN
15, rue Rene Descartes
F-67084 STRASBOURG Cedex
b.beckert at ibmc.u-strasbg.fr


From niels at genomics.dk  Tue Oct 17 09:54:36 2006
From: niels at genomics.dk (Niels Larsen)
Date: Tue, 17 Oct 2006 15:54:36 +0200
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534B156.4090501@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>
Message-ID: <4534E09C.9030707@genomics.dk>

Greetings,

I am no perl beginner, but I am a BioPerl beginner. Today I looked
for remote similarity services that can be used from Perl. I found
the EBI SOAP interface where their example script returns

Can't find method element in the message at 
/ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

and the DDBJ service which (from Denmark) returns

undef

and then the NCBI server accessed through BioPerls RemoteBlast which
seems to spin in a loop that fills TMPDIR with many tempfiles. Will
release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
is working towards that).

Niels L


------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------

From cjfields at uiuc.edu  Tue Oct 17 10:28:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:28:40 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534E575.5050308@sheffield.ac.uk>
Message-ID: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>

...
> So am I correct in saying that the best way is to make all updates to
> the wikified versions of these files, and then at regular
> intervals/major releases you (or someone else) will update the CVS
> version of the files in the way describe above?
> 
> Cheers
> Nath

Yes.  I think the online docs will stay relatively stable.  A week or so ago
Mauricio and I were discussing moving the dependencies list to it's own CVS
document (since they pertain to all Bioperl installations, not just UNIX'y
flavors).  I haven't done that yet since I was waiting on the INSTALL.WIN
changes before I made any more changes.  Well, that and I've been really
busy doing other things.

One way we could make sure that changes to the online docs would match the
CVS docs would be to only allow certain wiki users (such as sysadmins) make
modifications to those pages.  That way any changes would have to go through
someone who also has CVS access and could make similar changes to the
distribution docs.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Tue Oct 17 10:37:38 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 15:37:38 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>
References: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>
Message-ID: <4534EAB2.50609@sheffield.ac.uk>

Chris Fields wrote:
> ...
>   
>> So am I correct in saying that the best way is to make all updates to
>> the wikified versions of these files, and then at regular
>> intervals/major releases you (or someone else) will update the CVS
>> version of the files in the way describe above?
>>
>> Cheers
>> Nath
>>     
>
> Yes.  I think the online docs will stay relatively stable.  A week or so ago
> Mauricio and I were discussing moving the dependencies list to it's own CVS
> document (since they pertain to all Bioperl installations, not just UNIX'y
> flavors).  I haven't done that yet since I was waiting on the INSTALL.WIN
> changes before I made any more changes.  Well, that and I've been really
> busy doing other things.
>   
Sounds good.
> One way we could make sure that changes to the online docs would match the
> CVS docs would be to only allow certain wiki users (such as sysadmins) make
> modifications to those pages.  That way any changes would have to go through
> someone who also has CVS access and could make similar changes to the
> distribution docs.
>   
Ugh, not sure I like the sound of maintaining 2 copies of any files - 
sounds like a future headache even if they are pretty stable. It also 
makes it unclear which of the two file should be considered first (i.e. 
is the most up-to-date) on pages such as:
http://www.bioperl.org/wiki/Installing_BioPerl

It suggests that INSTALL and INSTALL.WIN should be looked at first, but 
there are online copies of those files available - this should now be 
the other way around - shouldn't it? I might just be making a mountain 
out of a molehill, so I'll shut up on this topic and make any future 
edits to the wiki pages instead.
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   

From bosborne11 at verizon.net  Tue Oct 17 10:48:54 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 17 Oct 2006 10:48:54 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <000401c6f1f6$424b5580$15327e82@pyrimidine>
Message-ID: <C15A6596.AD0B%bosborne11@verizon.net>

Chris,

The Bio::Index modules either 'use base qw(Bio::Index::Abstract)' or 'use
base qw(Bio::Index::AbstractSeq)'. Neither of these modules has an
id_parser() method.

Brian O.


On 10/17/06 10:12 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> Do the various Bio::Index* modules share a common interface?  


From cjfields at uiuc.edu  Tue Oct 17 10:45:53 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:45:53 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534EAB2.50609@sheffield.ac.uk>
Message-ID: <000601c6f1fa$f260b560$15327e82@pyrimidine>

...
> > One way we could make sure that changes to the online docs would match
> the
> > CVS docs would be to only allow certain wiki users (such as sysadmins)
> make
> > modifications to those pages.  That way any changes would have to go
> through
> > someone who also has CVS access and could make similar changes to the
> > distribution docs.
> >
> Ugh, not sure I like the sound of maintaining 2 copies of any files -
> sounds like a future headache even if they are pretty stable. It also
> makes it unclear which of the two file should be considered first (i.e.
> is the most up-to-date) on pages such as:
> http://www.bioperl.org/wiki/Installing_BioPerl
> 
> It suggests that INSTALL and INSTALL.WIN should be looked at first, but
> there are online copies of those files available - this should now be
> the other way around - shouldn't it? I might just be making a mountain
> out of a molehill, so I'll shut up on this topic and make any future
> edits to the wiki pages instead.

Yes that should be the other way around (the wiki would be the most
up-to-date), so the CVS docs should point to the wiki, not vice-versa.

Getting the docs right is as important as getting the code to work.  So I
don't consider it a 'mountain-out-of-a-molehill' problem.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Tue Oct 17 11:07:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 10:07:49 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E207.8030508@sendu.me.uk>
Message-ID: <001001c6f1fe$02fd4de0$15327e82@pyrimidine>

> Niels Larsen wrote:
> > Greetings,
> >
> > I am no perl beginner, but I am a BioPerl beginner. Today I looked
> > for remote similarity services that can be used from Perl. I found
> > the EBI SOAP interface where their example script returns
> >
> > Can't find method element in the message at
> > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.
> 
> What script exactly? There was a problem with the SOAP server that was
> fixed earlier today.
> 
> 
> > and the DDBJ service which (from Denmark) returns
> >
> > undef
> 
> What returned undef? Specifics please.
> 

The first problem, like Sendu mentions, was fixed on the remote server (I
get them to pass now).  Those were from bioperl-run, though, not the bioperl
core distribution.

As for DDBJ, do you mean EBI or SwissProt?  I ask b/c you mention Denmark.
EBI were having server maintenance outages yesterday, which was announced
here.

As Sendu mentions, please be more specific.

> > and then the NCBI server accessed through BioPerls RemoteBlast which
> > seems to spin in a loop that fills TMPDIR with many tempfiles. Will
> > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
> > is working towards that).
> 
> What version of Bioperl were you testing with? What did you do to get it
> to 'spin in a loop'? I can tell you that remote blasting certainly works
> in Bioperl 1.5.2, but you'll have to give more details on the things you
> tried and the problems you encountered.
> 
> You can also answer the questions yourself by trying the release
> candidate.

The tempfiles showing up are from the repeated RID requests and are deleted
after the BLAST run (at least they should be); this is quite normal.  They
don't 'spin in a loop' unless the BLAST query is taking a particularly long
time, which can happen depending on how the BLAST query is set up, i.e. what
type of BLAST program is requested, if comp-based stats are requested,
length of query, database requested, etc.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 17 11:14:07 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 16:14:07 +0100
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
In-Reply-To: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>
References: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>
Message-ID: <4534F33F.3070809@sendu.me.uk>

Bertrand Beckert wrote:
> hi,
> 
> I am running a large number of blasts via a connexion to ncbi blast
> page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
> I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
> some problems.
[snip]
> In the documentation it wrote that $result=$factory->retrieve_blast
> ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
> object. In my case it returns a Bio::SearchIO::blast... I don't
> understand why I don't have the good type of object return (see PART I).

I take it you're using some old version of Bioperl where unfortunately 
the documentation was incorrect. In fact you're supposed to get a 
Bio::SearchIO object, so it is a good thing that you are. The latest 
version of Bioperl has (as far as I can see) correct documentation and 
behaviour.

Bio::Tools::Bplite and Bio::Tools::Blast are deprecated. You want 
Bio::SearchIO::blast. All is well.


> I also try to resolve the problem by replace the foreach loop in my
> script by a new one in order to explore the blast page result but it
> also don't work (see part II).

I'm not really sure what problem you might be facing there, but take a 
look at some up-to-date documentation, using the new example code:

http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html

From n.haigh at sheffield.ac.uk  Tue Oct 17 12:10:15 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 17:10:15 +0100
Subject: [Bioperl-l] [Fwd: Re: Bundle::BioPerl]
Message-ID: <45350067.6070604@sheffield.ac.uk>

FYI on Bundle::BioPerl

Nathan

-------- Original Message --------
Subject: 	Re: Bundle::BioPerl
Date: 	Tue, 17 Oct 2006 11:52:00 -0400
From: 	Chris Dagdigian <dag at sonsorol.org>
To: 	Nathan S. Haigh <n.haigh at sheffield.ac.uk>
References: 	<45348FB8.4050009 at sheffield.ac.uk>


Hi Nathan,

I've updated the Bundle and uploaded it to CPAN.

I *think* the rationale for keeping it still exists but I'm removed  
enough from Bioperl now that I'll defer to others on the decision.

The basic idea was that BioPerl has a heck of a lot of dependencies  
that it requires of (other perl modules) in order to get all the  
functionality out of it. Many of these dependencies may not be  
present in default Perl installations.  Tracking down all of the  
dependencies and installing them (along with all of the dependencies- 
of-the-dependencies) by hand is a massive pain.

The nice thing about the Bundle is that it lists the core module  
dependencies and it works great with the CPAN.pm module to automate  
the downloading and installation of everything that BioPerl requires.  
The CPAN module is smart enough that when processing *our* bundle it  
will also track down and install anything that our bundle entries  
themselves list as a dependency.

So for unix/Linux systems the Bundle is a great one-liner ("perl - 
MCPAN -e 'install Bundle::BioPerl'" )  way to auto-install or update  
the many perl modules that BioPerl makes use of.

On the windows side, not sure if it is of any help though.

Regards,
Chris


On Oct 17, 2006, at 4:09 AM, Nathan S. Haigh wrote:

> Hi Chris
>
> I've been working on making a PPD for the upcoming Bioperl 1.5.2  
> release. During this time I also updated Bundle::BioPerl to include  
> up-to-date prereqs. I was wondering if you could update the CPAN  
> package? The updated BioPerl.pm file is attached.
>
> There is some talk about why and if we need Bundle::BioPerl  
> anymore. What was the rationale for having it in the first place,  
> and does it still hold true now?
>
> Cheers
> Nath
>

From plu5even at gmail.com  Tue Oct 17 12:26:34 2006
From: plu5even at gmail.com (Peter H. Baenziger)
Date: Tue, 17 Oct 2006 12:26:34 -0400
Subject: [Bioperl-l] LocatableSeq object vs Sequence Object
Message-ID: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com>

All,
This is my first bioperl script (but not my first Perl script) so
please forgive my naivety.  I've read through documentation and looked
through cookbooks and the like but to no avail.  Any advice is
appreciated.
 So...I am working with an alignment object of several sequences.  My
intentions is to loop through all the sequences of the alignment to
find what amino acid they have at a known position in the alignment
(not the position in the sequence).  I was thinking I could use:
foreach $seq ($alignment->each_seq())
to loop through the sequences and call:
$seq->location_from_column($pos)
on each of the sequences.  However, I don't think I have
"LocatableSequences" (the type of object that has method
"location_from_columns") being returned by $alignment->each_seq().
So, how do I bridge this gap here?  Or is there a better way?
My appreciation in advance!
Peter

 code:
my $swissObj = $swissdb->get_Seq_by_acc($query);  //put several of
these in @sequenceObjects
...
my $alignFactory = Bio::Tools::Run::Alignment::Clustalw->new();
    my $alignment = $alignFactory->align(\@sequenceObjects);
    #print $alignment->overall_percentage_identity(); #works

    #now we find the "alignment position" of the mutation we have on
the human version and get the amino acid at that "alignment position"
for all seq
    my $humanSequence = $prefix."HUMAN";
    my $pos = $alignment->column_from_residue_number($humanSequence,
$aa_seqpos); #this is the "alignment position" equivalent to the
mutation position

    #we'll keep track of what amino acid each species has at the
"alignment equivalent" location listed as being a mutation on the the
human version
    foreach $seq ($alignment->each_seq())
    {
        #print $seq->species() . "\n"; #won't work because
$alignment->each_seq() actually returns a locatableSeq object, not a
normal sequence object
        $speciesAA{$species} = $seq->locatation_from_column($pos);
    }


-- 
<<->>
Peter H. Baenziger

From akarger at CGR.Harvard.edu  Tue Oct 17 12:53:19 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Tue, 17 Oct 2006 12:53:19 -0400
Subject: [Bioperl-l] split location problems
Message-ID: <B9182BFF5B004245BABC12956EA6322E018E6735@huls5.nucleus.harvard.edu>

> From: Jason Stajich [mailto:jason.stajich at gmail.com]
> 
> The whole point of split locations is to represent genes with 
> introns  
> so that is not the "rare" case.

Absolutely.

> I have processed the genbank fungal genomes into GFF3 and 
> have had no  
> problems so I'm confused where you are breaking down.  If I write  
> them out as embl I also get the correct thing.  This is using 
> the CVS  
> version of bioperl from the HEAD.
> 
> I've added code to test this to bug 2101 including a C.glabrata  
> chromsome downloaded from genbank.  Perhaps the problem is on the  
> EMBL parsing side, I didn't test that.

Well, I don't know whether it's EMBL parsing, or a bit further down the
pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968),
and it describes the complement/joins in the way that Bioperl is
handling correctly.

GenBank:
     CDS             complement(join(10347..10372,10632..11157))
                     /locus_tag="CAGL0B00242g"

EMBL:
FT   CDS
join(complement(10632..11157),complement(10347..10372))
FT                   /locus_tag="CAGL0B00242g"

Here's the diff when I run the location-printing script I posted
yesterday:

diff biogb bio
1c1,5
< complement(join(10347..10372,10632..11157))
---
> complement(1701..2651)
> complement(2635..3345)
> complement(3980..4408)
> complement(join(10632..11157,10347..10372))
> 10379..10615
209a214,217
> 498198..498890
> 499712..500062
> 499851..500702
> 500579..501364

As you can see, the complement/join CDS is written out in a different
order, which is Bad.

(I looked at at least one of the other differences: the GB file says
it's a "misc feature" and EMBL says it's a CDS. But they don't seem to
be relevant here.)

-Amir

> 
> On the technical side, I still am not sure I fully know where the  
> strand information should be stored - the top level container or the  
> sub-features.  I'll try and stay up on the discussion if 
> anything has  
> been decided that I should know about.
> 
> -jason
> 
> 
> 
> 


From paul.boutros at utoronto.ca  Tue Oct 17 12:57:19 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 12:57:19 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
Message-ID: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>

Hi,
Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed  
tests, the first seems to be just a result of me not having DBD::mysql  
installed.
Paul

Test Summary
============

Failed Test               Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
t/BioDBSeqFeature_mysql.t               46   46  1-46
t/SearchIO.t                22  5632  1337 2671  2-1337
2 tests and 106 subtests skipped.
Failed 2/236 test scripts. 1382/11688 subtests failed.
Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =  
159.61 CPU)

BioDBSeqFeature_mysql
=====================
pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
1..46
install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC  
contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t  
/db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi  
/db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at  
(eval 37) line 3.
Perhaps the DBD::mysql perl module hasn't been fully installed,
or perhaps the capitalisation of 'mysql' isn't right.
Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
  at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208

SearchIO
========
pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
1..1337
ok 1

-------------------- WARNING ---------------------
MSG: XML::SAX::Expat not currently supported; must have local copies  
of NCBI DTD docs!
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: error in parsing a report:

404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'  
does not exist  
file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
Handler couldn't resolve external entity at line 2, column 82, byte 104
error in processing external entity reference at line 2, column 82,  
byte 104 at  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line  
187

---------------------------------------------------
not ok 2
# Failed test 2 in t/SearchIO.t at line 68
Can't call method "database_name" on an undefined value at  
t/SearchIO.t line 69.

------------------------------

Message: 10
Date: Tue, 17 Oct 2006 11:32:54 +0100
From: Sendu Bala <bix at sendu.me.uk>
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
To: bioperl-l at bioperl.org
Message-ID: <4534B156.4090501 at sendu.me.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
See http://www.bioperl.org/wiki/Release_1.5.2 for
instructions on getting and testing this RC.

Developers:
    This should be the last RC before release ~next monday. Now would
    be a good time for last minute documentaiton updates and additions.

Users:
    Even though 1.5.2 is a 'developer' release, we consider it the most
    stable and capable version of Bioperl, and recommend that you use
    it in all but the most critical production environments. Please
    try it out and let us know of any problems or difficulties you run
    into.


Thank you,
Sendu.


From barry.moore at genetics.utah.edu  Tue Oct 17 12:57:48 2006
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Tue, 17 Oct 2006 10:57:48 -0600
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
Message-ID: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>

lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix

does a reasonable job of textifying html.  You get the links as  
numbered references at the bottom or:

lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |  
perl -ane 's/\[?\[\d+\](edit\])?//g;print'

to remove the links all together.

Barry

P.S.  Looks like this:

    #Creative Commons copyright

Installing Bioperl for Unix

 From BioPerl

    Jump to: navigation, search

Contents

      * 1 BIOPERL INSTALLATION
      * 2 SYSTEM REQUIREMENTS
      * 3 OPTIONAL
      * 4 ADDITIONAL INSTALLATION INFORMATION
      * 5 THE BIOPERL BUNDLE
      * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
      * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
      * 8 WHERE ARE THE MAN PAGES?
      * 9 EXTERNAL PROGRAMS
           + 9.1 Environment Variables
      * 10 INSTALLING BIOPERL SCRIPTS
      * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
      * 12 INSTALLING BIOPERL MODULES THE HARD WAY
      * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
      * 14 THE TEST SYSTEM
      * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
           + 15.1 CONFIGURING for BSD and Solaris boxes
           + 15.2 INSTALLATION
         * 16 DEPENDENCIES AND Bundle::BioPerl


BIOPERL INSTALLATION

    Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
    and on Mac OS X (see the PLATFORMS file for more details).  
Following are
    instructions  for  installing Bioperl for Unix/Linux/Mac OS X;  
Windows
    installation instructions can be found here. For installing  
Bioperl for
    Mac OS X using Fink, see Getting BioPerl.


SYSTEM REQUIREMENTS

      * Perl 5.005 or later; version 5.6 and greater are recommended.  
Note
        that most modules will work with earlier versions of Perl.  
The only ones
        that will not are Bio::SimpleAlign and the Bio::Index::*  
modules. If
        you don't need these modules and you want to install Bioperl  
using an
        earlier version of Perl, edit the "require 5.005;" line in  
Makefile.PL
        as necessary.

      * External modules: Bioperl uses functionality provided in  
other Perl
        modules. Some of these are included in the standard perl  
package but
        some  need to be obtained from the CPAN site. The list of  
external
        modules is included at the bottom of this document.

    The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of  
these
    external modules easy. Simply install the bundle using your CPAN  
shell and
    all necessary modules will be installed. See THE BIOPERL BUNDLE,  
below.


OPTIONAL

      * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
        bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
        PACKAGE, below).


ADDITIONAL INSTALLATION INFORMATION

      * Additional information on Bioperl and MAC OS:
           + OS 9 - http://bioperl.org/Core/mac-bioperl.html
           + OSX-http://www.tc.umn.edu/~cann0010/ 
Bioperl_OSX_install.html
           + OS X - Installing using Fink (in Getting BioPerl)


THE BIOPERL BUNDLE

    You typically need root privileges to install using CPAN. If you  
don't
    have these privileges please see INSTALLING BIOPERL IN A PERSONAL  
MODULE
    AREA for additional information.

    Install Bundle::Bioperl using CPAN. One way:
 >perl -MCPAN -e "install Bundle::BioPerl"

    Another way:
 >perl -MCPAN -e shell
cpan>install Bundle::BioPerl


On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:

> There isn't a very easy way since so many links have to be removed/ 
> modified.
> I have found a few CPAN modules that could help, but for now I just  
> dump the
> text output from a text browser (elinks) using the 'printable  
> version' page
> and hand-edit, which works very quickly.  That works for the time  
> being
> until I can find another more automated solution.
>
> Fortunately there have been very few edits to either INSTALL wiki  
> page so
> they should remain relatively stable.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>> -----Original Message-----
>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>> Sent: Tuesday, October 17, 2006 6:46 AM
>> To: Chris Fields
>> Cc: bioperl-l
>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>
>> Chris Fields wrote:
>>> The general consensus was to keep text versions available; we could
>>> add URL links to the wiki pages for the most up-to-dat version.   
>>> BTW,
>>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
>>> waiting for your changes).
>>>
>> Is it possible to generate these files from the wiki whenever  
>> there is a
>> release? I now edits shouldn't be too severe or too often - but I can
>> see things getting a little messy/annoying if edits have to be  
>> made in 2
>> places.
>>
>> Nath
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From niels at genomics.dk  Tue Oct 17 12:58:14 2006
From: niels at genomics.dk (Niels Larsen)
Date: Tue, 17 Oct 2006 18:58:14 +0200
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E207.8030508@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk>
	<4534E207.8030508@sendu.me.uk>
Message-ID: <45350BA6.3040102@genomics.dk>

Ok, here are ways to reproduce; I sure apologize if I made the
test scripts wrong. And I suppose EBI/DDBJ's interfaces are not
a bioperl issue really.

Niels

------------ EBI

I invoked the EBI script

http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip

like this

WSWUBlastClient.pl -p blastn -D embl test.fasta

where the content of test.fasta is below, and got

Can't find method element in the message at 
/ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

 >Planctomyces sp. 282; Genbank Taxonomy ID: 79927
AATGAACGTTGGCGGCATGGATTAGGCATGCAAGTCGAGGGAGAACCCGCAAGGGGACACCGGCG
AACGGGGTAGGAATACATAGGTAACGTACCCTCAGGACGGGGATAGCCAAGGGAAACTTTGGGTA
ATACCCGATGTGATGGCAAGATGTGAATGCTTGTCATCAAAGGTGAGATTCCACCTGAGGAGCGG
CTTATGCATCATTAGCTTGTTGGCGGGGTAACGGCCCACCAAGGCTGCGATGATTAGGGGGTGTG
AGAGCATGGCCCCCACCACTGGCACTGAGACACTGGCCAGACACCTACGGGTGGCTGCAGTCGAG

I tried with this test sequence in fasta format and with just the
sequence.

------------ DDBJ

Inspired by this page,

http://xml.nig.ac.jp/doc/Blast.txt

I made this test script

------ cut --
#!/usr/bin/env perl

use strict;
use warnings FATAL => qw ( all );

my ( $service, $seqstr, $result );

use SOAP::Lite;
use Data::Dumper;

$service = SOAP::Lite->service('http://xml.nig.ac.jp/wsdl/Blast.wsdl');

$seqstr = "MSSRIARALALVVTLLHLTRLALSTCPAACHCPLEAPKCAPGVGLVRDGCGCCKVCAKQL";

$result = $service->searchSimple( "blastp", "SWISS", $seqstr );

print Dumper( $result );
------ cut --

which for me prints undef.

------------- NCBI/Bioperl

I installed 1.5.2-RC2, looked at the RemoteBlast example in

http://www.bioperl.org/wiki/Bptutorial.pl

and then put that into this test code, more or less cut/paste,

--- cut --
#!/usr/bin/env perl

use strict;
use warnings FATAL => qw ( all );

use Bio::Tools::Run::RemoteBlast;
use Data::Dumper;

my ( $remote_blast, $r, $rc, $rid, @rids );

$remote_blast = Bio::Tools::Run::RemoteBlast->new (
                 -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' );

$r = $remote_blast->submit_blast("ecoli.fasta");

while ( @rids = $remote_blast->each_rid )
{
#    print Dumper( \@rids );

     for $rid ( @rids ) {
         $rc = $remote_blast->retrieve_blast($rid);
#        print Dumper( $rc );
     }

     sleep 10;
}
--- cut --

which saves the same blast report to TMPDIR for every 10 seconds.
The "ecoli.fasta" file contains this

 >test
gggggctctgttggttctcccgcaacgctactctgtttaccaggtcaggtccggaaggaa
gcagccaaggcagatgacgcgtgtgccgggatgtagctggcagggcccccaccc

Maybe I am supposed to add a check for content in $rc and then stop
the inner loop? I could figure that out maybe, but I wish there was a
function which simply takes a single sequence + arguments and only
returns a list of matches when done, and does not return until then
(or until a specified timeout).


------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------

From bertrand.beckert at gmail.com  Tue Oct 17 10:52:36 2006
From: bertrand.beckert at gmail.com (bertrand beckert)
Date: Tue, 17 Oct 2006 16:52:36 +0200
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
Message-ID: <500217090610170752q565cfc08t5208e3b64f99ef7f@mail.gmail.com>

hi,

I am running a large number of blasts via a connexion to ncbi blast
page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
some problems. I make a simple example with only one sequence in
order to understand how work this module. This is my simple input
file, a DNA sequence in fasta form:

>test
TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT
TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG
TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA

I have made some modification of the example available in doc of bioperl.
It give me a RID which contain the results of my blast but I have a
problem with the "$result=$factory->retrieve_blast($rid)" in my script.
In the documentation it wrote that $result=$factory->retrieve_blast
($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
object. In my case it returns a Bio::SearchIO::blast... I don't
understand why I don't have the good type of object return (see PART I).

I also try to resolve the problem by replace the foreach loop in my
script by a new one in order to explore the blast page result but it
also don't work (see part II).

could you help me please. Thank you

Bertrand Beckert.

PART I:

Here is my script with a little annotation and also the shell window
printing:
------------------------------------------------------------------------
#!/usr/bin/perl -w
use Bio::Tools::Run::RemoteBlast;
use Bio::SearchIO;
sub blast {
my $prog='blastn';
my $db='refseq_genomic';
my $e_val='1e-10';
my $Input='Seq.fasta';
my @params = ('-prog' =>  $prog, '-data' =>  $db, '-expect' =>
$e_val, '-readmethod' => 'SearchIO');
my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
#changes parameters
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25';
$factory->submit_blast($Input);
print STDERR "waiting...\n";
while (my @rids=$factory->each_rid) {
          print "my rid: ", at rids,"\n";
     #return me the ID of the submited blast i.e. RID:
1161079157-766-185099855365.BLASTQ2
     #this page contains the result of my blast...
             foreach my $rid (@rids) {
                         $result=$factory->retrieve_blast($rid);
        #line in order to understand what type of object is
return by
retrieve_blast
                 print "rc:", $result,"\n";

                        }
            }
        }

&blast;
------------------------------------------------------------------------

here you can see the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc54)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc30)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x89eb7f4)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x8a2cc74)
my rid: 1161079157-766-185099855365.BLASTQ2
...
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x886bbac)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x89eb5f0)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x8a2d2d4)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x84fa054)
...

PARTII:

I also try to resolve the problem by replace the foreach loop in my
script by:
------------------------------------------------------------------------

foreach my $rid (@rids) {
                 while(1) {
                 $result=$factory->retrieve_blast($rid)->next_result();
                 print "rc:", $result,"\n";
                 if ($result) {
                 print  $result->num_hits(),"\n";
                 }
------------------------------------------------------------------------

With tis loop I could explore the result Blast page. that is what I
obtain in the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161088606-9905-123050755601.BLASTQ4
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb834)


----
-- 
Berrtrand BECKERT
PhD student
IBMC - UPR 9002 du CNRS - ARN
15, rue Rene Descartes
F-67084 STRASBOURG Cedex
b.beckert at ibmc.u-strasbg.fr
bertrand.beckert at gmail.com

From cjfields at uiuc.edu  Tue Oct 17 13:50:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 12:50:49 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>
Message-ID: <001201c6f214$c8934440$15327e82@pyrimidine>

(Apologies for the top post, but I thought my response might get lost below)

I use elinks in a similar fashion.  It tends to format the tables a bit
better than lynx.

Chris

> -----Original Message-----
> From: Barry Moore [mailto:barry.moore at genetics.utah.edu]
> Sent: Tuesday, October 17, 2006 11:58 AM
> To: Chris Fields
> Cc: 'Nathan S. Haigh'; 'bioperl-l'
> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
> 
> does a reasonable job of textifying html.  You get the links as
> numbered references at the bottom or:
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
> 
> to remove the links all together.
> 
> Barry
> 
> P.S.  Looks like this:
> 
>     #Creative Commons copyright
> 
> Installing Bioperl for Unix
> 
>  From BioPerl
> 
>     Jump to: navigation, search
> 
> Contents
> 
>       * 1 BIOPERL INSTALLATION
>       * 2 SYSTEM REQUIREMENTS
>       * 3 OPTIONAL
>       * 4 ADDITIONAL INSTALLATION INFORMATION
>       * 5 THE BIOPERL BUNDLE
>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>       * 8 WHERE ARE THE MAN PAGES?
>       * 9 EXTERNAL PROGRAMS
>            + 9.1 Environment Variables
>       * 10 INSTALLING BIOPERL SCRIPTS
>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>       * 14 THE TEST SYSTEM
>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>            + 15.1 CONFIGURING for BSD and Solaris boxes
>            + 15.2 INSTALLATION
>          * 16 DEPENDENCIES AND Bundle::BioPerl
> 
> 
> BIOPERL INSTALLATION
> 
>     Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
>     and on Mac OS X (see the PLATFORMS file for more details).
> Following are
>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
> Windows
>     installation instructions can be found here. For installing
> Bioperl for
>     Mac OS X using Fink, see Getting BioPerl.
> 
> 
> SYSTEM REQUIREMENTS
> 
>       * Perl 5.005 or later; version 5.6 and greater are recommended.
> Note
>         that most modules will work with earlier versions of Perl.
> The only ones
>         that will not are Bio::SimpleAlign and the Bio::Index::*
> modules. If
>         you don't need these modules and you want to install Bioperl
> using an
>         earlier version of Perl, edit the "require 5.005;" line in
> Makefile.PL
>         as necessary.
> 
>       * External modules: Bioperl uses functionality provided in
> other Perl
>         modules. Some of these are included in the standard perl
> package but
>         some  need to be obtained from the CPAN site. The list of
> external
>         modules is included at the bottom of this document.
> 
>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
> these
>     external modules easy. Simply install the bundle using your CPAN
> shell and
>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
> below.
> 
> 
> OPTIONAL
> 
>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>         PACKAGE, below).
> 
> 
> 
> ADDITIONAL INSTALLATION INFORMATION
> 
>       * Additional information on Bioperl and MAC OS:
>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>            + OSX-http://www.tc.umn.edu/~cann0010/
> Bioperl_OSX_install.html
>            + OS X - Installing using Fink (in Getting BioPerl)
> 
> 
> 
> THE BIOPERL BUNDLE
> 
>     You typically need root privileges to install using CPAN. If you
> don't
>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
> MODULE
>     AREA for additional information.
> 
>     Install Bundle::Bioperl using CPAN. One way:
>  >perl -MCPAN -e "install Bundle::BioPerl"
> 
>     Another way:
>  >perl -MCPAN -e shell
> cpan>install Bundle::BioPerl
> 
> 
> 
> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
> 
> > There isn't a very easy way since so many links have to be removed/
> > modified.
> > I have found a few CPAN modules that could help, but for now I just
> > dump the
> > text output from a text browser (elinks) using the 'printable
> > version' page
> > and hand-edit, which works very quickly.  That works for the time
> > being
> > until I can find another more automated solution.
> >
> > Fortunately there have been very few edits to either INSTALL wiki
> > page so
> > they should remain relatively stable.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >> -----Original Message-----
> >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
> >> Sent: Tuesday, October 17, 2006 6:46 AM
> >> To: Chris Fields
> >> Cc: bioperl-l
> >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> >>
> >> Chris Fields wrote:
> >>> The general consensus was to keep text versions available; we could
> >>> add URL links to the wiki pages for the most up-to-dat version.
> >>> BTW,
> >>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
> >>> waiting for your changes).
> >>>
> >> Is it possible to generate these files from the wiki whenever
> >> there is a
> >> release? I now edits shouldn't be too severe or too often - but I can
> >> see things getting a little messy/annoying if edits have to be
> >> made in 2
> >> places.
> >>
> >> Nath
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Oct 17 13:52:36 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 12:52:36 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
Message-ID: <001301c6f215$07a9a070$15327e82@pyrimidine>

What do you get when you run the SearchIO.t test by itself using 'perl -I.
t/SearchIO.t'?  It looks like something pretty catastrophic happened.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
> Sent: Tuesday, October 17, 2006 11:57 AM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
> 
> Hi,
> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
> tests, the first seems to be just a result of me not having DBD::mysql
> installed.
> Paul
> 
> Test Summary
> ============
> 
> Failed Test               Stat Wstat Total Fail  List of Failed
> --------------------------------------------------------------------------
> -----
> t/BioDBSeqFeature_mysql.t               46   46  1-46
> t/SearchIO.t                22  5632  1337 2671  2-1337
> 2 tests and 106 subtests skipped.
> Failed 2/236 test scripts. 1382/11688 subtests failed.
> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
> 159.61 CPU)
> 
> BioDBSeqFeature_mysql
> =====================
> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
> 1..46
> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
> (eval 37) line 3.
> Perhaps the DBD::mysql perl module hasn't been fully installed,
> or perhaps the capitalisation of 'mysql' isn't right.
> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
> 
> SearchIO
> ========
> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
> 1..1337
> ok 1
> 
> -------------------- WARNING ---------------------
> MSG: XML::SAX::Expat not currently supported; must have local copies
> of NCBI DTD docs!
> ---------------------------------------------------
> 
> -------------------- WARNING ---------------------
> MSG: error in parsing a report:
> 
> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> does not exist
> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> Handler couldn't resolve external entity at line 2, column 82, byte 104
> error in processing external entity reference at line 2, column 82,
> byte 104 at
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> 187
> 
> ---------------------------------------------------
> not ok 2
> # Failed test 2 in t/SearchIO.t at line 68
> Can't call method "database_name" on an undefined value at
> t/SearchIO.t line 69.
> 
> ------------------------------
> 
> Message: 10
> Date: Tue, 17 Oct 2006 11:32:54 +0100
> From: Sendu Bala <bix at sendu.me.uk>
> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
> To: bioperl-l at bioperl.org
> Message-ID: <4534B156.4090501 at sendu.me.uk>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> See http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
> 
> Developers:
>     This should be the last RC before release ~next monday. Now would
>     be a good time for last minute documentaiton updates and additions.
> 
> Users:
>     Even though 1.5.2 is a 'developer' release, we consider it the most
>     stable and capable version of Bioperl, and recommend that you use
>     it in all but the most critical production environments. Please
>     try it out and let us know of any problems or difficulties you run
>     into.
> 
> 
> Thank you,
> Sendu.
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paul.boutros at utoronto.ca  Tue Oct 17 13:59:33 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 13:59:33 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
Message-ID: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca>

Hi Chris,

Here it is:
pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
1..1337
ok 1

-------------------- WARNING ---------------------
MSG: XML::SAX::Expat not currently supported; must have local copies  
of NCBI DTD docs!
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: error in parsing a report:

404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'  
does not exist  
file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
Handler couldn't resolve external entity at line 2, column 82, byte 104
error in processing external entity reference at line 2, column 82,  
byte 104 at  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line  
187

---------------------------------------------------
not ok 2
# Failed test 2 in t/SearchIO.t at line 68
Can't call method "database_name" on an undefined value at  
t/SearchIO.t line 69.


Quoting Chris Fields <cjfields at uiuc.edu>:

> What do you get when you run the SearchIO.t test by itself using 'perl -I.
> t/SearchIO.t'?  It looks like something pretty catastrophic happened.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>> Sent: Tuesday, October 17, 2006 11:57 AM
>> To: bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>>
>> Hi,
>> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
>> tests, the first seems to be just a result of me not having DBD::mysql
>> installed.
>> Paul
>>
>> Test Summary
>> ============
>>
>> Failed Test               Stat Wstat Total Fail  List of Failed
>> --------------------------------------------------------------------------
>> -----
>> t/BioDBSeqFeature_mysql.t               46   46  1-46
>> t/SearchIO.t                22  5632  1337 2671  2-1337
>> 2 tests and 106 subtests skipped.
>> Failed 2/236 test scripts. 1382/11688 subtests failed.
>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
>> 159.61 CPU)
>>
>> BioDBSeqFeature_mysql
>> =====================
>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>> 1..46
>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
>> (eval 37) line 3.
>> Perhaps the DBD::mysql perl module hasn't been fully installed,
>> or perhaps the capitalisation of 'mysql' isn't right.
>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>>
>> SearchIO
>> ========
>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>> 1..1337
>> ok 1
>>
>> -------------------- WARNING ---------------------
>> MSG: XML::SAX::Expat not currently supported; must have local copies
>> of NCBI DTD docs!
>> ---------------------------------------------------
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>
>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> does not exist
>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> error in processing external entity reference at line 2, column 82,
>> byte 104 at
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> 187
>>
>> ---------------------------------------------------
>> not ok 2
>> # Failed test 2 in t/SearchIO.t at line 68
>> Can't call method "database_name" on an undefined value at
>> t/SearchIO.t line 69.
>>
>> ------------------------------
>>
>> Message: 10
>> Date: Tue, 17 Oct 2006 11:32:54 +0100
>> From: Sendu Bala <bix at sendu.me.uk>
>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>> To: bioperl-l at bioperl.org
>> Message-ID: <4534B156.4090501 at sendu.me.uk>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
>> See http://www.bioperl.org/wiki/Release_1.5.2 for
>> instructions on getting and testing this RC.
>>
>> Developers:
>>     This should be the last RC before release ~next monday. Now would
>>     be a good time for last minute documentaiton updates and additions.
>>
>> Users:
>>     Even though 1.5.2 is a 'developer' release, we consider it the most
>>     stable and capable version of Bioperl, and recommend that you use
>>     it in all but the most critical production environments. Please
>>     try it out and let us know of any problems or difficulties you run
>>     into.
>>
>>
>> Thank you,
>> Sendu.
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From barry.moore at genetics.utah.edu  Tue Oct 17 14:07:12 2006
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Tue, 17 Oct 2006 12:07:12 -0600
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <C15A8DE6.AD40%bosborne11@verizon.net>
References: <C15A8DE6.AD40%bosborne11@verizon.net>
Message-ID: <588DE26B-8F18-4540-BAEE-2B479CBDE8B3@genetics.utah.edu>

In fact, I think it was you who taught me that trick in the first place.

B

On Oct 17, 2006, at 11:40 AM, Brian Osborne wrote:

> Barry,
>
> I second that. lynx does the best job of converting HTML to text  
> I've seen.
>
> Brian O.
>
>
> On 10/17/06 12:57 PM, "Barry Moore" <barry.moore at genetics.utah.edu>  
> wrote:
>
>> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
>>
>> does a reasonable job of textifying html.  You get the links as
>> numbered references at the bottom or:
>>
>> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
>> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
>>
>> to remove the links all together.
>>
>> Barry
>>
>> P.S.  Looks like this:
>>
>>     #Creative Commons copyright
>>
>> Installing Bioperl for Unix
>>
>>  From BioPerl
>>
>>     Jump to: navigation, search
>>
>> Contents
>>
>>       * 1 BIOPERL INSTALLATION
>>       * 2 SYSTEM REQUIREMENTS
>>       * 3 OPTIONAL
>>       * 4 ADDITIONAL INSTALLATION INFORMATION
>>       * 5 THE BIOPERL BUNDLE
>>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>>       * 8 WHERE ARE THE MAN PAGES?
>>       * 9 EXTERNAL PROGRAMS
>>            + 9.1 Environment Variables
>>       * 10 INSTALLING BIOPERL SCRIPTS
>>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>>       * 14 THE TEST SYSTEM
>>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>>            + 15.1 CONFIGURING for BSD and Solaris boxes
>>            + 15.2 INSTALLATION
>>          * 16 DEPENDENCIES AND Bundle::BioPerl
>>
>>
>> BIOPERL INSTALLATION
>>
>>     Bioperl has been installed on many forms of Unix, Win9X/NT/ 
>> 2000/XP,
>>     and on Mac OS X (see the PLATFORMS file for more details).
>> Following are
>>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
>> Windows
>>     installation instructions can be found here. For installing
>> Bioperl for
>>     Mac OS X using Fink, see Getting BioPerl.
>>
>>
>> SYSTEM REQUIREMENTS
>>
>>       * Perl 5.005 or later; version 5.6 and greater are recommended.
>> Note
>>         that most modules will work with earlier versions of Perl.
>> The only ones
>>         that will not are Bio::SimpleAlign and the Bio::Index::*
>> modules. If
>>         you don't need these modules and you want to install Bioperl
>> using an
>>         earlier version of Perl, edit the "require 5.005;" line in
>> Makefile.PL
>>         as necessary.
>>
>>       * External modules: Bioperl uses functionality provided in
>> other Perl
>>         modules. Some of these are included in the standard perl
>> package but
>>         some  need to be obtained from the CPAN site. The list of
>> external
>>         modules is included at the bottom of this document.
>>
>>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
>> these
>>     external modules easy. Simply install the bundle using your CPAN
>> shell and
>>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
>> below.
>>
>>
>> OPTIONAL
>>
>>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions  
>> (the
>>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>>         PACKAGE, below).
>>
>>
>>
>> ADDITIONAL INSTALLATION INFORMATION
>>
>>       * Additional information on Bioperl and MAC OS:
>>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>>            + OSX-http://www.tc.umn.edu/~cann0010/
>> Bioperl_OSX_install.html
>>            + OS X - Installing using Fink (in Getting BioPerl)
>>
>>
>>
>> THE BIOPERL BUNDLE
>>
>>     You typically need root privileges to install using CPAN. If you
>> don't
>>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
>> MODULE
>>     AREA for additional information.
>>
>>     Install Bundle::Bioperl using CPAN. One way:
>>> perl -MCPAN -e "install Bundle::BioPerl"
>>
>>     Another way:
>>> perl -MCPAN -e shell
>> cpan>install Bundle::BioPerl
>>
>>
>>
>> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
>>
>>> There isn't a very easy way since so many links have to be removed/
>>> modified.
>>> I have found a few CPAN modules that could help, but for now I just
>>> dump the
>>> text output from a text browser (elinks) using the 'printable
>>> version' page
>>> and hand-edit, which works very quickly.  That works for the time
>>> being
>>> until I can find another more automated solution.
>>>
>>> Fortunately there have been very few edits to either INSTALL wiki
>>> page so
>>> they should remain relatively stable.
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher - Switzer Lab
>>> Dept. of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>> -----Original Message-----
>>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>>>> Sent: Tuesday, October 17, 2006 6:46 AM
>>>> To: Chris Fields
>>>> Cc: bioperl-l
>>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>>>
>>>> Chris Fields wrote:
>>>>> The general consensus was to keep text versions available; we  
>>>>> could
>>>>> add URL links to the wiki pages for the most up-to-dat version.
>>>>> BTW,
>>>>> I have modified INSTALL already.  INSTALL.WIN is next in line  
>>>>> (I was
>>>>> waiting for your changes).
>>>>>
>>>> Is it possible to generate these files from the wiki whenever
>>>> there is a
>>>> release? I now edits shouldn't be too severe or too often - but  
>>>> I can
>>>> see things getting a little messy/annoying if edits have to be
>>>> made in 2
>>>> places.
>>>>
>>>> Nath
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From bix at sendu.me.uk  Tue Oct 17 14:07:04 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 19:07:04 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
References: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
Message-ID: <45351BC8.9080507@sendu.me.uk>

Paul Boutros wrote:
> Hi,
> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed  
> tests, the first seems to be just a result of me not having DBD::mysql  
> installed.
[snip]

Thanks for those, very useful. Not something that's come up before 
afaik; I'll look into them.

From cjfields at uiuc.edu  Tue Oct 17 14:31:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 13:31:51 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca>
Message-ID: <001401c6f21a$836f9fc0$15327e82@pyrimidine>

Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX
backend parser.  For some reason BLAST XML parsing doesn't work with that
parser (it tries to verify the XML first before parsing, hence the DTD
error).  I may try getting this to work again, but so far I haven't found an
easy way to prevent XML verification via XML::SAX::Expat.

There are two options: 1) install XML::SAX::ExpatXS (the better option),
which works AND is 4x faster than XML::SAX::Expat, or  2) set the default
parser in the PareserDetails.ini file in your local to use
XML::SAX::PurePerl.  

BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just
hasn't officially happened yet); the latter hasn't had significant
development in about three years.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
> Sent: Tuesday, October 17, 2006 1:00 PM
> To: Chris Fields
> Cc: bioperl-l at lists.open-bio.org
> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
> 
> Hi Chris,
> 
> Here it is:
> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
> 1..1337
> ok 1
> 
> -------------------- WARNING ---------------------
> MSG: XML::SAX::Expat not currently supported; must have local copies
> of NCBI DTD docs!
> ---------------------------------------------------
> 
> -------------------- WARNING ---------------------
> MSG: error in parsing a report:
> 
> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> does not exist
> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> Handler couldn't resolve external entity at line 2, column 82, byte 104
> error in processing external entity reference at line 2, column 82,
> byte 104 at
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> 187
> 
> ---------------------------------------------------
> not ok 2
> # Failed test 2 in t/SearchIO.t at line 68
> Can't call method "database_name" on an undefined value at
> t/SearchIO.t line 69.
> 
> 
> Quoting Chris Fields <cjfields at uiuc.edu>:
> 
> > What do you get when you run the SearchIO.t test by itself using 'perl -
> I.
> > t/SearchIO.t'?  It looks like something pretty catastrophic happened.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
> >> Sent: Tuesday, October 17, 2006 11:57 AM
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
> >>
> >> Hi,
> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
> >> tests, the first seems to be just a result of me not having DBD::mysql
> >> installed.
> >> Paul
> >>
> >> Test Summary
> >> ============
> >>
> >> Failed Test               Stat Wstat Total Fail  List of Failed
> >> -----------------------------------------------------------------------
> ---
> >> -----
> >> t/BioDBSeqFeature_mysql.t               46   46  1-46
> >> t/SearchIO.t                22  5632  1337 2671  2-1337
> >> 2 tests and 106 subtests skipped.
> >> Failed 2/236 test scripts. 1382/11688 subtests failed.
> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
> >> 159.61 CPU)
> >>
> >> BioDBSeqFeature_mysql
> >> =====================
> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
> >> 1..46
> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
> >> (eval 37) line 3.
> >> Perhaps the DBD::mysql perl module hasn't been fully installed,
> >> or perhaps the capitalisation of 'mysql' isn't right.
> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
> >>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
> >>
> >> SearchIO
> >> ========
> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
> >> 1..1337
> >> ok 1
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: XML::SAX::Expat not currently supported; must have local copies
> >> of NCBI DTD docs!
> >> ---------------------------------------------------
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: error in parsing a report:
> >>
> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> >> does not exist
> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> >> Handler couldn't resolve external entity at line 2, column 82, byte 104
> >> error in processing external entity reference at line 2, column 82,
> >> byte 104 at
> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> >> 187
> >>
> >> ---------------------------------------------------
> >> not ok 2
> >> # Failed test 2 in t/SearchIO.t at line 68
> >> Can't call method "database_name" on an undefined value at
> >> t/SearchIO.t line 69.
> >>
> >> ------------------------------
> >>
> >> Message: 10
> >> Date: Tue, 17 Oct 2006 11:32:54 +0100
> >> From: Sendu Bala <bix at sendu.me.uk>
> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
> >> To: bioperl-l at bioperl.org
> >> Message-ID: <4534B156.4090501 at sendu.me.uk>
> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>
> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> >> See http://www.bioperl.org/wiki/Release_1.5.2 for
> >> instructions on getting and testing this RC.
> >>
> >> Developers:
> >>     This should be the last RC before release ~next monday. Now would
> >>     be a good time for last minute documentaiton updates and additions.
> >>
> >> Users:
> >>     Even though 1.5.2 is a 'developer' release, we consider it the most
> >>     stable and capable version of Bioperl, and recommend that you use
> >>     it in all but the most critical production environments. Please
> >>     try it out and let us know of any problems or difficulties you run
> >>     into.
> >>
> >>
> >> Thank you,
> >> Sendu.
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> 


From cjfields at uiuc.edu  Tue Oct 17 15:05:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 14:05:59 -0500
Subject: [Bioperl-l] split location problems
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E018E6735@huls5.nucleus.harvard.edu>
Message-ID: <001b01c6f21f$48640420$15327e82@pyrimidine>

> > From: Jason Stajich [mailto:jason.stajich at gmail.com]
> >
> > The whole point of split locations is to represent genes with
> > introns
> > so that is not the "rare" case.
> 
> Absolutely.

Right, but that specific kind of join statement is not commonly used  in
GenBank files, which seems to be the format predominately used (no offense
to EBI).  This may explain why we haven't seen this pop up more often.  

I believe we're seeing is a difference in the way these locations are
described at NCBI vs EBI, which Nadeem Faruque seems to corroborate.  He
indicated that EBI may move to using similar GenBank-like location strings.
Regardless, FTlocationFactory and Bio::Location::Split should handle both if
they are present but only seems to like the GenBank version.

> > I've added code to test this to bug 2101 including a C.glabrata
> > chromsome downloaded from genbank.  Perhaps the problem is on the
> > EMBL parsing side, I didn't test that.
> 
> Well, I don't know whether it's EMBL parsing, or a bit further down the
> pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968),
> and it describes the complement/joins in the way that Bioperl is
> handling correctly.
> 
> GenBank:
>      CDS             complement(join(10347..10372,10632..11157))
>                      /locus_tag="CAGL0B00242g"
> 
> EMBL:
> FT   CDS
> join(complement(10632..11157),complement(10347..10372))
> FT                   /locus_tag="CAGL0B00242g"

Yes, something that I found out independently (and corroborated by Nadeem).

> Here's the diff when I run the location-printing script I posted
> yesterday:
> 
> diff biogb bio
> 1c1,5
> < complement(join(10347..10372,10632..11157))
> ---
> > complement(1701..2651)
> > complement(2635..3345)
> > complement(3980..4408)
> > complement(join(10632..11157,10347..10372))
> > 10379..10615
> 209a214,217
> > 498198..498890
> > 499712..500062
> > 499851..500702
> > 500579..501364
> 
> As you can see, the complement/join CDS is written out in a different
> order, which is Bad.

I think this can be handled directly in to_FTstring().  I'll have to add a
method to get the strand info from the Split object w/o going through
strand().  

However, I'm thinking about trying a different tact which is a bit simpler
and, if it proves fruitful, may simplify Split locations somewhat.  It won't
be ready for 1.5.2 but maybe the next release.

> (I looked at at least one of the other differences: the GB file says
> it's a "misc feature" and EMBL says it's a CDS. But they don't seem to
> be relevant here.)
> -Amir

Probably not but something to keep in mind.
 
-c

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From er at xs4all.nl  Tue Oct 17 15:01:48 2006
From: er at xs4all.nl (Erikjan)
Date: Tue, 17 Oct 2006 21:01:48 +0200 (CEST)
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
Message-ID: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>

Hello,

I noticed a little problem with the Annotation "DBLink" from GenBank entries

When I run:

perl -MBio::DB::GenBank -e 'my $gi =
56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
$db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
$ac=$seq->annotation(); my @annotations = $ac->get_Annotations("dblink");
for(@annotations) { print $_, "\n";} print $INC{
"Bio/Annotation/DBLink.pm" }, "\n"; '

This yields:

   GenBank:AL591065.17.17

and the place where the used Bio/Annotation/DBLink.pm resides.

Can others repeat this?

I have dug into the source a little and Bio::Annotation::DBLink seems to
be the place where this happens: it has a concatenation which leads to
that repeated version number.

It this something that I should fix "client-side", so to speak, or is it
worthwhile to add some logic to that concatenation to prevent this?


Thanks,

Eric


From bosborne11 at verizon.net  Tue Oct 17 13:40:54 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 17 Oct 2006 13:40:54 -0400
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>
Message-ID: <C15A8DE6.AD40%bosborne11@verizon.net>

Barry,

I second that. lynx does the best job of converting HTML to text I've seen.

Brian O.


On 10/17/06 12:57 PM, "Barry Moore" <barry.moore at genetics.utah.edu> wrote:

> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
> 
> does a reasonable job of textifying html.  You get the links as
> numbered references at the bottom or:
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
> 
> to remove the links all together.
> 
> Barry
> 
> P.S.  Looks like this:
> 
>     #Creative Commons copyright
> 
> Installing Bioperl for Unix
> 
>  From BioPerl
> 
>     Jump to: navigation, search
> 
> Contents
> 
>       * 1 BIOPERL INSTALLATION
>       * 2 SYSTEM REQUIREMENTS
>       * 3 OPTIONAL
>       * 4 ADDITIONAL INSTALLATION INFORMATION
>       * 5 THE BIOPERL BUNDLE
>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>       * 8 WHERE ARE THE MAN PAGES?
>       * 9 EXTERNAL PROGRAMS
>            + 9.1 Environment Variables
>       * 10 INSTALLING BIOPERL SCRIPTS
>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>       * 14 THE TEST SYSTEM
>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>            + 15.1 CONFIGURING for BSD and Solaris boxes
>            + 15.2 INSTALLATION
>          * 16 DEPENDENCIES AND Bundle::BioPerl
> 
> 
> BIOPERL INSTALLATION
> 
>     Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
>     and on Mac OS X (see the PLATFORMS file for more details).
> Following are
>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
> Windows
>     installation instructions can be found here. For installing
> Bioperl for
>     Mac OS X using Fink, see Getting BioPerl.
> 
> 
> SYSTEM REQUIREMENTS
> 
>       * Perl 5.005 or later; version 5.6 and greater are recommended.
> Note
>         that most modules will work with earlier versions of Perl.
> The only ones
>         that will not are Bio::SimpleAlign and the Bio::Index::*
> modules. If
>         you don't need these modules and you want to install Bioperl
> using an
>         earlier version of Perl, edit the "require 5.005;" line in
> Makefile.PL
>         as necessary.
> 
>       * External modules: Bioperl uses functionality provided in
> other Perl
>         modules. Some of these are included in the standard perl
> package but
>         some  need to be obtained from the CPAN site. The list of
> external
>         modules is included at the bottom of this document.
> 
>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
> these
>     external modules easy. Simply install the bundle using your CPAN
> shell and
>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
> below.
> 
> 
> OPTIONAL
> 
>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>         PACKAGE, below).
> 
> 
> 
> ADDITIONAL INSTALLATION INFORMATION
> 
>       * Additional information on Bioperl and MAC OS:
>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>            + OSX-http://www.tc.umn.edu/~cann0010/
> Bioperl_OSX_install.html
>            + OS X - Installing using Fink (in Getting BioPerl)
> 
> 
> 
> THE BIOPERL BUNDLE
> 
>     You typically need root privileges to install using CPAN. If you
> don't
>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
> MODULE
>     AREA for additional information.
> 
>     Install Bundle::Bioperl using CPAN. One way:
>> perl -MCPAN -e "install Bundle::BioPerl"
> 
>     Another way:
>> perl -MCPAN -e shell
> cpan>install Bundle::BioPerl
> 
> 
> 
> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
> 
>> There isn't a very easy way since so many links have to be removed/
>> modified.
>> I have found a few CPAN modules that could help, but for now I just
>> dump the
>> text output from a text browser (elinks) using the 'printable
>> version' page
>> and hand-edit, which works very quickly.  That works for the time
>> being
>> until I can find another more automated solution.
>> 
>> Fortunately there have been very few edits to either INSTALL wiki
>> page so
>> they should remain relatively stable.
>> 
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>> 
>>> -----Original Message-----
>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>>> Sent: Tuesday, October 17, 2006 6:46 AM
>>> To: Chris Fields
>>> Cc: bioperl-l
>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>> 
>>> Chris Fields wrote:
>>>> The general consensus was to keep text versions available; we could
>>>> add URL links to the wiki pages for the most up-to-dat version.
>>>> BTW,
>>>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
>>>> waiting for your changes).
>>>> 
>>> Is it possible to generate these files from the wiki whenever
>>> there is a
>>> release? I now edits shouldn't be too severe or too often - but I can
>>> see things getting a little messy/annoying if edits have to be
>>> made in 2
>>> places.
>>> 
>>> Nath
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Oct 17 16:30:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 15:30:15 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
Message-ID: <0FB91820-B2A1-4F7F-866C-8D4791DD8306@uiuc.edu>

I can confirm this using bioperl-live:

GenBank:AL591065.17.17
/Users/cjfields/src/bioperl-live/Bio/Annotation/DBLink.pm

Could you file a bug report via bugzilla?

Chris

On Oct 17, 2006, at 2:01 PM, Erikjan wrote:

> Hello,
>
> I noticed a little problem with the Annotation "DBLink" from  
> GenBank entries
>
> When I run:
>
> perl -MBio::DB::GenBank -e 'my $gi =
> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations 
> ("dblink");
> for(@annotations) { print $_, "\n";} print $INC{
> "Bio/Annotation/DBLink.pm" }, "\n"; '
>
> This yields:
>
>    GenBank:AL591065.17.17
>
> and the place where the used Bio/Annotation/DBLink.pm resides.
>
> Can others repeat this?
>
> I have dug into the source a little and Bio::Annotation::DBLink  
> seems to
> be the place where this happens: it has a concatenation which leads to
> that repeated version number.
>
> It this something that I should fix "client-side", so to speak, or  
> is it
> worthwhile to add some logic to that concatenation to prevent this?
>
>
> Thanks,
>
> Eric
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From paul.boutros at utoronto.ca  Tue Oct 17 19:49:52 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 19:49:52 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
Message-ID: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>

Hi Chris,

Yup, that's it.  I installed XML::SAX::ExpatXS (make test output  
below).  Should there be a note somewhere in the INSTALL docs saying  
basically what you just wrote?  Or maybe it's already there somewhere  
and I missed it.

Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks  
if DBD::mysql can be loaded, and if not doesn't run the test.  Since  
the file is only one-line long, here's the modified file rather than a  
patch:
################################################################
BEGIN {
         # DBD::mysql is required
         eval {
                 require DBD::mysql;
                 };
         if ( $@ ) {
                 use Test::More skip_all => "DBD::mysql is not  
installed or is installed incorrectly - skipping BioDBSeqFeature
_mysql.t";
                 exit(0);
                 }
         }

system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1  
-dsn test";
################################################################

And when I run it I get:
t/BioDBSeqFeature_mysql......skipped
         all skipped: DBD::mysql is not installed or is installed  
incorrectly - skipping BioDBSeqFeature_mysql.t

And for the overall make test:
All tests successful, 3 tests and 106 subtests skipped.
Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys =  
164.24 CPU)

Hope this helps,
Paul


Quoting Chris Fields <cjfields at uiuc.edu>:

> Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX
> backend parser.  For some reason BLAST XML parsing doesn't work with that
> parser (it tries to verify the XML first before parsing, hence the DTD
> error).  I may try getting this to work again, but so far I haven't found an
> easy way to prevent XML verification via XML::SAX::Expat.
>
> There are two options: 1) install XML::SAX::ExpatXS (the better option),
> which works AND is 4x faster than XML::SAX::Expat, or  2) set the default
> parser in the PareserDetails.ini file in your local to use
> XML::SAX::PurePerl.
>
> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just
> hasn't officially happened yet); the latter hasn't had significant
> development in about three years.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>> -----Original Message-----
>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
>> Sent: Tuesday, October 17, 2006 1:00 PM
>> To: Chris Fields
>> Cc: bioperl-l at lists.open-bio.org
>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
>>
>> Hi Chris,
>>
>> Here it is:
>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
>> 1..1337
>> ok 1
>>
>> -------------------- WARNING ---------------------
>> MSG: XML::SAX::Expat not currently supported; must have local copies
>> of NCBI DTD docs!
>> ---------------------------------------------------
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>
>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> does not exist
>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> error in processing external entity reference at line 2, column 82,
>> byte 104 at
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> 187
>>
>> ---------------------------------------------------
>> not ok 2
>> # Failed test 2 in t/SearchIO.t at line 68
>> Can't call method "database_name" on an undefined value at
>> t/SearchIO.t line 69.
>>
>>
>> Quoting Chris Fields <cjfields at uiuc.edu>:
>>
>> > What do you get when you run the SearchIO.t test by itself using 'perl -
>> I.
>> > t/SearchIO.t'?  It looks like something pretty catastrophic happened.
>> >
>> > Christopher Fields
>> > Postdoctoral Researcher - Switzer Lab
>> > Dept. of Biochemistry
>> > University of Illinois Urbana-Champaign
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>> >> Sent: Tuesday, October 17, 2006 11:57 AM
>> >> To: bioperl-l at lists.open-bio.org
>> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>> >>
>> >> Hi,
>> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
>> >> tests, the first seems to be just a result of me not having DBD::mysql
>> >> installed.
>> >> Paul
>> >>
>> >> Test Summary
>> >> ============
>> >>
>> >> Failed Test               Stat Wstat Total Fail  List of Failed
>> >> -----------------------------------------------------------------------
>> ---
>> >> -----
>> >> t/BioDBSeqFeature_mysql.t               46   46  1-46
>> >> t/SearchIO.t                22  5632  1337 2671  2-1337
>> >> 2 tests and 106 subtests skipped.
>> >> Failed 2/236 test scripts. 1382/11688 subtests failed.
>> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
>> >> 159.61 CPU)
>> >>
>> >> BioDBSeqFeature_mysql
>> >> =====================
>> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>> >> 1..46
>> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
>> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
>> >> (eval 37) line 3.
>> >> Perhaps the DBD::mysql perl module hasn't been fully installed,
>> >> or perhaps the capitalisation of 'mysql' isn't right.
>> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>> >>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>> >>
>> >> SearchIO
>> >> ========
>> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>> >> 1..1337
>> >> ok 1
>> >>
>> >> -------------------- WARNING ---------------------
>> >> MSG: XML::SAX::Expat not currently supported; must have local copies
>> >> of NCBI DTD docs!
>> >> ---------------------------------------------------
>> >>
>> >> -------------------- WARNING ---------------------
>> >> MSG: error in parsing a report:
>> >>
>> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> >> does not exist
>> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> >> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> >> error in processing external entity reference at line 2, column 82,
>> >> byte 104 at
>> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> >> 187
>> >>
>> >> ---------------------------------------------------
>> >> not ok 2
>> >> # Failed test 2 in t/SearchIO.t at line 68
>> >> Can't call method "database_name" on an undefined value at
>> >> t/SearchIO.t line 69.
>> >>
>> >> ------------------------------
>> >>
>> >> Message: 10
>> >> Date: Tue, 17 Oct 2006 11:32:54 +0100
>> >> From: Sendu Bala <bix at sendu.me.uk>
>> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>> >> To: bioperl-l at bioperl.org
>> >> Message-ID: <4534B156.4090501 at sendu.me.uk>
>> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>> >>
>> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
>> >> See http://www.bioperl.org/wiki/Release_1.5.2 for
>> >> instructions on getting and testing this RC.
>> >>
>> >> Developers:
>> >>     This should be the last RC before release ~next monday. Now would
>> >>     be a good time for last minute documentaiton updates and additions.
>> >>
>> >> Users:
>> >>     Even though 1.5.2 is a 'developer' release, we consider it the most
>> >>     stable and capable version of Bioperl, and recommend that you use
>> >>     it in all but the most critical production environments. Please
>> >>     try it out and let us know of any problems or difficulties you run
>> >>     into.
>> >>
>> >>
>> >> Thank you,
>> >> Sendu.
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >
>> >
>>
>
>
>


From cjfields at uiuc.edu  Tue Oct 17 20:51:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 19:51:35 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
	<20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
Message-ID: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>

On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote:

> Hi Chris,
>
> Yup, that's it.  I installed XML::SAX::ExpatXS (make test output
> below).  Should there be a note somewhere in the INSTALL docs saying
> basically what you just wrote?  Or maybe it's already there somewhere
> and I missed it.

The INSTALL docs should have this, yes.  I'll double-check though.

Pretty much anything that plugs into XML::SAX except XML::SAX::Expat  
works (XML::LibXML also works, I found).

> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks
> if DBD::mysql can be loaded, and if not doesn't run the test.  Since
> the file is only one-line long, here's the modified file rather than a
> patch:
> ################################################################
> BEGIN {
>          # DBD::mysql is required
>          eval {
>                  require DBD::mysql;
>                  };
>          if ( $@ ) {
>                  use Test::More skip_all => "DBD::mysql is not
> installed or is installed incorrectly - skipping BioDBSeqFeature
> _mysql.t";
>                  exit(0);
>                  }
>          }
>
> system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1
> -dsn test";
> ################################################################
>
> And when I run it I get:
> t/BioDBSeqFeature_mysql......skipped
>          all skipped: DBD::mysql is not installed or is installed
> incorrectly - skipping BioDBSeqFeature_mysql.t
>
> And for the overall make test:
> All tests successful, 3 tests and 106 subtests skipped.
> Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys =
> 164.24 CPU)

It should check this when using 'perl Makefile.PL', since the tests  
are only set up if MySQL is present (so you would assume that it  
checks for DBD::mysql).  I'll look into it.

Chris

> Hope this helps,
> Paul
>
>
> Quoting Chris Fields <cjfields at uiuc.edu>:
>
>> Your local copy of XML::SAX has XML::SAX::Expat set as the default  
>> SAX
>> backend parser.  For some reason BLAST XML parsing doesn't work  
>> with that
>> parser (it tries to verify the XML first before parsing, hence the  
>> DTD
>> error).  I may try getting this to work again, but so far I  
>> haven't found an
>> easy way to prevent XML verification via XML::SAX::Expat.
>>
>> There are two options: 1) install XML::SAX::ExpatXS (the better  
>> option),
>> which works AND is 4x faster than XML::SAX::Expat, or  2) set the  
>> default
>> parser in the PareserDetails.ini file in your local to use
>> XML::SAX::PurePerl.
>>
>> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it  
>> just
>> hasn't officially happened yet); the latter hasn't had significant
>> development in about three years.
>>
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>> -----Original Message-----
>>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
>>> Sent: Tuesday, October 17, 2006 1:00 PM
>>> To: Chris Fields
>>> Cc: bioperl-l at lists.open-bio.org
>>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
>>>
>>> Hi Chris,
>>>
>>> Here it is:
>>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
>>> 1..1337
>>> ok 1
>>>
>>> -------------------- WARNING ---------------------
>>> MSG: XML::SAX::Expat not currently supported; must have local copies
>>> of NCBI DTD docs!
>>> ---------------------------------------------------
>>>
>>> -------------------- WARNING ---------------------
>>> MSG: error in parsing a report:
>>>
>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>>> does not exist
>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>>> Handler couldn't resolve external entity at line 2, column 82,  
>>> byte 104
>>> error in processing external entity reference at line 2, column 82,
>>> byte 104 at
>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm  
>>> line
>>> 187
>>>
>>> ---------------------------------------------------
>>> not ok 2
>>> # Failed test 2 in t/SearchIO.t at line 68
>>> Can't call method "database_name" on an undefined value at
>>> t/SearchIO.t line 69.
>>>
>>>
>>> Quoting Chris Fields <cjfields at uiuc.edu>:
>>>
>>>> What do you get when you run the SearchIO.t test by itself using  
>>>> 'perl -
>>> I.
>>>> t/SearchIO.t'?  It looks like something pretty catastrophic  
>>>> happened.
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher - Switzer Lab
>>>> Dept. of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>>>>> Sent: Tuesday, October 17, 2006 11:57 AM
>>>>> To: bioperl-l at lists.open-bio.org
>>>>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>>>>>
>>>>> Hi,
>>>>> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two  
>>>>> failed
>>>>> tests, the first seems to be just a result of me not having  
>>>>> DBD::mysql
>>>>> installed.
>>>>> Paul
>>>>>
>>>>> Test Summary
>>>>> ============
>>>>>
>>>>> Failed Test               Stat Wstat Total Fail  List of Failed
>>>>> ------------------------------------------------------------------ 
>>>>> -----
>>> ---
>>>>> -----
>>>>> t/BioDBSeqFeature_mysql.t               46   46  1-46
>>>>> t/SearchIO.t                22  5632  1337 2671  2-1337
>>>>> 2 tests and 106 subtests skipped.
>>>>> Failed 2/236 test scripts. 1382/11688 subtests failed.
>>>>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14  
>>>>> csys =
>>>>> 159.61 CPU)
>>>>>
>>>>> BioDBSeqFeature_mysql
>>>>> =====================
>>>>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>>>>> 1..46
>>>>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC  
>>>>> (@INC
>>>>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>>>>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>>>>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/ 
>>>>> site_perl) at
>>>>> (eval 37) line 3.
>>>>> Perhaps the DBD::mysql perl module hasn't been fully installed,
>>>>> or perhaps the capitalisation of 'mysql' isn't right.
>>>>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>>>>>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>>>>>
>>>>> SearchIO
>>>>> ========
>>>>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>>>>> 1..1337
>>>>> ok 1
>>>>>
>>>>> -------------------- WARNING ---------------------
>>>>> MSG: XML::SAX::Expat not currently supported; must have local  
>>>>> copies
>>>>> of NCBI DTD docs!
>>>>> ---------------------------------------------------
>>>>>
>>>>> -------------------- WARNING ---------------------
>>>>> MSG: error in parsing a report:
>>>>>
>>>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/ 
>>>>> NCBI_BlastOutput.dtd'
>>>>> does not exist
>>>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>>>>> Handler couldn't resolve external entity at line 2, column 82,  
>>>>> byte 104
>>>>> error in processing external entity reference at line 2, column  
>>>>> 82,
>>>>> byte 104 at
>>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/ 
>>>>> Parser.pm line
>>>>> 187
>>>>>
>>>>> ---------------------------------------------------
>>>>> not ok 2
>>>>> # Failed test 2 in t/SearchIO.t at line 68
>>>>> Can't call method "database_name" on an undefined value at
>>>>> t/SearchIO.t line 69.
>>>>>
>>>>> ------------------------------
>>>>>
>>>>> Message: 10
>>>>> Date: Tue, 17 Oct 2006 11:32:54 +0100
>>>>> From: Sendu Bala <bix at sendu.me.uk>
>>>>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>>>>> To: bioperl-l at bioperl.org
>>>>> Message-ID: <4534B156.4090501 at sendu.me.uk>
>>>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>>>
>>>>> Bioperl 1.5.2 Release Candidate 2 is ready and available for  
>>>>> testing.
>>>>> See http://www.bioperl.org/wiki/Release_1.5.2 for
>>>>> instructions on getting and testing this RC.
>>>>>
>>>>> Developers:
>>>>>     This should be the last RC before release ~next monday. Now  
>>>>> would
>>>>>     be a good time for last minute documentaiton updates and  
>>>>> additions.
>>>>>
>>>>> Users:
>>>>>     Even though 1.5.2 is a 'developer' release, we consider it  
>>>>> the most
>>>>>     stable and capable version of Bioperl, and recommend that  
>>>>> you use
>>>>>     it in all but the most critical production environments.  
>>>>> Please
>>>>>     try it out and let us know of any problems or difficulties  
>>>>> you run
>>>>>     into.
>>>>>
>>>>>
>>>>> Thank you,
>>>>> Sendu.
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>
>>
>>
>>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Wed Oct 18 02:52:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 07:52:05 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534B156.4090501@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>
Message-ID: <4535CF15.4090502@sendu.me.uk>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> See http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
> 
> Developers:
>    This should be the last RC before release ~next monday. Now would
>    be a good time for last minute documentaiton updates and additions.

Given the few issues that have come up, it would be prudent to have 
another RC, so expect one around the time the 'Needs investigation' 
issues on the release page have been solved.

If you think there are more things that need investigation, please add 
them, but note the bias toward things that affect the successful 
completion of the test suite as opposed to general bugs which should go 
to Bugzilla as normal.

From bix at sendu.me.uk  Wed Oct 18 04:55:21 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 09:55:21 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45350BA6.3040102@genomics.dk>
References: <4534B156.4090501@sendu.me.uk>
	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>
	<45350BA6.3040102@genomics.dk>
Message-ID: <4535EBF9.1090706@sendu.me.uk>

Niels Larsen wrote:

> ------------ EBI
> 
> I invoked the EBI script
> 
> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip
> 
> like this
> 
> WSWUBlastClient.pl -p blastn -D embl test.fasta
> 
> where the content of test.fasta is below, and got
> 
> Can't find method element in the message at 
> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

As you admit, this is not a Bioperl issue. I would suggest you contact 
EBI support.

In the mean time/alternatively I'd suggest investigating the Bioperl 
interface to the SOAP server, which is part of the Bioperl-run package.

http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/Analysis.html


> ------------ DDBJ
> 
> Inspired by this page,
> 
> http://xml.nig.ac.jp/doc/Blast.txt
> 
> I made this test script
[snip]
> which for me prints undef.

Again, not something I can really help you with. You'll need to 
triple-check your code and then seek support from the providers of that 
SOAP service.


> ------------- NCBI/Bioperl
> 
> I installed 1.5.2-RC2, looked at the RemoteBlast example in
> 
> http://www.bioperl.org/wiki/Bptutorial.pl
> 
> and then put that into this test code, more or less cut/paste,
[snip]
> Maybe I am supposed to add a check for content in $rc and then stop
> the inner loop?

Yes, the wiki page example isn't really adequate. I'll update it. For a 
better code example see the RemoteBlast documentation:

http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html


> I could figure that out maybe, but I wish there was a
> function which simply takes a single sequence + arguments and only
> returns a list of matches when done, and does not return until then
> (or until a specified timeout).

Yes, I hardly find dealing with RIDs that pleasant. You might like to 
add a feature request to Bugzilla.


From n.haigh at sheffield.ac.uk  Wed Oct 18 05:58:00 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 10:58:00 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
Message-ID: <4535FAA8.2050506@sheffield.ac.uk>

I get all tests passing except for BioDBSeqFeature_mysql which fails all
tests (1-46).

During perl Makefile.PL I get:
"I see you have Berkeleydb installed. I will create the DBD tests for
Bio::DB::SeqFeature::Store..."

I notice under the "needs investigation" there is mention about tests
been generated even if DBD::mysql isn't installed. I assume this is the
problem? If this is the problem should DBD::mysql be added to the
dependencies in Makefile.PL?

Is there an easy way to find out what tests are being skipped due to
absent modules?

Cheers
Nath


From n.haigh at sheffield.ac.uk  Wed Oct 18 07:34:21 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 12:34:21 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4535EBF9.1090706@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk>
Message-ID: <4536113D.1080307@sheffield.ac.uk>

I've just added test results for 1.5.2 RC2 to the wiki.

There are lots of fails for packages other than bioperl-live. I'm not
sure excatly how the test fails/skipps are/should be handled since my
setups are as follows.

Clean WinXP Pro:
This is a clean install of WinXP Pro SP2 with no major software
installed, other than ActivePerl 5.8.8.819 and a few tools for archive
extracting, anti virus etc. Therefore, I'm unsure how tests in
bioperl-network and bioperl-db should return. For example, I have made
no effort to setup biosql-schema but I thought that maybe there would be
a test that would detect this, and fail, then skip over other tests
gracefully - like the bioperl-run tests when a piece of software is not
installed???

Debian Linux:
This is a Bio-Linux machine with quite a lot of bioinformatics software
installed in the Path. So most of the tests in bioperl-run should
probably have passed. The same goes for bioperl-network and bioperl-db
as with my Windows setup.

If my thoughts are totally wrong - let me know!
Nath

From bix at sendu.me.uk  Wed Oct 18 08:03:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 13:03:11 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <4535FAA8.2050506@sheffield.ac.uk>
References: <4535FAA8.2050506@sheffield.ac.uk>
Message-ID: <453617FF.9080508@sendu.me.uk>

Nathan Haigh wrote:
> I get all tests passing except for BioDBSeqFeature_mysql which fails all
> tests (1-46).
> 
> During perl Makefile.PL I get:
> "I see you have Berkeleydb installed. I will create the DBD tests for
> Bio::DB::SeqFeature::Store..."
> 
> I notice under the "needs investigation" there is mention about tests
> been generated even if DBD::mysql isn't installed. I assume this is the
> problem? 

Probably. I'm looking into it. Not sure why it wasn't causing a problem 
before now.

 > If this is the problem should DBD::mysql be added to the
 > dependencies in Makefile.PL?

No. You can use the modules in question without mysql (presumably; ie. 
you have a different sql setup), so it makes no sense to warn people 
they don't have a module they absolutely do not need.


> Is there an easy way to find out what tests are being skipped due to
> absent modules?

Ideally, when the skip occurs the test script will issue a message. I 
think that happens in most, if not all cases.

From bix at sendu.me.uk  Wed Oct 18 09:02:50 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 14:02:50 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <453617FF.9080508@sendu.me.uk>
References: <4535FAA8.2050506@sheffield.ac.uk> <453617FF.9080508@sendu.me.uk>
Message-ID: <453625FA.6090907@sendu.me.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
?
>> I notice under the "needs investigation" there is mention about tests
>> been generated even if DBD::mysql isn't installed. I assume this is the
>> problem? 
> 
> Probably. I'm looking into it. Not sure why it wasn't causing a problem 
> before now.
> 
>  > If this is the problem should DBD::mysql be added to the
>  > dependencies in Makefile.PL?
> 
> No. You can use the modules in question without mysql (presumably; ie. 
> you have a different sql setup), so it makes no sense to warn people 
> they don't have a module they absolutely do not need.

Oops. It /is/ in the pre-reqs in Makefile.PL. Maybe DBD::mysql is the 
only supported driver?

From bix at sendu.me.uk  Wed Oct 18 09:16:24 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 14:16:24 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>	<20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
	<67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>
Message-ID: <45362928.8070104@sendu.me.uk>

Chris Fields wrote:
> On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote:
> 
>> Hi Chris,
>>
>> Yup, that's it.  I installed XML::SAX::ExpatXS (make test output
>> below).  Should there be a note somewhere in the INSTALL docs saying
>> basically what you just wrote?  Or maybe it's already there somewhere
>> and I missed it.
> 
> The INSTALL docs should have this, yes.  I'll double-check though.
> 
> Pretty much anything that plugs into XML::SAX except XML::SAX::Expat  
> works (XML::LibXML also works, I found).
> 
>> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks
>> if DBD::mysql can be loaded,
[snip]
> It should check this when using 'perl Makefile.PL', since the tests  
> are only set up if MySQL is present (so you would assume that it  
> checks for DBD::mysql).  I'll look into it.

This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in 
my t directory when I packed it up for release.

I'm tweaking Makefile.PL right now in any case; there are a few errors 
and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean.

From cjfields at uiuc.edu  Wed Oct 18 09:55:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 08:55:37 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
Message-ID: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>

Ding dong the witch is dead!  As announce previously, from the latest
GenBank release (156.0):

-----------------------------------------------

1.3.8 Feature location syntax X.Y no longer supported

  The Feature Table has supported feature locations of the form 'X.Y', to
represent a base position which is greater or equal to X, and less than or
equal to Y. For example:

	misc_feature    1.10..20
	misc_feature    join(100..150,200.210..250)

  In the first example, the misc_feature starts somewhere between bases 1
and 10 (inclusive), and ends at basepair 20. In the second, the 51 bases
from 100..150 are joined together with a second basepair interval, which
could be anywhere from 200..250 to 210..250 .

  Although this syntax seems like a reasonable way to capture an uncertain
interval, it is used for features on a vanishingly small number of sequence
records, most database submission mechanisms don't support it, and the
meaning of its use in a join() context is not entirely clear.

  As of October 2006, this type of location is no longer supported.
Those records with features which utilize X.Y locations will be reviewed and
converted to a non-uncertain format.

-----------------------------------------------

EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
Not sure about UniProt/SwissProt.

I guess we're keeping this in for backwards compatibility only, but how do
we handle any bugs that pop up related to this?  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Wed Oct 18 10:10:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:10:07 -0500
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <453617FF.9080508@sendu.me.uk>
Message-ID: <001f01c6f2bf$20737270$15327e82@pyrimidine>

> Nathan Haigh wrote:
> > I get all tests passing except for BioDBSeqFeature_mysql which fails all
> > tests (1-46).
> >
> > During perl Makefile.PL I get:
> > "I see you have Berkeleydb installed. I will create the DBD tests for
> > Bio::DB::SeqFeature::Store..."
> >
> > I notice under the "needs investigation" there is mention about tests
> > been generated even if DBD::mysql isn't installed. I assume this is the
> > problem?
> 
> Probably. I'm looking into it. Not sure why it wasn't causing a problem
> before now.

Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP
because 'perl Makefile.PL' doesn't detect my MySQL installation, so the
MySQL-based tests don't run even though I have DBD::mysql installed.  I
thought this might just be a WinXP issue, but apparently not.  If I can get
to it I'll run a few checks.

>  > If this is the problem should DBD::mysql be added to the
>  > dependencies in Makefile.PL?
> 
> No. You can use the modules in question without mysql (presumably; ie.
> you have a different sql setup), so it makes no sense to warn people
> they don't have a module they absolutely do not need.

Agreed, though I don't know if other relational DB's are supported like
PostgreSQL.

> > Is there an easy way to find out what tests are being skipped due to
> > absent modules?
> 
> Ideally, when the skip occurs the test script will issue a message. I
> think that happens in most, if not all cases.

Yes, though we may run into the same issue we had with XEMBL tests not
reporting the reasons it skipped.  Each test suite should run an eval{} to
check the required modules, then only skip blocks of tests that rely on
those modules.  I think we have caught most of those, but who knows w/o
doing a complete test suite audit?

Our eventual complete switchover to Test::More should hopefully clean these
up.  I don't consider it a pressing issue for this release, though Sendu may
feel differently.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Wed Oct 18 10:12:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:12:52 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45362928.8070104@sendu.me.uk>
Message-ID: <002001c6f2bf$807849c0$15327e82@pyrimidine>

...
> This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in
> my t directory when I packed it up for release.
> 
> I'm tweaking Makefile.PL right now in any case; there are a few errors
> and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean.

Okay, makes sense now.  No big deal, it's still an RC (a developer's RC at
that!).

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Wed Oct 18 10:17:35 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 15:17:35 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <001f01c6f2bf$20737270$15327e82@pyrimidine>
References: <001f01c6f2bf$20737270$15327e82@pyrimidine>
Message-ID: <4536377F.6000408@sheffield.ac.uk>

Chris Fields wrote:
>> Nathan Haigh wrote:
>>     
>>> I get all tests passing except for BioDBSeqFeature_mysql which fails all
>>> tests (1-46).
>>>
>>> During perl Makefile.PL I get:
>>> "I see you have Berkeleydb installed. I will create the DBD tests for
>>> Bio::DB::SeqFeature::Store..."
>>>
>>> I notice under the "needs investigation" there is mention about tests
>>> been generated even if DBD::mysql isn't installed. I assume this is the
>>> problem?
>>>       
>> Probably. I'm looking into it. Not sure why it wasn't causing a problem
>> before now.
>>     
>
> Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP
> because 'perl Makefile.PL' doesn't detect my MySQL installation, so the
> MySQL-based tests don't run even though I have DBD::mysql installed.  I
> thought this might just be a WinXP issue, but apparently not.  If I can get
> to it I'll run a few checks.
>
>   
This was on WinXP.
>>  > If this is the problem should DBD::mysql be added to the
>>  > dependencies in Makefile.PL?
>>
>> No. You can use the modules in question without mysql (presumably; ie.
>> you have a different sql setup), so it makes no sense to warn people
>> they don't have a module they absolutely do not need.
>>     
>
> Agreed, though I don't know if other relational DB's are supported like
> PostgreSQL.
>
>   
>>> Is there an easy way to find out what tests are being skipped due to
>>> absent modules?
>>>       
>> Ideally, when the skip occurs the test script will issue a message. I
>> think that happens in most, if not all cases.
>>     
>
> Yes, though we may run into the same issue we had with XEMBL tests not
> reporting the reasons it skipped.  Each test suite should run an eval{} to
> check the required modules, then only skip blocks of tests that rely on
> those modules.  I think we have caught most of those, but who knows w/o
> doing a complete test suite audit?
>
> Our eventual complete switchover to Test::More should hopefully clean these
> up.  I don't consider it a pressing issue for this release, though Sendu may
> feel differently.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   


From hlapp at gmx.net  Wed Oct 18 10:36:31 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 18 Oct 2006 10:36:31 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
References: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
Message-ID: <B8036AB0-741F-427A-9EB1-7E80A28EC79F@gmx.net>


On Oct 18, 2006, at 9:55 AM, Chris Fields wrote:

> how do we handle any bugs that pop up related to this?

By an evil grin, followed by deflecting the blame to NCBI, followed  
by another evil grin.
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Oct 18 10:43:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:43:31 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <B8036AB0-741F-427A-9EB1-7E80A28EC79F@gmx.net>
Message-ID: <002401c6f2c3$c83c7e30$15327e82@pyrimidine>

> On Oct 18, 2006, at 9:55 AM, Chris Fields wrote:
> 
> > how do we handle any bugs that pop up related to this?
> 
> By an evil grin, followed by deflecting the blame to NCBI, followed
> by another evil grin.
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Sounds good to me!  One less thing to worry about.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Wed Oct 18 10:45:57 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 15:45:57 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
Message-ID: <45363E25.8010806@sheffield.ac.uk>

Nathan Haigh wrote:
> I've just added test results for 1.5.2 RC2 to the wiki.
>
> There are lots of fails for packages other than bioperl-live. I'm not
> sure excatly how the test fails/skipps are/should be handled since my
> setups are as follows.
>
> Clean WinXP Pro:
> This is a clean install of WinXP Pro SP2 with no major software
> installed, other than ActivePerl 5.8.8.819 and a few tools for archive
> extracting, anti virus etc. Therefore, I'm unsure how tests in
> bioperl-network and bioperl-db should return. For example, I have made
> no effort to setup biosql-schema but I thought that maybe there would be
> a test that would detect this, and fail, then skip over other tests
> gracefully - like the bioperl-run tests when a piece of software is not
> installed???
>
> Debian Linux:
> This is a Bio-Linux machine with quite a lot of bioinformatics software
> installed in the Path. So most of the tests in bioperl-run should
> probably have passed. The same goes for bioperl-network and bioperl-db
> as with my Windows setup.
>
> If my thoughts are totally wrong - let me know!
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just looking into the failed Linux tests.

Several of the tests result in errors like:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: ARGUMENTS !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Alignment::Exonerate::AUTOLOAD
/home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:126
STACK: Bio::Tools::Run::Alignment::Exonerate::new
/home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:154
STACK: t/Exonerate.t:32
-----------------------------------------------------------

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: 'arguments' !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Hmmer::AUTOLOAD Bio/Tools/Run/Hmmer.pm:172
STACK: Bio::Tools::Run::Hmmer::_run Bio/Tools/Run/Hmmer.pm:253
STACK: Bio::Tools::Run::Hmmer::run Bio/Tools/Run/Hmmer.pm:228
STACK: t/Hmmer.t:54
-----------------------------------------------------------

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: ARGUMENTS !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Phrap::AUTOLOAD Bio/Tools/Run/Phrap.pm:137
STACK: Bio::Tools::Run::Phrap::new Bio/Tools/Run/Phrap.pm:165
STACK: t/Phrap.t:34
-----------------------------------------------------------

Any ideas??

Nath


From hlapp at gmx.net  Wed Oct 18 10:51:36 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 18 Oct 2006 10:51:36 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk>
Message-ID: <E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>


On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:

>  For example, I have made
> no effort to setup biosql-schema but I thought that maybe there  
> would be
> a test that would detect this

I'm afraid there isn't. Bioperl-db is meaningless without biosql-schema.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bosborne11 at verizon.net  Wed Oct 18 10:43:06 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 10:43:06 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
 GenBank/EMBL/DDBJ
In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
Message-ID: <C15BB5BA.ADAA%bosborne11@verizon.net>

Chris,

I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
of the more recent examples in t/LocationFactory.t come from there.

Brian O.


On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
> Not sure about UniProt/SwissProt.


From cjfields at uiuc.edu  Wed Oct 18 11:00:30 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 10:00:30 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <C15BB5BA.ADAA%bosborne11@verizon.net>
Message-ID: <002501c6f2c6$27625540$15327e82@pyrimidine>

Do they still use the X.Y notations?  Those are the most troublesome.  I
guess we still don't support the ones containing '?'.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: Brian Osborne [mailto:bosborne11 at verizon.net]
> Sent: Wednesday, October 18, 2006 9:43 AM
> To: Chris Fields; bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in
> GenBank/EMBL/DDBJ
> 
> Chris,
> 
> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
> of the more recent examples in t/LocationFactory.t come from there.
> 
> Brian O.
> 
> 
> On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:
> 
> > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
> > Not sure about UniProt/SwissProt.


From Kevin.M.Brown at asu.edu  Wed Oct 18 11:16:50 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 18 Oct 2006 08:16:50 -0700
Subject: [Bioperl-l] Blast information
Message-ID: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>

I just recently upgraded to 1.5.1 on WinXP to bring this version closer
to live to parse some locally created blast files.  I'm trying to find
the method that returns the values that are underneath the Identities
and Positives information as I'm trying to replicate the output of an
old blast parser we have here written in RealBasic which is showing its
age.  Once I have it replicating the old output I then intend to add
more features in terms of filtering returned hits (like not returning
self->self hits or a->b so don't show b->a).

Example:
I'm looking for the methods that will return 117 from identities and 117
from positives.  I can't just use num_identical/percent_identity as that
isn't 100% accurate.

>BurkM_2016
          Length = 241

 Score = 43.2 bits (88), Expect = 7e-005
 Identities = 26/117 (22%), Positives = 51/117 (43%)

Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
357
           Q   F  F  + A+    ++ +         + + L +R   GL   + P   E + A+L
Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
170

Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
              A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227

Thanks,
Kevin


From cjfields at uiuc.edu  Wed Oct 18 11:25:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 10:25:59 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
Message-ID: <002601c6f2c9$b6d04a90$15327e82@pyrimidine>

> I've just added test results for 1.5.2 RC2 to the wiki.
> 
> There are lots of fails for packages other than bioperl-live. I'm not
> sure excatly how the test fails/skipps are/should be handled since my
> setups are as follows.
> 
> Clean WinXP Pro:
> This is a clean install of WinXP Pro SP2 with no major software
> installed, other than ActivePerl 5.8.8.819 and a few tools for archive
> extracting, anti virus etc. Therefore, I'm unsure how tests in
> bioperl-network and bioperl-db should return. For example, I have made
> no effort to setup biosql-schema but I thought that maybe there would be
> a test that would detect this, and fail, then skip over other tests
> gracefully - like the bioperl-run tests when a piece of software is not
> installed???
> 
> Debian Linux:
> This is a Bio-Linux machine with quite a lot of bioinformatics software
> installed in the Path. So most of the tests in bioperl-run should
> probably have passed. The same goes for bioperl-network and bioperl-db
> as with my Windows setup.
> 
> If my thoughts are totally wrong - let me know!
> Nath

The bioperl-db tests rely on a local BioSQL database and on having a
properly set up configuration file (these are detailed in the bioperl-db
INSTALL doc).  Furthermore, there are serious problems with bioperl-db and
WinXP (see Bug 1938 in bugzilla).  There is a workaround, but it isn't
perfect by any means.  

http://bugzilla.open-bio.org/show_bug.cgi?id=1938

Many of the bioperl-run tests rely on env. variables being set properly, so
maybe that's why they failed.  These should all be detailed in the INSTALL
file (but maybe they aren't?).

I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac OS
X yet but intended on doing this within the week.  The INSTALL file details
the requirements for the packages (Graph 0.80 is the only one for
bioperl-network, for instance, and there isn't a PPM for that version
available yet).  

It would be nice to skip the tests based on absence of the particular
modules or installed programs, and I think the final goal is to possibly
attempt to do this.  However, all of the bioperl-related distributions have
their own documentation which outline their installation, requirements, and
use.  At least we can point to that, which works for now.  We could always
start up a wiki page for the various bioperl distributions to monitor
problems or issues with each based on OS, proposed enhancements/ideas, etc.


Also, most (if not all, including core) have been primarily tested on some
*nix-related system, which means that they may not work on Win32 systems.
Though the Windows support is light-years ahead of what it used to be circa
rel 0.7, I don't think it is full-proof yet, as witnessed by the bioperl-db
bug.  Frankly, we need more WinXP users for those packages willing to test
them out and offer suggestions.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign l


From bosborne11 at verizon.net  Wed Oct 18 11:13:51 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 11:13:51 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
 GenBank/EMBL/DDBJ
In-Reply-To: <002501c6f2c6$27625540$15327e82@pyrimidine>
Message-ID: <C15BBCEF.ADB8%bosborne11@verizon.net>

Chris,

No, I don't think they use the form X.Y. See below, from
t/LocationFactory.t, we do support most of the forms using ?. Supposedly
these tests accommodate all of the possible fuzzy locations encountered in
Swissprot, I wrote these a year or so ago.

Brian O.


        # UNCERTAIN locations and positions (Swissprot)
   "?2465..2774" => [$fuzzy_impl,
       2465, 2465, "UNCERTAIN", 2774, 2774, "EXACT", "EXACT", 1, 1],
   "22..?64" => [$fuzzy_impl,
       22, 22, "EXACT", 64, 64, "UNCERTAIN", "EXACT", 1, 1],
   "?22..?64" => [$fuzzy_impl,
       22, 22, "UNCERTAIN", 64, 64, "UNCERTAIN", "EXACT", 1, 1],
   "?..>393" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", 393, undef, "AFTER", "UNCERTAIN", 1, 1],
   "<1..?" => [$fuzzy_impl,
       undef, 1, "BEFORE", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1],
   "?..536" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", 536, 536, "EXACT", "UNCERTAIN", 1, 1],
   "1..?" => [$fuzzy_impl,
       1, 1, "EXACT", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1],
   "?..?" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", undef, undef, "UNCERTAIN", "UNCERTAIN", 1,
1],
   # Not working yet:
   #"12..?1" => [$fuzzy_impl,
   #    1, 1, "UNCERTAIN", 12, 12, "EXACT", "EXACT", 1, 1]


On 10/18/06 11:00 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> Do they still use the X.Y notations?  Those are the most troublesome.  I
> guess we still don't support the ones containing '?'.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
>> -----Original Message-----
>> From: Brian Osborne [mailto:bosborne11 at verizon.net]
>> Sent: Wednesday, October 18, 2006 9:43 AM
>> To: Chris Fields; bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in
>> GenBank/EMBL/DDBJ
>> 
>> Chris,
>> 
>> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
>> of the more recent examples in t/LocationFactory.t come from there.
>> 
>> Brian O.
>> 
>> 
>> On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>> 
>>> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
>>> Not sure about UniProt/SwissProt.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Wed Oct 18 12:56:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 11:56:07 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <002601c6f2c9$b6d04a90$15327e82@pyrimidine>
Message-ID: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>

...
> I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac
> OS
All,

> X yet but intended on doing this within the week.  The INSTALL file
> details
> the requirements for the packages (Graph 0.80 is the only one for
> bioperl-network, for instance, and there isn't a PPM for that version
> available yet).
...

As a followup in this, I tried bioperl-network and had similar failed tests
with Graph 0.79 (the only PPM available from ActiveState).  However, the
INSTALL docs state that Graph 0.80 is needed, and the test run gave several
warnings about not having Graph 0.80 installed. 

I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and
everything passed.  Maybe we need to have a Graph PPM available for those
who want bioperl-network?

As for bioperl-run, all tests passed from a new CVS checkout even though I
have none of the programs installed, so they seem to skip properly.  The
test run also printed warnings when a program wasn't available or installed.


Chris


From bosborne11 at verizon.net  Wed Oct 18 13:10:34 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 13:10:34 -0400
Subject: [Bioperl-l] Blast information
In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
Message-ID: <C15BD84A.ADCC%bosborne11@verizon.net>

Kevin,

Are you looking for hsp_length()? See the SearchIO HOWTO for a list of
methods:

http://www.bioperl.org/wiki/HOWTO:SearchIO


Brian O.


On 10/18/06 11:16 AM, "Kevin Brown" <Kevin.M.Brown at asu.edu> wrote:

> I just recently upgraded to 1.5.1 on WinXP to bring this version closer
> to live to parse some locally created blast files.  I'm trying to find
> the method that returns the values that are underneath the Identities
> and Positives information as I'm trying to replicate the output of an
> old blast parser we have here written in RealBasic which is showing its
> age.  Once I have it replicating the old output I then intend to add
> more features in terms of filtering returned hits (like not returning
> self->self hits or a->b so don't show b->a).
> 
> Example:
> I'm looking for the methods that will return 117 from identities and 117
> from positives.  I can't just use num_identical/percent_identity as that
> isn't 100% accurate.
> 
>> BurkM_2016
>           Length = 241
> 
>  Score = 43.2 bits (88), Expect = 7e-005
>  Identities = 26/117 (22%), Positives = 51/117 (43%)
> 
> Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> 357
>            Q   F  F  + A+    ++ +         + + L +R   GL   + P   E + A+L
> Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> 170
> 
> Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
>               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
> 
> Thanks,
> Kevin
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Kevin.M.Brown at asu.edu  Wed Oct 18 17:25:48 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 18 Oct 2006 14:25:48 -0700
Subject: [Bioperl-l] Blast information
Message-ID: <1A4207F8295607498283FE9E93B775B4022A71C3@EX02.asurite.ad.asu.edu>

Yes, that does indeed look like what I was after. 

> -----Original Message-----
> From: Brian Osborne [mailto:bosborne11 at verizon.net] 
> Sent: Wednesday, October 18, 2006 10:11 AM
> To: Kevin Brown; bioperl-l
> Subject: Re: [Bioperl-l] Blast information
> 
> Kevin,
> 
> Are you looking for hsp_length()? See the SearchIO HOWTO for a list of
> methods:
> 
> http://www.bioperl.org/wiki/HOWTO:SearchIO
> 
> 
> Brian O.
> 
> 
> On 10/18/06 11:16 AM, "Kevin Brown" <Kevin.M.Brown at asu.edu> wrote:
> 
> > I just recently upgraded to 1.5.1 on WinXP to bring this 
> version closer
> > to live to parse some locally created blast files.  I'm 
> trying to find
> > the method that returns the values that are underneath the 
> Identities
> > and Positives information as I'm trying to replicate the 
> output of an
> > old blast parser we have here written in RealBasic which is 
> showing its
> > age.  Once I have it replicating the old output I then intend to add
> > more features in terms of filtering returned hits (like not 
> returning
> > self->self hits or a->b so don't show b->a).
> > 
> > Example:
> > I'm looking for the methods that will return 117 from 
> identities and 117
> > from positives.  I can't just use 
> num_identical/percent_identity as that
> > isn't 100% accurate.
> > 
> >> BurkM_2016
> >           Length = 241
> > 
> >  Score = 43.2 bits (88), Expect = 7e-005
> >  Identities = 26/117 (22%), Positives = 51/117 (43%)
> > 
> > Query: 298 
> QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> > 357
> >            Q   F  F  + A+    ++ +         + + L +R   GL   + 
> P   E + A+L
> > Sbjct: 111 
> QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> > 170
> > 
> > Query: 358 
> MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
> >               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> > Sbjct: 171 
> KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
> > 
> > Thanks,
> > Kevin
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> 


From n.appleby at uq.edu.au  Wed Oct 18 17:58:06 2006
From: n.appleby at uq.edu.au (Nikki Appleby)
Date: Thu, 19 Oct 2006 07:58:06 +1000
Subject: [Bioperl-l] CONTIG dealing
Message-ID: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>


I have just entered the wonderful new world of BioPerl, so the answer to my
question may be obvious to any of the gurus reading this.

I need to collect sequence features and ontology annotations. Here goes.

I am retrieving sequences from SwissProt via Bio::DB::SwissProt and
get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into an RDBMS
format that I am happy with I can get at the xref ids. In this case, they
are 

AP003451; BAB86144.1; -; Genomic_DNA. 
AP008207; BAF07116.1; -; Genomic_DNA. 
AB103395; BAC81207.1; -; mRNA. 

I can happily go off and fetch those from Bio::DB::GenBank (first column),
and Bio::DB::GenPept (second). All good, except...

AP008207 is a contig. I don't want to get all of the features for the entire
thing, just the single contig that actually matches the original sequence.
It takes a couple of hours to get at it and then it gives me way too much.

I will come across this problem with other sequences. How do I (a) find out
if it is a contig without downloading it in it's entirety and (b) extract
the list of sequences that are about to be contigged together.

I have searched the web for answers, including this list, but see nothing.
Help!
 
Nikki Appleby.


From bosborne11 at verizon.net  Wed Oct 18 20:54:04 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 20:54:04 -0400
Subject: [Bioperl-l] LocatableSeq object vs Sequence Object
In-Reply-To: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com>
Message-ID: <C15C44EC.ADF8%bosborne11@verizon.net>

Peter,

I'm not understanding your question, partly because your letter and your
code are saying different things. You say you want to call
location_from_column() but your code shows you calling species(). What
happens when you call location_from_column? Do you see errors?

Brian O.


On 10/17/06 12:26 PM, "Peter H. Baenziger" <plu5even at gmail.com> wrote:

> I was thinking I could use:
> foreach $seq ($alignment->each_seq())
> to loop through the sequences and call:
> $seq->location_from_column($pos)
> on each of the sequences.  


From cjfields at uiuc.edu  Wed Oct 18 22:46:14 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 21:46:14 -0500
Subject: [Bioperl-l] CONTIG dealing
In-Reply-To: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>
References: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>
Message-ID: <FAEAE9E1-EF95-4B79-AD75-B54D3E24E827@uiuc.edu>

On Oct 18, 2006, at 4:58 PM, Nikki Appleby wrote:

>
> I have just entered the wonderful new world of BioPerl, so the  
> answer to my
> question may be obvious to any of the gurus reading this.
>
> I need to collect sequence features and ontology annotations. Here  
> goes.
>
> I am retrieving sequences from SwissProt via Bio::DB::SwissProt and
> get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into  
> an RDBMS
> format that I am happy with I can get at the xref ids. In this  
> case, they
> are
>
> AP003451; BAB86144.1; -; Genomic_DNA.
> AP008207; BAF07116.1; -; Genomic_DNA.
> AB103395; BAC81207.1; -; mRNA.
>
> I can happily go off and fetch those from Bio::DB::GenBank (first  
> column),
> and Bio::DB::GenPept (second). All good, except...
>
> AP008207 is a contig. I don't want to get all of the features for  
> the entire
> thing, just the single contig that actually matches the original  
> sequence.
> It takes a couple of hours to get at it and then it gives me way  
> too much.
>
> I will come across this problem with other sequences. How do I (a)  
> find out
> if it is a contig without downloading it in it's entirety and (b)  
> extract
> the list of sequences that are about to be contigged together.
>
> I have searched the web for answers, including this list, but see  
> nothing.
> Help!
>
> Nikki Appleby.

The default setting for the retrieval format for GenBank is  
'gbwithparts' (which gets the full sequence at all times).  You can  
set this to 'gb' using request_format() to retrieve the sequence file  
with the contig information instead of the sequence, if it contains  
such (otherwise it just retrieves the sequence anyway).

However, I have noticed this particular file does not represent a  
true contig record but is the entire chromosome sequence.  The contig  
information is in the comments section, probably b/c the record is  
converted over.  You could just download the sequence record and run  
regexp to grab the comments section, then parse out the contigs (a  
pain) if you really want that.  Or you could try to find the  
equivalent GenBank record, such as the ones derived from the WGS  
records.

I did notice the list of dbxrefs in your swissprot record indicate  
three EMBL sequences.  If the order is consistent for the SwissProt  
entries you want, they probably represent:

The contig (what you want): AP003451; BAB86144.1; -; Genomic_DNA.
The supercontig (chromosome) : AP008207; BAF07116.1; -; Genomic_DNA.
The cDNA : AB103395; BAC81207.1; -; mRNA.

I checked the first one (AP003451), which seems to confirm this.

Since the chromosome supercontig is built from the smaller sequence  
contigs you could just grab the first EMBL dbxref instead of all of  
them.  It parses much faster than the chromosome file.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Wed Oct 18 11:47:14 2006
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Oct 2006 08:47:14 -0700
Subject: [Bioperl-l] Blast information
In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
References: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
Message-ID: <6B7D24F3-69F1-498D-AB53-B4CEB14E4F3D@bioperl.org>

I think this will work for you.

The seq_inds method parses the middle homology sequence and  
classifies each alignment column and returns a list of the columns  
meeting the criteria.  You can interrogate query or hit in this case  
since you are requiring it to be identical

my $identicalbases = scalar $hsp->seq_inds('query', 'identical');
my $conservedbases =  scalar $hsp->seq_inds('query','conserved');

Conserved returns those identical or conserved, if you want just  
those with conservative replacements use 'conserved-not-identical'

See http://bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods for more  
info.

-jason
On Oct 18, 2006, at 8:16 AM, Kevin Brown wrote:

> I just recently upgraded to 1.5.1 on WinXP to bring this version  
> closer
> to live to parse some locally created blast files.  I'm trying to find
> the method that returns the values that are underneath the Identities
> and Positives information as I'm trying to replicate the output of an
> old blast parser we have here written in RealBasic which is showing  
> its
> age.  Once I have it replicating the old output I then intend to add
> more features in terms of filtering returned hits (like not returning
> self->self hits or a->b so don't show b->a).
>
> Example:
> I'm looking for the methods that will return 117 from identities  
> and 117
> from positives.  I can't just use num_identical/percent_identity as  
> that
> isn't 100% accurate.
>
>> BurkM_2016
>           Length = 241
>
>  Score = 43.2 bits (88), Expect = 7e-005
>  Identities = 26/117 (22%), Positives = 51/117 (43%)
>
> Query: 298  
> QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> 357
>            Q   F  F  + A+    ++ +         + + L +R   GL   + P   E +  
> A+L
> Sbjct: 111  
> QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> 170
>
> Query: 358  
> MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
>               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> Sbjct: 171  
> KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
>
> Thanks,
> Kevin
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Thu Oct 19 01:00:28 2006
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Oct 2006 22:00:28 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
Message-ID: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>

So I'm unsure what we should do here.

We can certainly fix the problem which you report which is relying on  
the "" method -- if you were to do instead:
print $_->database, ":", $_->primary_id, "\n";

you'll get the right answer.  We at a minimum just fix the auto- 
string converting method to do The Right Thing.

But I am not sure if we should keep the version out of the primary_id  
field.  This will require some rejiggering in several modules when it  
comes to printing DBlinks and I don't want to do this before the  
release. I also am not sure if there was an explicit reason why  
someone did put the version information in the primary_id. (I hope it  
wasn't me because I don't think I'm going to remember why).

Does anyone else have a strong feeling?

-jason
On Oct 17, 2006, at 12:01 PM, Erikjan wrote:

> Hello,
>
> I noticed a little problem with the Annotation "DBLink" from  
> GenBank entries
>
> When I run:
>
> perl -MBio::DB::GenBank -e 'my $gi =
> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations 
> ("dblink");
> for(@annotations) { print $_, "\n";} print $INC{
> "Bio/Annotation/DBLink.pm" }, "\n"; '
>
> This yields:
>
>    GenBank:AL591065.17.17
>
> and the place where the used Bio/Annotation/DBLink.pm resides.
>
> Can others repeat this?
>
> I have dug into the source a little and Bio::Annotation::DBLink  
> seems to
> be the place where this happens: it has a concatenation which leads to
> that repeated version number.
>
> It this something that I should fix "client-side", so to speak, or  
> is it
> worthwhile to add some logic to that concatenation to prevent this?
>
>
> Thanks,
>
> Eric
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From n.haigh at sheffield.ac.uk  Thu Oct 19 02:41:02 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 07:41:02 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>
References: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>
Message-ID: <45371DFE.6050306@sheffield.ac.uk>


> As a followup in this, I tried bioperl-network and had similar failed tests
> with Graph 0.79 (the only PPM available from ActiveState).  However, the
> INSTALL docs state that Graph 0.80 is needed, and the test run gave several
> warnings about not having Graph 0.80 installed. 
>
> I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and
> everything passed.  Maybe we need to have a Graph PPM available for those
> who want bioperl-network?
>
> As for bioperl-run, all tests passed from a new CVS checkout even though I
> have none of the programs installed, so they seem to skip properly.  The
> test run also printed warnings when a program wasn't available or installed.
>
>
> Chris
>
>   
If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make 
modifications to integrate them into the package.xml file for PPM4 clients.

Nath

From n.haigh at sheffield.ac.uk  Thu Oct 19 06:40:21 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 11:40:21 +0100
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
Message-ID: <45375615.1020603@sheffield.ac.uk>

Should line 25 read:
require Bio::Factory::EMBOSS

instead of:
require Bio::EMBOSS::Factory;

Nath

From hlapp at gmx.net  Thu Oct 19 09:56:05 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 19 Oct 2006 09:56:05 -0400
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
Message-ID: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>

Here is the overload code:

use overload '""' => sub {
	(($_[0]->database ? $_[0]->database . ':' : '' )
	. ($_[0]->primary_id ? $_[0]->primary_id : '')
	. ($_[0]->version ? '.' . $_[0]->version : ''))
	|| '' };

Except that the last '||' is redundant and unnecessary (it either  
does nothing or replaces an empty string with an empty string), I  
don't see the potential for duplicating the version number here -  
unless primary_id() did that, which I don't see it doing.

So, to me this seems to come from a parsing error in the beginning,  
rather than an erroneous mangling of version into primary_id later.

Is someone in the position to confirm this?

	-hilmar

On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:

> So I'm unsure what we should do here.
>
> We can certainly fix the problem which you report which is relying on
> the "" method -- if you were to do instead:
> print $_->database, ":", $_->primary_id, "\n";
>
> you'll get the right answer.  We at a minimum just fix the auto-
> string converting method to do The Right Thing.
>
> But I am not sure if we should keep the version out of the primary_id
> field.  This will require some rejiggering in several modules when it
> comes to printing DBlinks and I don't want to do this before the
> release. I also am not sure if there was an explicit reason why
> someone did put the version information in the primary_id. (I hope it
> wasn't me because I don't think I'm going to remember why).
>
> Does anyone else have a strong feeling?
>
> -jason
> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>
>> Hello,
>>
>> I noticed a little problem with the Annotation "DBLink" from
>> GenBank entries
>>
>> When I run:
>>
>> perl -MBio::DB::GenBank -e 'my $gi =
>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>> ("dblink");
>> for(@annotations) { print $_, "\n";} print $INC{
>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>
>> This yields:
>>
>>    GenBank:AL591065.17.17
>>
>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>
>> Can others repeat this?
>>
>> I have dug into the source a little and Bio::Annotation::DBLink
>> seems to
>> be the place where this happens: it has a concatenation which  
>> leads to
>> that repeated version number.
>>
>> It this something that I should fix "client-side", so to speak, or
>> is it
>> worthwhile to add some logic to that concatenation to prevent this?
>>
>>
>> Thanks,
>>
>> Eric
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From dmessina at wustl.edu  Thu Oct 19 09:55:31 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 19 Oct 2006 08:55:31 -0500
Subject: [Bioperl-l] missing documentation (request for help)
Message-ID: <69453D5F-7794-4DC7-BAE1-A8B2191752E6@wustl.edu>

Hi all,

There are a few modules missing a one-line description, and by one- 
line description, I'm referring to the part that comes after the  
module name in the POD.

e.g. in

=head1 NAME

Bio::SearchIO - Driver for parsing Sequence Database Searches
(BLAST, FASTA, ...)

=head1 SYNOPSIS

[etc...]

"Driver for parsing Sequence Database Searches (BLAST, FASTA, ...)"  
is the one-line description (even though it falls onto two lines) :).

I fixed the modules that I knew something about, but there are some I  
haven't used. Perhaps the author, or someone else familiar with these  
modules, could fill in an appropriate short description?

Here is the list of affected modules:
Bio::DB::Expression
Bio::Expression::Contact
Bio::Expression::DataSet
Bio::Expression::Platform
Bio::Expression::Sample
Bio::Search::Processor
Bio::DB::EUtilities::ElinkData
Bio::DB::GFF::Adaptor::memory::feature_serializer
Bio::DB::SeqFeature::Store::DBI::Iterator
Bio::Expression::FeatureGroup::FeatureGroupMas50
Bio::Expression::FeatureSet::FeatureSetMas50
Bio::Matrix::PSM::PsmHeaderI
Bio::OntologyIO::Handlers::BaseSAXHandler

Some of these are missing other POD parts as well -- please add those  
too if you can.


Thanks,
Dave


From mckays at cshl.edu  Thu Oct 19 09:51:18 2006
From: mckays at cshl.edu (Sheldon McKay)
Date: Thu, 19 Oct 2006 09:51:18 -0400
Subject: [Bioperl-l] chromosome ideograms
Message-ID: <6b0de00426b3c04b0d0d7641bc8e14e3@cshl.edu>

Hi,

Sorry for the late reply.  I have been working on a karyotype drawing 
tool as part of the Generic Genome Browser that may be useful.  In 
addition to drawing features next to chromosome ideograms, it also 
supports making chromosome 'bands' from any kind of scored features to 
create a sort of heat map on the chromosome itself.

I have a demo running at

http://mckay.cshl.edu/cgi-bin/gbrowse_karyotype

and the source is available from the GMOD CVS HEAD 
http://www.gmod.org/cvs

Sheldon

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Sheldon McKay, PhD
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724


From n.haigh at sheffield.ac.uk  Thu Oct 19 11:37:31 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 15:37:31 +0000
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
In-Reply-To: <45375615.1020603@sheffield.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>
Message-ID: <45379BBB.1040400@sheffield.ac.uk>

Thanks for committing that change Brian. Now the tests proceed from this
point, I get the following error:

------------- EXCEPTION: Bio::Root::NotImplemented -------------
MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not
implemented by package Bio::Tools::Run::EMBOSSApplication.
This is not your fault - author of Bio::Tools::Run::EMBOSSApplication
should be blamed!

STACK: Error::throw
STACK: Bio::Root::Root::throw
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350
STACK: Bio::Root::RootI::throw_not_implemented
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522
STACK: Bio::Tools::Run::WrapperBase::program_dir
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346
STACK: Bio::Tools::Run::WrapperBase::program_path
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327
STACK: Bio::Tools::Run::WrapperBase::executable
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297
STACK: t/EMBOSS.t:58
----------------------------------------------------------------

From N.Haigh at sheffield.ac.uk  Thu Oct 19 11:03:00 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 16:03:00 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45379BBB.1040400@sheffield.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>
	<45379BBB.1040400@sheffield.ac.uk>
Message-ID: <1161270180.453793a432e4f@webmail.shef.ac.uk>

I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be
consistent with other tests.

Failing that - Is there a good test writing style I should follow in one of the other test files?

Thanks
Nathan

From bosborne11 at verizon.net  Thu Oct 19 11:06:08 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Thu, 19 Oct 2006 11:06:08 -0400
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
In-Reply-To: <45379BBB.1040400@sheffield.ac.uk>
Message-ID: <C15D0CA0.AE2C%bosborne11@verizon.net>

Nathan,

Yes, I see. Those EMBOSS programs work a bit differently from the typical
app run by bioperl-run, there's no need for WrapperBase methods like
program_dir(), executable(), it seems. Well, I can try and take a look at
this tonight but there's probably someone better suited to this than me,
I've spent very little time with bioperl-run. Volunteer?

Brian O.


On 10/19/06 11:37 AM, "Nathan S. Haigh" <n.haigh at sheffield.ac.uk> wrote:

> Thanks for committing that change Brian. Now the tests proceed from this
> point, I get the following error:
> 
> ------------- EXCEPTION: Bio::Root::NotImplemented -------------
> MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not
> implemented by package Bio::Tools::Run::EMBOSSApplication.
> This is not your fault - author of Bio::Tools::Run::EMBOSSApplication
> should be blamed!
> 
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350
> STACK: Bio::Root::RootI::throw_not_implemented
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522
> STACK: Bio::Tools::Run::WrapperBase::program_dir
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346
> STACK: Bio::Tools::Run::WrapperBase::program_path
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327
> STACK: Bio::Tools::Run::WrapperBase::executable
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297
> STACK: t/EMBOSS.t:58
> ----------------------------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From niels at genomics.dk  Thu Oct 19 11:16:37 2006
From: niels at genomics.dk (Niels Larsen)
Date: Thu, 19 Oct 2006 17:16:37 +0200
Subject: [Bioperl-l] From EBI support re WU-Blast SOAP service
In-Reply-To: <4535EBF9.1090706@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk>
Message-ID: <453796D5.2070808@genomics.dk>

Sendu Bala wrote:
>> I invoked the EBI script
>>
>> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip
>>
>> like this
>>
>> WSWUBlastClient.pl -p blastn -D embl test.fasta
>>
>> where the content of test.fasta is below, and got
>>
>> Can't find method element in the message at 
>> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.
> 
> As you admit, this is not a Bioperl issue. I would suggest you contact 
> EBI support.
> 

To use EBI's WU-blast SOAP interface from perl, EBI support
says it one must use SOAP::Lite v 0.60 (no later version)
and include '--email you.example.com' on the command line.
This is neither evident from their web pages or the script
usage statement, but they promised to fix.

------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------

From cjfields at uiuc.edu  Thu Oct 19 11:31:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 10:31:45 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45371DFE.6050306@sheffield.ac.uk>
Message-ID: <001501c6f393$b66bd4a0$15327e82@pyrimidine>

> > As a followup in this, I tried bioperl-network and had similar failed
> tests
> > with Graph 0.79 (the only PPM available from ActiveState).  However, the
> > INSTALL docs state that Graph 0.80 is needed, and the test run gave
> several
> > warnings about not having Graph 0.80 installed.
> >
> > I made a PPM of Graph 0.80, installed, retried bioperl-network tests,
> and
> > everything passed.  Maybe we need to have a Graph PPM available for
> those
> > who want bioperl-network?
> >
> > As for bioperl-run, all tests passed from a new CVS checkout even though
> I
> > have none of the programs installed, so they seem to skip properly.  The
> > test run also printed warnings when a program wasn't available or
> installed.
> >
> >
> > Chris
> >
> >
> If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make
> modifications to integrate them into the package.xml file for PPM4
> clients.
> 
> Nath

Will do.  Should these be forwarded to Mauricio?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From N.Haigh at sheffield.ac.uk  Thu Oct 19 11:38:05 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 16:38:05 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001501c6f393$b66bd4a0$15327e82@pyrimidine>
References: <001501c6f393$b66bd4a0$15327e82@pyrimidine>
Message-ID: <1161272285.45379bdd1aea4@webmail.shef.ac.uk>


> > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make
> > modifications to integrate them into the package.xml file for PPM4
> > clients.
> > 
> > Nath
> 
> Will do.  Should these be forwarded to Mauricio?
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 


If you don't have access to the web, you can send them to me - I now have an account on that server.

Cheers
Nath

From cjfields at uiuc.edu  Thu Oct 19 11:45:00 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 10:45:00 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk>
Message-ID: <001601c6f395$8a752ed0$15327e82@pyrimidine>

> I thought I'd have my first proper try at writing some tests. I was
> wondering if there is a template test file that I should use/study in
> order to be
> consistent with other tests.
> 
> Failing that - Is there a good test writing style I should follow in one
> of the other test files?
> 
> Thanks
> Nathan

I would start with the Test::Simple and Test::More perldoc; they're pretty
self-explanatory.  You can look at the various test suites using Test::More
as well for pointers.  By far, most tests will use is().  You can use SKIP
blocks to skip tests that have a requirement, or skip all tests if they all
require something.  Pretty flexible.

We should probably get a wiki page for the developers underway, maybe a
HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
DB tests, etc.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Thu Oct 19 12:23:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 11:23:40 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
Message-ID: <001b01c6f39a$f0288ba0$15327e82@pyrimidine>

> Here is the overload code:
> 
> use overload '""' => sub {
> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> 	|| '' };
> 
> Except that the last '||' is redundant and unnecessary (it either
> does nothing or replaces an empty string with an empty string), I
> don't see the potential for duplicating the version number here -
> unless primary_id() did that, which I don't see it doing.
> 
> So, to me this seems to come from a parsing error in the beginning,
> rather than an erroneous mangling of version into primary_id later.
> 
> Is someone in the position to confirm this?
> 
> 	-hilmar

I have attached a script to the bug report on bugzilla, as well as the test
output sequence and the actual GenBank record.  There are a number of
problems:

1)  primary_id() is assigned both the id and version.
2)  version() is still assigned the version.

The above explain when printing the object directly using the overload (it
concatenates them).  

However, there are a few more issues.  The ID is printed normally
(accession.version), but the source DB is not present when SeqIO handles the
sequence.  I have attached the output and the original GenBank record to the
bug report.  

I can look into it but it won't be today; got my hands full with enzyme
assays. 

Chris


From N.Haigh at sheffield.ac.uk  Thu Oct 19 12:50:57 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 17:50:57 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine>
References: <001601c6f395$8a752ed0$15327e82@pyrimidine>
Message-ID: <1161276657.4537acf1edc80@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> > I thought I'd have my first proper try at writing some tests. I was
> > wondering if there is a template test file that I should use/study in
> > order to be
> > consistent with other tests.
> > 
> > Failing that - Is there a good test writing style I should follow in one
> > of the other test files?
> > 
> > Thanks
> > Nathan
> 
> I would start with the Test::Simple and Test::More perldoc; they're pretty
> self-explanatory.  You can look at the various test suites using Test::More
> as well for pointers.  By far, most tests will use is().  You can use SKIP
> blocks to skip tests that have a requirement, or skip all tests if they all
> require something.  Pretty flexible.
> 
> We should probably get a wiki page for the developers underway, maybe a
> HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
> DB tests, etc.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 
> 
> 


Just working through some test things now, I thought I'd start on the bioperl-run stuff as I thought it might be a bit more straight forward, i'm
familiar with some of them and they seem to get neglected.

I'm heavily commenting my tests with the thought of starting a wiki guide to testing Bioperl modules. See how far I get!

Nath

From hlapp at gmx.net  Thu Oct 19 13:11:27 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 19 Oct 2006 13:11:27 -0400
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
	<8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
Message-ID: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>

Actually you did that Jason: http://tinyurl.com/ye2edk

Apparently the motivation was to "parse swissprot fields in genpept  
file (dbsource)"?

It clearly looks wrong to add the version. You've probably had a  
reason why you did this at the time but if we (you :) can't recover  
that I guess it's best to just fix it to do the right thing (in both  
places obviously).

	-hilmar

On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:

> Well there is explicit addition of the version to the primary id so  
> it isn't so much a parsing error as a deliberate decision to append  
> it.
> see Bio::SeqIO::genbank
>
> to make the dblink
>                                               $annotation- 
> >add_Annotation
>                                                     ('dblink',
>                                                       
> Bio::Annotation::DBLink->new
>                                                      (-primary_id  
> => $id . "." . $version,
>                                                       -version =>  
> $version,
>                                                       -database =>  
> $db,
>                                                       -tagname =>  
> 'dblink'));
>
> and the code to print the dblink back out in the writer already  
> assumes the version number is appended...
>
>         foreach my $ref ( $seq->annotation->get_Annotations 
> ('dblink') ) {
>             # if ($ref->comment eq 'DBSOURCE') {
>             $self->_print('DBSOURCE    accession ',
>                           $ref->primary_id, "\n");
>             # }
>         }
>
> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
>
>> Here is the overload code:
>>
>> use overload '""' => sub {
>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
>> 	|| '' };
>>
>> Except that the last '||' is redundant and unnecessary (it either  
>> does nothing or replaces an empty string with an empty string), I  
>> don't see the potential for duplicating the version number here -  
>> unless primary_id() did that, which I don't see it doing.
>>
>> So, to me this seems to come from a parsing error in the  
>> beginning, rather than an erroneous mangling of version into  
>> primary_id later.
>>
>> Is someone in the position to confirm this?
>>
>> 	-hilmar
>>
>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>>
>>> So I'm unsure what we should do here.
>>>
>>> We can certainly fix the problem which you report which is  
>>> relying on
>>> the "" method -- if you were to do instead:
>>> print $_->database, ":", $_->primary_id, "\n";
>>>
>>> you'll get the right answer.  We at a minimum just fix the auto-
>>> string converting method to do The Right Thing.
>>>
>>> But I am not sure if we should keep the version out of the  
>>> primary_id
>>> field.  This will require some rejiggering in several modules  
>>> when it
>>> comes to printing DBlinks and I don't want to do this before the
>>> release. I also am not sure if there was an explicit reason why
>>> someone did put the version information in the primary_id. (I  
>>> hope it
>>> wasn't me because I don't think I'm going to remember why).
>>>
>>> Does anyone else have a strong feeling?
>>>
>>> -jason
>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>>
>>>> Hello,
>>>>
>>>> I noticed a little problem with the Annotation "DBLink" from
>>>> GenBank entries
>>>>
>>>> When I run:
>>>>
>>>> perl -MBio::DB::GenBank -e 'my $gi =
>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>>> $seqio =
>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>>> ("dblink");
>>>> for(@annotations) { print $_, "\n";} print $INC{
>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>>
>>>> This yields:
>>>>
>>>>    GenBank:AL591065.17.17
>>>>
>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>>
>>>> Can others repeat this?
>>>>
>>>> I have dug into the source a little and Bio::Annotation::DBLink
>>>> seems to
>>>> be the place where this happens: it has a concatenation which  
>>>> leads to
>>>> that repeated version number.
>>>>
>>>> It this something that I should fix "client-side", so to speak, or
>>>> is it
>>>> worthwhile to add some logic to that concatenation to prevent this?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Eric
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> --
>>> Jason Stajich, PhD
>>> Miller Research Fellow
>>> University of California
>>> Dept of Plant and Microbial Biology
>>> 321 Koshland Hall #3102
>>> Berkeley, CA 94720-3102
>>> lab: 510.642.8441
>>> http://pmb.berkeley.edu/~taylor/people/js.html
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From N.Haigh at sheffield.ac.uk  Thu Oct 19 13:17:33 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 18:17:33 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine>
References: <001601c6f395$8a752ed0$15327e82@pyrimidine>
Message-ID: <1161278253.4537b32dd3d15@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> > I thought I'd have my first proper try at writing some tests. I was
> > wondering if there is a template test file that I should use/study in
> > order to be
> > consistent with other tests.
> > 
> > Failing that - Is there a good test writing style I should follow in one
> > of the other test files?
> > 
> > Thanks
> > Nathan
> 
> I would start with the Test::Simple and Test::More perldoc; they're pretty
> self-explanatory.  You can look at the various test suites using Test::More
> as well for pointers.  By far, most tests will use is().  You can use SKIP
> blocks to skip tests that have a requirement, or skip all tests if they all
> require something.  Pretty flexible.
> 
> We should probably get a wiki page for the developers underway, maybe a
> HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
> DB tests, etc.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 

Just wrote a partial and small test script for t/Amap.t in bioperl-run. When I run "perl -I. t/Amap.t" I get the following output:
1..10
ok 1 - use Bio::Tools::Run::Alignment::Amap;
ok 2 - use Bio::AlignIO;
ok 3 - use Bio::SeqIO;
ok 4 - use Bio::Root::IO;
ok 5 - All the required modules are present
ok 6 - new() returned something
ok 7 -   and its the right class
not ok 8 - executable() got the correct filename
#   Failed test 'executable() got the correct filename'
#   in t/Amap.t at line 90.
#          got: undef
#     expected: 'filename'
ok 9 # skip Got incorrect filename for executable
ok 10 # skip Got incorrect filename for executable
# Looks like you failed 1 test of 10.


So far this looks good (well, that it's failing passing expected tests). However, when i run "make test" the output is unexpected and I don't know
why. It seems to die and produce the results of the testing before the rest of the test suit is run:
t/Amap....................NOK 8
#   Failed test 'executable() got the correct filename'
#   in t/Amap.t at line 90.
#          got: undef
#     expected: 'filename'
# Looks like you failed 1 test of 10.
t/Amap....................dubious
        Test returned status 1 (wstat 256, 0x100)
DIED. FAILED test 8
        Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, 70.00%)
t/Analysis_soap...........ok 7/17make: *** wait: No child processes.  Stop.


Is there something I'm missing?? If it's something less obvious, let me know and i'll post whole test file.
Nath


From cjfields at uiuc.edu  Thu Oct 19 13:26:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 12:26:45 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161278253.4537b32dd3d15@webmail.shef.ac.uk>
Message-ID: <002001c6f3a3$c00b9080$15327e82@pyrimidine>

...
> Just wrote a partial and small test script for t/Amap.t in bioperl-run.
> When I run "perl -I. t/Amap.t" I get the following output:
> 1..10
> ok 1 - use Bio::Tools::Run::Alignment::Amap;
> ok 2 - use Bio::AlignIO;
> ok 3 - use Bio::SeqIO;
> ok 4 - use Bio::Root::IO;
> ok 5 - All the required modules are present
> ok 6 - new() returned something
> ok 7 -   and its the right class
> not ok 8 - executable() got the correct filename
> #   Failed test 'executable() got the correct filename'
> #   in t/Amap.t at line 90.
> #          got: undef
> #     expected: 'filename'
> ok 9 # skip Got incorrect filename for executable
> ok 10 # skip Got incorrect filename for executable
> # Looks like you failed 1 test of 10.
> 
> 
> So far this looks good (well, that it's failing passing expected tests).
> However, when i run "make test" the output is unexpected and I don't know
> why. It seems to die and produce the results of the testing before the
> rest of the test suit is run:
> t/Amap....................NOK 8
> #   Failed test 'executable() got the correct filename'
> #   in t/Amap.t at line 90.
> #          got: undef
> #     expected: 'filename'
> # Looks like you failed 1 test of 10.
> t/Amap....................dubious
>         Test returned status 1 (wstat 256, 0x100)
> DIED. FAILED test 8
>         Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay,
> 70.00%)
> t/Analysis_soap...........ok 7/17make: *** wait: No child processes.
> Stop.
> 
> 
> 
> Is there something I'm missing?? If it's something less obvious, let me
> know and i'll post whole test file.
> Nath

Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
the problem.  The only issue I can think of is that Test::More TODO blocks
require a newer version of Test::Harness (which most users have anyway).
Are you using a TODO block?

You can send me Amap.t and I'll give it a try, but I can't promise I'll get
to it immediately (busy day).

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From N.Haigh at sheffield.ac.uk  Thu Oct 19 13:38:25 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 18:38:25 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161279505.4537b811e143f@webmail.shef.ac.uk>


> Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
> the problem.  The only issue I can think of is that Test::More TODO blocks
> require a newer version of Test::Harness (which most users have anyway).
> Are you using a TODO block?
> 
> You can send me Amap.t and I'll give it a try, but I can't promise I'll get
> to it immediately (busy day).
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 
> 

No TODO blocks.

I must have done something wrong - it's the first time I've seen this - but then again, I don't look that closely at the output of "make test" unless
something shows as a fail. Anyway, below is the short bit of code.

Thanks
Nath

use strict;
use Bio::Root::IO;  # cant test for this, might be needed to get Test::More

BEGIN {
  # Things to do ASAP once the script is run
  # even before anything else in the file is parsed
  use vars qw($NUMTESTS $DEBUG $error);
  $DEBUG = $ENV{'BIOIPERLDEBUG'} || 0;

  # Use installed Test module, otherwise fall back
  # to copy of Test.pm located in the t dir
  eval { require Test::More; };
  if ( $@ ) {
    use lib Bio::Root::IO->catfile('t','lib');
  }

  # Currently no errors
  $error = 0;

  # Setup the number of tests to be run
  # what about using:
  # use Test::More 'no_plan';
  use Test::More;
  $NUMTESTS = 10;
  plan tests => $NUMTESTS;

  # Use modules that are needed in this test that are from
  # any of the Bioperl packages: Bioperl-core, Bioperl-run ... etc
  # use_ok('<module::to::use>');
  use_ok('Bio::Tools::Run::Alignment::Amap');
  use_ok('Bio::AlignIO');
  use_ok('Bio::SeqIO');
  use_ok('Bio::Root::IO');
}

# Multiple END blocks are run in reverse order of their definition
# Last In, First Out (LIFO)
END {
  # Things to do right at the very end, just
  # when the  interpreter finishes/exits
  # E.g. deleting intermediate files produced during the test

  foreach my $file ( qw(cysprot.dnd cysprot1a.dnd) ) {
    unlink $file;
    # check it was deleted

  }
  #unlink qw(cysprot.dnd cysprot1a.dnd)
}

END {
  # Not sure what this is doing?
  #for ( $Test::ntest..$NUMTESTS ) {
  #  skip("Amap program not found. Skipping.\n",1);
  #}
}

# if we got to here, thats OK!
# is this really needed?
ok( 1, 'All the required modules are present');

# setup input files etc
my $inputfilename = Bio::Root::IO->catfile("t","data","cysprot.fa");

# setup output files etc
# none in this test

# setup global objects that are to be used in more than one test
# Also test they were initialised correctly
my @params = ();
my $aln;
my $factory = Bio::Tools::Run::Alignment::Amap->new(@params);
ok( defined $factory,                                  'new() returned something' );
ok( $factory->isa('Bio::Tools::Run::Alignment::Amap'), '  and its the right class' );

# Now onto the nitty gritty tests of the modules methods
my $executable_file = $factory->executable();
#is( $factory->executable(), 'filename',                'executable() got the correct filename' );

# block of tests to skip if you know the tests will fail
# under some condition. E.g.:
#   Need network access,
#   Wont work on particular OS,
#   Cant find the exectuable
# Do not just skip tests that seem to fail for an unknown reason
SKIP: {
  # condition used to skip this block of tests
  #skip($why, $how_many_in_block);
  skip("Got incorrect filename for executable", 2)
    unless is($factory->executable(), 'filename',       'executable() got the correct filename');

  ok( -e $executable_file,                              'Found executable' );
  ok( $factory->version >= 2.0,                         'Code tested on Amap versions >= 2.0' );

}

From jason at bioperl.org  Thu Oct 19 13:44:51 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 10:44:51 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
	<8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
	<7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
Message-ID: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>

Yikes - I was worried that it might have been me.....

Okay I'll look into fixing it -- ChrisF - check in with me before  
diving in, in case I've gotten it done and I expect your enzyme  
assays might take up the time.

-jason
On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:

> Actually you did that Jason: http://tinyurl.com/ye2edk
>
> Apparently the motivation was to "parse swissprot fields in genpept  
> file (dbsource)"?
>
> It clearly looks wrong to add the version. You've probably had a  
> reason why you did this at the time but if we (you :) can't recover  
> that I guess it's best to just fix it to do the right thing (in  
> both places obviously).
>
> 	-hilmar
>
> On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
>
>> Well there is explicit addition of the version to the primary id  
>> so it isn't so much a parsing error as a deliberate decision to  
>> append it.
>> see Bio::SeqIO::genbank
>>
>> to make the dblink
>>                                               $annotation- 
>> >add_Annotation
>>                                                     ('dblink',
>>                                                       
>> Bio::Annotation::DBLink->new
>>                                                      (-primary_id  
>> => $id . "." . $version,
>>                                                       -version =>  
>> $version,
>>                                                       -database =>  
>> $db,
>>                                                       -tagname =>  
>> 'dblink'));
>>
>> and the code to print the dblink back out in the writer already  
>> assumes the version number is appended...
>>
>>         foreach my $ref ( $seq->annotation->get_Annotations 
>> ('dblink') ) {
>>             # if ($ref->comment eq 'DBSOURCE') {
>>             $self->_print('DBSOURCE    accession ',
>>                           $ref->primary_id, "\n");
>>             # }
>>         }
>>
>> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
>>
>>> Here is the overload code:
>>>
>>> use overload '""' => sub {
>>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
>>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
>>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
>>> 	|| '' };
>>>
>>> Except that the last '||' is redundant and unnecessary (it either  
>>> does nothing or replaces an empty string with an empty string), I  
>>> don't see the potential for duplicating the version number here -  
>>> unless primary_id() did that, which I don't see it doing.
>>>
>>> So, to me this seems to come from a parsing error in the  
>>> beginning, rather than an erroneous mangling of version into  
>>> primary_id later.
>>>
>>> Is someone in the position to confirm this?
>>>
>>> 	-hilmar
>>>
>>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>>>
>>>> So I'm unsure what we should do here.
>>>>
>>>> We can certainly fix the problem which you report which is  
>>>> relying on
>>>> the "" method -- if you were to do instead:
>>>> print $_->database, ":", $_->primary_id, "\n";
>>>>
>>>> you'll get the right answer.  We at a minimum just fix the auto-
>>>> string converting method to do The Right Thing.
>>>>
>>>> But I am not sure if we should keep the version out of the  
>>>> primary_id
>>>> field.  This will require some rejiggering in several modules  
>>>> when it
>>>> comes to printing DBlinks and I don't want to do this before the
>>>> release. I also am not sure if there was an explicit reason why
>>>> someone did put the version information in the primary_id. (I  
>>>> hope it
>>>> wasn't me because I don't think I'm going to remember why).
>>>>
>>>> Does anyone else have a strong feeling?
>>>>
>>>> -jason
>>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I noticed a little problem with the Annotation "DBLink" from
>>>>> GenBank entries
>>>>>
>>>>> When I run:
>>>>>
>>>>> perl -MBio::DB::GenBank -e 'my $gi =
>>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>>>> $seqio =
>>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>>>> ("dblink");
>>>>> for(@annotations) { print $_, "\n";} print $INC{
>>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>>>
>>>>> This yields:
>>>>>
>>>>>    GenBank:AL591065.17.17
>>>>>
>>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>>>
>>>>> Can others repeat this?
>>>>>
>>>>> I have dug into the source a little and Bio::Annotation::DBLink
>>>>> seems to
>>>>> be the place where this happens: it has a concatenation which  
>>>>> leads to
>>>>> that repeated version number.
>>>>>
>>>>> It this something that I should fix "client-side", so to speak, or
>>>>> is it
>>>>> worthwhile to add some logic to that concatenation to prevent  
>>>>> this?
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Eric
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> --
>>>> Jason Stajich, PhD
>>>> Miller Research Fellow
>>>> University of California
>>>> Dept of Plant and Microbial Biology
>>>> 321 Koshland Hall #3102
>>>> Berkeley, CA 94720-3102
>>>> lab: 510.642.8441
>>>> http://pmb.berkeley.edu/~taylor/people/js.html
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Jason Stajich, PhD
>> Miller Research Fellow
>> University of California
>> Dept of Plant and Microbial Biology
>> 321 Koshland Hall #3102
>> Berkeley, CA 94720-3102
>> lab: 510.642.8441
>> http://pmb.berkeley.edu/~taylor/people/js.html
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From cjfields at uiuc.edu  Thu Oct 19 14:03:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:03:52 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
Message-ID: <000001c6f3a8$f0a46a00$15327e82@pyrimidine>

Also seems that the DBSOURCE line isn't caught correctly and stuffs it by
default into a GenBank dblink (the dbsource ihn the test case is EMBL, not
GenBank).  

http://bugzilla.open-bio.org/show_bug.cgi?id=2124

It looks like NCBI may be now using:

DBSOURCE    embl accession Z49548.1

instead of the old version:

DBSOURCE    embl locus SCYJR048W, accession Z49548.1

I don't recall NCBI mentioning changes regarding DBSOURCE in any of the
recent release notes.

Chris

> Actually you did that Jason: http://tinyurl.com/ye2edk
> 
> Apparently the motivation was to "parse swissprot fields in genpept
> file (dbsource)"?
> 
> It clearly looks wrong to add the version. You've probably had a
> reason why you did this at the time but if we (you :) can't recover
> that I guess it's best to just fix it to do the right thing (in both
> places obviously).
> 
> 	-hilmar
> 
> On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> 
> > Well there is explicit addition of the version to the primary id so
> > it isn't so much a parsing error as a deliberate decision to append
> > it.
> > see Bio::SeqIO::genbank
> >
> > to make the dblink
> >                                               $annotation-
> > >add_Annotation
> >                                                     ('dblink',
> >
> > Bio::Annotation::DBLink->new
> >                                                      (-primary_id
> > => $id . "." . $version,
> >                                                       -version =>
> > $version,
> >                                                       -database =>
> > $db,
> >                                                       -tagname =>
> > 'dblink'));
> >
> > and the code to print the dblink back out in the writer already
> > assumes the version number is appended...
> >
> >         foreach my $ref ( $seq->annotation->get_Annotations
> > ('dblink') ) {
> >             # if ($ref->comment eq 'DBSOURCE') {
> >             $self->_print('DBSOURCE    accession ',
> >                           $ref->primary_id, "\n");
> >             # }
> >         }
> >
> > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >
> >> Here is the overload code:
> >>
> >> use overload '""' => sub {
> >> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >> 	|| '' };
> >>
> >> Except that the last '||' is redundant and unnecessary (it either
> >> does nothing or replaces an empty string with an empty string), I
> >> don't see the potential for duplicating the version number here -
> >> unless primary_id() did that, which I don't see it doing.
> >>
> >> So, to me this seems to come from a parsing error in the
> >> beginning, rather than an erroneous mangling of version into
> >> primary_id later.
> >>
> >> Is someone in the position to confirm this?
> >>
> >> 	-hilmar
> >>
> >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>
> >>> So I'm unsure what we should do here.
> >>>
> >>> We can certainly fix the problem which you report which is
> >>> relying on
> >>> the "" method -- if you were to do instead:
> >>> print $_->database, ":", $_->primary_id, "\n";
> >>>
> >>> you'll get the right answer.  We at a minimum just fix the auto-
> >>> string converting method to do The Right Thing.
> >>>
> >>> But I am not sure if we should keep the version out of the
> >>> primary_id
> >>> field.  This will require some rejiggering in several modules
> >>> when it
> >>> comes to printing DBlinks and I don't want to do this before the
> >>> release. I also am not sure if there was an explicit reason why
> >>> someone did put the version information in the primary_id. (I
> >>> hope it
> >>> wasn't me because I don't think I'm going to remember why).
> >>>
> >>> Does anyone else have a strong feeling?
> >>>
> >>> -jason
> >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> I noticed a little problem with the Annotation "DBLink" from
> >>>> GenBank entries
> >>>>
> >>>> When I run:
> >>>>
> >>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>> $seqio =
> >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>> ("dblink");
> >>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>
> >>>> This yields:
> >>>>
> >>>>    GenBank:AL591065.17.17
> >>>>
> >>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>
> >>>> Can others repeat this?
> >>>>
> >>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>> seems to
> >>>> be the place where this happens: it has a concatenation which
> >>>> leads to
> >>>> that repeated version number.
> >>>>
> >>>> It this something that I should fix "client-side", so to speak, or
> >>>> is it
> >>>> worthwhile to add some logic to that concatenation to prevent this?
> >>>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>> --
> >>> Jason Stajich, PhD
> >>> Miller Research Fellow
> >>> University of California
> >>> Dept of Plant and Microbial Biology
> >>> 321 Koshland Hall #3102
> >>> Berkeley, CA 94720-3102
> >>> lab: 510.642.8441
> >>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >> --
> >> ===========================================================
> >> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >> ===========================================================
> >>
> >>
> >>
> >>
> >>
> >
> > --
> > Jason Stajich, PhD
> > Miller Research Fellow
> > University of California
> > Dept of Plant and Microbial Biology
> > 321 Koshland Hall #3102
> > Berkeley, CA 94720-3102
> > lab: 510.642.8441
> > http://pmb.berkeley.edu/~taylor/people/js.html
> >
> >
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From N.Haigh at sheffield.ac.uk  Thu Oct 19 14:06:11 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:06:11 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161281171.4537be93b63c9@webmail.shef.ac.uk>


> 
> Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
> the problem.  The only issue I can think of is that Test::More TODO blocks
> require a newer version of Test::Harness (which most users have anyway).
> Are you using a TODO block?
> 
> You can send me Amap.t and I'll give it a try, but I can't promise I'll get
> to it immediately (busy day).
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 

Nevermind about this - It's working as expected!

I got confused as a previous run threw errors but wasn't included in the final table of failed tests - working now.

Nath 


From N.Haigh at sheffield.ac.uk  Thu Oct 19 14:14:54 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:14:54 +0100
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>

I have a few questions about How bioperl-run modules.

1) How do modules define what the name of the executable is that it uses?
2) Is there a way to test what this is?
3) Does $factory->executable return this or does it only return the name if it successfully found it?

Thanks
Nath

From cjfields at uiuc.edu  Thu Oct 19 14:15:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:15:08 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>
Message-ID: <000001c6f3aa$82845ba0$15327e82@pyrimidine>

Go for it.  I haven't got the time to spare at the moment, sucky protein
assays....

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Jason Stajich
> Sent: Thursday, October 19, 2006 12:45 PM
> To: Hilmar Lapp
> Cc: bioperl-l at lists.open-bio.org; Erikjan
> Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating
> 
> Yikes - I was worried that it might have been me.....
> 
> Okay I'll look into fixing it -- ChrisF - check in with me before
> diving in, in case I've gotten it done and I expect your enzyme
> assays might take up the time.
> 
> -jason
> On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:
> 
> > Actually you did that Jason: http://tinyurl.com/ye2edk
> >
> > Apparently the motivation was to "parse swissprot fields in genpept
> > file (dbsource)"?
> >
> > It clearly looks wrong to add the version. You've probably had a
> > reason why you did this at the time but if we (you :) can't recover
> > that I guess it's best to just fix it to do the right thing (in
> > both places obviously).
> >
> > 	-hilmar
> >
> > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> >
> >> Well there is explicit addition of the version to the primary id
> >> so it isn't so much a parsing error as a deliberate decision to
> >> append it.
> >> see Bio::SeqIO::genbank
> >>
> >> to make the dblink
> >>                                               $annotation-
> >> >add_Annotation
> >>                                                     ('dblink',
> >>
> >> Bio::Annotation::DBLink->new
> >>                                                      (-primary_id
> >> => $id . "." . $version,
> >>                                                       -version =>
> >> $version,
> >>                                                       -database =>
> >> $db,
> >>                                                       -tagname =>
> >> 'dblink'));
> >>
> >> and the code to print the dblink back out in the writer already
> >> assumes the version number is appended...
> >>
> >>         foreach my $ref ( $seq->annotation->get_Annotations
> >> ('dblink') ) {
> >>             # if ($ref->comment eq 'DBSOURCE') {
> >>             $self->_print('DBSOURCE    accession ',
> >>                           $ref->primary_id, "\n");
> >>             # }
> >>         }
> >>
> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >>
> >>> Here is the overload code:
> >>>
> >>> use overload '""' => sub {
> >>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >>> 	|| '' };
> >>>
> >>> Except that the last '||' is redundant and unnecessary (it either
> >>> does nothing or replaces an empty string with an empty string), I
> >>> don't see the potential for duplicating the version number here -
> >>> unless primary_id() did that, which I don't see it doing.
> >>>
> >>> So, to me this seems to come from a parsing error in the
> >>> beginning, rather than an erroneous mangling of version into
> >>> primary_id later.
> >>>
> >>> Is someone in the position to confirm this?
> >>>
> >>> 	-hilmar
> >>>
> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>>
> >>>> So I'm unsure what we should do here.
> >>>>
> >>>> We can certainly fix the problem which you report which is
> >>>> relying on
> >>>> the "" method -- if you were to do instead:
> >>>> print $_->database, ":", $_->primary_id, "\n";
> >>>>
> >>>> you'll get the right answer.  We at a minimum just fix the auto-
> >>>> string converting method to do The Right Thing.
> >>>>
> >>>> But I am not sure if we should keep the version out of the
> >>>> primary_id
> >>>> field.  This will require some rejiggering in several modules
> >>>> when it
> >>>> comes to printing DBlinks and I don't want to do this before the
> >>>> release. I also am not sure if there was an explicit reason why
> >>>> someone did put the version information in the primary_id. (I
> >>>> hope it
> >>>> wasn't me because I don't think I'm going to remember why).
> >>>>
> >>>> Does anyone else have a strong feeling?
> >>>>
> >>>> -jason
> >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> I noticed a little problem with the Annotation "DBLink" from
> >>>>> GenBank entries
> >>>>>
> >>>>> When I run:
> >>>>>
> >>>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>>> $seqio =
> >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>>> ("dblink");
> >>>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>>
> >>>>> This yields:
> >>>>>
> >>>>>    GenBank:AL591065.17.17
> >>>>>
> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>>
> >>>>> Can others repeat this?
> >>>>>
> >>>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>>> seems to
> >>>>> be the place where this happens: it has a concatenation which
> >>>>> leads to
> >>>>> that repeated version number.
> >>>>>
> >>>>> It this something that I should fix "client-side", so to speak, or
> >>>>> is it
> >>>>> worthwhile to add some logic to that concatenation to prevent
> >>>>> this?
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l at lists.open-bio.org
> >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>> --
> >>>> Jason Stajich, PhD
> >>>> Miller Research Fellow
> >>>> University of California
> >>>> Dept of Plant and Microbial Biology
> >>>> 321 Koshland Hall #3102
> >>>> Berkeley, CA 94720-3102
> >>>> lab: 510.642.8441
> >>>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>
> >>> --
> >>> ===========================================================
> >>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >>> ===========================================================
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Jason Stajich, PhD
> >> Miller Research Fellow
> >> University of California
> >> Dept of Plant and Microbial Biology
> >> 321 Koshland Hall #3102
> >> Berkeley, CA 94720-3102
> >> lab: 510.642.8441
> >> http://pmb.berkeley.edu/~taylor/people/js.html
> >>
> >>
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Thu Oct 19 14:35:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:35:08 -0500
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
Message-ID: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>

I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase
but I'm not sure.  I haven't used them very much myself but plan on making
wrappers at some point soon for some programs I use.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Nathan Haigh [mailto:N.Haigh at sheffield.ac.uk]
> Sent: Thursday, October 19, 2006 1:15 PM
> To: Chris Fields
> Cc: 'bioperl-l'
> Subject: bioperl-run executable
> 
> I have a few questions about How bioperl-run modules.
> 
> 1) How do modules define what the name of the executable is that it uses?
> 2) Is there a way to test what this is?
> 3) Does $factory->executable return this or does it only return the name
> if it successfully found it?
> 
> Thanks
> Nath


From N.Haigh at sheffield.ac.uk  Thu Oct 19 14:47:01 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:47:01 +0100
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
Message-ID: <1161283620.4537c82501c43@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase
> but I'm not sure.  I haven't used them very much myself but plan on making
> wrappers at some point soon for some programs I use.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 

On closer inspection of a couple of other modules (Clustalw.pm and TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME and have a sub
(program_name) that simply returns this value. I'd like to see the program_name become a getter/setter so users can change the default and have the
string stored in the factory object.

Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core not bioperl-run? I suppose not since bioperl-core is a prerep for bioperl-run but
wouldn't it make sence to go in bioperl-run?

Nath

From cjfields at uiuc.edu  Thu Oct 19 15:07:05 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 14:07:05 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>
Message-ID: <000701c6f3b1$c5914230$15327e82@pyrimidine>

Jason, Hilmar, 

How about changing the default parsed dblink in SeqIO::genbank (line 520) to

		if( $dbsource =~ /^(\S*?)\s*accession\s+(\S+)\.(\d+)/ ) {
		    my ($db,$id,$version) = ($1,$2,$3);
		    $annotation->add_Annotation
			('dblink',
			 Bio::Annotation::DBLink->new
			 (-primary_id => $id,
			  -version => $version,
			  -database => $db || 'GenBank',
			  -tagname => 'dblink'));
		} 

It passes tests and catches the optional database ('embl' for the bugzilla
report).  The output sequence still doesn't print the DB if it isn't GenBank
via write_seq(), but that should be too hard to fix (famous last words).

Okay, okay, back to the assays...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Jason Stajich
> Sent: Thursday, October 19, 2006 12:45 PM
> To: Hilmar Lapp
> Cc: bioperl-l at lists.open-bio.org; Erikjan
> Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating
> 
> Yikes - I was worried that it might have been me.....
> 
> Okay I'll look into fixing it -- ChrisF - check in with me before
> diving in, in case I've gotten it done and I expect your enzyme
> assays might take up the time.
> 
> -jason
> On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:
> 
> > Actually you did that Jason: http://tinyurl.com/ye2edk
> >
> > Apparently the motivation was to "parse swissprot fields in genpept
> > file (dbsource)"?
> >
> > It clearly looks wrong to add the version. You've probably had a
> > reason why you did this at the time but if we (you :) can't recover
> > that I guess it's best to just fix it to do the right thing (in
> > both places obviously).
> >
> > 	-hilmar
> >
> > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> >
> >> Well there is explicit addition of the version to the primary id
> >> so it isn't so much a parsing error as a deliberate decision to
> >> append it.
> >> see Bio::SeqIO::genbank
> >>
> >> to make the dblink
> >>                                               $annotation-
> >> >add_Annotation
> >>                                                     ('dblink',
> >>
> >> Bio::Annotation::DBLink->new
> >>                                                      (-primary_id
> >> => $id . "." . $version,
> >>                                                       -version =>
> >> $version,
> >>                                                       -database =>
> >> $db,
> >>                                                       -tagname =>
> >> 'dblink'));
> >>
> >> and the code to print the dblink back out in the writer already
> >> assumes the version number is appended...
> >>
> >>         foreach my $ref ( $seq->annotation->get_Annotations
> >> ('dblink') ) {
> >>             # if ($ref->comment eq 'DBSOURCE') {
> >>             $self->_print('DBSOURCE    accession ',
> >>                           $ref->primary_id, "\n");
> >>             # }
> >>         }
> >>
> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >>
> >>> Here is the overload code:
> >>>
> >>> use overload '""' => sub {
> >>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >>> 	|| '' };
> >>>
> >>> Except that the last '||' is redundant and unnecessary (it either
> >>> does nothing or replaces an empty string with an empty string), I
> >>> don't see the potential for duplicating the version number here -
> >>> unless primary_id() did that, which I don't see it doing.
> >>>
> >>> So, to me this seems to come from a parsing error in the
> >>> beginning, rather than an erroneous mangling of version into
> >>> primary_id later.
> >>>
> >>> Is someone in the position to confirm this?
> >>>
> >>> 	-hilmar
> >>>
> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>>
> >>>> So I'm unsure what we should do here.
> >>>>
> >>>> We can certainly fix the problem which you report which is
> >>>> relying on
> >>>> the "" method -- if you were to do instead:
> >>>> print $_->database, ":", $_->primary_id, "\n";
> >>>>
> >>>> you'll get the right answer.  We at a minimum just fix the auto-
> >>>> string converting method to do The Right Thing.
> >>>>
> >>>> But I am not sure if we should keep the version out of the
> >>>> primary_id
> >>>> field.  This will require some rejiggering in several modules
> >>>> when it
> >>>> comes to printing DBlinks and I don't want to do this before the
> >>>> release. I also am not sure if there was an explicit reason why
> >>>> someone did put the version information in the primary_id. (I
> >>>> hope it
> >>>> wasn't me because I don't think I'm going to remember why).
> >>>>
> >>>> Does anyone else have a strong feeling?
> >>>>
> >>>> -jason
> >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> I noticed a little problem with the Annotation "DBLink" from
> >>>>> GenBank entries
> >>>>>
> >>>>> When I run:
> >>>>>
> >>>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>>> $seqio =
> >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>>> ("dblink");
> >>>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>>
> >>>>> This yields:
> >>>>>
> >>>>>    GenBank:AL591065.17.17
> >>>>>
> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>>
> >>>>> Can others repeat this?
> >>>>>
> >>>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>>> seems to
> >>>>> be the place where this happens: it has a concatenation which
> >>>>> leads to
> >>>>> that repeated version number.
> >>>>>
> >>>>> It this something that I should fix "client-side", so to speak, or
> >>>>> is it
> >>>>> worthwhile to add some logic to that concatenation to prevent
> >>>>> this?
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l at lists.open-bio.org
> >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>> --
> >>>> Jason Stajich, PhD
> >>>> Miller Research Fellow
> >>>> University of California
> >>>> Dept of Plant and Microbial Biology
> >>>> 321 Koshland Hall #3102
> >>>> Berkeley, CA 94720-3102
> >>>> lab: 510.642.8441
> >>>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>
> >>> --
> >>> ===========================================================
> >>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >>> ===========================================================
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Jason Stajich, PhD
> >> Miller Research Fellow
> >> University of California
> >> Dept of Plant and Microbial Biology
> >> 321 Koshland Hall #3102
> >> Berkeley, CA 94720-3102
> >> lab: 510.642.8441
> >> http://pmb.berkeley.edu/~taylor/people/js.html
> >>
> >>
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Thu Oct 19 14:48:28 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 11:48:28 -0700
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
	<1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
Message-ID: <67650240-D61B-4842-AE7C-75F15F608F6F@bioperl.org>

program_name()
  Should return the name of the program

executable()
  Is a function that you don't have to mess with that tries to find  
the executable named  program_name() based on your PATH.


-jason
On Oct 19, 2006, at 11:14 AM, Nathan Haigh wrote:

> I have a few questions about How bioperl-run modules.
>
> 1) How do modules define what the name of the executable is that it  
> uses?
> 2) Is there a way to test what this is?
> 3) Does $factory->executable return this or does it only return the  
> name if it successfully found it?
>
> Thanks
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Thu Oct 19 17:06:43 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 14:06:43 -0700
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161283620.4537c82501c43@webmail.shef.ac.uk>
References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
	<1161283620.4537c82501c43@webmail.shef.ac.uk>
Message-ID: <AA1A41EC-C0E1-49C3-818E-64210971E331@bioperl.org>

It can be reset now but of course this not a very nice way of doing it:

$Bio::Tools::Run::Alignment::Clustalw::PROGRAM_NAME = 'clustalw_smp';

I am not sure if there are pros and cons to making it a getter- 
setter, but if you want to run with it, please do.

The whole run system has been hard to keep people adhering to a  
standard (and the standard has changed a bit) so some auditing is  
warranted.

-jason

On Oct 19, 2006, at 11:47 AM, Nathan Haigh wrote:

> Quoting Chris Fields <cjfields at uiuc.edu>:
>
>> I think a lot of the bioperl-run modules use  
>> Bio::Tools::Run::WrapperBase
>> but I'm not sure.  I haven't used them very much myself but plan  
>> on making
>> wrappers at some point soon for some programs I use.
>>
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>
> On closer inspection of a couple of other modules (Clustalw.pm and  
> TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME  
> and have a sub
> (program_name) that simply returns this value. I'd like to see the  
> program_name become a getter/setter so users can change the default  
> and have the
> string stored in the factory object.
>
> Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core  
> not bioperl-run? I suppose not since bioperl-core is a prerep for  
> bioperl-run but
> wouldn't it make sence to go in bioperl-run?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From torsten.seemann at infotech.monash.edu.au  Thu Oct 19 19:24:03 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 20 Oct 2006 09:24:03 +1000
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161279505.4537b811e143f@webmail.shef.ac.uk>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
	<1161279505.4537b811e143f@webmail.shef.ac.uk>
Message-ID: <45380913.3070506@infotech.monash.edu.au>

Nathan,

> use strict;
> use Bio::Root::IO;  # cant test for this, might be needed to get Test::More

use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
and File::Spec is "guaranteed" to be installed with Perl 5.6+.

>     use lib Bio::Root::IO->catfile('t','lib');

Simpler as:
	use lib 't/lib';
I understand the 'lib.pm' accepts Unix style directories REGARDLESS of native 
platform.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia

From prabubio at gmail.com  Thu Oct 19 20:11:36 2006
From: prabubio at gmail.com (Prabu Raja)
Date: 20 Oct 2006 00:11:36 -0000
Subject: [Bioperl-l] Prabu Raja sent you this link
Message-ID: <20061020001136.86586.qmail@x05.namesdatabase.com>

Remember your link from Prabu Raja:

http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2


1 -> Use Prabu Raja's link by clicking above.

2 -> Enter your info for a membership connected to Prabu.

3 -> Share links with other friends, family and co-workers.

4 -> Use the members-only people search tools.

Prabu selected you for this on 09-02-2004 22:52 ET.


prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-bio.org
at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99.
If you do not know a Prabu Raja, use http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more reminders about this.
For reference, the address of The Names Database is 1253 N. Research Way, Suite Q-2500, Orem, UT 84097.

From cjfields at uiuc.edu  Thu Oct 19 20:29:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 19:29:11 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <45380913.3070506@infotech.monash.edu.au>
Message-ID: <000f01c6f3de$c3d91170$15327e82@pyrimidine>

> Nathan,
> 
> > use strict;
> > use Bio::Root::IO;  # cant test for this, might be needed to get
> Test::More
> 
> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
> and File::Spec is "guaranteed" to be installed with Perl 5.6+.
> 
> >     use lib Bio::Root::IO->catfile('t','lib');
> 
> Simpler as:
> 	use lib 't/lib';
> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of
> native
> platform.
> 
> --
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia

That is true, at least for WinXP (not sure about older Windows versions out
there).  I was using 'Root::IO->catfile' but found 'use lib 't/lib' works.
I may have a few of the 'catfile' versions floating around out there, which
may be where that originated.

Note that if you plan on using Test::More with the bioperl-run test suite,
you should add it to the bioperl-run CVS distribution directory in 't/lib'.
Most people will have it installed, but you never know.

Chris


From cjfields at uiuc.edu  Thu Oct 19 20:33:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 19:33:22 -0500
Subject: [Bioperl-l] Prabu Raja sent you this link
In-Reply-To: <20061020001136.86586.qmail@x05.namesdatabase.com>
Message-ID: <001001c6f3df$598a24c0$15327e82@pyrimidine>

That Prabu Raja sure gets around...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Prabu Raja
> Sent: Thursday, October 19, 2006 7:12 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Prabu Raja sent you this link
> 
> Remember your link from Prabu Raja:
> 
> http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2
> 
> 
> 1 -> Use Prabu Raja's link by clicking above.
> 
> 2 -> Enter your info for a membership connected to Prabu.
> 
> 3 -> Share links with other friends, family and co-workers.
> 
> 4 -> Use the members-only people search tools.
> 
> Prabu selected you for this on 09-02-2004 22:52 ET.
> 
> 
> prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-
> bio.org
> at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99.
> If you do not know a Prabu Raja, use
> http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more
> reminders about this.
> For reference, the address of The Names Database is 1253 N. Research Way,
> Suite Q-2500, Orem, UT 84097.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From keithplayer at hotmail.com  Thu Oct 19 22:13:52 2006
From: keithplayer at hotmail.com (Keith Player)
Date: Fri, 20 Oct 2006 02:13:52 +0000 (UTC)
Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning
Message-ID: <loom.20061020T041338-193@post.gmane.org>

I know that there may be some changes resulting from new GFF3 implementations, 
but thought I would see if the following is useful anyway.

I implemented the R-tree binning schema as used by Bio::DB::GFF::Util::Binning 
and as mention in this article:

I tested the following query on a normal table (no binning), but it assumes 
that you know the longest range in the table.  So for example with a table of 
human genes, where the longest gene we know of is around 2.4Mb.

 SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) AND 
g.start < [end] AND g.end > [start] AND g.chromosome = '1'

so for 100Mb:101Mb

SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < 
101000000 AND g.end > 100000000 AND g.chromosome = '1'


where [start] and [end] define the region of interest.  This query outperforms 
the R-Tree implementation on all tests that I have performed (for lengths of 
200bp to 10Mb across a whole chromsome).  Could this be of some practical use?


From jason at bioperl.org  Thu Oct 19 11:50:49 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 08:50:49 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
Message-ID: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>

Well there is explicit addition of the version to the primary id so  
it isn't so much a parsing error as a deliberate decision to append it.
see Bio::SeqIO::genbank

to make the dblink
                                               $annotation- 
 >add_Annotation
                                                     ('dblink',
                                                       
Bio::Annotation::DBLink->new
                                                      (-primary_id =>  
$id . "." . $version,
                                                       -version =>  
$version,
                                                       -database => $db,
                                                       -tagname =>  
'dblink'));

and the code to print the dblink back out in the writer already  
assumes the version number is appended...

         foreach my $ref ( $seq->annotation->get_Annotations 
('dblink') ) {
             # if ($ref->comment eq 'DBSOURCE') {
             $self->_print('DBSOURCE    accession ',
                           $ref->primary_id, "\n");
             # }
         }

On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:

> Here is the overload code:
>
> use overload '""' => sub {
> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> 	|| '' };
>
> Except that the last '||' is redundant and unnecessary (it either  
> does nothing or replaces an empty string with an empty string), I  
> don't see the potential for duplicating the version number here -  
> unless primary_id() did that, which I don't see it doing.
>
> So, to me this seems to come from a parsing error in the beginning,  
> rather than an erroneous mangling of version into primary_id later.
>
> Is someone in the position to confirm this?
>
> 	-hilmar
>
> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>
>> So I'm unsure what we should do here.
>>
>> We can certainly fix the problem which you report which is relying on
>> the "" method -- if you were to do instead:
>> print $_->database, ":", $_->primary_id, "\n";
>>
>> you'll get the right answer.  We at a minimum just fix the auto-
>> string converting method to do The Right Thing.
>>
>> But I am not sure if we should keep the version out of the primary_id
>> field.  This will require some rejiggering in several modules when it
>> comes to printing DBlinks and I don't want to do this before the
>> release. I also am not sure if there was an explicit reason why
>> someone did put the version information in the primary_id. (I hope it
>> wasn't me because I don't think I'm going to remember why).
>>
>> Does anyone else have a strong feeling?
>>
>> -jason
>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>
>>> Hello,
>>>
>>> I noticed a little problem with the Annotation "DBLink" from
>>> GenBank entries
>>>
>>> When I run:
>>>
>>> perl -MBio::DB::GenBank -e 'my $gi =
>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>> $seqio =
>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>> ("dblink");
>>> for(@annotations) { print $_, "\n";} print $INC{
>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>
>>> This yields:
>>>
>>>    GenBank:AL591065.17.17
>>>
>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>
>>> Can others repeat this?
>>>
>>> I have dug into the source a little and Bio::Annotation::DBLink
>>> seems to
>>> be the place where this happens: it has a concatenation which  
>>> leads to
>>> that repeated version number.
>>>
>>> It this something that I should fix "client-side", so to speak, or
>>> is it
>>> worthwhile to add some logic to that concatenation to prevent this?
>>>
>>>
>>> Thanks,
>>>
>>> Eric
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich, PhD
>> Miller Research Fellow
>> University of California
>> Dept of Plant and Microbial Biology
>> 321 Koshland Hall #3102
>> Berkeley, CA 94720-3102
>> lab: 510.642.8441
>> http://pmb.berkeley.edu/~taylor/people/js.html
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From n.haigh at sheffield.ac.uk  Fri Oct 20 04:35:03 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 20 Oct 2006 08:35:03 +0000
Subject: [Bioperl-l] test::more template
In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
Message-ID: <45388A37.7040505@sheffield.ac.uk>

Chris Fields wrote:
>> Nathan,
>>
>>     
>>> use strict;
>>> use Bio::Root::IO;  # cant test for this, might be needed to get
>>>       
>> Test::More
>>
>> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
>> and File::Spec is "guaranteed" to be installed with Perl 5.6+.
>>
>>     
>>>     use lib Bio::Root::IO->catfile('t','lib');
>>>       
>> Simpler as:
>> 	use lib 't/lib';
>> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of
>> native
>> platform.
>>
>> --
>> Torsten Seemann
>> Victorian Bioinformatics Consortium, Monash University, Australia
>>     
>
> That is true, at least for WinXP (not sure about older Windows versions out
> there).  I was using 'Root::IO->catfile' but found 'use lib 't/lib' works.
> I may have a few of the 'catfile' versions floating around out there, which
> may be where that originated.
>
> Note that if you plan on using Test::More with the bioperl-run test suite,
> you should add it to the bioperl-run CVS distribution directory in 't/lib'.
> Most people will have it installed, but you never know.
>
> Chris
>
>
>   
What is the reason for including Test::More in 't/lib' rather than
having it as a prereq?

-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>

From n.haigh at sheffield.ac.uk  Fri Oct 20 05:27:19 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 20 Oct 2006 10:27:19 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
Message-ID: <45389677.1000709@sheffield.ac.uk>

Is it really necessary to specify the number of tests that are to be
conducted in advance? It seems a bit annoying to have to count the
number of tests in the script or to run the test just to see how many
tests were done, we could just use:
use Test::More 'no_plan';

And then it's up to Test::More to keep a track of how many tests it's
run. The only thing then to worry about is how many tests are in a SKIP
block if the skip criteria are met. This is unless there is a good
reason to use it that I am unaware of.

Thanks
Nath

From bix at sendu.me.uk  Fri Oct 20 06:01:09 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 11:01:09 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45389677.1000709@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk>
Message-ID: <45389E65.6080908@sendu.me.uk>

Nathan Haigh wrote:
> Is it really necessary to specify the number of tests that are to be
> conducted in advance? It seems a bit annoying to have to count the
> number of tests in the script or to run the test just to see how many
> tests were done, we could just use:
> use Test::More 'no_plan';

It's very important to have a plan. That way you know all the tests 
actually ran and weren't skipped (either due to an actual SKIP block or 
an if block that returned false due to a bug, or a for/foreach/while 
that didn't loop enough times due to a bug, or any number of other reasons).

From bix at sendu.me.uk  Fri Oct 20 06:04:48 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 11:04:48 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45388A37.7040505@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45388A37.7040505@sheffield.ac.uk>
Message-ID: <45389F40.5060601@sendu.me.uk>

Nathan S. Haigh wrote:
> Chris Fields wrote:
>
>> Note that if you plan on using Test::More with the bioperl-run test suite,
>> you should add it to the bioperl-run CVS distribution directory in 't/lib'.
>> Most people will have it installed, but you never know.
>
> What is the reason for including Test::More in 't/lib' rather than
> having it as a prereq?

Because we want to ensure that the test suite runs and tells you real 
problems (if any) about the code (Bioperl) that it is testing, not 
problems about actually running the tests (which are NOT required for 
using Bioperl, so cannot be considered 'pre-requisites').


From n.haigh at sheffield.ac.uk  Fri Oct 20 06:54:30 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 20 Oct 2006 11:54:30 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45389E65.6080908@sendu.me.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk>
Message-ID: <4538AAE6.5070600@sheffield.ac.uk>

If there are known bugs in a particular version of software, what is the
best approach for dealing with tests that would fail due to this bug?
Simply skip those tests that would be affected by the bug, or to fail if
the affected version is detected and report the reason so the user is
informed? Or simply bump the minimum version to one above the affected
versions?

For example, t/Clustalw has a test for at least version 1.8. It then has
some profile alignment tests that are only run if version > 1.82 is
installed. It states that versions 1.81 and 1.82 are affected by a
profile alignment bug - which i assume would make the tests fail.

Cheers
Nath

From bix at sendu.me.uk  Fri Oct 20 07:06:07 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 12:06:07 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <4538AAE6.5070600@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk>
	<4538AAE6.5070600@sheffield.ac.uk>
Message-ID: <4538AD9F.8040003@sendu.me.uk>

Nathan Haigh wrote:
> If there are known bugs in a particular version of software, what is the
> best approach for dealing with tests that would fail due to this bug?
> Simply skip those tests that would be affected by the bug, or to fail if
> the affected version is detected and report the reason so the user is
> informed? Or simply bump the minimum version to one above the affected
> versions?
> 
> For example, t/Clustalw has a test for at least version 1.8. It then has
> some profile alignment tests that are only run if version > 1.82 is
> installed. It states that versions 1.81 and 1.82 are affected by a
> profile alignment bug - which i assume would make the tests fail.

Specific cases like this, I'd discuss on the list/ with the author of
the module in question. Maybe there is some great need to allow usage
with <1.81?

My view, based purely on what you've said above, bump the pre-requisite
to a version that works.


From cjfields at uiuc.edu  Fri Oct 20 08:36:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 07:36:37 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <45388A37.7040505@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45388A37.7040505@sheffield.ac.uk>
Message-ID: <80A2D210-B0DB-4CD2-9B56-A38097F4F63F@uiuc.edu>


>> ,,,
>>
> What is the reason for including Test::More in 't/lib' rather than
> having it as a prereq?

We could do that.  Many CPAN modules include it in 't/lib' b/c it is  
only needed for testing purposes.

Chris

>
> -- 
>> A: Yes.
>>> Q: Are you sure?
>>>
>>>> A: Because it reverses the logical flow of conversation.
>>>>
>>>>> Q: Why is top posting frowned upon?
>>>>>
> Get Thunderbird <http://www.mozilla.org/products/thunderbird/>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Fri Oct 20 10:44:29 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 15:44:29 +0100
Subject: [Bioperl-l] Updated Makefile.PL
Message-ID: <4538E0CD.1030908@sendu.me.uk>

Hi,
I've just committed an updated Makefile.PL to HEAD for bioperl-live. 
Could some people test it on multiple platforms and confirm it is ok 
(try out the different possible options as well)?

(NB. in the below, 'pre-reqs' are things the makefile considers optional 
dependencies)

Note that some pre-reqs have been removed:
# DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end 
up requiring it but only after the user makes an explicit choice by 
typing 'DBD::mysql' in their own code to supply as an option to Bioperl 
code)
# File::Temp (standard in 5.6.1)


This pre-req was wrong:
# Data::Stag::Writer
and has been replaced with:
Data::Stag::XMLWriter


Also, I note that very many Bioperl modules need IO::String, including 
Bio::SeqIO, so I'm not sure to what extent we can pretend it is an 
optional module. I didn't make any change though.


I don't know if these changes affect the Windows ppm Nathan, or anything 
else (Bundle?)?

The INSTALL docs need updating with these new and improved pre-reqs 
(note that some pre-reqs had wrong/not enough Bioperl modules listed as 
needing them); does someone want to correct the wiki (based on the new 
Makefile.PL) and then Chris can re-create the text version?

From hlapp at gmx.net  Fri Oct 20 11:03:34 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 20 Oct 2006 11:03:34 -0400
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
References: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>


On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote:

> Also, I note that very many Bioperl modules need IO::String, including
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
> optional module. I didn't make any change though.

I agree. There's really not that many terribly useful things you can  
do with Bioperl w/o having IO::String installed, which is in stark  
contrast to many other dependencies.

I don't have a problem with making it (and a few others used all over  
the place) required, to better contrast them with the dependencies  
that are really optional (and not needed for 90% of users).

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct 20 11:18:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 10:18:32 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <001501c6f45b$019103c0$15327e82@pyrimidine>

> Hi,
> I've just committed an updated Makefile.PL to HEAD for bioperl-live.
> Could some people test it on multiple platforms and confirm it is ok
> (try out the different possible options as well)?
> 
> (NB. in the below, 'pre-reqs' are things the makefile considers optional
> dependencies)
> 
> Note that some pre-reqs have been removed:
> # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end
> up requiring it but only after the user makes an explicit choice by
> typing 'DBD::mysql' in their own code to supply as an option to Bioperl
> code)
> # File::Temp (standard in 5.6.1)

I'll try it out on WinXP and Mac OS X.  BTW, do any of Lincoln's Bio::DB*
use DBD::mySQL?  Bio::DB::GFF comes to mind.  I don't think it should be an
absolute requirement, though.

If we plan on removing those, then we should also remove them from
Bundle::Bioperl (if they are present).

> This pre-req was wrong:
> # Data::Stag::Writer
> and has been replaced with:
> Data::Stag::XMLWriter
> 
> 
> Also, I note that very many Bioperl modules need IO::String, including
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
> optional module. I didn't make any change though.

Do they all require IO::String or is it an option?  There are a few
instances (WebDBSeqI-implementing, for instance) where this is presented as
an option for most OS's (along with the default, pipeline, and tempfile).
However, it is currently used by default with Windows due to lack of
pipe/fork support at the time.

BTW, the latter may now work with WinXP ActivePerl.  ActiveState has been
working on WinXP fork() emulation for a while, but I think it is still
somewhat experimental.  

> I don't know if these changes affect the Windows ppm Nathan, or anything
> else (Bundle?)?
> 
> The INSTALL docs need updating with these new and improved pre-reqs
> (note that some pre-reqs had wrong/not enough Bioperl modules listed as
> needing them); does someone want to correct the wiki (based on the new
> Makefile.PL) and then Chris can re-create the text version?

Easier to just modify the text version based on what is changed in the wiki,
at least for the time being.  The text dumping from elinks/lynx isn't
full-proof re: tables and such, which is one reason I think we should move
the prereqs to a separate file as it's easier to maintain long-term (this
seems to be where most changes occur anyway).  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Fri Oct 20 11:23:38 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 16:23:38 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>	<45379BBB.1040400@sheffield.ac.uk>
	<1161270180.453793a432e4f@webmail.shef.ac.uk>
Message-ID: <4538E9FA.60701@sendu.me.uk>

Nathan Haigh wrote:
> I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be
> consistent with other tests.
> 
> Failing that - Is there a good test writing style I should follow in one of the other test files?

I originally based mine on one of Chris's EUtilities tests, but now 
refer to t/ESEfinder.t since it is small and demonstrates all the major 
tricky things you might have to do - skip remote tests if no 
BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests 
under some condition, fall-back to t/lib for Test::More if necessary.

(Though I just spotted an oops in the latter...)

From cjfields at uiuc.edu  Fri Oct 20 11:38:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 10:38:02 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <4538E9FA.60701@sendu.me.uk>
Message-ID: <001601c6f45d$bb824350$15327e82@pyrimidine>

> Nathan Haigh wrote:
> > I thought I'd have my first proper try at writing some tests. I was
> wondering if there is a template test file that I should use/study in
> order to be
> > consistent with other tests.
> >
> > Failing that - Is there a good test writing style I should follow in one
> of the other test files?
> 
> I originally based mine on one of Chris's EUtilities tests, but now
> refer to t/ESEfinder.t since it is small and demonstrates all the major
> tricky things you might have to do - skip remote tests if no
> BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests
> under some condition, fall-back to t/lib for Test::More if necessary.
> 
> (Though I just spotted an oops in the latter...)

I agree.  The EUtilities tests are quite long.  I plan on eventually cutting
out some of them  Making them somewhat less prone to changes in returned XML
data has also been a pain, as demonstrated by some of the tests from MAIN
now failing... d'oh!

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Fri Oct 20 11:39:32 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 16:39:32 +0100
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <001501c6f45b$019103c0$15327e82@pyrimidine>
References: <001501c6f45b$019103c0$15327e82@pyrimidine>
Message-ID: <4538EDB4.3030500@sendu.me.uk>

Chris Fields wrote:
> BTW, do any of Lincoln's Bio::DB*
> use DBD::mySQL?  Bio::DB::GFF comes to mind.

No, just a require on a user-passed variable as I described.


>> Also, I note that very many Bioperl modules need IO::String, including
>> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
>> optional module. I didn't make any change though.
> 
> Do they all require IO::String or is it an option?

Oops, I take that back. Bio::SeqIO doesn't use IO::String. That's what 
you get for relying on grep output...
It's still many modules that use it, but I suppose you could do useful 
things without. So actually, let's keep it optional.

From cjfields at uiuc.edu  Fri Oct 20 16:32:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 15:32:32 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
Message-ID: <000001c6f486$df508930$15327e82@pyrimidine>


Seth, 

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto:bioperl-l-
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------
> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

-- 
Best Regards,

Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From olenka.m at gmail.com  Fri Oct 20 17:47:15 2006
From: olenka.m at gmail.com (Olena Morozova)
Date: Fri, 20 Oct 2006 14:47:15 -0700
Subject: [Bioperl-l] GO annotations
Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>

Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena

From olenka.m at gmail.com  Fri Oct 20 17:47:15 2006
From: olenka.m at gmail.com (Olena Morozova)
Date: Fri, 20 Oct 2006 14:47:15 -0700
Subject: [Bioperl-l] GO annotations
Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>

Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena

From sdavis2 at mail.nih.gov  Sat Oct 21 11:05:26 2006
From: sdavis2 at mail.nih.gov (Davis, Sean (NIH/NCI) [E])
Date: Sat, 21 Oct 2006 11:05:26 -0400
Subject: [Bioperl-l] GO annotations
References: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>
Message-ID: <014DBF86B19310419F0DF8910FC56457240CE3@nihcesmlbx10.nih.gov>

You can use the ensembl perl API, or (more simply) use the Ensembl MART interface:

http://www.ensembl.org/Multi/martview

Sean


-----Original Message-----
From: Olena Morozova [mailto:olenka.m at gmail.com]
Sent: Fri 10/20/2006 5:47 PM
To: bioperl-l
Subject: [Bioperl-l] GO annotations
 
Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Sun Oct 22 06:34:51 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 22 Oct 2006 10:34:51 +0000
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
Message-ID: <453B494B.7040702@sheffield.ac.uk>

Hilmar Lapp wrote:
> On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote:
>
>   
>> Also, I note that very many Bioperl modules need IO::String, including
>> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
>> optional module. I didn't make any change though.
>>     
>
> I agree. There's really not that many terribly useful things you can  
> do with Bioperl w/o having IO::String installed, which is in stark  
> contrast to many other dependencies.
>
> I don't have a problem with making it (and a few others used all over  
> the place) required, to better contrast them with the dependencies  
> that are really optional (and not needed for 90% of users).
>
> 	-hilmar
>
>   

Is it possible to  make a distinction in Makefile.PL between those
modules that are an absolute must for Bioperl-core and those which are
optional and should go into Bundle::BioPerl?

Once I'm sure what should be "option" I'll do the Bundle::BioPerl
package and PPD's.

Cheers
Nath


From vitacolonna at appliedgenomics.org  Sun Oct 22 09:04:48 2006
From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna)
Date: Sun, 22 Oct 2006 15:04:48 +0200
Subject: [Bioperl-l] Submission proposal: ABIF module
Message-ID: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>

Hi everybody,
I would like to submit to CPAN a module for reading and parsing the  
ABIF files (with .ab1 suffix) produced by Applied Biosequence  
sequencers. The need for such a module arose in our lab because the  
existing ABI module we found on CPAN had too limited functionality.  
As an example, our module allows us to easily produce analysis  
reports similar to the ones generated by the Sequencing Analysis  
software.

May I call the module Bio::ABIF? Or should I follow other conventions?

Nicola

From cjfields at uiuc.edu  Sun Oct 22 09:54:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 08:54:51 -0500
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
Message-ID: <F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>


On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:

> Hi everybody,
> I would like to submit to CPAN a module for reading and parsing the
> ABIF files (with .ab1 suffix) produced by Applied Biosequence
> sequencers. The need for such a module arose in our lab because the
> existing ABI module we found on CPAN had too limited functionality.
> As an example, our module allows us to easily produce analysis
> reports similar to the ones generated by the Sequencing Analysis
> software.
>
> May I call the module Bio::ABIF? Or should I follow other conventions?
>
> Nicola

It depends.  Does it interact with bioperl in any way?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct 22 09:57:18 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 08:57:18 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <453B494B.7040702@sheffield.ac.uk>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
	<453B494B.7040702@sheffield.ac.uk>
Message-ID: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>


On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote:

> Is it possible to  make a distinction in Makefile.PL between those
> modules that are an absolute must for Bioperl-core and those which are
> optional and should go into Bundle::BioPerl?
>
> Once I'm sure what should be "option" I'll do the Bundle::BioPerl
> package and PPD's.
>
> Cheers
> Nath

We probably should steer this way eventually.  Do you aim on placing  
prereqs required for bioperl core in the bioperl PPD and the  
'optional' ones with the bundle?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From vitacolonna at appliedgenomics.org  Sun Oct 22 10:16:26 2006
From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna)
Date: Sun, 22 Oct 2006 16:16:26 +0200
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
	<F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
Message-ID: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>


On 22/ott/06, at 15:54, Chris Fields wrote:

>
> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:
>
>> Hi everybody,
>> I would like to submit to CPAN a module for reading and parsing the
>> ABIF files (with .ab1 suffix) [...]
>> May I call the module Bio::ABIF? Or should I follow other  
>> conventions?
>
> It depends.  Does it interact with bioperl in any way?

No. Can you suggest a suitable pattern for the name?

Nicola

From cjfields at uiuc.edu  Sun Oct 22 10:55:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 09:55:46 -0500
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
	<F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
	<8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>
Message-ID: <B4155C40-8E3D-4AA0-88F5-7A1FFBD3A134@uiuc.edu>

On Oct 22, 2006, at 9:16 AM, Nicola Vitacolonna wrote:

> On 22/ott/06, at 15:54, Chris Fields wrote:
>
>>
>> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:
>>
>>> Hi everybody,
>>> I would like to submit to CPAN a module for reading and parsing the
>>> ABIF files (with .ab1 suffix) [...]
>>> May I call the module Bio::ABIF? Or should I follow other
>>> conventions?
>>
>> It depends.  Does it interact with bioperl in any way?
>
> No. Can you suggest a suitable pattern for the name?
>
> Nicola

I don't think it will be a problem to name it Bio::ABIF; there is  
already a Bio::ASN1::EntrezGene, and Rutger Vos's Bio::Phylo modules  
(the latter doesn't require BioPerl either).

Saying that, if you plan on contributing more CPAN modules with  
similar functionality (such as parsing other trace files), you might  
want to consider using a namespace that isn't limiting but doesn't  
conflict with Bioperl core (like Bio::Trace or similar, then name  
your module Bio::Trace::ABIF).  You can use search.cpan.org to check  
namespaces for conflicts.

Just as an note: we have bioperl-ext, which also parses ABI and other  
trace file formats.  It's a bit old now and needs updating, but is  
supposed to be quite fast (it uses the Staden io_lib C library via  
PerlXS).

-c

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Sun Oct 22 13:26:37 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Sun, 22 Oct 2006 12:26:37 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
References: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <453BA9CD.4060107@campus.iztacala.unam.mx>

Works fine on FreeBSD.

Mauricio.

Sendu Bala wrote:
> Hi,
> I've just committed an updated Makefile.PL to HEAD for bioperl-live. 
> Could some people test it on multiple platforms and confirm it is ok 
> (try out the different possible options as well)?
> 
> (NB. in the below, 'pre-reqs' are things the makefile considers optional 
> dependencies)
> 
> Note that some pre-reqs have been removed:
> # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end 
> up requiring it but only after the user makes an explicit choice by 
> typing 'DBD::mysql' in their own code to supply as an option to Bioperl 
> code)
> # File::Temp (standard in 5.6.1)
> 
> 
> This pre-req was wrong:
> # Data::Stag::Writer
> and has been replaced with:
> Data::Stag::XMLWriter
> 
> 
> Also, I note that very many Bioperl modules need IO::String, including 
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an 
> optional module. I didn't make any change though.
> 
> 
> I don't know if these changes affect the Windows ppm Nathan, or anything 
> else (Bundle?)?
> 
> The INSTALL docs need updating with these new and improved pre-reqs 
> (note that some pre-reqs had wrong/not enough Bioperl modules listed as 
> needing them); does someone want to correct the wiki (based on the new 
> Makefile.PL) and then Chris can re-create the text version?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM

From n.haigh at sheffield.ac.uk  Sun Oct 22 15:37:07 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 22 Oct 2006 20:37:07 +0100
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
	<453B494B.7040702@sheffield.ac.uk>
	<7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>
Message-ID: <453BC863.4090803@sheffield.ac.uk>

Chris Fields wrote:
>
> On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote:
>
>> Is it possible to  make a distinction in Makefile.PL between those
>> modules that are an absolute must for Bioperl-core and those which are
>> optional and should go into Bundle::BioPerl?
>>
>> Once I'm sure what should be "option" I'll do the Bundle::BioPerl
>> package and PPD's.
>>
>> Cheers
>> Nath
>
> We probably should steer this way eventually.  Do you aim on placing 
> prereqs required for bioperl core in the bioperl PPD and the 
> 'optional' ones with the bundle?
>
That's correct. However, PPM will always try to update packages to the 
latest available. Therefore, if at some point in the future, a 
dependency is removed, and thus removed from Bundle::BioPerl, a 
situation may arise where an older version of BioPerl is running with 
the a recent version of Bundle::BioPerl and could have missing 
dependencies - not ideal but it is how things currently stand. The 
process of making the Bundle::BioPerl PPD would be simplified if these 
"optional" dependencies are separated from the "core" dependencies. If 
one of the following solutions is possible (i'm not sure if they are), 
it would be very useful:

1) Maintain 2 hashes in Makefile.PL that contain the "core" and 
"optional" dependencies. In unsure of the way dependencies are ordered 
during a "make ppd", but it may be possible to pass hash references of 
both to PREREQS_PM in MakeMakefile and have the "optional" depenencies 
grouped separately from "core" depenedcies in the ppd file - thus making 
it easy to stip them out into a Bundle::BioPerl ppd.

2) Again, maintain 2 hashes in Makefile.PL that contain the "core" and 
"optional" dependencies. Have some Makefile setup that allows the 
generation of a Bundle::BioPerl ppd separately from the main Bioperl ppd.

Like I said, these are just some thoughts and I'm not sure if they are 
even viable options.

Nath

From chhalling at alumni.ls.berkeley.edu  Sun Oct 22 19:45:33 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Sun, 22 Oct 2006 19:45:33 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
Message-ID: <453C029D.1070708@alumni.ls.berkeley.edu>

I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 
that prevent these modules from being installed:

Data::Stag::Writer (listed as Data::Stag::writer)
HTTP::Request::Common (listed as HTTP::Request::Common-)
Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From cjfields at uiuc.edu  Sun Oct 22 22:24:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 21:24:07 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
Message-ID: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>

Thanks for letting us know!  Did PPM4 throw errors or just silently  
pass them over?

Chris

On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:

> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17- 
> Oct-2006
> that prevent these modules from being installed:
>
> Data::Stag::Writer (listed as Data::Stag::writer)
> HTTP::Request::Common (listed as HTTP::Request::Common-)
> Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)
>
> -- 
> Conrad Halling
> chhalling at alumni.ls.berkeley.edu
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Mon Oct 23 02:45:29 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 06:45:29 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
Message-ID: <453C6509.90005@sheffield.ac.uk>

Chris Fields wrote:
> Thanks for letting us know!  Did PPM4 throw errors or just silently  
> pass them over?
>
> Chris
>
> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:
>
>   
I believe he is talking about the bundle on cpan and not the ppd. I will
get this updated as soon as possible.

Sendu/Chris - can you confirm to me which Bioperl modules are essential
to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
reason for not putting *all* dependencies into the bundle?

Nath


From bix at sendu.me.uk  Mon Oct 23 02:43:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 07:43:36 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
Message-ID: <453C6498.5@sendu.me.uk>

Conrad Halling wrote:
> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 
> that prevent these modules from being installed:
> 
> Data::Stag::Writer (listed as Data::Stag::writer)

This should be Data::Stag::XMLWriter

> HTTP::Request::Common (listed as HTTP::Request::Common-)
> Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)


From bix at sendu.me.uk  Mon Oct 23 02:52:47 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 07:52:47 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C6509.90005@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
Message-ID: <453C66BF.1060008@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu/Chris - can you confirm to me which Bioperl modules are essential
> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
> reason for not putting *all* dependencies into the bundle?

AFAIK, there are no essential external dependencies. Everything in 
%packages in Makefile.PL, for example, is optional.

We had the discussion about making all the easy-to-install ones a forced 
requirement anyway (so that most things work out of the box), but 
perhaps we'll hold off on making such a change until after 1.5.2.

From jyotikshah at gmail.com  Mon Oct 23 03:10:43 2006
From: jyotikshah at gmail.com (Jyoti Shah)
Date: Mon, 23 Oct 2006 00:10:43 -0700
Subject: [Bioperl-l] short motif searches
Message-ID: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>

Hi,

I am interested in searching motifs as small as 6 or 7 nucleotides in
genomic databases. I need exact matches. Is there any bioperl module
available which can help me do this? I tried WU BLAST with word size one,
but I am getting warning messages such as "WARNING: the maximum achievable
score of 7 in context 0 (frame +1) is less than the ungapped cutoff score S2
(=13). Exit code 0...". Any suggestions?

Thanks in advance,
Jyoti

From bix at sendu.me.uk  Mon Oct 23 03:55:40 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 08:55:40 +0100
Subject: [Bioperl-l] short motif searches
In-Reply-To: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>
References: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>
Message-ID: <453C757C.1010408@sendu.me.uk>

Jyoti Shah wrote:
> Hi,
> 
> I am interested in searching motifs as small as 6 or 7 nucleotides in
> genomic databases. I need exact matches. Is there any bioperl module
> available which can help me do this?

At 6 or 7bp long doing a simple exact match I should point out you're 
going to get very many hits; are you sure this is an appropriate thing 
to do for your purposes?

Assuming yes, you can use Bio::SeqIO, Bio::Index or Bio::DB::<something> 
to get your genomic sequences of interest, then simply use a normal perl 
regexp on the resulting $seq->seq strings.

If your motifs are anything like transcription factor binding sites, and 
you have more information than just a single sequence string for the 
motif, investigate Bio::Matrix::PSM.

From bix at sendu.me.uk  Mon Oct 23 04:29:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 09:29:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C7648.8030004@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk>
	<453C7648.8030004@sheffield.ac.uk>
Message-ID: <453C7D80.80207@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> Sendu/Chris - can you confirm to me which Bioperl modules are essential
>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
>>> reason for not putting *all* dependencies into the bundle?
>> AFAIK, there are no essential external dependencies. Everything in
>> %packages in Makefile.PL, for example, is optional.
>>
>> We had the discussion about making all the easy-to-install ones a
>> forced requirement anyway (so that most things work out of the box),
>> but perhaps we'll hold off on making such a change until after 1.5.2.
 >
> How are they forced?

They're not. Right now they're optional. I'm suggesting we might change 
that in the future.

If you're asking how we /would/ force them, probably by adding 
PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs 
successfully (or should!) without its optional dependencies given in 
PREREQ_PM because make test succeeds (because tests skip ok when the 
optional dependency isn't there).

I don't really know how CPAN discovers dependencies and auto-installs 
them before a dependent module though. Anyone care to explain?

From n.haigh at sheffield.ac.uk  Mon Oct 23 06:09:12 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 10:09:12 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C7D80.80207@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk>
	<453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk>
Message-ID: <453C94C8.5040900@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Nathan S. Haigh wrote:
>>>> Sendu/Chris - can you confirm to me which Bioperl modules are
>>>> essential
>>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
>>>> reason for not putting *all* dependencies into the bundle?
>>> AFAIK, there are no essential external dependencies. Everything in
>>> %packages in Makefile.PL, for example, is optional.
>>>
>>> We had the discussion about making all the easy-to-install ones a
>>> forced requirement anyway (so that most things work out of the box),
>>> but perhaps we'll hold off on making such a change until after 1.5.2.
> >
>> How are they forced?
>
> They're not. Right now they're optional. I'm suggesting we might
> change that in the future.
> If you're asking how we /would/ force them, probably by adding
> PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs
> successfully (or should!) without its optional dependencies given in
> PREREQ_PM because make test succeeds (because tests skip ok when the
> optional dependency isn't there).
>
> I don't really know how CPAN discovers dependencies and auto-installs
> them before a dependent module though. Anyone care to explain?

I thought so! I misunderstood something earlier which confused me. Just
to clarify for my own sanities sake:

1) Currently all dependencies are optional.
2) All dependencies are in %packages
3) all these are passed to PREREQ_PM

As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's:
--snip--

    I installed a Bundle and had a couple of fails. When I retried,
    everything resolved nicely. Can this be fixed to work on first try?

    The reason for this is that CPAN does not know the dependencies of
    all modules when it starts out. To decide about the additional items
    to install, it just uses data found in the META.yml file or the
    generated Makefile. An undetected missing piece breaks the process.
    But it may well be that your Bundle installs some prerequisite later
    than some depending item and thus your second try is able to resolve
    everything. Please note, CPAN.pm does not know the dependency tree
    in advance and cannot sort the queue of things to install in a
    topologically correct order. It resolves perfectly well IF all
    modules declare the prerequisites correctly with the PREREQ_PM
    attribute to MakeMaker or the |requires| stanza of Module::Build.
    For bundles which fail and you need to install often, it is
    recommended to sort the Bundle definition file manually.

--snip--

Therefore, recent modifications to Makefile.PL should result in a fully
operational Bioperl installation, if installed via CPAN. Although only
Bioperl 1.4 is available via CPAN currently. It is possible to upload a
developer release to CPAN which can only be ownloaded via CPAN if
specifically asked for - would be good for 1.5.x.:
--snip--

    How do I install a "DEVELOPER RELEASE" of a module?

    By default, CPAN will install the latest non-developer release of a
    module. If you want to install a dev release, you have to specify
    the partial path starting with the author id to the tarball you wish
    to install, like so:

        cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz

    Note that you can use the |ls| command to get this path listed.

--snip--

HTH
Nath

From bix at sendu.me.uk  Mon Oct 23 05:41:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 10:41:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C94C8.5040900@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
Message-ID: <453C8E60.7000105@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>
>> I don't really know how CPAN discovers dependencies and auto-installs
>> them before a dependent module though. Anyone care to explain?
> 
> I thought so! I misunderstood something earlier which confused me. Just
> to clarify for my own sanities sake:
> 
> 1) Currently all dependencies are optional.
> 2) All dependencies are in %packages
> 3) all these are passed to PREREQ_PM

All correct.


> As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's:
> --snip--
> 
>     I installed a Bundle and had a couple of fails. When I retried,
>     everything resolved nicely. Can this be fixed to work on first try?
> 
>     The reason for this is that CPAN does not know the dependencies of
>     all modules when it starts out. To decide about the additional items
>     to install, it just uses data found in the META.yml file or the
>     generated Makefile. An undetected missing piece breaks the process.
>     But it may well be that your Bundle installs some prerequisite later
>     than some depending item and thus your second try is able to resolve
>     everything. Please note, CPAN.pm does not know the dependency tree
>     in advance and cannot sort the queue of things to install in a
>     topologically correct order. It resolves perfectly well IF all
>     modules declare the prerequisites correctly with the PREREQ_PM
>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>     For bundles which fail and you need to install often, it is
>     recommended to sort the Bundle definition file manually.
> 
> --snip--
>
> Therefore, recent modifications to Makefile.PL should result in a fully
> operational Bioperl installation, if installed via CPAN.

Right, thanks for that.


> Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a
> developer release to CPAN which can only be ownloaded via CPAN if
> specifically asked for - would be good for 1.5.x.:
> --snip--
> 
>     How do I install a "DEVELOPER RELEASE" of a module?
> 
>     By default, CPAN will install the latest non-developer release of a
>     module. If you want to install a dev release, you have to specify
>     the partial path starting with the author id to the tarball you wish
>     to install, like so:
> 
>         cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz
> 
>     Note that you can use the |ls| command to get this path listed.
> 
> --snip--

That's the user point of view - how does the developer actually tell 
CPAN that something is a developer release so that normal users don't 
automatically install it?

From bix at sendu.me.uk  Mon Oct 23 05:59:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 10:59:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C8E60.7000105@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk>
Message-ID: <453C9298.9000900@sendu.me.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> As far as CPAN discovering dependencies, here is a snip from the CPAN 
>> FAQ's:
>> --snip--
>>
>>     I installed a Bundle and had a couple of fails. When I retried,
>>     everything resolved nicely. Can this be fixed to work on first try?
>>
>>     The reason for this is that CPAN does not know the dependencies of
>>     all modules when it starts out. To decide about the additional items
>>     to install, it just uses data found in the META.yml file or the
>>     generated Makefile. An undetected missing piece breaks the process.
>>     But it may well be that your Bundle installs some prerequisite later
>>     than some depending item and thus your second try is able to resolve
>>     everything. Please note, CPAN.pm does not know the dependency tree
>>     in advance and cannot sort the queue of things to install in a
>>     topologically correct order. It resolves perfectly well IF all
>>     modules declare the prerequisites correctly with the PREREQ_PM
>>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>>     For bundles which fail and you need to install often, it is
>>     recommended to sort the Bundle definition file manually.
>>
>> --snip--
>>
>> Therefore, recent modifications to Makefile.PL should result in a fully
>> operational Bioperl installation, if installed via CPAN.
> 
> Right, thanks for that.

Oh, so this effectively means that our 'optional' dependencies are 
installed for CPAN users, which matches up to my 'force the optional 
ones anyway' desire, leaving Bundle::BioPerl without any use.

Makefile.PL could be altered again to remove from PREREQ_PM those 
modules the user didn't already have installed, thus CPAN would only 
install Bioperl itself and nothing optional. The user could then install 
Bundle::BioPerl if they wanted a quick way of getting all the optional 
stuff to work.

I'm happy either way; what do other people think?

From n.haigh at sheffield.ac.uk  Mon Oct 23 07:22:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 11:22:17 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C9298.9000900@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk>
Message-ID: <453CA5E9.1060406@sheffield.ac.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> As far as CPAN discovering dependencies, here is a snip from the
>>> CPAN FAQ's:
>>> --snip--
>>>
>>>     I installed a Bundle and had a couple of fails. When I retried,
>>>     everything resolved nicely. Can this be fixed to work on first try?
>>>
>>>     The reason for this is that CPAN does not know the dependencies of
>>>     all modules when it starts out. To decide about the additional
>>> items
>>>     to install, it just uses data found in the META.yml file or the
>>>     generated Makefile. An undetected missing piece breaks the process.
>>>     But it may well be that your Bundle installs some prerequisite
>>> later
>>>     than some depending item and thus your second try is able to
>>> resolve
>>>     everything. Please note, CPAN.pm does not know the dependency tree
>>>     in advance and cannot sort the queue of things to install in a
>>>     topologically correct order. It resolves perfectly well IF all
>>>     modules declare the prerequisites correctly with the PREREQ_PM
>>>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>>>     For bundles which fail and you need to install often, it is
>>>     recommended to sort the Bundle definition file manually.
>>>
>>> --snip--
>>>
>>> Therefore, recent modifications to Makefile.PL should result in a fully
>>> operational Bioperl installation, if installed via CPAN.
>>
>> Right, thanks for that.
>
> Oh, so this effectively means that our 'optional' dependencies are
> installed for CPAN users, which matches up to my 'force the optional
> ones anyway' desire, leaving Bundle::BioPerl without any use.
>
> Makefile.PL could be altered again to remove from PREREQ_PM those
> modules the user didn't already have installed, thus CPAN would only
> install Bioperl itself and nothing optional. The user could then
> install Bundle::BioPerl if they wanted a quick way of getting all the
> optional stuff to work.
>
> I'm happy either way; what do other people think?
>From my point of view, removing them from PREREQ_PM means building the
Bundle::BioPerl a bit of a pain :o(

I prefer the way it is currently set up - most people have fast internet
connections and GB of harddrive space. Other than the reason "why
install something I won't ever need" I don't see much point maintaining
Bundle::BioPerl and having "optional" dependencies. I think if there are
any modules which are not going to be used by the majority of users,
then this could be used as the rationale for removing them from
bioperl-core into another package?

Nath

From n.haigh at sheffield.ac.uk  Mon Oct 23 07:38:05 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 11:38:05 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C8E60.7000105@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk>
Message-ID: <453CA99D.9060009@sheffield.ac.uk>


>> Although only Bioperl 1.4 is available via CPAN currently. It is
>> possible to upload a
>> developer release to CPAN which can only be ownloaded via CPAN if
>> specifically asked for - would be good for 1.5.x.:
>> --snip--
>>
>>     How do I install a "DEVELOPER RELEASE" of a module?
>>
>>     By default, CPAN will install the latest non-developer release of a
>>     module. If you want to install a dev release, you have to specify
>>     the partial path starting with the author id to the tarball you wish
>>     to install, like so:
>>
>>         cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz
>>
>>     Note that you can use the |ls| command to get this path listed.
>>
>> --snip--
>
> That's the user point of view - how does the developer actually tell
> CPAN that something is a developer release so that normal users don't
> automatically install it?

I found this:
http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt

Is says that $VERSION should simply be changed from a naked number into
a single quoted number and this should be recognized by the CPAN indexer.

Nath

From bix at sendu.me.uk  Mon Oct 23 06:47:38 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 11:47:38 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
	<E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
Message-ID: <453C9DCA.4020802@sendu.me.uk>

Hilmar Lapp wrote:
> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:
> 
>> For example, I have made no effort to setup biosql-schema but I
>> thought that maybe there would be a test that would detect this
> 
> I'm afraid there isn't. Bioperl-db is meaningless without
> biosql-schema.

Can you suggest a way we might detect if biosql-schema has been 
installed prior to running the test suite, so we can give some 
meaningful error message?

From bix at sendu.me.uk  Mon Oct 23 08:43:30 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 13:43:30 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
Message-ID: <453CB8F2.7070703@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>
>> Makefile.PL could be altered again to remove from PREREQ_PM those
>> modules the user didn't already have installed, thus CPAN would only
>> install Bioperl itself and nothing optional. The user could then
>> install Bundle::BioPerl if they wanted a quick way of getting all the
>> optional stuff to work.
>>
>> I'm happy either way; what do other people think?
 >
> From my point of view, removing them from PREREQ_PM means building the
> Bundle::BioPerl a bit of a pain :o(

Can I ask how you're generating Bundle::BioPerl? That is, how did the 
typos get in there? Is there a way to certainly avoid typos in the future?

From n.haigh at sheffield.ac.uk  Mon Oct 23 09:46:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 13:46:17 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CB8F2.7070703@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk>
Message-ID: <453CC7A9.6090609@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>
>>> Makefile.PL could be altered again to remove from PREREQ_PM those
>>> modules the user didn't already have installed, thus CPAN would only
>>> install Bioperl itself and nothing optional. The user could then
>>> install Bundle::BioPerl if they wanted a quick way of getting all the
>>> optional stuff to work.
>>>
>>> I'm happy either way; what do other people think?
> >
>> From my point of view, removing them from PREREQ_PM means building the
>> Bundle::BioPerl a bit of a pain :o(
>
> Can I ask how you're generating Bundle::BioPerl? That is, how did the
> typos get in there? Is there a way to certainly avoid typos in the
> future?

I just modified the list by hand a while back :o( - I'm sure there must
be a better way.

From bix at sendu.me.uk  Mon Oct 23 08:58:13 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 13:58:13 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CC7A9.6090609@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk>
Message-ID: <453CBC65.2020202@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> Sendu Bala wrote:
>>>
>>>> Makefile.PL could be altered again to remove from PREREQ_PM those
>>>> modules the user didn't already have installed, thus CPAN would only
>>>> install Bioperl itself and nothing optional. The user could then
>>>> install Bundle::BioPerl if they wanted a quick way of getting all the
>>>> optional stuff to work.
>>>>
>>>> I'm happy either way; what do other people think?
 >>>
>>> From my point of view, removing them from PREREQ_PM means building the
>>> Bundle::BioPerl a bit of a pain :o(
 >>
>> Can I ask how you're generating Bundle::BioPerl? That is, how did the
>> typos get in there? Is there a way to certainly avoid typos in the
>> future?
> 
> I just modified the list by hand a while back :o( - I'm sure there must
> be a better way.

I'm not sure I understand why removing things from PREREQ_PM would be a 
problem for you then; the %packages hash would remain unchanged (ie. 
have everything) so you have something to refer to when manually editing 
the Bundle.

http://www.cpan.org/misc/cpan-faq.html#How_make_bundle
might be helpful? I didn't really pay too much attention to the advice - 
does it offer a typo-avoiding solution?

From n.haigh at sheffield.ac.uk  Mon Oct 23 10:04:12 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 14:04:12 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CBC65.2020202@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk>
	<453CBC65.2020202@sendu.me.uk>
Message-ID: <453CCBDC.6030904@sheffield.ac.uk>


> I'm not sure I understand why removing things from PREREQ_PM would be
> a problem for you then; the %packages hash would remain unchanged (ie.
> have everything) so you have something to refer to when manually
> editing the Bundle.
>
> http://www.cpan.org/misc/cpan-faq.html#How_make_bundle
> might be helpful? I didn't really pay too much attention to the advice
> - does it offer a typo-avoiding solution?

It's helpful in producing the Bundle PPD as all the XML tags are present
in the Bioperl PPD and they simply need to be copied over to a
Bundle-BioPerl PPD file.

Looks like manual editing of the relevant file is required for making a
CPAN bundle. Unfortunately - no typo-avoiding solution. :o(

From dhoworth at mrc-lmb.cam.ac.uk  Mon Oct 23 08:46:29 2006
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Mon, 23 Oct 2006 13:46:29 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA99D.9060009@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453CA99D.9060009@sheffield.ac.uk>
Message-ID: <453CB9A5.2020409@mrc-lmb.cam.ac.uk>

>> That's the user point of view - how does the developer actually tell
>> CPAN that something is a developer release so that normal users don't
>> automatically install it?
> 
> I found this:
> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
> 
> Is says that $VERSION should simply be changed from a naked number into
> a single quoted number and this should be recognized by the CPAN indexer.

<http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>

Cheers, Dave

From hlapp at gmx.net  Mon Oct 23 09:40:29 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 23 Oct 2006 09:40:29 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <453C9DCA.4020802@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
	<E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
	<453C9DCA.4020802@sendu.me.uk>
Message-ID: <5C22B9C8-CEF0-457B-8565-793D56389A86@gmx.net>

You would need a lot of information to make that determination (host,  
port, db driver, db name, user, password; i.e., the entire connection  
information, and there is no 'standard').

You might just ask a simple question in Makefile.PL as to whether  
biosql is installed or not, similar to the DB::GFF tests.

	-hilmar

On Oct 23, 2006, at 6:47 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:
>>
>>> For example, I have made no effort to setup biosql-schema but I
>>> thought that maybe there would be a test that would detect this
>>
>> I'm afraid there isn't. Bioperl-db is meaningless without
>> biosql-schema.
>
> Can you suggest a way we might detect if biosql-schema has been
> installed prior to running the test suite, so we can give some
> meaningful error message?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Mon Oct 23 09:59:23 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 14:59:23 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CB9A5.2020409@mrc-lmb.cam.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>
	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
Message-ID: <453CCABB.2060308@sendu.me.uk>

Dave Howorth wrote:
>>> That's the user point of view - how does the developer actually tell
>>> CPAN that something is a developer release so that normal users don't
>>> automatically install it?
>> I found this:
>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>
>> Is says that $VERSION should simply be changed from a naked number into
>> a single quoted number and this should be recognized by the CPAN indexer.
> 
> <http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>

Thanks for that.

I guess from that the 1.5.2 version number should be:

$VERSION = 1.05_02

And 1.6 would be

$VERSION = 1.06

But will this cause a problem wrt 1.4? 1.4 has:

$VERSION = 1.4;

Is 1.4 lower than 1.06? Should we keep to a single digit version, so 
1.5_02 and 1.6? Does this really not work with CPAN? Should we call them 
version fifty and version sixty? 1.50_02, 1.60?

From cjfields at uiuc.edu  Mon Oct 23 10:12:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:12:16 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C9298.9000900@sendu.me.uk>
Message-ID: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>

...
> > Right, thanks for that.
> 
> Oh, so this effectively means that our 'optional' dependencies are
> installed for CPAN users, which matches up to my 'force the optional
> ones anyway' desire, leaving Bundle::BioPerl without any use.
> 
> Makefile.PL could be altered again to remove from PREREQ_PM those
> modules the user didn't already have installed, thus CPAN would only
> install Bioperl itself and nothing optional. The user could then install
> Bundle::BioPerl if they wanted a quick way of getting all the optional
> stuff to work.
> 
> I'm happy either way; what do other people think?

I think that we should have it so Bioperl installs as-is (no additional
reqs) and have Bundle::BioPerl used as a convenient way to install all
optional modules for full functionality.  The catch is to make sure that any
optional installations do not crash tests during a CPAN bioperl
installation, otherwise they aren't considered optional by CPAN, and the
install won't work without forcing it.

Frankly, most users will find themselves wanting to install the Bundle
anyway to get full functionality, so we could always 'strongly recommend'
preceding the bioperl installation with a Bundle::Bioperl CPAN installation
to avoid problems, at least for this release. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 23 10:23:04 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:23:04 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk>
Message-ID: <002101c6f6ae$c14d7860$15327e82@pyrimidine>

...
> >> Right, thanks for that.
> >
> > Oh, so this effectively means that our 'optional' dependencies are
> > installed for CPAN users, which matches up to my 'force the optional
> > ones anyway' desire, leaving Bundle::BioPerl without any use.
> >
> > Makefile.PL could be altered again to remove from PREREQ_PM those
> > modules the user didn't already have installed, thus CPAN would only
> > install Bioperl itself and nothing optional. The user could then
> > install Bundle::BioPerl if they wanted a quick way of getting all the
> > optional stuff to work.
> >
> > I'm happy either way; what do other people think?
> >From my point of view, removing them from PREREQ_PM means building the
> Bundle::BioPerl a bit of a pain :o(
> 
> I prefer the way it is currently set up - most people have fast internet
> connections and GB of harddrive space. Other than the reason "why
> install something I won't ever need" I don't see much point maintaining
> Bundle::BioPerl and having "optional" dependencies. I think if there are
> any modules which are not going to be used by the majority of users,
> then this could be used as the rationale for removing them from
> bioperl-core into another package?
> 
> Nath

I think you'll likely find it much easier to maintain a Bundle package
long-term and indicate that it should be installed along with bioperl, than
to have users complain about a particular Bioperl module failing b/c a
particular dependency wasn't installed.  

If we have the Bundle around in CPAN and in PPM for Win32 users, and
indicate in the INSTALL docs and the wiki our preference that it be
installed prior to or along with a Bioperl installation for beginners, we
can mitigate most of those problems.  Nip it in the bud, to quote a Mr.
Barney Fife.

My 2c

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 23 10:29:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:29:33 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CCABB.2060308@sendu.me.uk>
Message-ID: <002201c6f6af$a91e4200$15327e82@pyrimidine>

> Dave Howorth wrote:
> >>> That's the user point of view - how does the developer actually tell
> >>> CPAN that something is a developer release so that normal users don't
> >>> automatically install it?
> >> I found this:
> >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
> >>
> >> Is says that $VERSION should simply be changed from a naked number into
> >> a single quoted number and this should be recognized by the CPAN
> indexer.
> >
> > <http://search.cpan.org/~nwclark/perl-
> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
> 
> Thanks for that.
> 
> I guess from that the 1.5.2 version number should be:
> 
> $VERSION = 1.05_02
> 
> And 1.6 would be
> 
> $VERSION = 1.06
> 
> But will this cause a problem wrt 1.4? 1.4 has:
> 
> $VERSION = 1.4;
> 
> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them
> version fifty and version sixty? 1.50_02, 1.60?

Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
much simpler to use that. 

Simon Cozens wrote about this a while back:

http://www.perl.com/pub/a/2000/04/whatsnew.html

...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Mon Oct 23 10:41:24 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 15:41:24 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <002201c6f6af$a91e4200$15327e82@pyrimidine>
References: <002201c6f6af$a91e4200$15327e82@pyrimidine>
Message-ID: <453CD494.8070905@sendu.me.uk>

Chris Fields wrote:
>> Dave Howorth wrote:
>>>>> That's the user point of view - how does the developer actually tell
>>>>> CPAN that something is a developer release so that normal users don't
>>>>> automatically install it?
>>>> I found this:
>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>>
>>>> Is says that $VERSION should simply be changed from a naked number into
>>>> a single quoted number and this should be recognized by the CPAN
>> indexer.
>>> <http://search.cpan.org/~nwclark/perl-
>> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>
>> Thanks for that.
>>
>> I guess from that the 1.5.2 version number should be:
>>
>> $VERSION = 1.05_02
>>
>> And 1.6 would be
>>
>> $VERSION = 1.06
>>
>> But will this cause a problem wrt 1.4? 1.4 has:
>>
>> $VERSION = 1.4;
>>
>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
>> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them
>> version fifty and version sixty? 1.50_02, 1.60?
> 
> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
> much simpler to use that. 

That does not present us with a way to have 1.5.2 marked as a developer 
release in CPAN.

Also, see the discussion here: 
http://perldoc.perl.org/functions/require.html

Since we require 5.6.1 the backwards-compatible issues maybe don't apply 
to us, but do these ideas work with modules, or just Perl itself? Is 
CPAN et al. happy with this form of versioning?

/Something/ needs to be done about Bioperl versioning, because the 
current 1.4 or 1.5 is completely inadequate.

From bix at sendu.me.uk  Mon Oct 23 10:51:25 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 15:51:25 +0100
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
Message-ID: <453CD6ED.5050507@sendu.me.uk>

Chris Fields wrote:

[option 1]
>> Oh, so this effectively means that our 'optional' dependencies are 
>> installed for CPAN users, which matches up to my 'force the
>> optional ones anyway' desire, leaving Bundle::BioPerl without any
>> use.

[option 2]
>> Makefile.PL could be altered again to remove from PREREQ_PM those 
>> modules the user didn't already have installed, thus CPAN would
>> only install Bioperl itself and nothing optional. The user could
>> then install Bundle::BioPerl if they wanted a quick way of getting
>> all the optional stuff to work.
>> 
>> I'm happy either way; what do other people think?
> 
> I think that we should have it so Bioperl installs as-is (no
> additional reqs) and have Bundle::BioPerl used as a convenient way to
> install all optional modules for full functionality.

Note we're specifically considering a CPAN install here. If you download
the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
still needed as a convenience if you want to install the optional
external dependencies.


> The catch is to make sure that any optional installations do not
> crash tests during a CPAN bioperl installation, otherwise they aren't
> considered optional by CPAN, and the install won't work without
> forcing it.

I'm pretty sure this isn't a problem, though it would be nice if someone 
could test it on a clean system: does 'make test' pass all ok with none 
of the optional modules installed?


Anyway, to reiterate the question: Do we care if CPAN users get all the 
optional external dependencies installed for them automatically, or do 
we want to force them to install Bundle?

The current situation is: CPAN users will get all optional external 
dependencies without using Bundle::BioPerl. Manual installers of bioperl 
(from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to 
get full functionality.

From n.haigh at sheffield.ac.uk  Mon Oct 23 12:30:34 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 16:30:34 +0000
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CCABB.2060308@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
	<453CCABB.2060308@sendu.me.uk>
Message-ID: <453CEE2A.8000002@sheffield.ac.uk>

Sendu Bala wrote:
> Dave Howorth wrote:
>   
>>>> That's the user point of view - how does the developer actually tell
>>>> CPAN that something is a developer release so that normal users don't
>>>> automatically install it?
>>>>         
>>> I found this:
>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>
>>> Is says that $VERSION should simply be changed from a naked number into
>>> a single quoted number and this should be recognized by the CPAN indexer.
>>>       
>> <http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>     
>
> Thanks for that.
>
> I guess from that the 1.5.2 version number should be:
>
> $VERSION = 1.05_02
>
> And 1.6 would be
>
> $VERSION = 1.06
>
> But will this cause a problem wrt 1.4? 1.4 has:
>
> $VERSION = 1.4;
>
> Is 1.4 lower than 1.06? Should we keep to a single digit version, so 
> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them 
> version fifty and version sixty? 1.50_02, 1.60?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
I believe the link to the documentation above describes a common CPAN
versioning scheme as follows:

1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32

Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
be better as 1.52. Then to indicate that the 1.5 series is a developer
release, you append the underscore and at least 2 digits. Thus resulting
in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
1.52_01. The only thing i'm unsure about would be when does the _01 get
incremented? I suspect we would probably not increment this number since
each release would be an increment of the minor release number e.g.
1.52_01, 1.53_01, 1.54_01 etc.

Although I'm still not sure how this versioning would affect bioperl 1.4
since 1.4 uses a non-standard versioning scheme :o(

As I understand it, the versioning of the Perl releases uses the x.y.z
scheme. But apparently CPAN modules should use the above versioning scheme.

Nath

From cjfields at uiuc.edu  Mon Oct 23 11:36:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 10:36:37 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CD6ED.5050507@sendu.me.uk>
Message-ID: <000c01c6f6b9$0781af40$15327e82@pyrimidine>

...
> 
> Note we're specifically considering a CPAN install here. If you download
> the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
> still needed as a convenience if you want to install the optional
> external dependencies.
> 

Agreed.  I don't think the Bundle is dispensable.  For instance, it's very
easy for us to just state to beginners to install Bundle::Bioperl before
installing bioperl itself,  as opposed to having them inundate the mail list
with requests on why x.pl script didn't work, which could be simply from
lack of the required module. 

> I'm pretty sure this isn't a problem, though it would be nice if someone
> could test it on a clean system: does 'make test' pass all ok with none
> of the optional modules installed?

So far on WinXP everything passes; I ran a clean perl installation a while
ago using nmake and tests passed.

> Anyway, to reiterate the question: Do we care if CPAN users get all the
> optional external dependencies installed for them automatically, or do
> we want to force them to install Bundle?
> 
> The current situation is: CPAN users will get all optional external
> dependencies without using Bundle::BioPerl. Manual installers of bioperl
> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
> get full functionality.

I don't think forcing is necessary, so a CPAN installation shouldn't force
someone to install optional modules.  Graph.pm, for instance has a few
optional modules, and the tests which use those get skipped and pass so the
installation proceeds w/o problems.  We could do the same (any tests using
those optional modules display the reason why they are skipped).  

I would strongly state in the INSTALL and INSTALL.WIN docs that (new) users
should install Bundle::Bioperl before installing Bioperl core for full
functionality.  If you are an advanced user and know your way around
CPAN/Perl, then you can install the various independent requirements
depending on your particular requirements. 

Chris


From n.haigh at sheffield.ac.uk  Mon Oct 23 12:38:00 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 16:38:00 +0000
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CD6ED.5050507@sendu.me.uk>
References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
	<453CD6ED.5050507@sendu.me.uk>
Message-ID: <453CEFE8.4000704@sheffield.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>
> [option 1]
>   
>>> Oh, so this effectively means that our 'optional' dependencies are 
>>> installed for CPAN users, which matches up to my 'force the
>>> optional ones anyway' desire, leaving Bundle::BioPerl without any
>>> use.
>>>       
>
> [option 2]
>   
>>> Makefile.PL could be altered again to remove from PREREQ_PM those 
>>> modules the user didn't already have installed, thus CPAN would
>>> only install Bioperl itself and nothing optional. The user could
>>> then install Bundle::BioPerl if they wanted a quick way of getting
>>> all the optional stuff to work.
>>>
>>> I'm happy either way; what do other people think?
>>>       
>> I think that we should have it so Bioperl installs as-is (no
>> additional reqs) and have Bundle::BioPerl used as a convenient way to
>> install all optional modules for full functionality.
>>     
>
> Note we're specifically considering a CPAN install here. If you download
> the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
> still needed as a convenience if you want to install the optional
> external dependencies.
>
>
>   
>> The catch is to make sure that any optional installations do not
>> crash tests during a CPAN bioperl installation, otherwise they aren't
>> considered optional by CPAN, and the install won't work without
>> forcing it.
>>     
>
> I'm pretty sure this isn't a problem, though it would be nice if someone 
> could test it on a clean system: does 'make test' pass all ok with none 
> of the optional modules installed?
>
>   

I could definitely do this on WinXP and *possibly* on a Linux system.

> Anyway, to reiterate the question: Do we care if CPAN users get all the 
> optional external dependencies installed for them automatically, or do 
> we want to force them to install Bundle?
>
>   

I'd prefer any dependencies, whether the are seen as vital to the main
functionality of Bioperl or not actually specified in PREREQ_PM (as they
currently are). A dependency is a dependency - is it not? If a
distinction is to be made based on whether the requiring module is
simply adding additional functionality to Bioperl-core, then shouldn't
it be moved out of core and into another package as with the run modules
if we are to have "optional" dependencies?

my 2p
Nath

> The current situation is: CPAN users will get all optional external 
> dependencies without using Bundle::BioPerl. Manual installers of bioperl 
> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to 
> get full functionality.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   


From cjfields at uiuc.edu  Mon Oct 23 11:39:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 10:39:09 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CD494.8070905@sendu.me.uk>
Message-ID: <000d01c6f6b9$62033d80$15327e82@pyrimidine>

...
> That does not present us with a way to have 1.5.2 marked as a developer
> release in CPAN.
> 
> Also, see the discussion here:
> http://perldoc.perl.org/functions/require.html
> 
> Since we require 5.6.1 the backwards-compatible issues maybe don't apply
> to us, but do these ideas work with modules, or just Perl itself? Is
> CPAN et al. happy with this form of versioning?
> 
> /Something/ needs to be done about Bioperl versioning, because the
> current 1.4 or 1.5 is completely inadequate.

I think using 'require Foo x.y.z' is applicable to modules as well.  There
is something in Programming Perl about this, just don't have it on hand...

Not sure about CPAN, so we need to look into it.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Oct 23 11:42:15 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 16:42:15 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CEE2A.8000002@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
	<453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk>
Message-ID: <453CE2D7.5080608@sendu.me.uk>

Nathan S. Haigh wrote:
> I believe the link to the documentation above describes a common CPAN
> versioning scheme as follows:
> 
> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32
> 
> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
> be better as 1.52. Then to indicate that the 1.5 series is a developer
> release, you append the underscore and at least 2 digits. Thus resulting
> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
> 1.52_01. The only thing i'm unsure about would be when does the _01 get
> incremented? I suspect we would probably not increment this number since
> each release would be an increment of the minor release number e.g.
> 1.52_01, 1.53_01, 1.54_01 etc.
> 
> Although I'm still not sure how this versioning would affect bioperl 1.4
> since 1.4 uses a non-standard versioning scheme :o(

Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
treated higher than 1.4? Anyway, we can cross that bridge when we get 
there, but this seems appropriate now.


Cheers,
Sendu.

From bix at sendu.me.uk  Mon Oct 23 11:59:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 16:59:01 +0100
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
References: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
Message-ID: <453CE6C5.6000108@sendu.me.uk>

Chris Fields wrote:
> ...
>> The current situation is: CPAN users will get all optional external
>> dependencies without using Bundle::BioPerl. Manual installers of bioperl
>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
>> get full functionality.
> 
> I don't think forcing is necessary, so a CPAN installation shouldn't force
> someone to install optional modules.  Graph.pm, for instance has a few
> optional modules, and the tests which use those get skipped and pass so the
> installation proceeds w/o problems.  We could do the same (any tests using
> those optional modules display the reason why they are skipped).  

I should clarify and say that that's what happens in Bioperl as well. 
The 'forcing' that I talk about is simply what I assume will happen if 
the user has CPAN set to automatically install dependencies. The user 
could say 'no' to every question regarding the installation of 
dependencies that CPAN discovers and Bioperl would still install fine.

So really the difference between the current situation and, say, the 
situation when 1.5.1 was released, is that the CPAN user doesn't have to 
use Bundle::BioPerl for full functionality anymore, but can still chose 
not to install all the optional external modules.

The difference is the possible default behaviour. Those users that 
auto-install dependencies get all the optional ones, whereas in the past 
they would not have. I have to point out the benefit of this behaviour: 
those people that don't care and just want it to work are more likely to 
get an installation that does just work. People who know what they're 
doing can still do what they want.


Before we decide what to do I guess we need hard confirmation of how 
CPAN will actually behave with the current Makefile.PL. Any ideas how we 
can find out?

It would also be good to have more options to break the current tie 
(Nathan is for keeping PREREQ_PM populated, Chris is for having it 
empty, I can go either way)...

From dhoworth at mrc-lmb.cam.ac.uk  Mon Oct 23 11:55:42 2006
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Mon, 23 Oct 2006 16:55:42 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CD494.8070905@sendu.me.uk>
References: <002201c6f6af$a91e4200$15327e82@pyrimidine>
	<453CD494.8070905@sendu.me.uk>
Message-ID: <453CE5FE.9070001@mrc-lmb.cam.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>>> Dave Howorth wrote:
>>>>>> That's the user point of view - how does the developer actually tell
>>>>>> CPAN that something is a developer release so that normal users don't
>>>>>> automatically install it?
>>>>> I found this:
>>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>>>
>>>>> Is says that $VERSION should simply be changed from a naked number into
>>>>> a single quoted number and this should be recognized by the CPAN
>>> indexer.
>>>> <http://search.cpan.org/~nwclark/perl-
>>> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>>
>>> Thanks for that.
>>>
>>> I guess from that the 1.5.2 version number should be:
>>>
>>> $VERSION = 1.05_02

I believe so - the underscore is key. Look at your favourite CPAN
modules and see what they do.

>>> And 1.6 would be
>>>
>>> $VERSION = 1.06
>>>
>>> But will this cause a problem wrt 1.4? 1.4 has:

I think it will cause a problem, yes. 1.4 > 1.06 As a workaround, you
could remove 1.4 from CPAN and require everybody who installs from CPAN
to uninstall it before installing 1.06.

>>> $VERSION = 1.4;
>>>
>>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
>>> 1.5_02 and 1.6? Does this really not work with CPAN?

I think that would work but see at the end.

>> Should we call them
>>> version fifty and version sixty? 1.50_02, 1.60?

Then you can count 1.50_02, 1.50_03, 1.52, 1.53_01 ... if you wish.

>> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
>> much simpler to use that. 
> 
> That does not present us with a way to have 1.5.2 marked as a developer 
> release in CPAN.
> 
> Also, see the discussion here: 
> http://perldoc.perl.org/functions/require.html
> 
> Since we require 5.6.1 the backwards-compatible issues maybe don't apply 
> to us, but do these ideas work with modules, or just Perl itself? Is 
> CPAN et al. happy with this form of versioning?

I'm not an expert :( It's my understanding that there is an awful lot of
flexibility in Perl module version numbering (as you might expect :)
However, I believe there are some gotchas. So I would recommend (a)
finding an expert and (b) trying an experiment!

> /Something/ needs to be done about Bioperl versioning, because the 
> current 1.4 or 1.5 is completely inadequate.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 


From n.haigh at sheffield.ac.uk  Mon Oct 23 13:37:13 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 17:37:13 +0000
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CE6C5.6000108@sendu.me.uk>
References: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
	<453CE6C5.6000108@sendu.me.uk>
Message-ID: <453CFDC9.8030107@sheffield.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>   
>> ...
>>     
>>> The current situation is: CPAN users will get all optional external
>>> dependencies without using Bundle::BioPerl. Manual installers of bioperl
>>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
>>> get full functionality.
>>>       
>> I don't think forcing is necessary, so a CPAN installation shouldn't force
>> someone to install optional modules.  Graph.pm, for instance has a few
>> optional modules, and the tests which use those get skipped and pass so the
>> installation proceeds w/o problems.  We could do the same (any tests using
>> those optional modules display the reason why they are skipped).  
>>     
>
> I should clarify and say that that's what happens in Bioperl as well. 
> The 'forcing' that I talk about is simply what I assume will happen if 
> the user has CPAN set to automatically install dependencies. The user 
> could say 'no' to every question regarding the installation of 
> dependencies that CPAN discovers and Bioperl would still install fine.
>
> So really the difference between the current situation and, say, the 
> situation when 1.5.1 was released, is that the CPAN user doesn't have to 
> use Bundle::BioPerl for full functionality anymore, but can still chose 
> not to install all the optional external modules.
>
>   
--snip--

Obviously, we could maintain a Bundle::BioPerl which includes all
dependencies required for a fully functional Bioperl. I think the whole
idea for a Bundle is to provide a common environment for a particular
package. If for example, someone chooses not to install the dependencies
through CPAN (in the current setup), that can easily go back and install
Bundle::BioPerl and it would retrieve any missing dependencies for a
fully functional Bioperl-core.

Nath


From n.haigh at sheffield.ac.uk  Mon Oct 23 14:06:16 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 18:06:16 +0000
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CE2D7.5080608@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
Message-ID: <453D0498.8050206@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>   
>> I believe the link to the documentation above describes a common CPAN
>> versioning scheme as follows:
>>
>> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32
>>
>> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
>> be better as 1.52. Then to indicate that the 1.5 series is a developer
>> release, you append the underscore and at least 2 digits. Thus resulting
>> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
>> 1.52_01. The only thing i'm unsure about would be when does the _01 get
>> incremented? I suspect we would probably not increment this number since
>> each release would be an increment of the minor release number e.g.
>> 1.52_01, 1.53_01, 1.54_01 etc.
>>
>> Although I'm still not sure how this versioning would affect bioperl 1.4
>> since 1.4 uses a non-standard versioning scheme :o(
>>     
>
> Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
> treated higher than 1.4? Anyway, we can cross that bridge when we get 
> there, but this seems appropriate now.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just tried the suggested:
perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)'
bioperl-1-5-2/Bio/Root/Version.pm

To see how it parses the various different version schemes - here are
the results:
1.5       -> 1.5
1.4       -> 1.4
1.60      -> 1.60
1.05_01   -> 1.0501
1.5_01    -> 1.501
1.50_01   -> 1.5001

Nath

From cjfields at uiuc.edu  Mon Oct 23 13:15:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:15:44 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CE6C5.6000108@sendu.me.uk>
Message-ID: <002701c6f6c6$e2622c40$15327e82@pyrimidine>

...
> I should clarify and say that that's what happens in Bioperl as well.
> The 'forcing' that I talk about is simply what I assume will happen if
> the user has CPAN set to automatically install dependencies. The user
> could say 'no' to every question regarding the installation of
> dependencies that CPAN discovers and Bioperl would still install fine.
> 
> So really the difference between the current situation and, say, the
> situation when 1.5.1 was released, is that the CPAN user doesn't have to
> use Bundle::BioPerl for full functionality anymore, but can still chose
> not to install all the optional external modules.
> 
> The difference is the possible default behaviour. Those users that
> auto-install dependencies get all the optional ones, whereas in the past
> they would not have. I have to point out the benefit of this behaviour:
> those people that don't care and just want it to work are more likely to
> get an installation that does just work. People who know what they're
> doing can still do what they want.

OK with me.  Any way we go about it, we have to assume that anyone who set
CPAN to automatically install dependencies would want this behavior.

> Before we decide what to do I guess we need hard confirmation of how
> CPAN will actually behave with the current Makefile.PL. Any ideas how we
> can find out?
> 
> It would also be good to have more options to break the current tie
> (Nathan is for keeping PREREQ_PM populated, Chris is for having it
> empty, I can go either way)...

Frankly I'm for whatever is easiest for the end-user.  I think we should
continue maintaining Bundle::Bioperl b/c of its convenience (easier for us
to say 'install Bundle::Bioperl' as opposed to 'install modules a b d d e f
g...'  ).  I should note that Chris D. maintains Bundle::Bioperl via CPAN
and can easily add/remove modules as needed, so all that would be necessary
prior to a release is to make sure the various modules present in the Bundle
are up-to-date.

The only difficulty would updating the bundle PPM version for Win32; I agree
with Nathan that it would be nice if it were easier to maintain.  The PPD
file generated using 'nmake ppd' needs modifications, likely b/c these are
probably still generated as PPM3-compatible vs PPM4-compatible.

I also think the idea of having the developer releases available via CPAN is
a good one, as long as they are marked as such (which you are taking care of
with versioning changes).  It makes them a little more official, even if
they are interim developer releases.

Chris


From cjfields at uiuc.edu  Mon Oct 23 13:19:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:19:08 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CFDC9.8030107@sheffield.ac.uk>
Message-ID: <002801c6f6c7$5a58ed60$15327e82@pyrimidine>

...
> > So really the difference between the current situation and, say, the
> > situation when 1.5.1 was released, is that the CPAN user doesn't have to
> > use Bundle::BioPerl for full functionality anymore, but can still chose
> > not to install all the optional external modules.
> >
> >
> --snip--
> 
> Obviously, we could maintain a Bundle::BioPerl which includes all
> dependencies required for a fully functional Bioperl. I think the whole
> idea for a Bundle is to provide a common environment for a particular
> package. If for example, someone chooses not to install the dependencies
> through CPAN (in the current setup), that can easily go back and install
> Bundle::BioPerl and it would retrieve any missing dependencies for a
> fully functional Bioperl-core.
> 
> Nath

Succinctly put; I would've spent five paragraphs describing that!  Too much
coffee (from lab meetings...)

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 23 13:26:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:26:57 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>
Message-ID: <002c01c6f6c8$7163dd20$15327e82@pyrimidine>

Seth, 

Did you try this with a clean, taxonomy-installed database?  There may be
some junk left over tfrom the previous test runs.

I'm looking into it this week; it may not make the developer release but
we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
with a call to gzip.  I'll look into a workaround for that.  

Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
introduces others.  One alternative which I found works is cygwin, but
there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
another...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 11:37 AM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
--- 
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): 
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85) 
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' 
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88) 
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis, J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2) 
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2) 
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3) 
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' 
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4) 
#    Expected: '99199225'
==============================


On 10/20/06, Chris Fields < <mailto:cjfields at uiuc.edu>  cjfields at uiuc.edu>
wrote:


Seth,

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with 
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
<mailto:bioperl-l-> 
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM 
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------

> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42 
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list 
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
<http://lists.open-bio.org/mailman/listinfo/bioperl-l> 
>
>


From johnson.biotech at gmail.com  Mon Oct 23 12:36:36 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 23 Oct 2006 12:36:36 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <000001c6f486$df508930$15327e82@pyrimidine>
References: <000001c6f486$df508930$15327e82@pyrimidine>
Message-ID: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>

Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31):
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85)
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators'
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88)
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2)
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2)
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3)
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein'
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4)
#    Expected: '99199225'
==============================


On 10/20/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
>
> Seth,
>
> Did you work out the problem here?  There was a recent CVS update to OBDA
> tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
> apparently left data from tests in the database, which caused problems
> with
> repeated test runs.
>
> Chris
>
> > > -----Original Message-----
> > > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
> > > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > > Sent: Saturday, September 30, 2006 6:35 PM
> > > To: Hilmar Lapp
> > > Cc: Chris Fields; Bioperl List
> > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> > >
> > > Here're complete test details:
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > ...
> >
> > > FAILED tests 10-12
> > >     Failed 3/12 tests, 75.00% okay
> > > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> > >
> >
> --------------------------------------------------------------------------
> > > -----
> > > t\02species.t                 65    2   3.08%  63 65
> > > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > > t\16obda.t                    12    3  25.00%  10-12
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l <at> lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>
>

From n.haigh at sheffield.ac.uk  Mon Oct 23 16:08:00 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 20:08:00 +0000
Subject: [Bioperl-l] CPAN testing Service
Message-ID: <453D2120.9010301@sheffield.ac.uk>

We should also check the CPAN testing service (CPANTS) to see how "good"
our package is for CPAN and try to increase the Kwalitee score. There
only appears to be details for bioperl-1.2.3 for some reason:
http://cpants.perl.org/dist/bioperl

Nath

From pabloivan at gmail.com  Sun Oct 22 15:54:35 2006
From: pabloivan at gmail.com (Pablo Ivan)
Date: Sun, 22 Oct 2006 16:54:35 -0300
Subject: [Bioperl-l] Bioperl installation under Windows
Message-ID: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>

Hello,

I have been trying to install Bioperl 1.4 on a Windows XP system, but I
didn't get too far; my perl installation was made using ActiveState
5.8.8build 816. I then tried the ppm method of searching for bioperl
in the
repositories and installing the core package 1.4. It says that the
installation was made successfully, but the /Bio folder doesn't show up in
/lib, and it's like nothing new was installed at all. I was wondering if
using that version of ActiveState could be causing it, but the uninstall
option for it isn't showing in Add/Remove, and I'm afraid just deleting the
folders and installing version 5.6 of AS could somehow damage and make
things worse. Or should I just forget about it and try using Cygwin?

Thank you,

Pablo.

From cjfields at uiuc.edu  Mon Oct 23 17:34:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 16:34:47 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880610231422o24029a0cu229fccc2b5809b85@mail.gmail.com>
Message-ID: <000401c6f6eb$111df040$15327e82@pyrimidine>

Don't know what that particular error is, but it looks ActivePerl-related
(PPM generates HTML from the blib directory).  You may need to run 'nmake
clean' in between test cycles get rid of old blib and other files.

 
The carryover issue from old test runs was a definite problem.  Brian fixed
that in the bioperl-db CVS recently.  Also,  I tried Sendu's fixes from CVS
head to Bio::Root::Root and they seem to fix the problems with
Bio::Root::Root.  The issue came down to a use of indirect syntax (a bad
perl practice).  There are other errors popping up related to Bio::Species,
but these seem fixable at least.

 
I committed a few changes to bioperl-db CVS to fix 03simpleseq.t test
failures due to a lack of gzip on WinXP (I didn't see them b/c I had a copy
on GNU gzip in my path).  These should pass w/o problems now on WinXP.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 4:22 PM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

I have not cleaned my test database yet.  I'll purge it and redo the tests. 

This error keeps popping up in unexpected places while running nmake during
installation: 
 "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. 
NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code
'0xff'"

Is there a way around it??

Seth

On 10/23/06, Chris Fields <cjfields at uiuc.edu> wrote:

Seth, 

Did you try this with a clean, taxonomy-installed database?  There may be
some junk left over tfrom the previous test runs.

I'm looking into it this week; it may not make the developer release but
we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
with a call to gzip.  I'll look into a workaround for that.  

Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
introduces others.  One alternative which I found works is cygwin, but
there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
another...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 11:37 AM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
--- 
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): 
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85) 
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' 
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88) 
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis, J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2) 
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2) 
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3) 
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' 
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4) 
#    Expected: '99199225'
==============================

On 10/20/06, Chris Fields < cjfields at uiuc.edu <mailto:cjfields at uiuc.edu> >
wrote:


Seth,

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with 
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
<mailto:bioperl-l-> 
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM 
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------

> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42 
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list 
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
<http://lists.open-bio.org/mailman/listinfo/bioperl-l> 
>
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358 


From cjfields at uiuc.edu  Mon Oct 23 17:53:27 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 16:53:27 -0500
Subject: [Bioperl-l] Bioperl installation under Windows
In-Reply-To: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>
References: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>
Message-ID: <9994CFF6-FCA1-4C7F-9A33-31765C6AE255@uiuc.edu>

It won't install in Perl\lib, but in Perl\site\lib.  Check there.

We are working intently on the next developer release for BioPerl and  
plan on having several PPMs available, but we only are supporting  
ActivePerl 5.8.8.819.  I would suggest that you upgrade your  
ActivePerl installation to that if possible since PPM has undergone  
major changes (they use PPM4 now, which has a GUI by default).  Most  
repositories are now moving over to using PPM4 so you'll likely be  
seeing less PPM3-compatible packages being made.

Chris

On Oct 22, 2006, at 2:54 PM, Pablo Ivan wrote:

> Hello,
>
> I have been trying to install Bioperl 1.4 on a Windows XP system,  
> but I
> didn't get too far; my perl installation was made using ActiveState
> 5.8.8build 816. I then tried the ppm method of searching for bioperl
> in the
> repositories and installing the core package 1.4. It says that the
> installation was made successfully, but the /Bio folder doesn't  
> show up in
> /lib, and it's like nothing new was installed at all. I was  
> wondering if
> using that version of ActiveState could be causing it, but the  
> uninstall
> option for it isn't showing in Add/Remove, and I'm afraid just  
> deleting the
> folders and installing version 5.6 of AS could somehow damage and make
> things worse. Or should I just forget about it and try using Cygwin?
>
> Thank you,
>
> Pablo.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnson.biotech at gmail.com  Mon Oct 23 17:22:13 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 23 Oct 2006 17:22:13 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <002c01c6f6c8$7163dd20$15327e82@pyrimidine>
References: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>
	<002c01c6f6c8$7163dd20$15327e82@pyrimidine>
Message-ID: <b99962880610231422o24029a0cu229fccc2b5809b85@mail.gmail.com>

Chris,

I have not cleaned my test database yet.  I'll purge it and redo the tests.

This error keeps popping up in unexpected places while running nmake during
installation:
 "Undefined subroutine &main::UpdateHTML_blib called at -e line 1.
NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code
'0xff'"

Is there a way around it??

Seth

On 10/23/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>  Seth,
>
> Did you try this with a clean, taxonomy-installed database?  There may be
> some junk left over tfrom the previous test runs.
>
> I'm looking into it this week; it may not make the developer release but
> we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
> with a call to gzip.  I'll look into a workaround for that.
>
> Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
> introduces others.  One alternative which I found works is cygwin, but
> there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
> another...
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>   ------------------------------
>
> *From:* Seth Johnson [mailto:johnson.biotech at gmail.com]
> *Sent:* Monday, October 23, 2006 11:37 AM
> *To:* Chris Fields
> *Cc:* bioperl-l
> *Subject:* Re: Error retrieving sequence from BioSQL
>
>
>
> Chris,
>
> There's definite improvement:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> -------------------------------------------------------------------------------
>
> t/02species.t                 65    2   3.08%  63 65
> t/03simpleseq.t    1   256    59  106 179.66%  7-59
> t/04swiss.t                   52   14  26.92%  25 27-34 38-42
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> There's some weirdness going on during the 'swiss.t' test.  It almost
> seems to me that expectations of some tests are swapped (27 & 39, 28 & 40,
> 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31):
> ================================
> not ok 25
> # Test 25 got: '10097078' (t/04swiss.t at line 79)
> #    Expected: '91309150'
> ok 26
> not ok 27
> # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
> at line 85)
> #    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
> not ok 28
> # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
> mitochondrial matrix protein' (t/04swiss.t at line 86)
> #    Expected: 'Functional expression of cloned human splicing factor SF2:
> homology to RNA-binding proteins, U1 70K, and Drosophila splicing
> regulators'
> not ok 29
> # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
> (t/04swiss.t at line 87)
> #    Expected: 'Cell 66 (2), 383-394 (1991)'
> not ok 30
> # Test 30 got: <UNDEF> (t/04swiss.t at line 88)
> #    Expected: '91309150'
> not ok 31
> # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
> (t/04swiss.t at line 85 fail #2)
> #    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
> Celis, J.E. and Leffers,H.'
> not ok 32
> # Test 32 got: 'Functional expression of cloned human splicing factor SF2:
> homology to RNA-binding proteins, U1 70K, and Drosophila splicing
> regulators' (t/04swiss.t at line 86 fail #2)
> #    Expected: 'Cloning and expression of a cDNA covering the complete
> coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
> not ok 33
> # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
> #2)
> #    Expected: 'Gene 134 (2), 283-287 (1993)'
> not ok 34
> # Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
> #    Expected: '94085792'
> ok 35
> ok 36
> ok 37
> not ok 38
> # Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3)
> #    Expected: '94253723'
> not ok 39
> # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
> Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
> #    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
> not ok 40
> # Test 40 got: 'Cloning and expression of a cDNA covering the complete
> coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
> (t/04swiss.t at line 86 fail #4)
> #    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
> mitochondrial matrix protein'
> not ok 41
> # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
> #4)
> #    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
> not ok 42
> # Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4)
> #    Expected: '99199225'
> ==============================
>
>  On 10/20/06, *Chris Fields* < cjfields at uiuc.edu> wrote:
>
>
>
> Seth,
>
> Did you work out the problem here?  There was a recent CVS update to OBDA
> tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
> apparently left data from tests in the database, which caused problems
> with
> repeated test runs.
>
> Chris
>
> > > -----Original Message-----
> > > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
> > > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > > Sent: Saturday, September 30, 2006 6:35 PM
> > > To: Hilmar Lapp
> > > Cc: Chris Fields; Bioperl List
> > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> > >
> > > Here're complete test details:
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > ...
> >
> > > FAILED tests 10-12
> > >     Failed 3/12 tests, 75.00% okay
> > > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> > >
> >
> --------------------------------------------------------------------------
> > > -----
> > > t\02species.t                 65    2   3.08%  63 65
> > > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > > t\16obda.t                    12    3  25.00%  10-12
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l <at> lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358

From chhalling at alumni.ls.berkeley.edu  Mon Oct 23 21:02:24 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Mon, 23 Oct 2006 21:02:24 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C6509.90005@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
Message-ID: <453D6620.5020401@alumni.ls.berkeley.edu>

Sorry, I should know better about giving all the details.

This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a 
fresh compile) with Mac OS X 10.4.8.

-- Conrad

Nathan S. Haigh wrote:
> Chris Fields wrote:
>   
>> Thanks for letting us know!  Did PPM4 throw errors or just silently  
>> pass them over?
>>
>> Chris
>>
>> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:
>>
>>   
>>     
> I believe he is talking about the bundle on cpan and not the ppd. I will
> get this updated as soon as possible.
>
> Sendu/Chris - can you confirm to me which Bioperl modules are essential
> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
> reason for not putting *all* dependencies into the bundle?
>
> Nath
>
>
>
>
>
>   


-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From n.haigh at sheffield.ac.uk  Tue Oct 24 03:05:53 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 24 Oct 2006 08:05:53 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453D6620.5020401@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453D6620.5020401@alumni.ls.berkeley.edu>
Message-ID: <453DBB51.6010505@sheffield.ac.uk>

Conrad Halling wrote:
> Sorry, I should know better about giving all the details.
>
> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a 
> fresh compile) with Mac OS X 10.4.8.
>
> -- Conrad
>
>   
My apologies Conrad, this was my bad! Are you in need of the corrections 
being made swiftly or can you wait until the Bioperl 1.5.2 release when 
I'll ensure the Bundle is updated correctly for that release?

Cheers
Nath

From n.haigh at sheffield.ac.uk  Tue Oct 24 05:57:25 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 10:57:25 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CE2D7.5080608@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
Message-ID: <453DE385.8010700@sheffield.ac.uk>

--snip--
> Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
> treated higher than 1.4? Anyway, we can cross that bridge when we get 
> there, but this seems appropriate now.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just been having a think about this versioning. Does this work well and
is it intuitive with versioning the official 1.5.2 developer release and
also the 1.6 stable release? I'd like to put forward the following
versioning scheme for consideration (most is the same as what it is now,
but with some clarification - hopefully):
major-version . minor-version sub-version _ developer-release-version
RC-version

The sub-version represents bug-fixes and possibly some minor feature
enhancements with no API changes.
The minor-version represents some significant feature enhancements/API
changes/bug fixes.
The major-version represents significant rewrites of Bioperl.

For an RC of a developer release the version would have _0x (where x=the
RC number)
For a non RC of a developer release the version would have _10
For an RC of a stable release the version would have _0x (where x=RC number)
Fo a non RC of a stable release the version would not have the
underscore suffix

Therefore I would see the following $VERSION being applied:
1.5.2 RC1            = 1.52_01
1.5.2 RC2            = 1.52_02
1.5.2 RC3            = 1.52_03
1.5.2                = 1.52_10
1.6 RC1              = 1.60_01
1.6 RC2              = 1.60_02
1.6                  = 1.60
1.6.1 RC1            = 1.61_01
1.6.1                = 1.61

This should satisfy the requirement of CPAN for having underscores in
versions to indicate a developer release, which here is a Bioperl
release with an odd minor version number or any RC whether it be of a
developer release or a stable release. This should mean that we could
have the RC's on CPAN, but by default, CPAN would only install the
latest "non developer release" (i.e. the last package without an
underscore in the version).

If we are going ahead with the new $VERSION scheme (as it currently is
in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
1.52 instead of Bioperl 1.5.2 and make an effort to sync the
documentation with regards to this.

Nath


From bix at sendu.me.uk  Tue Oct 24 06:19:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 11:19:05 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DE385.8010700@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
	<453DE385.8010700@sheffield.ac.uk>
Message-ID: <453DE899.4030603@sendu.me.uk>

Nathan Haigh wrote:
>
> Therefore I would see the following $VERSION being applied:
> 1.5.2 RC1            = 1.52_01
> 1.5.2 RC2            = 1.52_02
> 1.5.2 RC3            = 1.52_03
> 1.5.2                = 1.52_10
> 1.6 RC1              = 1.60_01
> 1.6 RC2              = 1.60_02
> 1.6                  = 1.60
> 1.6.1 RC1            = 1.61_01
> 1.6.1                = 1.61
> 
> This should satisfy the requirement of CPAN for having underscores in
> versions to indicate a developer release, which here is a Bioperl
> release with an odd minor version number or any RC whether it be of a
> developer release or a stable release. This should mean that we could
> have the RC's on CPAN, but by default, CPAN would only install the
> latest "non developer release" (i.e. the last package without an
> underscore in the version).

That all sounds good to me, except I worry about potential confusion if 
people look manually at the things available in CPAN, see 1.60_02 and 
think it is more recent than 1.60 and try to install it manually.

Since
$VERSION = 1.52_10;
is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
final release version should be
$VERSION = 1.6010.


> If we are going ahead with the new $VERSION scheme (as it currently is
> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
> documentation with regards to this.

I might disagree with this though. I think perl people, and perhaps unix 
people in general, should be used to version numbers like '1.5.2', but 
then getting '1.52' from the code since such a number allows simple 
numerical comparisons while the former does not. The former is easier to 
read and understand. This is just how Perl itself behaves.

Most users who wouldn't expect such a behaviour aren't going to be 
checking the version number programatically anyway.


BTW. do we have someone with a CPAN account, or should I get one?

From n.haigh at sheffield.ac.uk  Tue Oct 24 07:37:12 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 12:37:12 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DE899.4030603@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk>
Message-ID: <453DFAE8.5050602@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>   
>> Therefore I would see the following $VERSION being applied:
>> 1.5.2 RC1            = 1.52_01
>> 1.5.2 RC2            = 1.52_02
>> 1.5.2 RC3            = 1.52_03
>> 1.5.2                = 1.52_10
>> 1.6 RC1              = 1.60_01
>> 1.6 RC2              = 1.60_02
>> 1.6                  = 1.60
>> 1.6.1 RC1            = 1.61_01
>> 1.6.1                = 1.61
>>
>> This should satisfy the requirement of CPAN for having underscores in
>> versions to indicate a developer release, which here is a Bioperl
>> release with an odd minor version number or any RC whether it be of a
>> developer release or a stable release. This should mean that we could
>> have the RC's on CPAN, but by default, CPAN would only install the
>> latest "non developer release" (i.e. the last package without an
>> underscore in the version).
>>     
>
> That all sounds good to me, except I worry about potential confusion if 
> people look manually at the things available in CPAN, see 1.60_02 and 
> think it is more recent than 1.60 and try to install it manually.
>
>   

I not sure if this would be a problem. As far as I understand, CPAN
treats these packages with underscores in $VERSION as something
distinctly different to the others releases (i.e. developer releases).
If you look at such a page, it is clearly evident that it is a
developers release. For example, if you search on CPAN for the latest
version of the CPAN module is shows 1.8802. if you go to that page:
http://search.cpan.org/~andk/CPAN-1.8802/
There is also a link for the latest developer release, released 1 day
after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).
This too appears to be later that 1.8802, but since it is dealt with as
a developer release it doesn't seem to matter - CPAN will only deal with
the stable (non-developer) releases, while the developer releases can be
used as a convenient way to access developer releases. Although I'm
thinking CPAN uses some hocus pocus with release dates too.

> Since
> $VERSION = 1.52_10;
> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
> final release version should be
> $VERSION = 1.6010.
>
>
>   

Because they are dealt with separately, I don't think this is an issue
(see above).

>> If we are going ahead with the new $VERSION scheme (as it currently is
>> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
>> documentation with regards to this.
>>     
>
> I might disagree with this though. I think perl people, and perhaps unix 
> people in general, should be used to version numbers like '1.5.2', but 
> then getting '1.52' from the code since such a number allows simple 
> numerical comparisons while the former does not. The former is easier to 
> read and understand. This is just how Perl itself behaves.
>
> Most users who wouldn't expect such a behaviour aren't going to be 
> checking the version number programatically anyway.
>
>
> BTW. do we have someone with a CPAN account, or should I get one?
>   

It says Ewan Birney is the author of Bioperl - I assume it must be
possible to have multiple people have the permissions to update a single
package.

Nath

From chhalling at alumni.ls.berkeley.edu  Tue Oct 24 07:15:12 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Tue, 24 Oct 2006 07:15:12 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453DBB51.6010505@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
	<453D6620.5020401@alumni.ls.berkeley.edu>
	<453DBB51.6010505@sheffield.ac.uk>
Message-ID: <453DF5C0.3040104@alumni.ls.berkeley.edu>

Nathan S. Haigh wrote:
> Conrad Halling wrote:
>> Sorry, I should know better about giving all the details.
>>
>> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 
>> (a fresh compile) with Mac OS X 10.4.8.
>>
>> -- Conrad  
> My apologies Conrad, this was my bad! Are you in need of the 
> corrections being made swiftly or can you wait until the Bioperl 1.5.2 
> release when I'll ensure the Bundle is updated correctly for that 
> release?
>
> Cheers
> Nath

No, I'm fine. I used the cpan utility to load the three modules manually.

-- Conrad

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From bix at sendu.me.uk  Tue Oct 24 08:16:54 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 13:16:54 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DFAE8.5050602@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
Message-ID: <453E0436.3050903@sendu.me.uk>

Nathan Haigh wrote:
> Sendu Bala wrote:
>
>> That all sounds good to me, except I worry about potential confusion if 
>> people look manually at the things available in CPAN, see 1.60_02 and 
>> think it is more recent than 1.60 and try to install it manually.
> 
> I not sure if this would be a problem. As far as I understand, CPAN
> treats these packages with underscores in $VERSION as something
> distinctly different to the others releases (i.e. developer releases).
> If you look at such a page, it is clearly evident that it is a
> developers release. For example, if you search on CPAN for the latest
> version of the CPAN module is shows 1.8802. if you go to that page:
> http://search.cpan.org/~andk/CPAN-1.8802/
> There is also a link for the latest developer release, released 1 day
> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).

[snip]

>> Since
>> $VERSION = 1.52_10;
>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
>> final release version should be
>> $VERSION = 1.6010.
>
> Because they are dealt with separately, I don't think this is an issue
> (see above).

If you don't notice the dates, or are doing numerical version number 
comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may 
not be automatic, but you can still chose to download the developer 
releases. Which means if we say to someone 'use Bioperl 1.6 or better' 
they may choose to get the latest version and think it is 1.6002 when 
infact 1.60 was the more recent version. 1.6010 solves the problem, is 
consistent with your 1.50_10 suggestion, and doesn't cause any problems 
as far as I can see.


>>> If we are going ahead with the new $VERSION scheme (as it currently is
>>> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
>>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
>>> documentation with regards to this.
>>>     
>> I might disagree with this though. I think perl people, and perhaps unix 
>> people in general, should be used to version numbers like '1.5.2', but 
>> then getting '1.52' from the code since such a number allows simple 
>> numerical comparisons while the former does not. The former is easier to 
>> read and understand. This is just how Perl itself behaves.
>>
>> Most users who wouldn't expect such a behaviour aren't going to be 
>> checking the version number programatically anyway.
>>
>>
>> BTW. do we have someone with a CPAN account, or should I get one?
>>   
> 
> It says Ewan Birney is the author of Bioperl - I assume it must be
> possible to have multiple people have the permissions to update a single
> package.

How did you get Bundle::BioPerl updated? Did you just ask Chris 
Dagdigian to do it for you? Or do you have access to his account? I'll 
ask Ewan about it.

From n.haigh at sheffield.ac.uk  Tue Oct 24 08:21:56 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 13:21:56 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0436.3050903@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk>
Message-ID: <453E0564.9030302@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> Sendu Bala wrote:
>>
>>> That all sounds good to me, except I worry about potential confusion
>>> if people look manually at the things available in CPAN, see 1.60_02
>>> and think it is more recent than 1.60 and try to install it manually.
>>
>> I not sure if this would be a problem. As far as I understand, CPAN
>> treats these packages with underscores in $VERSION as something
>> distinctly different to the others releases (i.e. developer releases).
>> If you look at such a page, it is clearly evident that it is a
>> developers release. For example, if you search on CPAN for the latest
>> version of the CPAN module is shows 1.8802. if you go to that page:
>> http://search.cpan.org/~andk/CPAN-1.8802/
>> There is also a link for the latest developer release, released 1 day
>> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).
>
> [snip]
>
>>> Since
>>> $VERSION = 1.52_10;
>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before
>>> release, final release version should be
>>> $VERSION = 1.6010.
>>
>> Because they are dealt with separately, I don't think this is an issue
>> (see above).
>
> If you don't notice the dates, or are doing numerical version number
> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
> not be automatic, but you can still chose to download the developer
> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
> they may choose to get the latest version and think it is 1.6002 when
> infact 1.60 was the more recent version. 1.6010 solves the problem, is
> consistent with your 1.50_10 suggestion, and doesn't cause any
> problems as far as I can see.
>
>

I see - you mean for a non-RC release append 10 to the version number
and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
the version.

--snip--
>
> How did you get Bundle::BioPerl updated? Did you just ask Chris
> Dagdigian to do it for you? Or do you have access to his account? I'll
> ask Ewan about it.
I just asked Chris D. to do it for me :o)

Nath

From bix at sendu.me.uk  Tue Oct 24 09:01:22 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 14:01:22 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0564.9030302@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk>
Message-ID: <453E0EA2.6050306@sendu.me.uk>

Nathan Haigh wrote:
> I see - you mean for a non-RC release append 10 to the version number
> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
> the version.

Precisely.

1.5.2 RC3 will have in Bio::Root::Version :

$VERSION = 1.52_03;
$VERSION = eval $VERSION; # $VERSION is 1.5203

1.5.2 final release would have:

$VERSION = 1.52_10;
$VERSION = eval $VERSION; # $VERSION is 1.5210

1.6.0 RC1 would have:

$VERSION = 1.60_01;
$VERSION = eval $VERSION; # $VERSION is 1.6001

1.6.0 final release would have:

$VERSION = 1.6010;


Nice thing about putting RCs up on CPAN is that I suppose we'd see the 
test results from cpantesters. The more test results the better :)

From n.haigh at sheffield.ac.uk  Tue Oct 24 09:05:54 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 14:05:54 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0EA2.6050306@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk>
	<453E0EA2.6050306@sendu.me.uk>
Message-ID: <453E0FB2.4080002@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> I see - you mean for a non-RC release append 10 to the version number
>> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
>> the version.
>
> Precisely.
>
> 1.5.2 RC3 will have in Bio::Root::Version :
>
> $VERSION = 1.52_03;
> $VERSION = eval $VERSION; # $VERSION is 1.5203
>
> 1.5.2 final release would have:
>
> $VERSION = 1.52_10;
> $VERSION = eval $VERSION; # $VERSION is 1.5210
>
> 1.6.0 RC1 would have:
>
> $VERSION = 1.60_01;
> $VERSION = eval $VERSION; # $VERSION is 1.6001
>
> 1.6.0 final release would have:
>
> $VERSION = 1.6010;
>
>
> Nice thing about putting RCs up on CPAN is that I suppose we'd see the
> test results from cpantesters. The more test results the better :)
Did you see the cpants site I sent earlier:
http://cpants.perl.org/dist/bioperl

But I'm not sure why 1.4 didn't make it in there instead of 1.2.3

From bix at sendu.me.uk  Tue Oct 24 09:14:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 14:14:08 +0100
Subject: [Bioperl-l] CPAN testing Service
In-Reply-To: <453D2120.9010301@sheffield.ac.uk>
References: <453D2120.9010301@sheffield.ac.uk>
Message-ID: <453E11A0.20304@sendu.me.uk>

Nathan S. Haigh wrote:
> We should also check the CPAN testing service (CPANTS) to see how "good"
> our package is for CPAN and try to increase the Kwalitee score. There
> only appears to be details for bioperl-1.2.3 for some reason:
> http://cpants.perl.org/dist/bioperl

Yes, but I think it will be pretty similar score this time round. We'll 
resolve the remaining issues for 1.6.

From cjfields at uiuc.edu  Tue Oct 24 10:24:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 09:24:44 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0436.3050903@sendu.me.uk>
Message-ID: <000501c6f778$279cee10$15327e82@pyrimidine>

...
> >> Since
> >> $VERSION = 1.52_10;
> >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release,
> >> final release version should be
> >> $VERSION = 1.6010.
> >
> > Because they are dealt with separately, I don't think this is an issue
> > (see above).
> 
> If you don't notice the dates, or are doing numerical version number
> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
> not be automatic, but you can still chose to download the developer
> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
> they may choose to get the latest version and think it is 1.6002 when
> infact 1.60 was the more recent version. 1.6010 solves the problem, is
> consistent with your 1.50_10 suggestion, and doesn't cause any problems
> as far as I can see.

CPAN looks like it can handle 'x.y.z', at least for Pugs:

http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/

>From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':

our $VERSION = 6.002013;

That's also a very perlish-way to do it.  And there are no developer
versions of Pugs, since it is always under active development.  We could try
something like:

our $VERSION = 1.005002_01;

just to tag it as a developer release or release candidate, if that's what
you want; I'm neutral to that point.  I don't think it's necessary to post
every RC to CPAN, though, unless you feel very strongly about it.  It just
seems like more hassle than it's worth, esp. since you've been releasing
about one per week leading up to a final 1.5.2 (due soon).  

> >> I might disagree with this though. I think perl people, and perhaps
> unix
> >> people in general, should be used to version numbers like '1.5.2', but
> >> then getting '1.52' from the code since such a number allows simple
> >> numerical comparisons while the former does not. The former is easier
> to
> >> read and understand. This is just how Perl itself behaves.
> >>
> >> Most users who wouldn't expect such a behaviour aren't going to be
> >> checking the version number programatically anyway.
> >>
> >>
> >> BTW. do we have someone with a CPAN account, or should I get one?
> >>
> >
> > It says Ewan Birney is the author of Bioperl - I assume it must be
> > possible to have multiple people have the permissions to update a single
> > package.

As a quick response to the above, I would read 'rel. 1.5.2' as the second
patched release of the second revision (here in a developer cycle) of the
first major release.  I would read 'rel 1.52' as the 52nd release of the
major release (just can't quite make it to version 2, I guess).  I don't
think we can use the latter as it is just too confusing, especially since
we've adopted the 'major.minor.patch' versioning quite early on.  

As for CPAN, I believe there is usually a person or group responsible for
maintaining each distribution.  As Ewan seems to be the point man, you'll
have to ask him.  I suppose it is possible to add more if needed

> How did you get Bundle::BioPerl updated? Did you just ask Chris
> Dagdigian to do it for you? Or do you have access to his account? I'll
> ask Ewan about it.

When I inquired about XML::Simple, I emailed Chris D. via his contact
information from CPAN.  He let me know that adding it would be pretty easy,
so all you need to do is let him know about any errors/additions/deletions.
I think his wiki page also has some contact info.  

Which reminds me, if anyone contacts him, could you make sure that
XML::Simple is added?  I can't remember if it has been.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 24 10:29:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 09:29:11 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0FB2.4080002@sheffield.ac.uk>
Message-ID: <000601c6f778$c639f0e0$15327e82@pyrimidine>

> Sendu Bala wrote:
> > Nathan Haigh wrote:
> >> I see - you mean for a non-RC release append 10 to the version number
> >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
> >> the version.
> >
> > Precisely.
> >
> > 1.5.2 RC3 will have in Bio::Root::Version :
> >
> > $VERSION = 1.52_03;
> > $VERSION = eval $VERSION; # $VERSION is 1.5203
> >
> > 1.5.2 final release would have:
> >
> > $VERSION = 1.52_10;
> > $VERSION = eval $VERSION; # $VERSION is 1.5210
> >
> > 1.6.0 RC1 would have:
> >
> > $VERSION = 1.60_01;
> > $VERSION = eval $VERSION; # $VERSION is 1.6001
> >
> > 1.6.0 final release would have:
> >
> > $VERSION = 1.6010;
> >
> >
> > Nice thing about putting RCs up on CPAN is that I suppose we'd see the
> > test results from cpantesters. The more test results the better :)
> Did you see the cpants site I sent earlier:
> http://cpants.perl.org/dist/bioperl
> 
> But I'm not sure why 1.4 didn't make it in there instead of 1.2.3

Yes, odd.  Another thing to note is that CPAN also list two bugs related to
bioperl 1.4.  We may need to have some way of either redirecting users from
there to bugzilla, or routinely checking the CPAN site.  Otherwise we'll
miss those. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From JK at novozymes.com  Tue Oct 24 10:45:26 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 16:45:26 +0200
Subject: [Bioperl-l] Keeping references around in the objects?
Message-ID: <934F95E71B6C9347A873C42AE3C196191299E011@NZT0004E.dknz.nzcorp.net>

Hi All. 

When getting a Bio::Seq object back from a feature it would be really 
nice to have access to the old objects through the new object as:

$featseq->feature()->parent_seq();

Would it be possible to keep the references around for (as an example) 
to be able to access the global information through the particular
feature. 

Most of the annotation in the general header of a EMBL/Genbank-record
also
applies to the specific features. 

Jesper


From JK at novozymes.com  Tue Oct 24 10:28:22 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 16:28:22 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
Message-ID: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>

Hi. 

We're trying to "extend" bioperl in our own setup. We have some funtions

that we'd like to "allways" have available on a Bio::Seq-object. As an
example, 
I'd like to have the sequence-digest available on ->digest that just
returns
A hex-encoded message-digest of the sequence in the object. This is
really comfortable
when trying to figure out wether we've got some computations stored in
the cache
for this particular sequence. 

Another example is that we have some fields we want to be mandatory in
the objects,
thus adding additional checks in the constructor is nessesary. 

Our approach has been to "subclass" Bio::Seq in a new object: (Nz::Seq)
and add 
the functionality there. This generally works fine (->translate() calls
->can_call_new()
and instantiates the correct subclassed object. 

But the logic fails when the ->seq of a feature just instantiates a
Bio::PrimarySeq 
without trying to get the subclassed object. 

So the question basically is: 
What is the preferred way of extending/subclassing Bio-perl -objects
with 
our own methods? 

Jesper


From bix at sendu.me.uk  Tue Oct 24 11:26:19 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 16:26:19 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <000501c6f778$279cee10$15327e82@pyrimidine>
References: <000501c6f778$279cee10$15327e82@pyrimidine>
Message-ID: <453E309B.9090007@sendu.me.uk>

Chris Fields wrote:
> ...
>>>> Since
>>>> $VERSION = 1.52_10;
>>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release,
>>>> final release version should be
>>>> $VERSION = 1.6010.
>>> Because they are dealt with separately, I don't think this is an issue
>>> (see above).
>> If you don't notice the dates, or are doing numerical version number
>> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
>> not be automatic, but you can still chose to download the developer
>> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
>> they may choose to get the latest version and think it is 1.6002 when
>> infact 1.60 was the more recent version. 1.6010 solves the problem, is
>> consistent with your 1.50_10 suggestion, and doesn't cause any problems
>> as far as I can see.
> 
> CPAN looks like it can handle 'x.y.z', at least for Pugs:
> 
> http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/

'handle'? I think it shows up as '6.2.13' simply because it was uploaded 
with the filename Perl6-Pugs-6.2.13.tar.gz


As you point out, the code has the kind of $VERSION number we've been 
suggesting in this thread:

> From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> 
> our $VERSION = 6.002013;
> 
> That's also a very perlish-way to do it.  And there are no developer
> versions of Pugs, since it is always under active development.  We could try
> something like:
> 
> our $VERSION = 1.005002_01;

Yes, this was already like one of my suggestions (1.0502_01), but I 
brought up the concern that 1.05 might be < 1.4.

So then we have a question: do we try and fumble a 1.4 compatible number 
by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if 
it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no 
room for RC numbering, or 1.006000010 (1.6.0.10) - the first final 
release following some 1.006000_001 (1.6.0.01 == rc1) RCs?


> just to tag it as a developer release or release candidate, if that's what
> you want; I'm neutral to that point.  I don't think it's necessary to post
> every RC to CPAN, though, unless you feel very strongly about it.  It just
> seems like more hassle than it's worth, esp. since you've been releasing
> about one per week leading up to a final 1.5.2 (due soon).  

I don't think it would be a hassle; on the contrary it would be very 
useful to know the CPAN distribution actually works. I'm very happy with 
the idea that a release candidate gets fully tested...

From bix at sendu.me.uk  Tue Oct 24 11:39:16 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 16:39:16 +0100
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
Message-ID: <453E33A4.5060004@sendu.me.uk>

JK (Jesper Agerbo Krogh) wrote:
> Hi. 
> 
> We're trying to "extend" bioperl in our own setup. We have some funtions
> that we'd like to "allways" have available on a Bio::Seq-object.
[snip]
> So the question basically is: 
> What is the preferred way of extending/subclassing Bio-perl -objects
> with our own methods? 

http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit


From hlapp at gmx.net  Tue Oct 24 12:24:09 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 24 Oct 2006 12:24:09 -0400
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
Message-ID: <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>

I think you've generally taken the right path, but see below.

First off, object factories are used extensively already but not yet  
in each and every place where Bioperl creates an object internally.  
Achieving your goal may entail fixes to Bioperl to use a factory  
instead of a hard-coded module name. Also be on the lookout for  
factory() or seq_factory() methods for classes whose work entails  
creating sequence objects and that already give you control over the  
type to be created.

The problem that hits you here though isn't one of determining the  
type of the object to be created, because the respective method  
doesn't create a sequence object. It only returns the sequence object  
that the feature has a reference to.

The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your  
extension of the latter is that the Perl garbage collector can't deal  
with circular references. The way we've circumvented the problem with  
sequence (who hold references to their feature objects) and feature  
objects (who need to hold a reference to their sequence object) is to  
make Bio::Seq a wrapper around Bio::PrimarySeq (i.e., Bio::Seq  
implements Bio::PrimarySeqI by delegating all the Bio::PrimarySeqI  
methods to an instance of Bio::PrimarySeq, and then adds  
implementations of the Bio::SeqI methods), and then make feature  
objects only hold a reference to the 'base' Bio::PrimarySeq instance.  
This works because Bio::PrimarySeq doesn't hold features, only  
Bio::SeqI objects do.

Having said all that, note that if all what you want to do is  
defining computations on Bio::Seq objects, as opposed to storing  
values for additional attributes, the best design approach is not to  
extend the class but to create a class with those computations as  
static methods (which would accept the seq object on which to compute  
as an argument; e.g., print $seqComputations->message_digest($seq)).

	-hlmar


On Oct 24, 2006, at 10:28 AM, JK ((Jesper Agerbo Krogh)) wrote:

> Hi.
>
> We're trying to "extend" bioperl in our own setup. We have some  
> funtions
>
> that we'd like to "allways" have available on a Bio::Seq-object. As an
> example,
> I'd like to have the sequence-digest available on ->digest that just
> returns
> A hex-encoded message-digest of the sequence in the object. This is
> really comfortable
> when trying to figure out wether we've got some computations stored in
> the cache
> for this particular sequence.
>
> Another example is that we have some fields we want to be mandatory in
> the objects,
> thus adding additional checks in the constructor is nessesary.
>
> Our approach has been to "subclass" Bio::Seq in a new object:  
> (Nz::Seq)
> and add
> the functionality there. This generally works fine (->translate()  
> calls
> ->can_call_new()
> and instantiates the correct subclassed object.
>
> But the logic fails when the ->seq of a feature just instantiates a
> Bio::PrimarySeq
> without trying to get the subclassed object.
>
> So the question basically is:
> What is the preferred way of extending/subclassing Bio-perl -objects
> with
> our own methods?
>
> Jesper
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 24 12:45:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 11:45:25 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E309B.9090007@sendu.me.uk>
Message-ID: <000001c6f78b$d1c65a30$15327e82@pyrimidine>

...
> 
> 'handle'? I think it shows up as '6.2.13' simply because it was uploaded
> with the filename Perl6-Pugs-6.2.13.tar.gz

Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is
'6.002013'.  So maybe we should follow a similar convention.  Seems easier
and less confusing to me, at least.
 
> As you point out, the code has the kind of $VERSION number we've been
> suggesting in this thread:
> 
> > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> >
> > our $VERSION = 6.002013;
> >
> > That's also a very perlish-way to do it.  And there are no developer
> > versions of Pugs, since it is always under active development.  We could
> try
> > something like:
> >
> > our $VERSION = 1.005002_01;
> 
> Yes, this was already like one of my suggestions (1.0502_01), but I
> brought up the concern that 1.05 might be < 1.4.
> 
> So then we have a question: do we try and fumble a 1.4 compatible number
> by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if
> it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no
> room for RC numbering, or 1.006000010 (1.6.0.10) - the first final
> release following some 1.006000_001 (1.6.0.01 == rc1) RCs?

I would go for the clean break if it follows perl/CPAN convention.
'1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing.

If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6
RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. 

BTW, the reason I looked at Pugs was to see what some of the Perl6
developers were using.  Who knows; they'll probably change it!

...

> I don't think it would be a hassle; on the contrary it would be very
> useful to know the CPAN distribution actually works. I'm very happy with
> the idea that a release candidate gets fully tested...

So you obviously feel strongly about it!  ;> 

I don't have a problem as long as we stick with doing this from now on (i.e.
have a consistent versioning scheme, release policy, CPAN release policy,
etc).  Would be nice for Jason/Brian/Hilmar to chime in as to the reasoning
behind the older versioning scheme.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From JK at novozymes.com  Tue Oct 24 13:59:10 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 19:59:10 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>

>  
> I think you've generally taken the right path, but see below.
> 
> First off, object factories are used extensively already but not yet  
> in each and every place where Bioperl creates an object internally.  
> Achieving your goal may entail fixes to Bioperl to use a factory  
> instead of a hard-coded module name. Also be on the lookout for  
> factory() or seq_factory() methods for classes whose work entails  
> creating sequence objects and that already give you control over the  
> type to be created.

Can you elaborate/describe this a bit more? 

> The problem that hits you here though isn't one of determining the  
> type of the object to be created, because the respective method  
> doesn't create a sequence object. It only returns the sequence object  
> that the feature has a reference to.

This was what Data::Dumper told me, but stuff I'd likewise would like to 
change was to get a RichSeq object returned every-time from Bio::Seq, adding
in the stuff that allways seems appropriate. 

> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your  
> extension of the latter is that the Perl garbage collector can't deal  
> with circular references. 

Doesn't Scalar::Util::weaken solve that? 

> Having said all that, note that if all what you want to do is  
> defining computations on Bio::Seq objects, as opposed to storing  
> values for additional attributes, the best design approach is not to  
> extend the class but to create a class with those computations as  
> static methods (which would accept the seq object on which to compute  
> as an argument; e.g., print $seqComputations->message_digest($seq)).

I could but there are some functionality that I'd by design would like to 
have available on every sequence in the system. This way I would end up 
coding the functionality for getting the message_digest every place that
I needed to get the value (which would be quite often in this application), 
whereas it by design belongs into the Bio::Seq-stuff. 

Jesper


From JK at novozymes.com  Tue Oct 24 13:59:19 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 19:59:19 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <453E33A4.5060004@sendu.me.uk>
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FD@NZT0004E.dknz.nzcorp.net>


> JK (Jesper Agerbo Krogh) wrote:
> > Hi. 
> > 
> > We're trying to "extend" bioperl in our own setup. We have some funtions
> > that we'd like to "allways" have available on a Bio::Seq-object.
> [snip]
> > So the question basically is: 
> > What is the preferred way of extending/subclassing Bio-perl -objects
> > with our own methods? 
> 
> http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit

That is definately a way of extending Bio-perl, thanks. 

Jesper


From hlapp at gmx.net  Tue Oct 24 14:57:02 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 24 Oct 2006 14:57:02 -0400
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>
	<934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>
Message-ID: <C8DB5DCD-E5BB-4AA0-9CDA-3C2EC7B88621@gmx.net>


On Oct 24, 2006, at 1:59 PM, JK ((Jesper Agerbo Krogh)) wrote:

>>
>> I think you've generally taken the right path, but see below.
>>
>> First off, object factories are used extensively already but not yet
>> in each and every place where Bioperl creates an object internally.
>> Achieving your goal may entail fixes to Bioperl to use a factory
>> instead of a hard-coded module name. Also be on the lookout for
>> factory() or seq_factory() methods for classes whose work entails
>> creating sequence objects and that already give you control over the
>> type to be created.
>
> Can you elaborate/describe this a bit more?

See for example the POD of Bio::SeqIO (sorry, the method is called  
sequence_factory()).

>
>> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your
>> extension of the latter is that the Perl garbage collector can't deal
>> with circular references.
>
> Doesn't Scalar::Util::weaken solve that?

You're welcome to test and try. It should be a simple change in  
Bio::Seq::add_SeqFeature(). You will see that it is this method and  
not the feature object that makes sure the wrapped primarySeq gets  
passed as sequence reference. Just change that to creating a new  
reference to the sequence object and make it a weak reference before  
passing it to the feature object.

(The feature object has no requirement (or knowledge) that the  
referenced sequence object is a PrimarySeq.)

>
>> Having said all that, note that if all what you want to do is
>> defining computations on Bio::Seq objects, as opposed to storing
>> values for additional attributes, the best design approach is not to
>> extend the class but to create a class with those computations as
>> static methods (which would accept the seq object on which to compute
>> as an argument; e.g., print $seqComputations->message_digest($seq)).
>
> I could but there are some functionality that I'd by design would  
> like to
> have available on every sequence in the system. This way I would  
> end up
> coding the functionality for getting the message_digest every place  
> that
> I needed to get the value (which would be quite often in this  
> application),
> whereas it by design belongs into the Bio::Seq-stuff.

I'm not following you why this would make any difference (it would be  
$seq->message_digest() compared to $seqCompute->message_digest 
($seq)), unless what you are saying is that you would like to cache  
the result of the computation.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Wed Oct 25 06:36:27 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 25 Oct 2006 11:36:27 +0100
Subject: [Bioperl-l] Lagan environment variable
Message-ID: <453F3E2B.2040309@sendu.me.uk>

Notification to say I'm changing the environmental variable that 
Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
default variable that the lagan installation and scripts themselves look 
for.

I hope this isn't too much of a burden, but it seems like the sensible 
approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.


Thank you,
Sendu.

From n.haigh at sheffield.ac.uk  Wed Oct 25 09:07:47 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 25 Oct 2006 13:07:47 +0000
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F3E2B.2040309@sendu.me.uk>
References: <453F3E2B.2040309@sendu.me.uk>
Message-ID: <453F61A3.4090904@sheffield.ac.uk>

Sendu Bala wrote:
> Notification to say I'm changing the environmental variable that 
> Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
> default variable that the lagan installation and scripts themselves look 
> for.
>
> I hope this isn't too much of a burden, but it seems like the sensible 
> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.
>
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Woudn't it make more sense to change the test? That is what I've just
done for t/Genscan.t

It seemed to fit in with the ENV variable syntax that other modules in
Bioperl-run used.

Nath

-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>

From bix at sendu.me.uk  Wed Oct 25 08:12:00 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 25 Oct 2006 13:12:00 +0100
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F61A3.4090904@sheffield.ac.uk>
References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk>
Message-ID: <453F5490.7060808@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Notification to say I'm changing the environmental variable that 
>> Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
>> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
>> default variable that the lagan installation and scripts themselves look 
>> for.
>>
>> I hope this isn't too much of a burden, but it seems like the sensible 
>> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.
>
> Woudn't it make more sense to change the test? That is what I've just
> done for t/Genscan.t

For Genscan.t, the test script looked at the wrong environment variable.

Here I'm talking about lagan itself (the thing you get from 
http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with 
Bioperl) needing the environment variable LAGAN_DIR to be set in order 
to work.

Since you need to set LAGAN_DIR to make lagan work, it makes sense that 
the Bioperl front-end to lagan also use the same variable.


From n.haigh at sheffield.ac.uk  Wed Oct 25 09:16:16 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 25 Oct 2006 13:16:16 +0000
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F5490.7060808@sendu.me.uk>
References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk>
	<453F5490.7060808@sendu.me.uk>
Message-ID: <453F63A0.7040609@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Notification to say I'm changing the environmental variable that
>>> Bio::Tools::Run::Alignment::Lagan expects to define the location of
>>> the lagan executables from LAGANDIR to LAGAN_DIR, since the latter
>>> is the default variable that the lagan installation and scripts
>>> themselves look for.
>>>
>>> I hope this isn't too much of a burden, but it seems like the
>>> sensible approach to getting Bio::Tools::Run::Alignment::Lagan to
>>> actually work.
>>
>> Woudn't it make more sense to change the test? That is what I've just
>> done for t/Genscan.t
>
> For Genscan.t, the test script looked at the wrong environment variable.
>
> Here I'm talking about lagan itself (the thing you get from
> http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with
> Bioperl) needing the environment variable LAGAN_DIR to be set in order
> to work.
>
> Since you need to set LAGAN_DIR to make lagan work, it makes sense
> that the Bioperl front-end to lagan also use the same variable.
>
Ah, OK! :-[  teach me for speak up about something I know nothing about!
:-)

FYI, I've been busy this morning installing as much Bioperl-run external
software as I could (those that have tests). Will be posting results shorty.

Nath


From massimo.ubaldi at gmail.com  Wed Oct 25 10:28:52 2006
From: massimo.ubaldi at gmail.com (Massimo Ubaldi)
Date: Wed, 25 Oct 2006 16:28:52 +0200
Subject: [Bioperl-l] blastxml format
Message-ID: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>

Hi
I'm using the script below to parse a blastn output to multiple sequences
I got the output from the blast web interface asking for xml formatted
output.
Everything work fine except that I cannot print the name of each input
sequence (see below).
That is, using the line (see below) $result->query_description I got just
the name of the first sequence. Infact this is defined by the
<BlastOutput_query-def> tag.
What I really want is to extract the name that is defined by the
<Iteration_query-def> tag.
Now I digged out the bioperl mailing list and other sources but I did not
find anything to solve this.
Can somebody help me?
Thanks alot
Massimo


 This is an example of ouput I got

MRDNA_probe
46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form B
(LOC562171), mRNA    68354945    XM_685568
81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
68420187    XM_684078

This what I'd like to get
MRDNA_probe
46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form B
(LOC562171), mRNA    68354945    XM_685568
VDRacterm_probe
81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
ARalpcterm_probe
PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
68420187    XM_684078

This is the script
#!/usr/bin/perl
use strict;
use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'blast',
                            -file   => 'Blastn_danio.bls');
open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
stopped";
my $result = $in->next_result;
print OUTFILE $result->algorithm, "\n";
print OUTFILE $result->database_name, "\n";

print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
"\t", "GenBank Accession", "\n";

while($result = $in->next_result ) {
    print OUTFILE $result->query_description, "\n";
      while( my $hit = $result->next_hit ) {
           while( my $hsp = $hit->next_hsp ) {

                my $acc=$hit->name;
                my $description= $hit->description;

                $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;

                print OUTFILE

                  $hit->raw_score, "\t", # Score
                  $hit->description, "\t", # Description

                $1, "\t", $2, "\n";
         }
      }
}

From cjfields at uiuc.edu  Wed Oct 25 11:04:14 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Oct 2006 10:04:14 -0500
Subject: [Bioperl-l] blastxml format
In-Reply-To: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>
Message-ID: <000301c6f846$d6227760$15327e82@pyrimidine>

Iterations (which are related to PSIBLAST) aren't currently handled in
blastxml, which is why the tag isn't being parsed.  I'll give it a look but
I don't think it will be properly fixed anytime soon, since we're gearing up
for a developer release and are sorting out various bugs in relation to
that.

In the meantime, you could always try changing the relevant tag in the
%MAPPING hash in your local copy of Bio::SearchIO::blastxml from
'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick for
you.  I'm a bit reluctant to change this in CVS as it would be better to add
this in when iterations are handled properly by blastxml, and I'm not sure
all BLAST XML varieties have the <Iteration_query-def> tag.

If you want you can add this to the bioperl bugzilla as an enhancement
request to remind us:

http://bugzilla.open-bio.org/

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi
> Sent: Wednesday, October 25, 2006 9:29 AM
> To: bioperl-l List
> Subject: [Bioperl-l] blastxml format
> 
> Hi
> I'm using the script below to parse a blastn output to multiple sequences
> I got the output from the blast web interface asking for xml formatted
> output.
> Everything work fine except that I cannot print the name of each input
> sequence (see below).
> That is, using the line (see below) $result->query_description I got just
> the name of the first sequence. Infact this is defined by the
> <BlastOutput_query-def> tag.
> What I really want is to extract the name that is defined by the
> <Iteration_query-def> tag.
> Now I digged out the bioperl mailing list and other sources but I did not
> find anything to solve this.
> Can somebody help me?
> Thanks alot
> Massimo
> 
> 
>  This is an example of ouput I got
> 
> MRDNA_probe
> 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form
> B
> (LOC562171), mRNA    68354945    XM_685568
> 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> 68420187    XM_684078
> 
> This what I'd like to get
> MRDNA_probe
> 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form
> B
> (LOC562171), mRNA    68354945    XM_685568
> VDRacterm_probe
> 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> ARalpcterm_probe
> PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> 68420187    XM_684078
> 
> This is the script
> #!/usr/bin/perl
> use strict;
> use Bio::SearchIO;
> my $in = new Bio::SearchIO(-format => 'blast',
>                             -file   => 'Blastn_danio.bls');
> open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
> stopped";
> my $result = $in->next_result;
> print OUTFILE $result->algorithm, "\n";
> print OUTFILE $result->database_name, "\n";
> 
> print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
> "\t", "GenBank Accession", "\n";
> 
> while($result = $in->next_result ) {
>     print OUTFILE $result->query_description, "\n";
>       while( my $hit = $result->next_hit ) {
>            while( my $hsp = $hit->next_hsp ) {
> 
>                 my $acc=$hit->name;
>                 my $description= $hit->description;
> 
>                 $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;
> 
>                 print OUTFILE
> 
>                   $hit->raw_score, "\t", # Score
>                   $hit->description, "\t", # Description
> 
>                 $1, "\t", $2, "\n";
>          }
>       }
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From massimo.ubaldi at gmail.com  Wed Oct 25 11:20:49 2006
From: massimo.ubaldi at gmail.com (Massimo Ubaldi)
Date: Wed, 25 Oct 2006 17:20:49 +0200
Subject: [Bioperl-l] blastxml format
In-Reply-To: <000301c6f846$d6227760$15327e82@pyrimidine>
References: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>
	<000301c6f846$d6227760$15327e82@pyrimidine>
Message-ID: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com>

Thanks for the reply. I've already tried this but I got exactly the same
results as before.
What other can I try?
Massimo

On 10/25/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Iterations (which are related to PSIBLAST) aren't currently handled in
> blastxml, which is why the tag isn't being parsed.  I'll give it a look
> but
> I don't think it will be properly fixed anytime soon, since we're gearing
> up
> for a developer release and are sorting out various bugs in relation to
> that.
>
> In the meantime, you could always try changing the relevant tag in the
> %MAPPING hash in your local copy of Bio::SearchIO::blastxml from
> 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick
> for
> you.  I'm a bit reluctant to change this in CVS as it would be better to
> add
> this in when iterations are handled properly by blastxml, and I'm not sure
> all BLAST XML varieties have the <Iteration_query-def> tag.
>
> If you want you can add this to the bioperl bugzilla as an enhancement
> request to remind us:
>
> http://bugzilla.open-bio.org/
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi
> > Sent: Wednesday, October 25, 2006 9:29 AM
> > To: bioperl-l List
> > Subject: [Bioperl-l] blastxml format
> >
> > Hi
> > I'm using the script below to parse a blastn output to multiple
> sequences
> > I got the output from the blast web interface asking for xml formatted
> > output.
> > Everything work fine except that I cannot print the name of each input
> > sequence (see below).
> > That is, using the line (see below) $result->query_description I got
> just
> > the name of the first sequence. Infact this is defined by the
> > <BlastOutput_query-def> tag.
> > What I really want is to extract the name that is defined by the
> > <Iteration_query-def> tag.
> > Now I digged out the bioperl mailing list and other sources but I did
> not
> > find anything to solve this.
> > Can somebody help me?
> > Thanks alot
> > Massimo
> >
> >
> >  This is an example of ouput I got
> >
> > MRDNA_probe
> > 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor
> form
> > B
> > (LOC562171), mRNA    68354945    XM_685568
> > 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> > 68420187    XM_684078
> >
> > This what I'd like to get
> > MRDNA_probe
> > 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor
> form
> > B
> > (LOC562171), mRNA    68354945    XM_685568
> > VDRacterm_probe
> > 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> > ARalpcterm_probe
> > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> > 68420187    XM_684078
> >
> > This is the script
> > #!/usr/bin/perl
> > use strict;
> > use Bio::SearchIO;
> > my $in = new Bio::SearchIO(-format => 'blast',
> >                             -file   => 'Blastn_danio.bls');
> > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
> > stopped";
> > my $result = $in->next_result;
> > print OUTFILE $result->algorithm, "\n";
> > print OUTFILE $result->database_name, "\n";
> >
> > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
> > "\t", "GenBank Accession", "\n";
> >
> > while($result = $in->next_result ) {
> >     print OUTFILE $result->query_description, "\n";
> >       while( my $hit = $result->next_hit ) {
> >            while( my $hsp = $hit->next_hsp ) {
> >
> >                 my $acc=$hit->name;
> >                 my $description= $hit->description;
> >
> >                 $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;
> >
> >                 print OUTFILE
> >
> >                   $hit->raw_score, "\t", # Score
> >                   $hit->description, "\t", # Description
> >
> >                 $1, "\t", $2, "\n";
> >          }
> >       }
> > }
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

From cjfields at uiuc.edu  Wed Oct 25 12:56:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Oct 2006 11:56:46 -0500
Subject: [Bioperl-l] blastxml format
In-Reply-To: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com>
Message-ID: <000001c6f856$8ee44bc0$15327e82@pyrimidine>


> Thanks for the reply. I've already tried this but I got exactly the same >
> results as before.
> What other can I try? 
> Massimo

If you don't mind me asking, what version of perl and Bioperl are you using,
and what version of BLAST is used?  

I want to point out there are a number of problems with your script, now I
have had a chance to look at it.  

1) You have the SearchIO format set to 'blast'.  It should be 'blastxml' if
you are parsing XML format.  

2) Every time you call next_result() you iterate through each BLAST report.
In effect, you're doing something like this:

  my $result = $in->next_result();
   ....# do something here (in first BLAST report)
 
  while ($result = $in->next_result()) { # change to second BLAST report
      # more stuff here (in second BLAST report, if there is one)
  }

I don't know if it's intentional though, but it's something to point out.

3) You also use raw_score(), which doesn't return a value for me (this may
be related to the bioperl version, which is why I asked above).  If you use
$hit->bits() or $hit->significance() you can get the bits or hit evalue,
respectively.

4) Also, I didn't see a difference with the two XML tags
<BlastOutput_query-def> and <Iteration_query-def> using BLAST 2.2.15 output
(WebBLAST at NCBI), which makes sense since they should originate from the
same query sequence anyway.  This could be related to the BLAST version.

Here's my version of your script, using WinXP and bioperl-live (CVS):

use Bio::SearchIO;
my $file = shift @ARGV;

my $in = new Bio::SearchIO(-format => 'blastxml',
                            -file   => $file);

open OUTFILE, ">parsed_blastn_danio.txt" || 
die "Could not open file, stopped";

while(my $result = $in->next_result ) {
    print OUTFILE $result->algorithm, "\n";
    print OUTFILE $result->database_name, "\n";
    print OUTFILE "Score", "\t",
                  "Description", "\t",
                  "NCBI gi identifiers", "\t",
                  "GenBank Accession", "\n";
    print OUTFILE $result->query_description, "\n";
    while( my $hit = $result->next_hit ) {
        while( my $hsp = $hit->next_hsp ) {
            my $acc=$hit->name;
            my $description= $hit->description;
            if ($acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/) {
                print OUTFILE $hit->bits, "\t", # Score
                  $hit->description, "\t", # Description
                  $1, "\t", $2, "\n";
            }
        }
    }
}

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign

...


From n.haigh at sheffield.ac.uk  Thu Oct 26 04:47:27 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 09:47:27 +0100
Subject: [Bioperl-l] More extensive Bioperl-run 1.5.2RC2 tests
Message-ID: <4540761F.6010904@sheffield.ac.uk>

Oops, I posted this to the Biojava list the other day by mistake!

I have recently installed some more software for which there are
bioperl-run tests and run the test suite with several versions of the
software I could find. I've added info to
http://www.bioperl.org/wiki/Release_1.5.2#bioperl-run. If there were any
fails in any of the versions I tested I've noted them together with
versions that were ok (if any).

There maybe another 6 or so programs I'm trying to get hold of to run
further tests - I'll update when I get them.
Nath


From n.haigh at sheffield.ac.uk  Thu Oct 26 05:14:07 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 10:14:07 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
Message-ID: <45407C5F.40104@sheffield.ac.uk>

I'm thinking that it's not wise to test for things like
overall_percentage_identity etc in alignments that are generated by
external software like T-Coffee, Clustalw etc. Changes to software
algorithms/efficiency, bug fixes etc may well alter the quality of the
alignment produced in different versions and thus affect the value
returned by such methods. Therefore, I think these methods should only
be tested from alignments loaded directly from t/data.

Nath

From bix at sendu.me.uk  Thu Oct 26 05:48:37 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 26 Oct 2006 10:48:37 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45407C5F.40104@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk>
Message-ID: <45408475.30903@sendu.me.uk>

Nathan Haigh wrote:
> I'm thinking that it's not wise to test for things like
> overall_percentage_identity etc in alignments that are generated by
> external software like T-Coffee, Clustalw etc. Changes to software
> algorithms/efficiency, bug fixes etc may well alter the quality of the
> alignment produced in different versions and thus affect the value
> returned by such methods. Therefore, I think these methods should only
> be tested from alignments loaded directly from t/data.

Did you discover some specific problem cases?

From n.haigh at sheffield.ac.uk  Thu Oct 26 06:04:54 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 11:04:54 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408475.30903@sendu.me.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
Message-ID: <45408846.1050001@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> I'm thinking that it's not wise to test for things like
>> overall_percentage_identity etc in alignments that are generated by
>> external software like T-Coffee, Clustalw etc. Changes to software
>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>> alignment produced in different versions and thus affect the value
>> returned by such methods. Therefore, I think these methods should only
>> be tested from alignments loaded directly from t/data.
>
> Did you discover some specific problem cases?
My messages seem to be taking a while to come through, but, yes. It may
be due to the software changing default parameters, but it makes testing
the output for specific details pretty difficult and inconsistent. For
example, running T-Coffee, the following command from t/TCoffee.t
results in slightly different alignment:
$aln = $factory->run('-type' => 'profile',
                     '-profile' => $aln1,
                     '-seq'  =>
Bio::Root::IO->catfile("t","data","cysprot1b.fa"));

Of particular note, is the gaps on the last line of the sequences. In
4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
<v4.45 this is ('gkn----mcg').

T-Coffee v4.45 returns the following alignment:

>CATH_RAT/1-333
------mwtalpllcagawllsagat----------aeltvnaiek------------fh
ftswmkqhqktyss-reyshrlqvfannwrkiqahn----qrnhtfkmglnqfsdmsfae
ikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqgacgscwtfs
ttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqafeyilynk
gimgedsypyigkngqckfnpekavafvknvv-nitlndeaamveavalynpvsfafevt
-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivknswgsnwgnn
gyfliergk-nm---cglaacasypipqv
>CATL_HUMAN/1-333
--------------------------------mnptlilaafclgiasatltfdhsleaq
wtkwkamhnrlygmnee-gwrravweknmkmielhnqeyregkhsftmamnafgdmtsee
frqvmngfqnrkpr----kgkvfqeplfyeaprsvdwrekg-yvtpvknqgqcgscwafs
atgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdyafqyvqdng
gldseesypyeateesckynpkysvandtgfv-dip-kqekalmkavatvgpisvaidag
hesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvknswgeewgmg
gyvkmakdrrnh---cgiasaasyptv--
>CATL_RAT/1-334
--------------------------------mtpllllavlclgtalatpkfdqtfnaq
whqwksthrrlygtnee-ewrravweknmrmiqlhngeysngkhgftmemnafgdmtnee
frqivngyrhqkhk----kgrlfqeplmlqipktvdwrekg-cvtpvknqgqcgscwafs
asgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfafqyikeng
gldseesypyeakdgsckyraeyavandtgfv-dip-qqekalmkavatvgpisvamdas
hpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvknswgkewgmd
gyikiakdrnnh---cglataasypivn-
>PAPA_CARPA/1-345
mamipsiskllfvaiclfvymglsfg-------------dfsivgysqndltsterliql
feswmlkhnkiyknidekiyrfeifkdnlkyidetn----kknnsywlglnvfadmsnde
fkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgscgscwafs
avvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsalqlvaqy-
gihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysian-qpvsvvleaa
gkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yiliknswgtgwgen
gyirikrgtgnsygvcglytssfypvkn-
>ALEU_HORVU/1-362
maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtrhalr
farfavrygksyesaaevrrrfrifsesleevrstn----rkglpyrlginrfsdmswee
fqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqahcgscwtfs
ttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqafeyikyng
gidteesypykgvngvchykaenaavqvldsv-nitlnaedelknavglvrpvsvafqvi
-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywliknswgadwgdn
gyfkmemgk-nm---caiatcasypvvaa
>CATH_HUMAN/1-335
------mwatlpllcagawllg--------vpvcgaaelsvnslek------------fh
fkswmskhrktys-teeyhhrlqtfasnwrkinahn----ngnhtfkmalnqfsdmsfae
ikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqgacgscwtfs
ttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqafeyilynk
gimgedtypyqgkdgyckfqpgkaigfvkdva-nitiydeeamveavalynpvsfafevt
-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivknswgpqwgmn
gyfliergk-nm---cglaacasypiplv
>CYS1_DICDI/1-343
-----mkvillfvlavftvfvs---------------srgippeeq------------sq
flefqdkfnkkys-heeylerfeifksnlgkieelnliainhkadtkfgvnkfadlssde
fknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgqcgscwsfs
ttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpnaynyiikng
giqtessypytaetgtqcnfnsanigakisnf-tmipknetvmagyivstgplaiaadav
-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivknswgadwgeq
gyiylrrgk-nt---cgvsnfvstsii--

While T-Coffee <4.45 returned:
>CATH_RAT/1-333
----------mwtalpllcagawllsagat----------aeltvnaiek----------
--fhftswmkqhqktyss-reyshrlqvfannwrkiqahn----q----rnhtfkmglnq
fsdmsfaeikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqga
cgscwtfsttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqa
feyilynkgimgedsypyigkngqckfnpekavafvknvvn-itlndeaamveavalynp
vsfafevt-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivkns
wgsnwgnngyfliergkn----mcglaacasypipqv
>PAPA_CARPA/1-345
mamipsiskllfvaiclfvymglsfgdfsivgysqndltsterliqlfeswml-------
-------------khnkiyknidekiyrf-----eifkdnlkyidetnkknnsywlglnv
fadmsndefkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgs
cgscwafsavvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsa
lq-lvaqygihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysia-nqp
vsvvleaagkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yilikns
wgtgwgengyirikrgtgnsygvcglytssfypvkn-
>CATL_HUMAN/1-333
-----------------------------------------mnptlilaafclgiasatl
tfdhsleaqwtkwkamhnrlygmneegwrravweknmkmielhnqeyregkhsftmamna
fgdmtseefrqvmngfqnrkprkgkvfqeplf----yeaprsvdwrekg-yvtpvknqgq
cgscwafsatgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdya
fqyvqdnggldseesypyeateesckynpkysvandtgfvd--ipkqekalmkavatvgp
isvaidaghesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvkns
wgeewgmggyvkmakdrrnh---cgiasaasyptv--
>CATL_RAT/1-334
-----------------------------------------mtpllllavlclgtalatp
kfdqtfnaqwhqwksthrrlygtneeewrravweknmrmiqlhngeysngkhgftmemna
fgdmtneefrqivngyrhqkhkkgrlfqeplm----lqipktvdwrekg-cvtpvknqgq
cgscwafsasgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfa
fqyikenggldseesypyeakdgsckyraeyavandtgfvd--ipqqekalmkavatvgp
isvamdashpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvkns
wgkewgmdgyikiakdrnnh---cglataasypivn-
>ALEU_HORVU/1-362
----maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtr
halrfarfavrygksyesaaevrrrfrifsesleevrstn----r----kglpyrlginr
fsdmsweefqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqah
cgscwtfsttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqa
feyikynggidteesypykgvngvchykaenaavqvldsvn-itlnaedelknavglvrp
vsvafqvi-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywlikns
wgadwgdngyfkmemgkn----mcaiatcasypvvaa
>CATH_HUMAN/1-335
----------mwatlpllcagawllg--------vpvcgaaelsvnslek----------
--fhfkswmskhrktys-teeyhhrlqtfasnwrkinahn----n----gnhtfkmalnq
fsdmsfaeikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqga
cgscwtfsttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqa
feyilynkgimgedtypyqgkdgyckfqpgkaigfvkdvan-itiydeeamveavalynp
vsfafevt-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivkns
wgpqwgmngyfliergkn----mcglaacasypiplv
>CYS1_DICDI/1-343
---------mkvillfvlavftvfvs---------------srgippeeq----------
--sqflefqdkfnkkys-heeylerfeifksnlgkieelnliain----hkadtkfgvnk
fadlssdefknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgq
cgscwsfsttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpna
ynyiiknggiqtessypytaetgtqcnfnsanigakisnft-mipknetvmagyivstgp
laiaadav-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivkns
wgadwgeqgyiylrrgkn----tcgvsnfvstsii--

From sanges at biogem.it  Thu Oct 26 06:26:36 2006
From: sanges at biogem.it (Remo Sanges)
Date: Thu, 26 Oct 2006 11:26:36 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408846.1050001@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk>
Message-ID: <45408D5C.1000305@biogem.it>

Nathan Haigh wrote:
> Sendu Bala wrote:
>   
>> Nathan Haigh wrote:
>>     
>>> I'm thinking that it's not wise to test for things like
>>> overall_percentage_identity etc in alignments that are generated by
>>> external software like T-Coffee, Clustalw etc. Changes to software
>>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>>> alignment produced in different versions and thus affect the value
>>> returned by such methods. Therefore, I think these methods should only
>>> be tested from alignments loaded directly from t/data.
>>>       
>> Did you discover some specific problem cases?
>>     
> My messages seem to be taking a while to come through, but, yes. It may
> be due to the software changing default parameters, but it makes testing
> the output for specific details pretty difficult and inconsistent. For
> example, running T-Coffee, the following command from t/TCoffee.t
> results in slightly different alignment:
> $aln = $factory->run('-type' => 'profile',
>                      '-profile' => $aln1,
>                      '-seq'  =>
> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>
> Of particular note, is the gaps on the last line of the sequences. In
> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
> <v4.45 this is ('gkn----mcg').
>   
I'm not a T-coffee user but usually you can come across
these problems when you use different scoring parameters
when align sequences.

Could it be possible that they have simply changed the
default parameters for gap penalties and that kind of
stuff? It is possible to set them?

If so you can just run the test by defining
the scores in the param hash without using the default.

HTH

Remo

From n.haigh at sheffield.ac.uk  Thu Oct 26 06:33:55 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 11:33:55 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408D5C.1000305@biogem.it>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it>
Message-ID: <45408F13.9020209@sheffield.ac.uk>

Remo Sanges wrote:
> Nathan Haigh wrote:
>> Sendu Bala wrote:
>>  
>>> Nathan Haigh wrote:
>>>    
>>>> I'm thinking that it's not wise to test for things like
>>>> overall_percentage_identity etc in alignments that are generated by
>>>> external software like T-Coffee, Clustalw etc. Changes to software
>>>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>>>> alignment produced in different versions and thus affect the value
>>>> returned by such methods. Therefore, I think these methods should only
>>>> be tested from alignments loaded directly from t/data.
>>>>       
>>> Did you discover some specific problem cases?
>>>     
>> My messages seem to be taking a while to come through, but, yes. It may
>> be due to the software changing default parameters, but it makes testing
>> the output for specific details pretty difficult and inconsistent. For
>> example, running T-Coffee, the following command from t/TCoffee.t
>> results in slightly different alignment:
>> $aln = $factory->run('-type' => 'profile',
>>                      '-profile' => $aln1,
>>                      '-seq'  =>
>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>>
>> Of particular note, is the gaps on the last line of the sequences. In
>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
>> <v4.45 this is ('gkn----mcg').
>>   
> I'm not a T-coffee user but usually you can come across
> these problems when you use different scoring parameters
> when align sequences.
>
> Could it be possible that they have simply changed the
> default parameters for gap penalties and that kind of
> stuff? It is possible to set them?
>
> If so you can just run the test by defining
> the scores in the param hash without using the default.
>
> HTH
>
> Remo
That is true, but it depends on the whether the wrapper is complete
enough to be able to set all the parameters provided by the software.

Nath

From n.haigh at sheffield.ac.uk  Thu Oct 26 12:13:03 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 17:13:03 +0100
Subject: [Bioperl-l] Bio::Restriction::Enzyme
Message-ID: <4540DE8F.7070501@sheffield.ac.uk>

I'm in the middle of writing some code that uses
Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
Bioperl from HEAD.

I seem to find that $enzyme->is_palindromic always seems to return true.
Can anyone verify this? If needs be, I can send some code.

Thanks
Nathan

From info at nanotechcongresssmailer.net  Tue Oct 24 10:45:10 2006
From: info at nanotechcongresssmailer.net (International Association of Nanotechnology)
Date: Tue, 24 Oct 2006 09:45:10 -0500
Subject: [Bioperl-l] ICNT2006-presents Nanotechnology Workforce Development
Message-ID: <200610241445.k9OEjBBA024478@portal.open-bio.org>

An HTML attachment was scrubbed...
URL: http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061024/d185772e/attachment.html 

From bosborne11 at verizon.net  Thu Oct 26 12:37:06 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Thu, 26 Oct 2006 12:37:06 -0400
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk>
Message-ID: <C1665C72.B068%bosborne11@verizon.net>

Nathan,

Perhaps because most restriction sites are palindromes. Anyway, I added
tests for palindromic() and is_palindromic() where the site is not a
palindrome, these tests pass (t/RestrictionAnalyis.t).

Brian O.


On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:

> I'm in the middle of writing some code that uses
> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
> Bioperl from HEAD.
> 
> I seem to find that $enzyme->is_palindromic always seems to return true.
> Can anyone verify this? If needs be, I can send some code.
> 
> Thanks
> Nathan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Thu Oct 26 12:49:48 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 17:49:48 +0100
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1665C72.B068%bosborne11@verizon.net>
References: <C1665C72.B068%bosborne11@verizon.net>
Message-ID: <4540E72C.5020800@sheffield.ac.uk>

Brian Osborne wrote:
> Nathan,
>
> Perhaps because most restriction sites are palindromes. Anyway, I added
> tests for palindromic() and is_palindromic() where the site is not a
> palindrome, these tests pass (t/RestrictionAnalyis.t).
>
> Brian O.
>
>
> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>
>   
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>
>   
Ok, thanks - nice to know :-)

From cjfields at uiuc.edu  Thu Oct 26 12:58:34 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 26 Oct 2006 11:58:34 -0500
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk>
Message-ID: <001301c6f91f$f9611770$15327e82@pyrimidine>

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh
> Sent: Thursday, October 26, 2006 11:13 AM
> To: Bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bio::Restriction::Enzyme
> 
> I'm in the middle of writing some code that uses
> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
> Bioperl from HEAD.
> 
> I seem to find that $enzyme->is_palindromic always seems to return true.
> Can anyone verify this? If needs be, I can send some code.
> 
> Thanks
> Nathan

You should file a bug report if you have found a test case where this method
isn't working as it should, especially if Brian's tests pass and you're
still getting the wrong results.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From jason at bioperl.org  Thu Oct 26 12:57:32 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 26 Oct 2006 09:57:32 -0700
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408F13.9020209@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it>
	<45408F13.9020209@sheffield.ac.uk>
Message-ID: <C2AC4DE8-7E99-4744-9FA9-B11C51788BDE@bioperl.org>

Nathan -

I agree - the values tend to change with different versions of the  
applications unfortunately.  It would make sense to just test that  
you get out sequences that are in valid alignment format and perhaps  
have as many ending sequences as you started with.   The more  
restrictive tests probably aren't reliable with mixing and matching  
versions.

One thing we do for PAML is condition tests on the version used - but  
of course when a new version comes out we have to add more stuff to  
the tests (or just have some code that skips those tests).

-jason
On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:

> Remo Sanges wrote:
>> Nathan Haigh wrote:
>>> Sendu Bala wrote:
>>>
>>>> Nathan Haigh wrote:
>>>>
>>>>> I'm thinking that it's not wise to test for things like
>>>>> overall_percentage_identity etc in alignments that are  
>>>>> generated by
>>>>> external software like T-Coffee, Clustalw etc. Changes to software
>>>>> algorithms/efficiency, bug fixes etc may well alter the quality  
>>>>> of the
>>>>> alignment produced in different versions and thus affect the value
>>>>> returned by such methods. Therefore, I think these methods  
>>>>> should only
>>>>> be tested from alignments loaded directly from t/data.
>>>>>
>>>> Did you discover some specific problem cases?
>>>>
>>> My messages seem to be taking a while to come through, but, yes.  
>>> It may
>>> be due to the software changing default parameters, but it makes  
>>> testing
>>> the output for specific details pretty difficult and  
>>> inconsistent. For
>>> example, running T-Coffee, the following command from t/TCoffee.t
>>> results in slightly different alignment:
>>> $aln = $factory->run('-type' => 'profile',
>>>                      '-profile' => $aln1,
>>>                      '-seq'  =>
>>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>>>
>>> Of particular note, is the gaps on the last line of the  
>>> sequences. In
>>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
>>> <v4.45 this is ('gkn----mcg').
>>>
>> I'm not a T-coffee user but usually you can come across
>> these problems when you use different scoring parameters
>> when align sequences.
>>
>> Could it be possible that they have simply changed the
>> default parameters for gap penalties and that kind of
>> stuff? It is possible to set them?
>>
>> If so you can just run the test by defining
>> the scores in the param hash without using the default.
>>
>> HTH
>>
>> Remo
> That is true, but it depends on the whether the wrapper is complete
> enough to be able to set all the parameters provided by the software.
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From cjfields at uiuc.edu  Thu Oct 26 18:01:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 26 Oct 2006 17:01:08 -0500
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <C2AC4DE8-7E99-4744-9FA9-B11C51788BDE@bioperl.org>
Message-ID: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>

I have been running into similar issues with EUtilities tests.  Since the
data on the server is constantly updated I have to try an future-proof the
tests so they don't constantly fail.  

I have been using Test::More and like/unlike or cmp_ok to get around some of
those 'fuzzy data' issues.  If some methods consistently return a particular
type of value, such as an integer, you could use:

like($foo->get_value, qr{^\d+$}, 'value test'); #integer

or similar.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> Nathan -
> 
> I agree - the values tend to change with different versions of the
> applications unfortunately.  It would make sense to just test that
> you get out sequences that are in valid alignment format and perhaps
> have as many ending sequences as you started with.   The more
> restrictive tests probably aren't reliable with mixing and matching
> versions.
> 
> One thing we do for PAML is condition tests on the version used - but
> of course when a new version comes out we have to add more stuff to
> the tests (or just have some code that skips those tests).
> 
> -jason
> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:
> 
> > Remo Sanges wrote:
> >> Nathan Haigh wrote:
> >>> Sendu Bala wrote:
> >>>
> >>>> Nathan Haigh wrote:
> >>>>
> >>>>> I'm thinking that it's not wise to test for things like
> >>>>> overall_percentage_identity etc in alignments that are
> >>>>> generated by
> >>>>> external software like T-Coffee, Clustalw etc. Changes to software
> >>>>> algorithms/efficiency, bug fixes etc may well alter the quality
> >>>>> of the
> >>>>> alignment produced in different versions and thus affect the value
> >>>>> returned by such methods. Therefore, I think these methods
> >>>>> should only
> >>>>> be tested from alignments loaded directly from t/data.
> >>>>>
> >>>> Did you discover some specific problem cases?
> >>>>
> >>> My messages seem to be taking a while to come through, but, yes.
> >>> It may
> >>> be due to the software changing default parameters, but it makes
> >>> testing
> >>> the output for specific details pretty difficult and
> >>> inconsistent. For
> >>> example, running T-Coffee, the following command from t/TCoffee.t
> >>> results in slightly different alignment:
> >>> $aln = $factory->run('-type' => 'profile',
> >>>                      '-profile' => $aln1,
> >>>                      '-seq'  =>
> >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
> >>>
> >>> Of particular note, is the gaps on the last line of the
> >>> sequences. In
> >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
> >>> <v4.45 this is ('gkn----mcg').
> >>>
> >> I'm not a T-coffee user but usually you can come across
> >> these problems when you use different scoring parameters
> >> when align sequences.
> >>
> >> Could it be possible that they have simply changed the
> >> default parameters for gap penalties and that kind of
> >> stuff? It is possible to set them?
> >>
> >> If so you can just run the test by defining
> >> the scores in the param hash without using the default.
> >>
> >> HTH
> >>
> >> Remo
> > That is true, but it depends on the whether the wrapper is complete
> > enough to be able to set all the parameters provided by the software.
> >
> > Nath
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From gbazykin at Princeton.EDU  Thu Oct 26 18:49:56 2006
From: gbazykin at Princeton.EDU (Georgii A Bazykin)
Date: Thu, 26 Oct 2006 18:49:56 -0400
Subject: [Bioperl-l] about PAML running within bioperl
In-Reply-To: <001901c6dbcf$9af4de50$0915020a@zchou>
References: <001901c6dbcf$9af4de50$0915020a@zchou>
Message-ID: <185431468.20061026184956@princeton.edu>

I just had the exact same problem, which was also (as in Caleb Davis's
case) was solved by switching to PAML 3.14 from 3.15.


------------------------------
Tuesday, September 19, 2006, 5:40:07 AM, you wrote:

> Hello, every one,

> I use code in the PAML HOWTO (running PAML fom within Bioperl) on
> my Linux OS. And I set ENV as described by instructions. At the
> beginning, it seems that ClustalW run smoothly. However, when the
> programme run to call method "get_MLmatrix", somethign happened. The
> following information was listed as follows: (What reason or How to solve these problems?)
> ........
> Sequences (2:3) Aligned. Score:  87
> Sequences (2:4) Aligned. Score:  88
> Sequences (2:5) Aligned. Score:  87
> Sequences (2:6) Aligned. Score:  87
> Sequences (2:7) Aligned. Score:  87
> Sequences (2:8) Aligned. Score:  87
> Sequences (3:4) Aligned. Score:  93
> Sequences (3:5) Aligned. Score:  93
> Sequences (3:6) Aligned. Score:  93
> Sequences (3:7) Aligned. Score:  92
> Sequences (3:8) Aligned. Score:  92
> Sequences (4:5) Aligned. Score:  99
> Sequences (4:6) Aligned. Score:  99
> Sequences (4:7) Aligned. Score:  98
> Sequences (4:8) Aligned. Score:  98
> Sequences (5:6) Aligned. Score:  100
> Sequences (5:7) Aligned. Score:  99
> Sequences (5:8) Aligned. Score:  99
> Sequences (6:7) Aligned. Score:  99
> Sequences (6:8) Aligned. Score:  99
> Sequences (7:8) Aligned. Score:  100
> Guide tree        file created:  
> [/home/zchou/TMPDIR/8QEqLivAKY/JU833u8OTP.dnd]
> Start of Multiple Alignment
> There are 7 groups
> Aligning...
> Group 1: Sequences:   2      Score:5875
> Group 2: Sequences:   2      Score:5877
> Group 3: Sequences:   4      Score:5864
> Group 4: Sequences:   5      Score:5537
> Group 5: Sequences:   6      Score:5727
> Group 6: Sequences:   7      Score:5608
> Group 7: Sequences:   8      Score:5607
> Alignment Score 43650
> GCG-Alignment file created     
> [/home/zchou/TMPDIR/8QEqLivAKY/CussPD56rZ]
> aligned aa sequences were: Bio::SimpleAlign=HASH(0x87b93f4)
> Can't call method "get_MLmatrix" on an undefined value at
> originalpaml.pl line 57, <GEN2> line 332.


> Zhuocheng Hou
> Department of Animal Genetics and Breeding
> China Agricultural University


From himanshu.ardawatia at bccs.uib.no  Thu Oct 26 21:54:36 2006
From: himanshu.ardawatia at bccs.uib.no (Himanshu Ardawatia)
Date: Fri, 27 Oct 2006 03:54:36 +0200
Subject: [Bioperl-l] Query on tree bootstrap values
Message-ID: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>

Hi,

2 questions :

1. I have a phylogenetic tree and I wish to set (or modify or query)
bootstrap values for all internal nodes. How do I do that using BioPerl ?

2. I tried the example script attached below for general purpose for the
example newick tree with bootstrap values (also attached below) and It gives
strange results even for branch length. It shows Parent ID as 0.71 which
actually is the bootstrap value for the last ancestral node for human and
chimp and It shows the Child node ID as 'Human' ! Am I missing something in
the tree formatting ? Results also attached below. Also how to extract /
modify/ add bootstrap values in this tree ?

Thanks
Himanshu

EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
#################################
(
  ('Chimp'  : 0.052,
   'Human'  : 0.042) 0.71 : 0.007,
  'Gorilla'  : 0.060,
  ('Gibbon'  : 0.124,
   'Orangutan'  : 0.0971) 1 : 0.038
);
#################################

EXAMPLE SCRIPT:

#################################
#!/usr/bin/perl -w

use Bio::Seq;
# use Bio::TreeIO;
use Bio::Tree::TreeI;

# get a Tree::NodeI somehow
    # like from a TreeIO
    use Bio::TreeIO;
    # read in a clustalw NJ in phylip/newick format
    my $treeio = new Bio::TreeIO(-format => 'newick', -file =>
'example_newick_tree.newick');

    my $tree = $treeio->next_tree; # we'll assume it worked for demo
purposes
                                   # you might want to test that it was
defined

    my $rootnode = $tree->get_root_node;

    # process just the next generation
    foreach my $node ( $rootnode->each_Descendent() ) {
        print "branch len is ", $node->branch_length, "\n";
    }

    # process all the children
    my $example_leaf_node;
    foreach my $node ( $rootnode->get_Descendents() ) {
        if( $node->is_Leaf ) {
            print "node is a leaf ... ";
            # for example use below
            $example_leaf_node = $node unless defined $example_leaf_node;
        }
        print "branch len is ", $node->branch_length, "\n";
    }

    # The ancestor() method points to the parent of a node
    # A node can only have one parent

    my $parent = $example_leaf_node->ancestor;

    # parent won't likely have an description because it is an internal node
    # but child will because it is a leaf

    print "Parent id: ", $parent->id," child id: ",
          $example_leaf_node->id, "\n";

##########################################

RESULTS:
branch len is  0.007
branch len is  0.060
branch len is  0.038
node is a leaf ... branch len is  0.042
node is a leaf ... branch len is  0.052
branch len is  0.007
node is a leaf ... branch len is  0.060
node is a leaf ... branch len is  0.0971
node is a leaf ... branch len is  0.124
branch len is  0.038
Parent id: _0.71_ child id: ___'Human'__

From n.haigh at sheffield.ac.uk  Fri Oct 27 04:42:23 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 08:42:23 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1665C72.B068%bosborne11@verizon.net>
References: <C1665C72.B068%bosborne11@verizon.net>
Message-ID: <4541C66F.1020404@sheffield.ac.uk>

Hi Brian,

I wonder if i'm using is_prototype() correctly as I don't seem to get
any returning true:

my $enz_coll = Bio::Restriction::EnzymeCollection->new();
my $prototype = 0;
foreach my $enz ($enz_coll->each_enzyme) {
    $prototype++ if $enz->is_prototype;
}
print "$prototype have unique recognition sites\n";

prints:
0 have unique recognition sites

Thanks
Nath

Brian Osborne wrote:
> Nathan,
>
> Perhaps because most restriction sites are palindromes. Anyway, I added
> tests for palindromic() and is_palindromic() where the site is not a
> palindrome, these tests pass (t/RestrictionAnalyis.t).
>
> Brian O.
>
>
> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>
>   
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>
>   


-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>

From n.haigh at sheffield.ac.uk  Fri Oct 27 04:47:21 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 08:47:21 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <001301c6f91f$f9611770$15327e82@pyrimidine>
References: <001301c6f91f$f9611770$15327e82@pyrimidine>
Message-ID: <4541C799.4090507@sheffield.ac.uk>

Chris Fields wrote:
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh
>> Sent: Thursday, October 26, 2006 11:13 AM
>> To: Bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] Bio::Restriction::Enzyme
>>
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>>     
>
> You should file a bug report if you have found a test case where this method
> isn't working as it should, especially if Brian's tests pass and you're
> still getting the wrong results.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   

I was doing some filtering of the default set of enzymes and happened to
removed the 2 that are not palindromic before I used is_palindromic().
Thus, I didn't see any that were not palindromic - if that makes sense!
Since I know very little about restriction enzymes, I'll trust that
these are correct :-)  and I'm getting the correct results.

Thanks
Nath
<http://www.mozilla.org/products/thunderbird/>

From n.haigh at sheffield.ac.uk  Fri Oct 27 05:04:40 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 09:04:40 +0000
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>
References: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>
Message-ID: <4541CBA8.10006@sheffield.ac.uk>

Chris Fields wrote:
> I have been running into similar issues with EUtilities tests.  Since the
> data on the server is constantly updated I have to try an future-proof the
> tests so they don't constantly fail.  
>
> I have been using Test::More and like/unlike or cmp_ok to get around some of
> those 'fuzzy data' issues.  If some methods consistently return a particular
> type of value, such as an integer, you could use:
>
> like($foo->get_value, qr{^\d+$}, 'value test'); #integer
>
> or similar.
>
> Chris
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>   
>> Nathan -
>>
>> I agree - the values tend to change with different versions of the
>> applications unfortunately.  It would make sense to just test that
>> you get out sequences that are in valid alignment format and perhaps
>> have as many ending sequences as you started with.   The more
>> restrictive tests probably aren't reliable with mixing and matching
>> versions.
>>
>> One thing we do for PAML is condition tests on the version used - but
>> of course when a new version comes out we have to add more stuff to
>> the tests (or just have some code that skips those tests).
>>
>> -jason
>> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:
>>
>>     
I think it makes sense to test that data of the expected type was
returned by the xternal resource but not to test the specifics of what
was retured. If specifics are tested we are then in the realm of testing
whether we believe the data returned by the external resource or not. We
should assume that the domain experts for these resources know what they
are doing - in some cases this might not be true :-)  but I think we
should stick to testing that the objects created hold the expected type
of data.

I like what Chris had to say (above) but wonder whether tests
would/should be tested for in the module itself - i.e. testing that a
stored value is an integer and warn/throw if not?

Nath

From bix at sendu.me.uk  Fri Oct 27 05:08:18 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 10:08:18 +0100
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
Message-ID: <4541CC82.2040705@sendu.me.uk>

Himanshu Ardawatia wrote:
> Hi,
> 
> 2 questions :
> 
> 1. I have a phylogenetic tree and I wish to set (or modify or query)
> bootstrap values for all internal nodes. How do I do that using BioPerl ?

Does bootstrap() not do what you need?


> 2. I tried the example script attached below for general purpose for the
> example newick tree with bootstrap values (also attached below) and It gives
> strange results even for branch length. It shows Parent ID as 0.71 which
> actually is the bootstrap value for the last ancestral node for human and
> chimp and It shows the Child node ID as 'Human' ! Am I missing something in
> the tree formatting ? Results also attached below. Also how to extract /
> modify/ add bootstrap values in this tree ?
[snip]
> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
> #################################
> (
>   ('Chimp'  : 0.052,
>    'Human'  : 0.042) 0.71 : 0.007,
>   'Gorilla'  : 0.060,
>   ('Gibbon'  : 0.124,
>    'Orangutan'  : 0.0971) 1 : 0.038
> );
> #################################

Are you sure this is in the correct format?

For example, with the tree:
( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 
'Gorilla':0.060, 
('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038);

and your script (with a print "--\n" between the two printing loops for 
clarity) I get...

> ##########################################
> 
> RESULTS:
> branch len is  0.007
> branch len is  0.060
> branch len is  0.038
> node is a leaf ... branch len is  0.042
> node is a leaf ... branch len is  0.052
> branch len is  0.007
> node is a leaf ... branch len is  0.060
> node is a leaf ... branch len is  0.0971
> node is a leaf ... branch len is  0.124
> branch len is  0.038
> Parent id: _0.71_ child id: ___'Human'__

...

branch len is 0.007
branch len is 0.060
branch len is 0.038
--
branch len is 0.007
node is a leaf ... branch len is 0.052
node is a leaf ... branch len is 0.042
node is a leaf ... branch len is 0.060
branch len is 0.038
node is a leaf ... branch len is 0.124
node is a leaf ... branch len is 0.0971
Parent id: 'Human_Chimp_Ancestor' child id: 'Chimp'

This seems reasonable to me. What were you expecting?

From n.haigh at sheffield.ac.uk  Fri Oct 27 07:36:10 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 11:36:10 +0000
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541CC82.2040705@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
	<4541CC82.2040705@sendu.me.uk>
Message-ID: <4541EF2A.4050600@sheffield.ac.uk>

Sendu Bala wrote:
> Himanshu Ardawatia wrote:
>   
>> Hi,
>>
>> 2 questions :
>>
>> 1. I have a phylogenetic tree and I wish to set (or modify or query)
>> bootstrap values for all internal nodes. How do I do that using BioPerl ?
>>     
>
> Does bootstrap() not do what you need?
>
>
>   
>> 2. I tried the example script attached below for general purpose for the
>> example newick tree with bootstrap values (also attached below) and It gives
>> strange results even for branch length. It shows Parent ID as 0.71 which
>> actually is the bootstrap value for the last ancestral node for human and
>> chimp and It shows the Child node ID as 'Human' ! Am I missing something in
>> the tree formatting ? Results also attached below. Also how to extract /
>> modify/ add bootstrap values in this tree ?
>>     
> [snip]
>   
>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>> #################################
>> (
>>   ('Chimp'  : 0.052,
>>    'Human'  : 0.042) 0.71 : 0.007,
>>   'Gorilla'  : 0.060,
>>   ('Gibbon'  : 0.124,
>>    'Orangutan'  : 0.0971) 1 : 0.038
>> );
>> #################################
>>     
>
> Are you sure this is in the correct format?
>   

He/she may have a tree that already contains bootstrap values output
from another program. If this is so, which program did you use? Without
reminding myself of the formats, you should lookup newick format and
whther it is possible to store bootstraps in it. In addition you should
also look up the nhx format.

> For example, with the tree:
> ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 
> 'Gorilla':0.060, 
> ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038);
>
>   

This tree does not contain any bootstrap values - only branch lengths.

Sorry I can't be much more help at the moment - if i get a spare 10 mins
i'll have a closer look.
Nath

From bix at sendu.me.uk  Fri Oct 27 07:16:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 12:16:08 +0100
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EF2A.4050600@sheffield.ac.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk>
Message-ID: <4541EA78.3050404@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Himanshu Ardawatia wrote:
>>>
>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>>> #################################
>>> (
>>>   ('Chimp'  : 0.052,
>>>    'Human'  : 0.042) 0.71 : 0.007,
>>>   'Gorilla'  : 0.060,
>>>   ('Gibbon'  : 0.124,
>>>    'Orangutan'  : 0.0971) 1 : 0.038
>>> );
>>> #################################
>>>     
>> Are you sure this is in the correct format?
>>   
> 
> He/she may have a tree that already contains bootstrap values output
> from another program. If this is so, which program did you use? Without
> reminding myself of the formats, you should lookup newick format and
> whther it is possible to store bootstraps in it. In addition you should
> also look up the nhx format.

Ah, well from a brief google it seemed like some software do store 
boostrap values for internal nodes as the node ids when outputting in 
Newick format. I don't think Bioperl should be able to tell the 
difference between a normal id and a bootstrap value, so you'll have to 
detect that yourself and manually use bootstrap() when you get an id 
that looks like a number.

Or should Bioperl be making this assumption for you? Is that a safe 
thing to do? Maybe as an option only?

From n.haigh at sheffield.ac.uk  Fri Oct 27 08:24:49 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 12:24:49 +0000
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EA78.3050404@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk>
Message-ID: <4541FA91.3040505@sheffield.ac.uk>

--snip--
>
> Ah, well from a brief google it seemed like some software do store
> boostrap values for internal nodes as the node ids when outputting in
> Newick format. I don't think Bioperl should be able to tell the
> difference between a normal id and a bootstrap value, so you'll have
> to detect that yourself and manually use bootstrap() when you get an
> id that looks like a number.

If I remember rightly, in programs like Clustal you can specify where
bootstrap values are stored - node or branch. I can't remember which is
the default way, but TreeView can only see bootstraps in they are stored
using the "non-default" setting. This "could" be the same issue here.

>
> Or should Bioperl be making this assumption for you? Is that a safe
> thing to do? Maybe as an option only?
I don't know without a closer look - i'd also need to look at the newick
format definition as to whether this is an "extension" to the format or
if something is just flouting the newick rules.

Nath


From n.haigh at sheffield.ac.uk  Fri Oct 27 08:59:51 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 12:59:51 +0000
Subject: [Bioperl-l] Caching sequences
Message-ID: <454202C7.1040701@sheffield.ac.uk>

I have a script that is capable of downloading sequences from GenBank
based on GI numbers. I retrieve them if fasta format in order to save
bandwidth, but I'd like to take this one step further and cache the
sequences in case the user want to rerun the script using some of the
GI's they used previously.

Does anyone have any guidance on how best to do this?

Cheers
Nath

From bix at sendu.me.uk  Fri Oct 27 08:35:13 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 13:35:13 +0100
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <454202C7.1040701@sheffield.ac.uk>
References: <454202C7.1040701@sheffield.ac.uk>
Message-ID: <4541FD01.6090803@sendu.me.uk>

Nathan S. Haigh wrote:
> I have a script that is capable of downloading sequences from GenBank
> based on GI numbers. I retrieve them if fasta format in order to save
> bandwidth, but I'd like to take this one step further and cache the
> sequences in case the user want to rerun the script using some of the
> GI's they used previously.
> 
> Does anyone have any guidance on how best to do this?

You'd probably write the sequences out in some suitable format and 
access them via Bio::Index

Or, I'm sure bioperl-db excels at this kind of thing, but is a little 
more involved if this is only a simple situation.

From bosborne11 at verizon.net  Fri Oct 27 09:09:30 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 27 Oct 2006 09:09:30 -0400
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4541C66F.1020404@sheffield.ac.uk>
Message-ID: <C1677D4A.B0AF%bosborne11@verizon.net>

Nathan,

I don't know how this is supposed to work, there would be different ways to
make is_prototype true. One way would be to make the enzyme with the first
occurrence of a given restriction site the prototype (and the next enzymes
with the same site are isoschizomers). Or, one could wait until one site had
appeared twice, with 2 different enzymes, then make the first the prototype,
etc. I would have done it the first way myself but I took a quick look at
IO/withrefm.pm and it looks like it's doing it the second way. That means
one can read an enzyme file and end up with no duplicated restriction sites,
or prototypes and isoschizomers.

Brian O.


On 10/27/06 4:42 AM, "Nathan S. Haigh" <n.haigh at sheffield.ac.uk> wrote:

> Hi Brian,
> 
> I wonder if i'm using is_prototype() correctly as I don't seem to get
> any returning true:
> 
> my $enz_coll = Bio::Restriction::EnzymeCollection->new();
> my $prototype = 0;
> foreach my $enz ($enz_coll->each_enzyme) {
>     $prototype++ if $enz->is_prototype;
> }
> print "$prototype have unique recognition sites\n";
> 
> prints:
> 0 have unique recognition sites
> 
> Thanks
> Nath
> 
> Brian Osborne wrote:
>> Nathan,
>> 
>> Perhaps because most restriction sites are palindromes. Anyway, I added
>> tests for palindromic() and is_palindromic() where the site is not a
>> palindrome, these tests pass (t/RestrictionAnalyis.t).
>> 
>> Brian O.
>> 
>> 
>> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>> 
>>   
>>> I'm in the middle of writing some code that uses
>>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>>> Bioperl from HEAD.
>>> 
>>> I seem to find that $enzyme->is_palindromic always seems to return true.
>>> Can anyone verify this? If needs be, I can send some code.
>>> 
>>> Thanks
>>> Nathan
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>     
>> 
>> 
>>   
> 


From n.haigh at sheffield.ac.uk  Fri Oct 27 10:19:02 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 14:19:02 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1677D4A.B0AF%bosborne11@verizon.net>
References: <C1677D4A.B0AF%bosborne11@verizon.net>
Message-ID: <45421556.9060300@sheffield.ac.uk>

Brian Osborne wrote:
> Nathan,
>
> I don't know how this is supposed to work, there would be different ways to
> make is_prototype true. One way would be to make the enzyme with the first
> occurrence of a given restriction site the prototype (and the next enzymes
> with the same site are isoschizomers). Or, one could wait until one site had
> appeared twice, with 2 different enzymes, then make the first the prototype,
> etc. I would have done it the first way myself but I took a quick look at
> IO/withrefm.pm and it looks like it's doing it the second way. That means
> one can read an enzyme file and end up with no duplicated restriction sites,
> or prototypes and isoschizomers.
>
> Brian O.
>
>   
Hmm, I'd have done it the first way also. Doing it the second way would
mean you only ended up with something as a prototype if there were
multiple enzymes with the same restriction site - is that correct
biologically?

Nath

From n.haigh at sheffield.ac.uk  Fri Oct 27 10:23:20 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 14:23:20 +0000
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
Message-ID: <45421658.5000103@sheffield.ac.uk>

As you may be aware by now, i'm working with Bio::Restriction::Analysis
and friends.

I'm doing restriction analysis on large sequences - chromosomes. I need
to identify an appropriate enzyme based on the total length of fragments
that are of a certain size (e.g. 100 - 500 bp). However, the amount of
memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
have the following code (bottom) which downloads 2 thaliana chromosomes
(mito and chloro - so pretty small) and runs an analysis and then loops
through the fragments for all enzymes in the default collection.

My memory usage just keep on climbing and none seems to get freed up
even when a $ra goes out of scope (start dealing with the next
sequence). Is this a memory leak of some sort, is there a way to free up
memory as I go? I'd appreciate any help/advice on how to reduce the
amount of memory being consumed as I'd like to use all the thaliana
chromosomes (not just mito and chloro), which at the moment probably
won't work.

Cheers
Nath

use strict;
use Bio::DB::GenBank;
use Bio::Restriction::Analysis;
use Bio::Restriction::EnzymeCollection;

my @seq_objs;
my @gis = ( 7525012,  26556996 );

my $db = Bio::DB::GenBank->new(-format => "fasta");
foreach my $gi (@gis) {
  print "Getting GI: $gi\n";
  push @seq_objs, $db->get_Seq_by_id($gi)
}

my $min_fragment_size = 100;
my $max_fragment_size = 500;
my $enz_Coll = Bio::Restriction::EnzymeCollection->new();

foreach my $seq (@seq_objs) {
  my $tot_size = 0;
  print "Processing ", $seq->primary_id,"\n";
  my $ra = Bio::Restriction::Analysis->new(
                                         -seq=>$seq,
                                         -enzymes=>$enz_Coll,
  );
 
  my @all_enzymes = $ra->cutters->each_enzyme;
  print "  Calc total length of fragments in range: $min_fragment_size -
$max_fragment_size\n";
  foreach my $enzyme ( @all_enzymes ) {
    # fragments() is a real memory hog
    foreach my $frag ($ra->fragments($enzyme)) {
      next if $min_fragment_size && (length $frag < $min_fragment_size);
      next if $max_fragment_size && (length $frag > $max_fragment_size);
      $tot_size += length $frag;
    }
    # do something based on value of $tot_size
    #print "    ", $enzyme->name, " total = $tot_size\n";
  }
  print "DONE\n";
}


From avilella at gmail.com  Fri Oct 27 09:39:41 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 27 Oct 2006 14:39:41 +0100
Subject: [Bioperl-l] scale branch lengths of a tree to sum 1
In-Reply-To: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>
References: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>
Message-ID: <358f4d650610270639q14870a6erae2e3c4e9063105d@mail.gmail.com>

I respond to myself: I think I found the way:

my $tree = $treeio->next_tree;
my $total_branch_length = 0;
foreach my $node ($tree->get_nodes) {
    $total_branch_length += $node->branch_length;
}
foreach my $node ($tree->get_nodes) {
    my $branch_length = $node->branch_length;
    next unless (defined($branch_length));
    $node->branch_length($branch_length/$total_branch_length);
    1;
}

my $new_branch_length;
foreach my $node ($tree->get_nodes) {
    $new_branch_length += $node->branch_length;
}
1;

On 10/27/06, Albert Vilella <avilella at gmail.com> wrote:
> Hi all,
>
> I am in need of a method that would scale the different branch lengths
> of a tree so that after the scaling they all sum up to exactly 1.
>
> Any pointers? Has anyone done that before?
>
> Thanks in advance,
>
>     Albert.
>

From cjfields at uiuc.edu  Fri Oct 27 10:35:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 09:35:35 -0500
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <4541CBA8.10006@sheffield.ac.uk>
Message-ID: <001501c6f9d5$2e33e120$15327e82@pyrimidine>

...
> I think it makes sense to test that data of the expected type was
> returned by the xternal resource but not to test the specifics of what
> was retured. If specifics are tested we are then in the realm of testing
> whether we believe the data returned by the external resource or not. We
> should assume that the domain experts for these resources know what they
> are doing - in some cases this might not be true :-)  but I think we
> should stick to testing that the objects created hold the expected type
> of data.
> 
> I like what Chris had to say (above) but wonder whether tests
> would/should be tested for in the module itself - i.e. testing that a
> stored value is an integer and warn/throw if not?
> 
> Nath

Yeah, sorry about the top post (stupid Outlook always sticks the sig at the
top of the page!).  

Testing in the module would be best but can be tricky for the very same
reasons that writing tests entail, even more so.  For instance, for NCBI
esummary data, I parse the data in a very generic way in order to have
access to as much data as possible.  

For tests, I have to assume that NCBI will always return a particular type
of value (string, integer, date).  I can test for each of those with a regex
in the module fairly simply and throw/wanr, as you indicate.  However, if
they decide to add new data with a data tag other that the ones I test for
in the module (i.e. String, Integer, Date), I suddenly have warns/throws
showing up and cluttering/clobbering the code for perfectly valid data.  

However, if these are caught in tests and the tests fail, no big loss.  The
actual module still works, even if the tests are failing based on an new
unknown value being returned.  

For me, failed tests are sort of a warning light to let me know that
something has changed, but it doesn't necessarily mean a module doesn't
work.  I generally use throw/warn for something truly catastrophic, like no
response from the server or an error in the XML, which affects downstream
methods.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Oct 27 11:09:36 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 10:09:36 -0500
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <454202C7.1040701@sheffield.ac.uk>
Message-ID: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>

> I have a script that is capable of downloading sequences from GenBank
> based on GI numbers. I retrieve them if fasta format in order to save
> bandwidth, but I'd like to take this one step further and cache the
> sequences in case the user want to rerun the script using some of the
> GI's they used previously.
> 
> Does anyone have any guidance on how best to do this?
> 
> Cheers
> Nath

There is Bio::DB::InMemoryCache, which is really an interface but appears to
have several methods defined; you could look for modules which implement it.
Sendu's suggestion of the Bio::Index modules and bioperl-db are also good
starting points.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Fri Oct 27 11:21:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 10:21:49 -0500
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <45421556.9060300@sheffield.ac.uk>
Message-ID: <001701c6f9db$9f90d160$15327e82@pyrimidine>

> Brian Osborne wrote:
> > Nathan,
> >
> > I don't know how this is supposed to work, there would be different ways
> to
> > make is_prototype true. One way would be to make the enzyme with the
> first
> > occurrence of a given restriction site the prototype (and the next
> enzymes
> > with the same site are isoschizomers). Or, one could wait until one site
> had
> > appeared twice, with 2 different enzymes, then make the first the
> prototype,
> > etc. I would have done it the first way myself but I took a quick look
> at
> > IO/withrefm.pm and it looks like it's doing it the second way. That
> means
> > one can read an enzyme file and end up with no duplicated restriction
> sites,
> > or prototypes and isoschizomers.
> >
> > Brian O.
> >
> >
> Hmm, I'd have done it the first way also. Doing it the second way would
> mean you only ended up with something as a prototype if there were
> multiple enzymes with the same restriction site - is that correct
> biologically?
> 
> Nath

I had a look at all the Restriction::IO modules a while back; most need
serious updating!  It just hasn't been a top priority unfortunately.

I think the prototype issue may depend on the IO format and whether or not
one is defined explicitly in the file being parsed or is just chosen based
on what Brian said (order in the file, similar cutting site).

By the strictest definition (and cheating by looking at the Fermentas web
site), the prototype is supposed to be the first enzyme discovered which
cleaves a unique sequence, so it may not be the first enzyme found in the
file.  Isoschizomers are those discovered to cleave the same sequence
subsequent to the prototype.  Neoschizomers cleave the same sequence as a
prototype but at a different site.

So this calls into question whether the prototype should be defined at all
unless it is specifically indicated in the file.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Fri Oct 27 12:47:53 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 16:47:53 +0000
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>
References: <454202C7.1040701@sheffield.ac.uk>	
	<001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
	<8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>
Message-ID: <45423839.9040503@sheffield.ac.uk>

Jason Stajich wrote:
> Bio::DB::FileCache does one better and lets you cache the data in a
> persistent file.  Not sure this index is shareable among users though
> - bioperl-db is a better soln when that is desired.
Thanks I'll have a look into it. No need for being sharable among users
- not unless the script becomes heavily used.

Thanks
Nath

From cjfields at uiuc.edu  Fri Oct 27 12:15:00 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 11:15:00 -0500
Subject: [Bioperl-l] StandAloneFasta.t bioperl-run tests
Message-ID: <000101c6f9e3$0e5e95d0$15327e82@pyrimidine>

Nathan,

The test fails you posted on the wiki seem to indicate that using the
wrapper works but the order of the returned hits is off.  Does the order of
the returned hits match the actual FASTA report order?  If it does then the
tests need to be fixed in a way to make it more flexible, to account for
some data 'fuzziness' due to variations in output based on different
versions.  

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From jason at bioperl.org  Fri Oct 27 12:50:54 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 27 Oct 2006 09:50:54 -0700
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EA78.3050404@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk>
Message-ID: <1230E110-01AB-4D4E-842F-20B939555299@bioperl.org>

I've answered to this effect this multiple times in the past on the  
mailing list.  newick format does not distinguish between internal  
ids and bootstrap values (or whatever else you want to attach  
there).  Different programs have different conventions.  when both  
values are present and encoded so that we can parse out the  
bootstrap  like this: [BOOTSTRAP] the parser grabs it out.   If you  
know all the internal ids are boostraps you can just copy the values  
over manually very simply

for my $node ( grep { ! $_->is_Leaf } $tree->get_nodes ) { # get all  
the internal nodes
  $node->bootstrap($node->id) if defined $node->id && length($node- 
 >id); # copy id to boostrap
  $node->id(''); # set internal id to empty
}

If someone can make this clearer on a wiki page that would be great.

On Oct 27, 2006, at 4:16 AM, Sendu Bala wrote:

> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Himanshu Ardawatia wrote:
>>>>
>>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>>>> #################################
>>>> (
>>>>   ('Chimp'  : 0.052,
>>>>    'Human'  : 0.042) 0.71 : 0.007,
>>>>   'Gorilla'  : 0.060,
>>>>   ('Gibbon'  : 0.124,
>>>>    'Orangutan'  : 0.0971) 1 : 0.038
>>>> );
>>>> #################################
>>>>
>>> Are you sure this is in the correct format?
>>>
>>
>> He/she may have a tree that already contains bootstrap values output
>> from another program. If this is so, which program did you use?  
>> Without
>> reminding myself of the formats, you should lookup newick format and
>> whther it is possible to store bootstraps in it. In addition you  
>> should
>> also look up the nhx format.
>
> Ah, well from a brief google it seemed like some software do store
> boostrap values for internal nodes as the node ids when outputting in
> Newick format. I don't think Bioperl should be able to tell the
> difference between a normal id and a bootstrap value, so you'll  
> have to
> detect that yourself and manually use bootstrap() when you get an id
> that looks like a number.
>
> Or should Bioperl be making this assumption for you? Is that a safe
> thing to do? Maybe as an option only?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From avilella at gmail.com  Fri Oct 27 09:23:07 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 27 Oct 2006 14:23:07 +0100
Subject: [Bioperl-l] scale branch lengths of a tree to sum 1
Message-ID: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>

Hi all,

I am in need of a method that would scale the different branch lengths
of a tree so that after the scaling they all sum up to exactly 1.

Any pointers? Has anyone done that before?

Thanks in advance,

    Albert.

From cjfields at uiuc.edu  Fri Oct 27 14:34:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 13:34:57 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
Message-ID: <000001c6f9f6$9ab12710$15327e82@pyrimidine>

I am working an refactoring the AlignIO::stockholm parser to get it reading
and writing Pfam/Rfam alignments, and noticed that many alignments have
EMBL-like annotations attached, which pertain to the entire alignment:

# STOCKHOLM 1.0
#=GF ID    ykkC-yxkD
#=GF AC    RF00442
#=GF DE    ykkC-yxkD element
#=GF AU    Moxon SJ
#=GF GA    20.0
#=GF NC    0.1
#=GF TC    59.4
#=GF SE    Barrick JE, Breaker RR
#=GF SS    Predicted; Barrick JE, Breaker RR
#=GF TP    Cis-reg; riboswitch;
#=GF BM    cmbuild CM SEED
#=GF BM    cmsearch -W 175 CM SEQDB
#=GF RN    [1]
#=GF RM    15096624
#=GF RT    New RNA motifs suggest an expanded scope for riboswitches in
#=GF RT    bacterial genetic control.
#=GF RA    Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, Collins J,
Lee
#=GF RA    M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR;
#=GF RL    Proc Natl Acad Sci U S A 2004;101:6421-6426.
#=GF CC    This family represents the bacterial ykkC/yxkD element. The
function of
#=GF CC    this family is unclear although it has been suggested that it may
function
#=GF CC    to switch on efflux pumps and detoxification systems in response
to harmful
#=GF CC    environmental molecules [1]. The Thermoanaerobacter tengcongensis
sequence
#=GF CC    EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that the two
#=GF CC    riboswitches may work in conjunction to regulate the the upstream
gene
#=GF CC    which codes for Swiss:Q8RC62, a member of Pfam:PF00860 (Personal
obs. Moxon
#=GF CC    SJ).
#=GF SQ    16

SimpleAlign, as implemented, seemingly doesn't have a way to store this
information.

I'll work on getting the core alignment IO working, but would there be any
interest in having a way to store annotations in Bio::SimpleAlign?  I'm
guessing the methods would be similar to the various Bio::Seq Annotation
methods.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From hlapp at gmx.net  Fri Oct 27 16:23:46 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 27 Oct 2006 16:23:46 -0400
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
Message-ID: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>

You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose  
this is what you meant by the 'various Bio::Seq Annotation methods'  
too.)

Just to make sure I'm not misunderstanding, I suppose the annotation  
pertains to the entire alignment?

	-hilmar

On Oct 27, 2006, at 2:34 PM, Chris Fields wrote:

> I am working an refactoring the AlignIO::stockholm parser to get it  
> reading
> and writing Pfam/Rfam alignments, and noticed that many alignments  
> have
> EMBL-like annotations attached, which pertain to the entire alignment:
>
> # STOCKHOLM 1.0
> #=GF ID    ykkC-yxkD
> #=GF AC    RF00442
> #=GF DE    ykkC-yxkD element
> #=GF AU    Moxon SJ
> #=GF GA    20.0
> #=GF NC    0.1
> #=GF TC    59.4
> #=GF SE    Barrick JE, Breaker RR
> #=GF SS    Predicted; Barrick JE, Breaker RR
> #=GF TP    Cis-reg; riboswitch;
> #=GF BM    cmbuild CM SEED
> #=GF BM    cmsearch -W 175 CM SEQDB
> #=GF RN    [1]
> #=GF RM    15096624
> #=GF RT    New RNA motifs suggest an expanded scope for  
> riboswitches in
> #=GF RT    bacterial genetic control.
> #=GF RA    Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M,  
> Collins J,
> Lee
> #=GF RA    M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR;
> #=GF RL    Proc Natl Acad Sci U S A 2004;101:6421-6426.
> #=GF CC    This family represents the bacterial ykkC/yxkD element. The
> function of
> #=GF CC    this family is unclear although it has been suggested  
> that it may
> function
> #=GF CC    to switch on efflux pumps and detoxification systems in  
> response
> to harmful
> #=GF CC    environmental molecules [1]. The Thermoanaerobacter  
> tengcongensis
> sequence
> #=GF CC    EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that  
> the two
> #=GF CC    riboswitches may work in conjunction to regulate the the  
> upstream
> gene
> #=GF CC    which codes for Swiss:Q8RC62, a member of Pfam:PF00860  
> (Personal
> obs. Moxon
> #=GF CC    SJ).
> #=GF SQ    16
>
> SimpleAlign, as implemented, seemingly doesn't have a way to store  
> this
> information.
>
> I'll work on getting the core alignment IO working, but would there  
> be any
> interest in having a way to store annotations in Bio::SimpleAlign?   
> I'm
> guessing the methods would be similar to the various Bio::Seq  
> Annotation
> methods.
>
> Chris
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct 27 16:38:17 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 15:38:17 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
Message-ID: <000001c6fa07$d8659990$15327e82@pyrimidine>

Hilmar Lapp wrote:
> You could make SimpleAlign be a Bio::AnnotationHolderI. (I
> suppose this is what you meant by the 'various Bio::Seq Annotation
> methods' too.)
> 
> Just to make sure I'm not misunderstanding, I suppose the
> annotation pertains to the entire alignment?
> 
> 	-hilmar
...

Yes, that's correct.  I would probably use Bio::Seq::Meta for the
sequence-specific markup lines.  I would have to add another new method to
deal with non-sequence-based consensus data (like sec. structure) for now.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Fri Oct 27 11:38:05 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 27 Oct 2006 08:38:05 -0700
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
References: <454202C7.1040701@sheffield.ac.uk>
	<001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
Message-ID: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>

Bio::DB::FileCache does one better and lets you cache the data in a
persistent file.  Not sure this index is shareable among users though -
bioperl-db is a better soln when that is desired.

-jason

On 10/27/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> > I have a script that is capable of downloading sequences from GenBank
> > based on GI numbers. I retrieve them if fasta format in order to save
> > bandwidth, but I'd like to take this one step further and cache the
> > sequences in case the user want to rerun the script using some of the
> > GI's they used previously.
> >
> > Does anyone have any guidance on how best to do this?
> >
> > Cheers
> > Nath
>
> There is Bio::DB::InMemoryCache, which is really an interface but appears
> to
> have several methods defined; you could look for modules which implement
> it.
> Sendu's suggestion of the Bio::Index modules and bioperl-db are also good
> starting points.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Jason Stajich
jason at bioperl.org
http://www.duke.edu/~jes12/

From cjfields at uiuc.edu  Fri Oct 27 21:57:58 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 20:57:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
Message-ID: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>


On Oct 27, 2006, at 3:23 PM, Hilmar Lapp wrote:

> You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose
> this is what you meant by the 'various Bio::Seq Annotation methods'
> too.)
>
> Just to make sure I'm not misunderstanding, I suppose the annotation
> pertains to the entire alignment?
>
> 	-hilmar

BTW, was that supposed to be Bio::AnnotatableI, or  
Bio::AnnotationHolderI?  The latter isn't present in CVS HEAD.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From eric.ross at neuro.utah.edu  Sat Oct 28 17:24:30 2006
From: eric.ross at neuro.utah.edu (Eric Ross)
Date: Sat, 28 Oct 2006 15:24:30 -0600
Subject: [Bioperl-l] PAML
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>

I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object.

I am able to extract other data from the report, but there seems to be a conflict in the documentation.  One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. 


I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far.  Anyone have suggestions?


code:

----begin code-------
#!/usr/bin/perl -w

use strict;


use Bio::Tools::Phylo::PAML;
my $parser = new Bio::Tools::Phylo::PAML
             (-file => "mlc");
my $result = $parser->next_result;
my @posteriors = $result->get_posteriors();

print "@posteriors";

exit(0);

---------end code-------------


---------------
Eric Ross
Computer Analyst II
ejr at neuro.utah.edu
Howard Hughes Medical Institute
University of Utah
S?nchez Lab


From avilella at gmail.com  Sun Oct 29 05:52:04 2006
From: avilella at gmail.com (Albert Vilella)
Date: Sun, 29 Oct 2006 10:52:04 +0000
Subject: [Bioperl-l] PAML
In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>

I don't know if this method is implemented. I can't grep-find it.
Maybe it's simply not there yet, but was planned when the
documentation was written.

On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
> I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>
> I am able to extract other data from the report, but there seems to be a conflict in the documentation.  One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object.
>
>
> I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far.  Anyone have suggestions?
>
>
> code:
>
> ----begin code-------
> #!/usr/bin/perl -w
>
> use strict;
>
>
> use Bio::Tools::Phylo::PAML;
> my $parser = new Bio::Tools::Phylo::PAML
>              (-file => "mlc");
> my $result = $parser->next_result;
> my @posteriors = $result->get_posteriors();
>
> print "@posteriors";
>
> exit(0);
>
> ---------end code-------------
>
>
>
> ---------------
> Eric Ross
> Computer Analyst II
> ejr at neuro.utah.edu
> Howard Hughes Medical Institute
> University of Utah
> S?nchez Lab
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Sun Oct 29 09:23:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 08:23:45 -0600
Subject: [Bioperl-l] PAML
In-Reply-To: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
Message-ID: <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>

Does the data show up in the object using Data::Dumper?

This should be filed as a bug since the docs imply the method  
exists.  This could be written up fairly quickly if one had test data  
and and a script to work with (hint hint...)

Chris

On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote:

> I don't know if this method is implemented. I can't grep-find it.
> Maybe it's simply not there yet, but was planned when the
> documentation was written.
>
> On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
>> I am trying to extract the "Naive Empirical Bayes (NEB)  
>> probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>>
>> I am able to extract other data from the report, but there seems  
>> to be a conflict in the documentation.  One doc implies that there  
>> should be a get_posteriors method. (It's used as an example in the  
>> Bio::Tools::Phylo::PAML doc), but the method does not appear to  
>> exist in the Bio::Tools::Phylo::PAML::Result object.
>>
>>
>> I have been trying various methods, in the event I'm just  
>> "confused", but I've had no luck, thus far.  Anyone have suggestions?
>>
>>
>> code:
>>
>> ----begin code-------
>> #!/usr/bin/perl -w
>>
>> use strict;
>>
>>
>> use Bio::Tools::Phylo::PAML;
>> my $parser = new Bio::Tools::Phylo::PAML
>>              (-file => "mlc");
>> my $result = $parser->next_result;
>> my @posteriors = $result->get_posteriors();
>>
>> print "@posteriors";
>>
>> exit(0);
>>
>> ---------end code-------------
>>
>>
>>
>> ---------------
>> Eric Ross
>> Computer Analyst II
>> ejr at neuro.utah.edu
>> Howard Hughes Medical Institute
>> University of Utah
>> S?nchez Lab
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From eric.ross at neuro.utah.edu  Sun Oct 29 12:06:54 2006
From: eric.ross at neuro.utah.edu (Eric Ross)
Date: Sun, 29 Oct 2006 10:06:54 -0700
Subject: [Bioperl-l] PAML
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
	<9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>
Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>

Thanks for all the help.

I've been looking at the code for the PAML rst parser.  It's a bit tricky. 

We have written a parser specific for our needs, but it looks to be a pretty complicated matter to make it generic.  

The output of PAML can vary a lot depending upon your options and this section can be repeated multiple times.  I'm sure someone with a good grasp of the potential output of PAML could come up with something, but I'll admit to being at a loss. 


---------------
Eric Ross
Computer Analyst II
ejr at neuro.utah.edu
Howard Hughes Medical Institute
University of Utah
S?nchez Lab


-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu]
Sent: Sun 2006-10-29 7:23 AM
To: Albert Vilella
Cc: Eric Ross; Bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] PAML
 
Does the data show up in the object using Data::Dumper?

This should be filed as a bug since the docs imply the method  
exists.  This could be written up fairly quickly if one had test data  
and and a script to work with (hint hint...)

Chris

On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote:

> I don't know if this method is implemented. I can't grep-find it.
> Maybe it's simply not there yet, but was planned when the
> documentation was written.
>
> On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
>> I am trying to extract the "Naive Empirical Bayes (NEB)  
>> probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>>
>> I am able to extract other data from the report, but there seems  
>> to be a conflict in the documentation.  One doc implies that there  
>> should be a get_posteriors method. (It's used as an example in the  
>> Bio::Tools::Phylo::PAML doc), but the method does not appear to  
>> exist in the Bio::Tools::Phylo::PAML::Result object.
>>
>>
>> I have been trying various methods, in the event I'm just  
>> "confused", but I've had no luck, thus far.  Anyone have suggestions?
>>
>>
>> code:
>>
>> ----begin code-------
>> #!/usr/bin/perl -w
>>
>> use strict;
>>
>>
>> use Bio::Tools::Phylo::PAML;
>> my $parser = new Bio::Tools::Phylo::PAML
>>              (-file => "mlc");
>> my $result = $parser->next_result;
>> my @posteriors = $result->get_posteriors();
>>
>> print "@posteriors";
>>
>> exit(0);
>>
>> ---------end code-------------
>>
>>
>>
>> ---------------
>> Eric Ross
>> Computer Analyst II
>> ejr at neuro.utah.edu
>> Howard Hughes Medical Institute
>> University of Utah
>> S?nchez Lab
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sun Oct 29 12:43:20 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 29 Oct 2006 17:43:20 +0000
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <45421658.5000103@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
Message-ID: <4544E838.7090400@sheffield.ac.uk>

Sorry for the repeat post but I haven't had a response. Just wondered if 
anyone had any idea about this?

Thanks
Nath

Nathan S. Haigh wrote:
> As you may be aware by now, i'm working with Bio::Restriction::Analysis
> and friends.
>
> I'm doing restriction analysis on large sequences - chromosomes. I need
> to identify an appropriate enzyme based on the total length of fragments
> that are of a certain size (e.g. 100 - 500 bp). However, the amount of
> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
> have the following code (bottom) which downloads 2 thaliana chromosomes
> (mito and chloro - so pretty small) and runs an analysis and then loops
> through the fragments for all enzymes in the default collection.
>
> My memory usage just keep on climbing and none seems to get freed up
> even when a $ra goes out of scope (start dealing with the next
> sequence). Is this a memory leak of some sort, is there a way to free up
> memory as I go? I'd appreciate any help/advice on how to reduce the
> amount of memory being consumed as I'd like to use all the thaliana
> chromosomes (not just mito and chloro), which at the moment probably
> won't work.
>
> Cheers
> Nath
>
> use strict;
> use Bio::DB::GenBank;
> use Bio::Restriction::Analysis;
> use Bio::Restriction::EnzymeCollection;
>
> my @seq_objs;
> my @gis = ( 7525012,  26556996 );
>
> my $db = Bio::DB::GenBank->new(-format => "fasta");
> foreach my $gi (@gis) {
>   print "Getting GI: $gi\n";
>   push @seq_objs, $db->get_Seq_by_id($gi)
> }
>
> my $min_fragment_size = 100;
> my $max_fragment_size = 500;
> my $enz_Coll = Bio::Restriction::EnzymeCollection->new();
>
> foreach my $seq (@seq_objs) {
>   my $tot_size = 0;
>   print "Processing ", $seq->primary_id,"\n";
>   my $ra = Bio::Restriction::Analysis->new(
>                                          -seq=>$seq,
>                                          -enzymes=>$enz_Coll,
>   );
>  
>   my @all_enzymes = $ra->cutters->each_enzyme;
>   print "  Calc total length of fragments in range: $min_fragment_size -
> $max_fragment_size\n";
>   foreach my $enzyme ( @all_enzymes ) {
>     # fragments() is a real memory hog
>     foreach my $frag ($ra->fragments($enzyme)) {
>       next if $min_fragment_size && (length $frag < $min_fragment_size);
>       next if $max_fragment_size && (length $frag > $max_fragment_size);
>       $tot_size += length $frag;
>     }
>     # do something based on value of $tot_size
>     #print "    ", $enzyme->name, " total = $tot_size\n";
>   }
>   print "DONE\n";
> }
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   

From cjfields at uiuc.edu  Sun Oct 29 13:09:54 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 12:09:54 -0600
Subject: [Bioperl-l] PAML
In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
	<9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <C775A898-5D18-48F6-874F-3B359C1A10C5@uiuc.edu>

On Oct 29, 2006, at 11:06 AM, Eric Ross wrote:

> Thanks for all the help.
>
> I've been looking at the code for the PAML rst parser.  It's a bit  
> tricky.
>
> We have written a parser specific for our needs, but it looks to be  
> a pretty complicated matter to make it generic.
>
> The output of PAML can vary a lot depending upon your options and  
> this section can be repeated multiple times.  I'm sure someone with  
> a good grasp of the potential output of PAML could come up with  
> something, but I'll admit to being at a loss.

Eric,

I planned on looking at ways to integrate the protein-based PAML  
programs but I'm working on a different area at the moment.  I agree  
it may be hard to adequately genericize parsing/methods to accomplish  
this, but if you have any ideas feel free to post them.  Again, I  
would suggest adding any proposed enhancements or bugs to Bugzilla:

http://bugzilla.open-bio.org/

Suggestions or bug reports on the list sometimes get lost in the  
shuffle, esp. since we're planning on a new developer release soon.

Chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct 29 13:16:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 12:16:37 -0600
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <4544E838.7090400@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
	<4544E838.7090400@sheffield.ac.uk>
Message-ID: <6D9EAA04-199C-4BDD-AA60-4833BC1CE250@uiuc.edu>


On Oct 29, 2006, at 11:43 AM, Nathan S. Haigh wrote:

> Sorry for the repeat post but I haven't had a response. Just  
> wondered if
> anyone had any idea about this?
>
> Thanks
> Nath

...

I think Warnock applies here.  Likely no one is really sure, hence  
they aren't answering.  It probably bears investigating by submitting  
and tracking as a bug.  My guess is something isn't garbage-collected  
properly (i.e. there are circular references present), leading to a  
memory leak.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From chhalling at alumni.ls.berkeley.edu  Sun Oct 29 14:16:36 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Sun, 29 Oct 2006 14:16:36 -0500
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <4544E838.7090400@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
	<4544E838.7090400@sheffield.ac.uk>
Message-ID: <4544FE14.7030701@alumni.ls.berkeley.edu>

Nathan S. Haigh wrote:
> Sorry for the repeat post but I haven't had a response. Just wondered if 
> anyone had any idea about this?
>
> Thanks
> Nath
>
> Nathan S. Haigh wrote:
>   
>> As you may be aware by now, i'm working with Bio::Restriction::Analysis
>> and friends.
>>
>> I'm doing restriction analysis on large sequences - chromosomes. I need
>> to identify an appropriate enzyme based on the total length of fragments
>> that are of a certain size (e.g. 100 - 500 bp). However, the amount of
>> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
>> have the following code (bottom) which downloads 2 thaliana chromosomes
>> (mito and chloro - so pretty small) and runs an analysis and then loops
>> through the fragments for all enzymes in the default collection.
>>
>> My memory usage just keep on climbing and none seems to get freed up
>> even when a $ra goes out of scope (start dealing with the next
>> sequence). Is this a memory leak of some sort, is there a way to free up
>> memory as I go? I'd appreciate any help/advice on how to reduce the
>> amount of memory being consumed as I'd like to use all the thaliana
>> chromosomes (not just mito and chloro), which at the moment probably
>> won't work.
>>
>> Cheers
>> Nath
>>
>> use strict;
>> use Bio::DB::GenBank;
>> use Bio::Restriction::Analysis;
>> use Bio::Restriction::EnzymeCollection;
>>
>> my @seq_objs;
>> my @gis = ( 7525012,  26556996 );
>>
>> my $db = Bio::DB::GenBank->new(-format => "fasta");
>> foreach my $gi (@gis) {
>>   print "Getting GI: $gi\n";
>>   push @seq_objs, $db->get_Seq_by_id($gi)
>> }
>>
>> my $min_fragment_size = 100;
>> my $max_fragment_size = 500;
>> my $enz_Coll = Bio::Restriction::EnzymeCollection->new();
>>
>> foreach my $seq (@seq_objs) {
>>   my $tot_size = 0;
>>   print "Processing ", $seq->primary_id,"\n";
>>   my $ra = Bio::Restriction::Analysis->new(
>>                                          -seq=>$seq,
>>                                          -enzymes=>$enz_Coll,
>>   );
>>  
>>   my @all_enzymes = $ra->cutters->each_enzyme;
>>   print "  Calc total length of fragments in range: $min_fragment_size -
>> $max_fragment_size\n";
>>   foreach my $enzyme ( @all_enzymes ) {
>>     # fragments() is a real memory hog
>>     foreach my $frag ($ra->fragments($enzyme)) {
>>       next if $min_fragment_size && (length $frag < $min_fragment_size);
>>       next if $max_fragment_size && (length $frag > $max_fragment_size);
>>       $tot_size += length $frag;
>>     }
>>     # do something based on value of $tot_size
>>     #print "    ", $enzyme->name, " total = $tot_size\n";
>>   }
>>   print "DONE\n";
>> }
>>
>>     
Try this code, which creates a new Bio::Restriction::Analysis object for 
each digest. On my PowerBook, this doesn't use more than 13 Mb of memory.

Reading the code for Bio::Restriction::Analysis reveals that the 
fragments() method calls the cut() method. The documentation for the cut 
method states:

Note: cut doesn't now re-initialize everything before figuring out
cuts. This is so that you can do multiple digests, or add more data or
whatever. You'll have to use new to reset everything.

This means there is no memory leak; it's just that the 
Bio::Restriction::Analysis object is retaining cut information for each 
enzyme, which takes a lot of memory.

use strict;
use warnings;
use Bio::DB::GenBank;
use Bio::Restriction::Analysis;
use Bio::Restriction::EnzymeCollection;

my @seq_objs;
my @gis = ( 7525012,  26556996 );

my $db = Bio::DB::GenBank->new(-format => "fasta");
foreach my $gi (@gis) {
  print "Getting GI: $gi\n";
  push @seq_objs, $db->get_Seq_by_id($gi)
}

my $min_fragment_size = 100;
my $max_fragment_size = 500;
my $enz_Coll = Bio::Restriction::EnzymeCollection->new();

foreach my $seq (@seq_objs) {
  print "Processing ", $seq->primary_id, "\n";
  foreach my $enzyme ( $enz_Coll->each_enzyme() ) {
    my $ra = Bio::Restriction::Analysis->new(
      -seq => $seq,
      -enzymes => $enzyme );
    my $tot_size = 0;
 
    print "  Calc total length of fragments in range: $min_fragment_size 
-" .
      " $max_fragment_size\n";

    foreach my $frag ($ra->fragments($enzyme)) {
      next if $min_fragment_size && (length $frag < $min_fragment_size);
      next if $max_fragment_size && (length $frag > $max_fragment_size);
      $tot_size += length $frag;
    }
    # do something based on value of $tot_size
    print "    ", $enzyme->name, " total = $tot_size\n";
  }
  print "DONE\n";
}

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From n.haigh at sheffield.ac.uk  Mon Oct 30 03:51:49 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 30 Oct 2006 08:51:49 +0000
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
Message-ID: <4545BD25.3030107@sheffield.ac.uk>

In my script I retrieve sequences from GenBank in FASTA format by GI
numbers and optionally store the sequence in a cache using
Bio::DB::Fasta. On subsequent runs of the script, the cache is first
checked for the GI and returns the sequence if it is found or the
sequence is obtained from GenBank as above.

I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
object which is defined within the Bio::DB::Fasta file. This is
annoying, since $seq_obj in my script would be either a Bio::Seq if it
was obtained from GenBank or a Bio::PrimarySeq if obtained from the
cache and calling primary_id() on it doesn't do the expected thing with
Bio::PrimarySeq:
ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)

Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?

Nath

From yuhki at ncifcrf.gov  Mon Oct 30 08:57:35 2006
From: yuhki at ncifcrf.gov (Naoya Yuhki)
Date: Mon, 30 Oct 2006 08:57:35 -0500
Subject: [Bioperl-l] bptutorial.pl 0
Message-ID: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov>

Hello,
I run

perl bptutorial.pl 0

and I got the following error.

-------------------- WARNING ---------------------
MSG: id (ROA1_HUMAN) does not exist
---------------------------------------------------
Can't call method "display_id" on an undefined value at bptutorial.pl  
line 3945.

other tests all worked.

I thank any suggestions from you.

NAOYA YUHKI.


From cjfields at uiuc.edu  Mon Oct 30 12:42:21 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 30 Oct 2006 11:42:21 -0600
Subject: [Bioperl-l] bptutorial.pl 0
In-Reply-To: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov>
Message-ID: <000601c6fc4a$c3e43450$15327e82@pyrimidine>

> Hello,
> I run
> 
> perl bptutorial.pl 0
> 
> and I got the following error.
> 
> -------------------- WARNING ---------------------
> MSG: id (ROA1_HUMAN) does not exist
> ---------------------------------------------------
> Can't call method "display_id" on an undefined value at bptutorial.pl
> line 3945. 
> 
> other tests all worked.
> 
> I thank any suggestions from you.
> 
> NAOYA YUHKI.

What version of Bioperl are you running?  

As a warning, the bptutorial.pl script has been removed from CVS and will
not be included in future versions of Bioperl.  It can be found on the
bioperl wiki instead:

http://www.bioperl.org/wiki/Bptutorial

chris


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Mon Oct 30 13:08:15 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 30 Oct 2006 10:08:15 -0800
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <4545BD25.3030107@sheffield.ac.uk>
References: <4545BD25.3030107@sheffield.ac.uk>
Message-ID: <29F47393-D134-4093-8751-E948BF521843@bioperl.org>

Bio::PrimarySeq makes sense because Fasta databases only provide  
sequences without features.  But you are actually getting a  
Bio::PrimarySeq::Fasta object which is a proxy object since the  
module won't pull a whole sequence into memory unless seq() is  
requested.

The problem is really why you are getting something useless set for  
primary_id.

What do you want it to be - the GI number?  you'll need to explicitly  
set it because DB::Fasta has no concept of GI numbers encoded in the  
header line.
AFAIK you cannot also set the primary_id to a value of your liking  
because this a proxy object.  The best bet is to create a Bio::Seq  
object out of one of these and set the primary_id and display_id to  
values that you can compute from the display_id.

At least that has been my strategy when using this - maybe someone  
wants to code something new into the object itsself.

-jason
On Oct 30, 2006, at 12:51 AM, Nathan S. Haigh wrote:

> In my script I retrieve sequences from GenBank in FASTA format by GI
> numbers and optionally store the sequence in a cache using
> Bio::DB::Fasta. On subsequent runs of the script, the cache is first
> checked for the GI and returns the sequence if it is found or the
> sequence is obtained from GenBank as above.
>
> I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
> returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
> object which is defined within the Bio::DB::Fasta file. This is
> annoying, since $seq_obj in my script would be either a Bio::Seq if it
> was obtained from GenBank or a Bio::PrimarySeq if obtained from the
> cache and calling primary_id() on it doesn't do the expected thing  
> with
> Bio::PrimarySeq:
> ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)
>
> Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From golharam at umdnj.edu  Mon Oct 30 15:11:51 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 15:11:51 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String?
Message-ID: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>

I'm trying to parse some blast output w/o actually creating the output
file.  Instead, I'm capturing the output in a variable and would like to
use IO::String to represent the file:

	$_ = `megablast -d somedatabase -i somesequence -D 2`;
	my $blast_file = new IO::String($_);
	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
$blast_file);
	my $results = $searchio->next_result;
	my $hit = $results->next_hit;
	if (! defined($hit)) {
		warn "No BLAST hit for $accession on chr $chr for
Seq/$orth_id/$organism\n\n";
		return;
	}

Now, when Bio::SearchIO tries to read the output line by line, instead
it reads the entire output as 1 line.

If I provide the output in a file and use:

	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
'/tmp/somefile.blast');

This works...so is it possible to use IO::String to provide
Bio::SearchIO with BLAST output?  

Ryan


From golharam at umdnj.edu  Mon Oct 30 15:54:29 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 15:54:29 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com>
Message-ID: <00e801c6fc65$9849aee0$e6028a0a@GOLHARMOBILE1>

Thanks.  How are you getting the output?  system()?  BTW- I'm using
v1.5.1...


> -----Original Message-----
> From: Bernd Web [mailto:bernd.web at gmail.com] 
> Sent: Monday, October 30, 2006 3:45 PM
> To: golharam at umdnj.edu
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] Is it possible to parse BLAST output 
> using IO:String?
> 
> 
> Hi Ryan,
> 
> I parse blastn output using IO::String w/o problems:
> 
>  my $stringfh = new IO::String($input);
>  my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh);
> 
> however this is input does not come via backticks.
> 
> 
> bernd
> 
> On 10/30/06, Ryan Golhar <golharam at umdnj.edu> wrote:
> > I'm trying to parse some blast output w/o actually creating 
> the output 
> > file.  Instead, I'm capturing the output in a variable and 
> would like 
> > to use IO::String to represent the file:
> >
> >         $_ = `megablast -d somedatabase -i somesequence -D 2`;
> >         my $blast_file = new IO::String($_);
> >         my $searchio = new Bio::SearchIO(-format => 'blast', -fh => 
> > $blast_file);
> >         my $results = $searchio->next_result;
> >         my $hit = $results->next_hit;
> >         if (! defined($hit)) {
> >                 warn "No BLAST hit for $accession on chr $chr for 
> > Seq/$orth_id/$organism\n\n";
> >                 return;
> >         }
> >
> > Now, when Bio::SearchIO tries to read the output line by 
> line, instead 
> > it reads the entire output as 1 line.
> >
> > If I provide the output in a file and use:
> >
> >         my $searchio = new Bio::SearchIO(-format => 
> 'blast', -file => 
> > '/tmp/somefile.blast');
> >
> > This works...so is it possible to use IO::String to provide 
> > Bio::SearchIO with BLAST output?
> >
> > Ryan
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org 
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 


From bix at sendu.me.uk  Mon Oct 30 16:27:58 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 30 Oct 2006 21:27:58 +0000
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
Message-ID: <45466E5E.9000504@sendu.me.uk>

Ryan Golhar wrote:
> I'm trying to parse some blast output w/o actually creating the output
> file.  Instead, I'm capturing the output in a variable and would like to
> use IO::String to represent the file:
> 
> 	$_ = `megablast -d somedatabase -i somesequence -D 2`;
> 	my $blast_file = new IO::String($_);
> 	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
> $blast_file);
> 	my $results = $searchio->next_result;
> 	my $hit = $results->next_hit;
> 	if (! defined($hit)) {
> 		warn "No BLAST hit for $accession on chr $chr for
> Seq/$orth_id/$organism\n\n";
> 		return;
> 	}
> 
> Now, when Bio::SearchIO tries to read the output line by line, instead
> it reads the entire output as 1 line.
> 
> If I provide the output in a file and use:
> 
> 	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
> '/tmp/somefile.blast');
> 
> This works...so is it possible to use IO::String to provide
> Bio::SearchIO with BLAST output?

Why must it be IO::String? Why not just open() your megablast and 
provide $searchio the real filehandle? It would be faster that way as well.

Read the docs for `. Your usage above is inappropriate.


From golharam at umdnj.edu  Mon Oct 30 16:54:45 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 16:54:45 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <C3209DC5-433B-4BAD-A184-AC9D2A2B4A90@bioperl.org>
Message-ID: <00f901c6fc6e$03916460$e6028a0a@GOLHARMOBILE1>

Hmmm.  Yes, I suppose I could.  
 
I did it with the backtick because I based my code off of the "To and
>From a String" from the SeqIO HOWTO...
 

-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason
Stajich
Sent: Monday, October 30, 2006 4:44 PM
To: Sendu Bala
Cc: golharam at umdnj.edu; 'bioperl-l'
Subject: Re: [Bioperl-l] Is it possible to parse BLAST output using
IO:String?


right - can't you just do: 

my $fh;
open($fh, "megablast -d ... | ") || die $!;
my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh);

On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote:


Ryan Golhar wrote:

I'm trying to parse some blast output w/o actually creating the output
file.  Instead, I'm capturing the output in a variable and would like to
use IO::String to represent the file:

$_ = `megablast -d somedatabase -i somesequence -D 2`;
my $blast_file = new IO::String($_);
my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
$blast_file);
my $results = $searchio->next_result;
my $hit = $results->next_hit;
if (! defined($hit)) {
warn "No BLAST hit for $accession on chr $chr for
Seq/$orth_id/$organism\n\n";
return;
}

Now, when Bio::SearchIO tries to read the output line by line, instead
it reads the entire output as 1 line.

If I provide the output in a file and use:

my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
'/tmp/somefile.blast');

This works...so is it possible to use IO::String to provide
Bio::SearchIO with BLAST output?


Why must it be IO::String? Why not just open() your megablast and 
provide $searchio the real filehandle? It would be faster that way as
well.

Read the docs for `. Your usage above is inappropriate.


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


--
Jason Stajich, PhD 
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From bernd.web at gmail.com  Mon Oct 30 15:44:31 2006
From: bernd.web at gmail.com (Bernd Web)
Date: Mon, 30 Oct 2006 21:44:31 +0100
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
Message-ID: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com>

Hi Ryan,

I parse blastn output using IO::String w/o problems:

 my $stringfh = new IO::String($input);
 my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh);

however this is input does not come via backticks.


bernd

On 10/30/06, Ryan Golhar <golharam at umdnj.edu> wrote:
> I'm trying to parse some blast output w/o actually creating the output
> file.  Instead, I'm capturing the output in a variable and would like to
> use IO::String to represent the file:
>
>         $_ = `megablast -d somedatabase -i somesequence -D 2`;
>         my $blast_file = new IO::String($_);
>         my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
> $blast_file);
>         my $results = $searchio->next_result;
>         my $hit = $results->next_hit;
>         if (! defined($hit)) {
>                 warn "No BLAST hit for $accession on chr $chr for
> Seq/$orth_id/$organism\n\n";
>                 return;
>         }
>
> Now, when Bio::SearchIO tries to read the output line by line, instead
> it reads the entire output as 1 line.
>
> If I provide the output in a file and use:
>
>         my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
> '/tmp/somefile.blast');
>
> This works...so is it possible to use IO::String to provide
> Bio::SearchIO with BLAST output?
>
> Ryan
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From jason at bioperl.org  Mon Oct 30 16:44:18 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 30 Oct 2006 13:44:18 -0800
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <45466E5E.9000504@sendu.me.uk>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
	<45466E5E.9000504@sendu.me.uk>
Message-ID: <C3209DC5-433B-4BAD-A184-AC9D2A2B4A90@bioperl.org>

right - can't you just do:

my $fh;
open($fh, "megablast -d ... | ") || die $!;
my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh);

On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote:

> Ryan Golhar wrote:
>> I'm trying to parse some blast output w/o actually creating the  
>> output
>> file.  Instead, I'm capturing the output in a variable and would  
>> like to
>> use IO::String to represent the file:
>>
>> 	$_ = `megablast -d somedatabase -i somesequence -D 2`;
>> 	my $blast_file = new IO::String($_);
>> 	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
>> $blast_file);
>> 	my $results = $searchio->next_result;
>> 	my $hit = $results->next_hit;
>> 	if (! defined($hit)) {
>> 		warn "No BLAST hit for $accession on chr $chr for
>> Seq/$orth_id/$organism\n\n";
>> 		return;
>> 	}
>>
>> Now, when Bio::SearchIO tries to read the output line by line,  
>> instead
>> it reads the entire output as 1 line.
>>
>> If I provide the output in a file and use:
>>
>> 	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
>> '/tmp/somefile.blast');
>>
>> This works...so is it possible to use IO::String to provide
>> Bio::SearchIO with BLAST output?
>
> Why must it be IO::String? Why not just open() your megablast and
> provide $searchio the real filehandle? It would be faster that way  
> as well.
>
> Read the docs for `. Your usage above is inappropriate.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From lstein at cshl.edu  Mon Oct 30 13:59:29 2006
From: lstein at cshl.edu (Lincoln Stein)
Date: Mon, 30 Oct 2006 13:59:29 -0500
Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase
Message-ID: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>

Hi All,

I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not
to validate. I have committed a new version to live and to the release
candidate branch. I hope it isn't too late to get this into the release.

Lincoln

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu

From huangyi1 at hkusua.hku.hk  Tue Oct 31 00:46:20 2006
From: huangyi1 at hkusua.hku.hk (Huang Yi)
Date: Tue, 31 Oct 2006 13:46:20 +0800
Subject: [Bioperl-l] bioperl1.5 and GD2.35
Message-ID: <200610310546.k9V5kQGT010481@hkusua.hku.hk>

Hi,

 
I just installed bioperl 1.4 from CPAN to my Gentoo linux computer. But the
installation was failed. I had to install by force.

 
However, the GD module couldn't be installed for some unknown reasons.

 
I therefore use "emerge" tool of Gentoo to get bioperl and GD again. They
are fine. The version of bioperl became upgrade to1.5 and GD was 2.35.

 
However, when I tested it by using the program in HOWTO wiki page
(http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me:

 
Can't locate object method "png" via package "GD::Image" at
/usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 799, <> line 9.

 
In my other computer, bioperl1.4 and GD2.34 work fine. I therefore want to
remove the CPAN bioperl from the system and re-install it, but it seems to
be impossible.

 
Would you please give me some advices on how to let my GD and bioperl work. 

 
Thanks!

 
Huang Yi

 
From bix at sendu.me.uk  Tue Oct 31 03:20:21 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 31 Oct 2006 08:20:21 +0000
Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase
In-Reply-To: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>
References: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>
Message-ID: <45470745.1050605@sendu.me.uk>

Lincoln Stein wrote:
> Hi All,
> 
> I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not
> to validate. I have committed a new version to live and to the release
> candidate branch. I hope it isn't too late to get this into the release.

It isn't too late, thank you.


From avilella at gmail.com  Tue Oct 31 08:54:39 2006
From: avilella at gmail.com (Albert Vilella)
Date: Tue, 31 Oct 2006 13:54:39 +0000
Subject: [Bioperl-l] catfile and catdir
Message-ID: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>

Hi,

I was testing the bioperl-run/t/PAML.t and stumbled upon this a
catdir/catfile error:

Can't locate object method "catdir" via package "Bio::Root::IO" at
/home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
113.
BEGIN failed--compilation aborted at
/home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
143.
Compilation failed in require at t/PAML.t line 64.
BEGIN failed--compilation aborted at t/PAML.t line 64.

Should be be using File::Spec for catdir and catfile instead of Root::IO?

Cheers,

    Albert.

From Kevin.M.Brown at asu.edu  Tue Oct 31 10:34:34 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Tue, 31 Oct 2006 08:34:34 -0700
Subject: [Bioperl-l] bioperl1.5 and GD2.35
Message-ID: <1A4207F8295607498283FE9E93B775B4023B5F3C@EX02.asurite.ad.asu.edu>

Not really a Bioperl issue per se, but sounds like when you had Gentoo
emerge GD it didn't include libpng and so didn't build the needed parts
to create PNG type graphics. 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Huang Yi
> Sent: Monday, October 30, 2006 10:46 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] bioperl1.5 and GD2.35
> 
> Hi,
> 
>  
> 
> I just installed bioperl 1.4 from CPAN to my Gentoo linux 
> computer. But the
> installation was failed. I had to install by force.
> 
>  
> 
> However, the GD module couldn't be installed for some unknown reasons.
> 
>  
> 
> I therefore use "emerge" tool of Gentoo to get bioperl and GD 
> again. They
> are fine. The version of bioperl became upgrade to1.5 and GD was 2.35.
> 
>  
> 
> However, when I tested it by using the program in HOWTO wiki page
> (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me:
> 
>  
> 
> Can't locate object method "png" via package "GD::Image" at
> /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 
> 799, <> line 9.
> 
>  
> 
> In my other computer, bioperl1.4 and GD2.34 work fine. I 
> therefore want to
> remove the CPAN bioperl from the system and re-install it, 
> but it seems to
> be impossible.
> 
>  
> 
> Would you please give me some advices on how to let my GD and 
> bioperl work. 
> 
>  
> 
> Thanks!
> 
>  
> 
> Huang Yi
> 
>  
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From hlapp at gmx.net  Tue Oct 31 11:21:40 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 11:21:40 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
	<24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
Message-ID: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>


On Oct 27, 2006, at 9:57 PM, Chris Fields wrote:

> BTW, was that supposed to be Bio::AnnotatableI, or  
> Bio::AnnotationHolderI?

Sorry, the former. I guess I got confused with FeatureHolders. Too  
bad Featureable isn't an English word.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Tue Oct 31 12:01:44 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 12:01:44 -0500
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <4545BD25.3030107@sheffield.ac.uk>
References: <4545BD25.3030107@sheffield.ac.uk>
Message-ID: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net>

The only thing I would add to Jason's reply is that it is easy to do

	if (! $seq->isa("Bio::SeqI")) {
		my $bioseq = Bio::Seq->new();
		$bioseq->primary_seq($seq);
		$seq = $bioseq;
	}

and from that point on all your objects are Bio::SeqI compliant  
regardless of whether they were obtained that way or not.

Aside from that I wonder why there isn't a -primary_seq option in  
Bio::Seq::new - this would shorten the above into a (more perl'ish)  
single line:

	$seq = Bio::Seq->new(-primary_seq=>$seq) unless $seq->isa("Bio::SeqI");

Anyone takers to add that capability?

-hilmar

On Oct 30, 2006, at 3:51 AM, Nathan S. Haigh wrote:

> In my script I retrieve sequences from GenBank in FASTA format by GI
> numbers and optionally store the sequence in a cache using
> Bio::DB::Fasta. On subsequent runs of the script, the cache is first
> checked for the GI and returns the sequence if it is found or the
> sequence is obtained from GenBank as above.
>
> I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
> returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
> object which is defined within the Bio::DB::Fasta file. This is
> annoying, since $seq_obj in my script would be either a Bio::Seq if it
> was obtained from GenBank or a Bio::PrimarySeq if obtained from the
> cache and calling primary_id() on it doesn't do the expected thing  
> with
> Bio::PrimarySeq:
> ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)
>
> Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 31 12:08:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 11:08:56 -0600
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
Message-ID: <001401c6fd0f$4239aa50$15327e82@pyrimidine>

>> BTW, was that supposed to be Bio::AnnotatableI, or
>> Bio::AnnotationHolderI?
> 
> Sorry, the former. I guess I got confused with
> FeatureHolders. Too bad Featureable isn't an English word.
> 
> 	-hilmar

Having SimpleAlign be AnnotatableI shouldn't be too much of a burden, since
the only additional implemented method is annotation().  So, I think all the
various Stockholm tags can be placed somewhere.

A bit OT: were we planning on getting rid of the various *_tag_* methods in
AnnotatableI at some point?  I'm a bit confused as to why they were added.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Tue Oct 31 12:09:26 2006
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 31 Oct 2006 09:09:26 -0800
Subject: [Bioperl-l] catfile and catdir
In-Reply-To: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>
References: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>
Message-ID: <1AD4DB38-E08D-4E47-8A59-6539068474CB@bioperl.org>

Yep.  Unless we want this to also exist in Root::IO and delegate to  
File::Spec.

-jason
On Oct 31, 2006, at 5:54 AM, Albert Vilella wrote:

> Hi,
>
> I was testing the bioperl-run/t/PAML.t and stumbled upon this a
> catdir/catfile error:
>
> Can't locate object method "catdir" via package "Bio::Root::IO" at
> /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
> 113.
> BEGIN failed--compilation aborted at
> /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
> 143.
> Compilation failed in require at t/PAML.t line 64.
> BEGIN failed--compilation aborted at t/PAML.t line 64.
>
> Should be be using File::Spec for catdir and catfile instead of  
> Root::IO?
>
> Cheers,
>
>     Albert.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Tue Oct 31 12:10:51 2006
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 31 Oct 2006 09:10:51 -0800
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
	<24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
	<8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
Message-ID: <65F92B54-33FD-4D8F-90B7-49E2697CDBA2@bioperl.org>

It just needs to have an annotation collection - so it would be  
Bio::AnnotateableI

On Oct 31, 2006, at 8:21 AM, Hilmar Lapp wrote:

>
> On Oct 27, 2006, at 9:57 PM, Chris Fields wrote:
>
>> BTW, was that supposed to be Bio::AnnotatableI, or
>> Bio::AnnotationHolderI?
>
> Sorry, the former. I guess I got confused with FeatureHolders. Too
> bad Featureable isn't an English word.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From hlapp at gmx.net  Tue Oct 31 12:44:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 12:44:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C16CF3EE.B1A9%bosborne11@verizon.net>
References: <C16CF3EE.B1A9%bosborne11@verizon.net>
Message-ID: <ACF19E78-7FC3-42BE-8F41-86C45C710F4B@gmx.net>

Well isn't this a result of conflating some of the SeqFeatureI  
methods into the annotation collection?

If I'm not mistaken on this then those methods were introduced in  
1.5.0 and hence can go away without deprecation.

	-hilmar

On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote:

> Chris,
>
> I don't think the intent was to remove the methods, rather we'd  
> just call
> deprecated(). Example from AnnotatableI:
>
> sub remove_tag {
>   my ($self, at args) = @_;
>
>   #uncomment in 1.6
>   #$self->deprecated('remove_tag() is deprecated, use
> remove_Annotations()');
>
>   return $self->annotation->remove_Annotations(@args);
> }
>
> With regards to "why", I can't reconstruct the entire rationale  
> myself but I
> can say that the newer names make more sense. Take that example  
> above - it's
> function is to remove entire Annotations not just to remove tags, so
> remove_Annotations is a better name.
>
> Brian O.
>
>
> On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>
>> A bit OT: were we planning on getting rid of the various *_tag_*  
>> methods in
>> AnnotatableI at some point?  I'm a bit confused as to why they  
>> were added.
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bosborne11 at verizon.net  Tue Oct 31 11:37:01 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 31 Oct 2006 12:37:01 -0400
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <001401c6fd0f$4239aa50$15327e82@pyrimidine>
Message-ID: <C16CF3EE.B1A9%bosborne11@verizon.net>

Chris,

I don't think the intent was to remove the methods, rather we'd just call
deprecated(). Example from AnnotatableI:

sub remove_tag {
  my ($self, at args) = @_;

  #uncomment in 1.6
  #$self->deprecated('remove_tag() is deprecated, use
remove_Annotations()');

  return $self->annotation->remove_Annotations(@args);
}

With regards to "why", I can't reconstruct the entire rationale myself but I
can say that the newer names make more sense. Take that example above - it's
function is to remove entire Annotations not just to remove tags, so
remove_Annotations is a better name.

Brian O.


On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> A bit OT: were we planning on getting rid of the various *_tag_* methods in
> AnnotatableI at some point?  I'm a bit confused as to why they were added.


From cjfields at uiuc.edu  Tue Oct 31 13:44:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 12:44:02 -0600
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <ACF19E78-7FC3-42BE-8F41-86C45C710F4B@gmx.net>
Message-ID: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>

Hilmar Lapp wrote:
> Well isn't this a result of conflating some of the
> SeqFeatureI methods into the annotation collection?
> 
> If I'm not mistaken on this then those methods were
> introduced in 1.5.0 and hence can go away without deprecation.
> 
> 	-hilmar
> 
> On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote:
> 
>> Chris,
>> 
>> I don't think the intent was to remove the methods, rather we'd just
>> call deprecated(). Example from AnnotatableI:
>> 
>> sub remove_tag {
>>   my ($self, at args) = @_;
>> 
>>   #uncomment in 1.6
>>   #$self->deprecated('remove_tag() is deprecated, use
>> remove_Annotations()'); 
>> 
>>   return $self->annotation->remove_Annotations(@args); }
>> 
>> With regards to "why", I can't reconstruct the entire rationale
>> myself but I can say that the newer names make more sense. Take that
>> example above - it's function is to remove entire Annotations not
>> just to remove tags, so remove_Annotations is a better name.
>> 
>> Brian O.
>> 
>> 
>> On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>> 
>>> A bit OT: were we planning on getting rid of the various *_tag_*
>>> methods in AnnotatableI at some point?  I'm a bit confused as to why
>>> they were added.

Sorry Brian, what I meant was, based on CVS history, the various *tag*
methods in AnnotatableI were added all at once, with deprecations already
present in the commit.  So the methods weren't there to begin with, then
added only to be deprecated later?  Hence the confusion...

I think Hilmar's right; the CVS history indicates these were added just
prior to rel. 1.5 by Allen and seem to be related to SeqFeatureI.  I'm sure
the intent was good, but they contradict methods in the Feature/Annotation
HOWTO on retrieving Annotation objects via the Annotation::Collection
object.  I think that agrees with your point about the various Annotation*
method names being the more appropriate ones.  

Does everybody agree we should just remove them?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 31 13:53:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 12:53:16 -0600
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net>
Message-ID: <000001c6fd1d$d4359c80$15327e82@pyrimidine>


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp
> Sent: Tuesday, October 31, 2006 11:02 AM
> To: n.haigh at sheffield.ac.uk
> Cc: Bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
> 
> The only thing I would add to Jason's reply is that it is easy to do
> 
> 	if (! $seq->isa("Bio::SeqI")) {
> 		my $bioseq = Bio::Seq->new();
> 		$bioseq->primary_seq($seq);
> 		$seq = $bioseq;
> 	}
> 
> and from that point on all your objects are Bio::SeqI 
> compliant regardless of whether they were obtained that way or not.
> 
> Aside from that I wonder why there isn't a -primary_seq 
> option in Bio::Seq::new - this would shorten the above into a 
> (more perl'ish) single line:
> 
> 	$seq = Bio::Seq->new(-primary_seq=>$seq) unless 
> $seq->isa("Bio::SeqI");
> 
> Anyone takers to add that capability?
> 
> -hilmar

Sounds good to me!

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
From nhansen at nhgri.nih.gov  Tue Oct 31 14:51:23 2006
From: nhansen at nhgri.nih.gov (Nancy Hansen)
Date: Tue, 31 Oct 2006 14:51:23 -0500 (EST)
Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling
Message-ID: <Pine.GSO.4.58.0610311438470.17750@stout.nhgri.nih.gov>


Hello,

	As sequencing centers begin to deposit trace data from "Medical
Sequencing" projects into the public archives, there is now the need to
"anonymize" sequence trace files by removing embedded information which
might be used to identify the individual who was the original source of
the DNA being sequenced.

	I was hoping I might be able to use Bio::SeqIO to manipulate the
comments contained in an SCF-formatted trace file, but I'm finding that
Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information.
Since SCF is a widely-accepted standard for trace files, would it be
reasonable to include fields like "scf_comments" and "scf_header" in a
Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them?
Likewise, it would be great if write_seq could pull these values right
from a SequenceTrace object rather than requiring them as arguments.

	I'd be happy to help in this effort if necessary.

	Thanks,
	--Nancy

*************************************
Nancy F. Hansen, PhD	nhansen at nhgri.nih.gov
Bioinformatics Group
NIH Intramural Sequencing Center (NISC)
5625 Fishers Lane
Rockville, MD 20852
Phone: (301) 435-1560	Fax: (301) 435-6170

From lincoln.stein at gmail.com  Tue Oct 31 15:24:17 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 31 Oct 2006 15:24:17 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <000001c6f78b$d1c65a30$15327e82@pyrimidine>
References: <453E309B.9090007@sendu.me.uk>
	<000001c6f78b$d1c65a30$15327e82@pyrimidine>
Message-ID: <6dce9a0b0610311224x79256b29sf102eb5c35865caf@mail.gmail.com>

Are you going to go ahead with 1.52_XX ? If so, I will code GBrowse to look
for 1.52 or higher.

Lincoln

On 10/24/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> ..
> >
> > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded
> > with the filename Perl6-Pugs-6.2.13.tar.gz
>
> Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is
> '6.002013'.  So maybe we should follow a similar convention.  Seems easier
> and less confusing to me, at least.
>
> > As you point out, the code has the kind of $VERSION number we've been
> > suggesting in this thread:
> >
> > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> > >
> > > our $VERSION = 6.002013;
> > >
> > > That's also a very perlish-way to do it.  And there are no developer
> > > versions of Pugs, since it is always under active development.  We
> could
> > try
> > > something like:
> > >
> > > our $VERSION = 1.005002_01;
> >
> > Yes, this was already like one of my suggestions (1.0502_01), but I
> > brought up the concern that 1.05 might be < 1.4.
> >
> > So then we have a question: do we try and fumble a 1.4 compatible number
> > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if
> > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no
> > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final
> > release following some 1.006000_001 (1.6.0.01 == rc1) RCs?
>
> I would go for the clean break if it follows perl/CPAN convention.
> '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing.
>
> If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6
> RC1, 1.6 RC2 etc then that would be consistent and perl-compatible.
>
> BTW, the reason I looked at Pugs was to see what some of the Perl6
> developers were using.  Who knows; they'll probably change it!
>
> ..
>
> > I don't think it would be a hassle; on the contrary it would be very
> > useful to know the CPAN distribution actually works. I'm very happy with
> > the idea that a release candidate gets fully tested...
>
> So you obviously feel strongly about it!  ;>
>
> I don't have a problem as long as we stick with doing this from now on (
> i.e.
> have a consistent versioning scheme, release policy, CPAN release policy,
> etc).  Would be nice for Jason/Brian/Hilmar to chime in as to the
> reasoning
> behind the older versioning scheme.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu

From hlapp at gmx.net  Tue Oct 31 16:53:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 16:53:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>
References: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>
Message-ID: <F244DEC6-0ADE-437E-9AED-1F864A54F7AD@gmx.net>


On Oct 31, 2006, at 1:44 PM, Chris Fields wrote:

> Does everybody agree we should just remove them?

I wish you could but I'm afraid that would break stuff? Otherwise why  
were they added in the first place? I thought  
Bio::SeqFeature::Annotated needs them maybe?

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 31 17:41:17 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 16:41:17 -0600
Subject: [Bioperl-l] AnnotatableI tag methods,
	was  Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <F244DEC6-0ADE-437E-9AED-1F864A54F7AD@gmx.net>
Message-ID: <000001c6fd3d$ae37c240$15327e82@pyrimidine>


> On Oct 31, 2006, at 1:44 PM, Chris Fields wrote:
> 
> > Does everybody agree we should just remove them?
> 
> I wish you could but I'm afraid that would break stuff? 
> Otherwise why were they added in the first place? I thought 
> Bio::SeqFeature::Annotated needs them maybe?
> 
> 	-hilmar
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Yep, removing them clobbers a ton of tests, including anything that requires
SeqIO::FTHelper.  Looks like SeqFeature::Generic and a few others use them.


I could understand if these were meant to be permanent methods, but why add
these in if they were to be deprecated in 1.6?  Something that was meant to
be a transition but wasn't finished?  That seems to be indicated in the
commented out lines for all the *tag* methods:

  #uncomment in 1.6
  #$self->deprecated('remove_tag() is deprecated, use
remove_Annotations()');

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
From lincoln.stein at gmail.com  Tue Oct 31 18:18:07 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 31 Oct 2006 18:18:07 -0500
Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning
In-Reply-To: <loom.20061020T041338-193@post.gmane.org>
References: <loom.20061020T041338-193@post.gmane.org>
Message-ID: <6dce9a0b0610311518l3bec852q5d04a9b488621377@mail.gmail.com>

Hi Keith,

The current Bio/DB/GFF/Util/Binning.pm file just contains the hierarchical
binning system that I implemented some time ago. Where is the R-tree system
that you describe? How much of an improvement did the R-tree scheme give
over the hierarchical scheme?

FTYI the GFF3 implementation uses a different binning scheme in which there
is a fixed-size bin. Every time a feature overlaps a bin, it creates a new
row in a table. So big features will have multiple rows and little features
that fit inside a bin will have only one row. The query for this is simpler
and seems to give the same relative speedup as the hierarchical binning
system. I'd really like to get these queries to go as fast as possible and
would love to work with you on this if you're interested.

Lincoln

On 10/19/06, Keith Player <keithplayer at hotmail.com> wrote:
>
> I know that there may be some changes resulting from new GFF3
> implementations,
> but thought I would see if the following is useful anyway.
>
> I implemented the R-tree binning schema as used by
> Bio::DB::GFF::Util::Binning
> and as mention in this article:
>
> I tested the following query on a normal table (no binning), but it
> assumes
> that you know the longest range in the table.  So for example with a table
> of
> human genes, where the longest gene we know of is around 2.4Mb.
>
> SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb])
> AND
> g.start < [end] AND g.end > [start] AND g.chromosome = '1'
>
> so for 100Mb:101Mb
>
> SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start <
> 101000000 AND g.end > 100000000 AND g.chromosome = '1'
>
>
> where [start] and [end] define the region of interest.  This query
> outperforms
> the R-Tree implementation on all tests that I have performed (for lengths
> of
> 200bp to 10Mb across a whole chromsome).  Could this be of some practical
> use?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu

From bosborne11 at verizon.net  Tue Oct 31 21:31:49 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 31 Oct 2006 22:31:49 -0400
Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling
In-Reply-To: <Pine.GSO.4.58.0610311438470.17750@stout.nhgri.nih.gov>
Message-ID: <C16D7F55.B1D9%bosborne11@verizon.net>

Nancy,

It looks like a good place to start would be the get_header() and
_get_header methods in Bio::SeqIO::scf. If you read t/scf.t you can see that
the author, at some point, wanted get_header to return meaningful
information but stepping through the test shows it returning a lot of UNDEF.
Now I don't know if this is due to the method or the source SCF file, but
you might be able to get these methods to work yourself.

But to answer your questions, yes, it certainly sounds reasonable that these
values would be extracted by Bio::SeqIO::scf.

Brian O.


On 10/31/06 3:51 PM, "Nancy Hansen" <nhansen at nhgri.nih.gov> wrote:

> 
> Hello,
> 
> As sequencing centers begin to deposit trace data from "Medical
> Sequencing" projects into the public archives, there is now the need to
> "anonymize" sequence trace files by removing embedded information which
> might be used to identify the individual who was the original source of
> the DNA being sequenced.
> 
> I was hoping I might be able to use Bio::SeqIO to manipulate the
> comments contained in an SCF-formatted trace file, but I'm finding that
> Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information.
> Since SCF is a widely-accepted standard for trace files, would it be
> reasonable to include fields like "scf_comments" and "scf_header" in a
> Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them?
> Likewise, it would be great if write_seq could pull these values right
> from a SequenceTrace object rather than requiring them as arguments.
> 
> I'd be happy to help in this effort if necessary.
> 
> Thanks,
> --Nancy
> 
> *************************************
> Nancy F. Hansen, PhD nhansen at nhgri.nih.gov
> Bioinformatics Group
> NIH Intramural Sequencing Center (NISC)
> 5625 Fishers Lane
> Rockville, MD 20852
> Phone: (301) 435-1560 Fax: (301) 435-6170
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Sun Oct  1 13:05:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 12:05:25 -0500
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>
	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>
	<451E3707.4090400@sendu.me.uk>
	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>
	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
Message-ID: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>


On Sep 30, 2006, at 4:43 PM, Hilmar Lapp wrote:

>
> On Sep 30, 2006, at 10:57 AM, Chris Fields wrote:
>
>> There should be a failed test to let us know of the problem.  As
>> currently set up, the XEMBL server failure doesn't show up in
>> Test::Harness test summaries.  Biblio_biofetch.t had the similar
>> problems before Brian's fixes.
>
> Just keep in mind that you may not want somebody's CPAN installation
> to fail (or require a 'forced' install) just because some server
> happens to be down for maintenance.
>
> 	-hilmar

I don't think this would be a problem unless users specifically set  
BIOPERLDEBUG to 1, which is something most people don't bother with  
before installation (and probably not something we should promote for  
normal installation anyway).  So, for CPAN installation we would  
suggest that BIOPERLDEBUG be 0 or not set at all, and outline the  
reasons why.

The idea is to retain current behavior (remote DB access will not be  
run unless BIOPERLDEBUG is set to 1) and apply it to all tests  
requiring such access.  Otherwise, just those tests are skipped (and  
not the rest of the tests, which occurs currently).  If BIOPERLDEBUG  
is set, the next tests would check the URL, which passes/fails (based  
on the specific value of $@), and runs/skips tests based on the mere  
presence of $@, which indicates some URL issue.  You can do this with  
Test::More, but I'm not sure this can be done with Test.pm or  
Test::Simple.

The current behavior just skips all tests based on a single failed  
URL.  Then, Test::Harness, as currently set, shows skipped tests as  
passed.  The last run I posted previously where XEMBL_DB.t remote DB  
tests failed, I also ran all tests (make test) and get this, which  
doesn't tell us that the remote URL failed:

-----------------------------------------

...
t/WABA.......................ok
t/XEMBL_DB...................ok
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext  
is not installed or is installed incorrectly - skipping ztr.t tests
ok
All tests successful, 5 subtests skipped.

-----------------------------------------


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct  1 13:17:24 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 12:17:24 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880609301610w7b838543t5f8ba7313d285915@mail.gmail.com>
References: <b99962880609271039s75cc4af4nc109cd637b5b267@mail.gmail.com>
	<7A592EAB-A869-4A6C-BFA8-F73F3DFD8F5B@gmx.net>
	<09FB1EB0-2C1C-4FCF-8339-E78556EFEFF2@uiuc.edu>
	<b99962880609280842w47401efnd6d00ff2a6e7fd98@mail.gmail.com>
	<8D75FE6D-C02D-4A86-93FA-B7256050AF11@uiuc.edu>
	<b99962880609280910i68a649fw38a4a77d514eccf@mail.gmail.com>
	<40155903-555A-4662-BCCE-38E5E3784118@uiuc.edu>
	<54E79A5F-5446-4D8E-AD26-B70894048D60@gmx.net>
	<b99962880609301444h3e0a8bd2y5d3ecb2ca9e222e6@mail.gmail.com>
	<1D69005A-DF0E-4F37-93FE-7577A32CC625@gmx.net>
	<b99962880609301610w7b838543t5f8ba7313d285915@mail.gmail.com>
Message-ID: <CAD572AC-B108-4520-8335-6B2F138905C9@uiuc.edu>

The '-w' flag on the shebang line is the source of those errors.  I  
never set it anymore on Windows due to this; I just use the 'use  
warnings' pragma.

If you use 'perl -I. t/test.t' you can normally get around the '-w'  
assumed by using 'make test'.

I will try running tests on bioperl-db and bioperl tomorrow on WinXP  
to confirm these.

Chris

On Sep 30, 2006, at 6:10 PM, Seth Johnson wrote:

> How do I get rid of all of the warnings for "redefined subroutines"  
> during
> the test??  It clutters the output and I can't see the errors.
>
> On 9/30/06, Hilmar Lapp <hlapp at gmx.net> wrote:
>>
>> It doesn't shed more light but it does raise an alert flag. All tests
>> are supposed to pass. The fact that they don't means the problems you
>> are seeing have nothing to do with your specific data or script.
>>
>> First off - can anyone else confirm those errors using the latest
>> Bioperl-db and Bioperl?
>>
>> Second - Seth could you run those tests individually, e.g., using
>>
>>         $ make test test_02species TEST_VERBOSE=1
>>
>> and similarly for the other tests that have failures and post the
>> output. Let's start with 02species and 03simpleseq.
>>
>>         -hilmar
>>
>> On Sep 30, 2006, at 5:44 PM, Seth Johnson wrote:
>>
>>> There are errors during the test. Here's their summary:
>>> ____________________________
>>> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
>>> -------------------------------------------------------------
>>> t\02species.t                 65    2   3.08%  63 65
>>> t\03simpleseq.t    1   256    59  106 179.66%  7-59
>>> t\04swiss.t                   52   14  26.92%  25 27-34 38-42
>>> t\12ontology.t     2   512   738 1471 199.32%  3-738
>>> t\16obda.t                    12    3  25.00%  10-12
>>> ____________________________
>>>
>>> May be that can shed some light on the problem?!?!
>>>
>>> On 9/29/06, Hilmar Lapp < hlapp at gmx.net> wrote:This may in fact be
>>> a knock-on effect of the fixes? <sigh>
>>>
>>> Seth, did you run the test suite that comes with bioperl-db, and did
>>> you get any errors?
>>>
>>>         -hilmar
>>>
>>> On Sep 28, 2006, at 2:26 PM, Chris Fields wrote:
>>>
>>>> Seth,
>>>>
>>>> The organism issue is a bug and has been reported, though I thought
>>>> it was fixed.
>>>>
>>>> The lack of the date and the version is a bit odd, but there have
>>>> been a lot of changes lately to bioperl-live (core bioperl in CVS),
>>>> and a few to bioperl-db.  How old is your bioperl and bioperl-db
>>>> installation.  Hilmar, any additional thoughts?
>>>>
>>>> Chris
>>>>
>>>> On Sep 28, 2006, at 11:10 AM, Seth Johnson wrote:
>>>>
>>>>> Thank you.  That takes care of that, however, I do have another
>>>>> gripe.  When
>>>>> running my script, quoted before, with "my $out =
>>>>> Bio::SeqIO->newFh('-format' => 'genbank');", I have several key
>>>>> pieces of
>>>>> information missing.  The most important one is the version
>>>>> number.  There's
>>>>> also a date missing, and source organism name is corrupted.
>>>>> Here's what I
>>>>> get:
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> LOCUS       NM_014580               2145 bp    dna     linear    
>>>>> UNK
>>>>> DEFINITION  Homo sapiens solute carrier family 2, (facilitated
>>>>> glucose
>>>>>             transporter) member 8 (SLC2A8), mRNA.
>>>>> ACCESSION   NM_014580
>>>>> SOURCE      sapiens.
>>>>>   ORGANISM  sapiens
>>>>>             Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa;
>>>>> Bilateria;
>>>>>             Coelomata; Deuterostomia; Chordata; Craniata;
>>> Vertebrata;
>>>>>             Gnathostomata; Teleostomi; Euteleostomi;  
>>>>> Sarcopterygii;
>>>>> Tetrapoda;
>>>>>             Amniota; Mammalia; Theria; Eutheria; Euarchontoglires;
>>>>> Primates;
>>>>>             Haplorrhini; Simiiformes; Catarrhini; Hominoidea;
>>>>> Hominidae;
>>>>>             Homo/Pan/Gorilla group; Homo.
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>> All of the missing information is stored in BioSQL and
>>>>> theoretically should
>>>>> be in the outpu. Here's how NCBI genbank file looks:
>>>>>
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>> LOCUS       NM_014580               2145 bp    mRNA    linear
>>>>> PRI 17-OCT-2005
>>>>> DEFINITION  Homo sapiens solute carrier family 2, (facilitated
>>>>> glucose
>>>>>             transporter) member 8 (SLC2A8), mRNA.
>>>>> ACCESSION   NM_014580
>>>>> VERSION     NM_014580.3  GI:51870928
>>>>> KEYWORDS    .
>>>>> SOURCE      Homo sapiens (human)
>>>>>   ORGANISM  Homo sapiens
>>>>> <http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606 >
>>>>>             Eukaryota; Metazoa; Chordata; Craniata; Vertebrata;
>>>>> Euteleostomi;
>>>>>             Mammalia; Eutheria; Euarchontoglires; Primates;
>>>>> Haplorrhini;
>>>>>             Catarrhini; Hominidae; Homo.
>>>>>
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>>
>>>>> On 9/28/06, Chris Fields <cjfields at uiuc.edu> wrote:
>>>>>>
>>>>>> Those are from the excessively paranoid '-w' flag on the shebang
>>>>>> line.  If you remove the flag but add the 'use warnings' pragma
>>> the
>>>>>> 'subroutine x redefined' warnings go away.  This, BTW, is one
>>> of the
>>>>>> quirks of the ActivePerl distribution; other OSs don't have the
>>> same
>>>>>> problem.
>>>>>>
>>>>>> The 'solution' described on that page is actually a workaround,
>>>>>> not a
>>>>>> bugfix.  It causes problems with stack traces with error handling
>>>>>> but
>>>>>> seems harmless beyond that.  I haven't been able to find a
>>>>>> satisfactory fix which works on all OS's.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>> On Sep 28, 2006, at 10:42 AM, Seth Johnson wrote:
>>>>>>
>>>>>>> This is under Windows, but using ActiveState Komodo 3.5 and  
>>>>>>> their
>>>>>>> latest Perl for Windows and latest BioPerl & BioPerl-db from  
>>>>>>> CVS.
>>>>>>>
>>>>>>> I actually just stumbled upon a solution.  It's described in the
>>>>>>> "Installing Bioperl on Windows" by adding a comma after
>>> $class: in
>>>>>>> Bio::Root::Root throw() subroutine.  Thanks for hinting me about
>>>>>>> what I run it on.
>>>>>>>
>>>>>>> The code works now, BUT it spews whole bunch of warnings about
>>>>>>> "Subroutine .... redefined":
>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\BioEntry
>>>>>>> .pm line 88.
>>>>>>> Subroutine object_id redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 128.
>>>>>>> Subroutine version redefined at c:/Perl/site/lib/Bio\BioEntry.pm
>>>>>>> line 150.
>>>>>>> Subroutine authority redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 171.
>>>>>>> Subroutine namespace redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 192.
>>>>>>> Subroutine display_name redefined at c:/Perl/site/lib/Bio
>>>>>>> \BioEntry.pm line 217.
>>>>>>> Subroutine description redefined at c:/Perl/site/lib/Bio
>>>>>>> \BioEntry.pm line 241.
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>> line
>>>>>>> 201.
>>>>>>> Subroutine verbose redefined at c:/Perl/site/lib/Bio\Root 
>>>>>>> \Root.pm
>>>>>>> line 234.
>>>>>>> Subroutine _register_for_cleanup redefined at c:/Perl/site/lib/
>>> Bio
>>>>>>> \Root\Root.pm line 246.
>>>>>>> Subroutine _unregister_for_cleanup redefined at c:/Perl/site/ 
>>>>>>> lib/
>>>>>>> Bio
>>>>>>> \Root\Root.pm line 256.
>>>>>>> Subroutine _cleanup_methods redefined at c:/Perl/site/lib/Bio
>>> \Root
>>>>>>> \Root.pm line 263.
>>>>>>> Subroutine throw redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>>>>>> line 316.
>>>>>>> Subroutine debug redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>>>>>> line 379.
>>>>>>> Subroutine _load_module redefined at c:/Perl/site/lib/Bio\Root
>>>>>>> \Root.pm line 398.
>>>>>>> Subroutine DESTROY redefined at c:/Perl/site/lib/Bio\Root 
>>>>>>> \Root.pm
>>>>>>> line 426.
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\RootI.pm
>>> line
>>>>>>> 117.
>>>>>>> Subroutine _initialize redefined at c:/Perl/site/lib/Bio\Root
>>>>>>> \RootI.pm line 128.
>>>>>>> ...
>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>>
>>>>>>>
>>>>>>> On 9/28/06, Chris Fields <cjfields at uiuc.edu> wrote: I had
>>> problems
>>>>>>> with bioperl-db on native WinXP (not cygwin), but I
>>>>>>> did manage to get it running in cygwin with some effort.  The
>>> issue
>>>>>>> on native WinXP was related to Bio::Root::Root::throw(), though.
>>>>>>>
>>>>>>> There is a bug and workaround filed on Bugzilla, but I haven't
>>>>>>> worked
>>>>>>> on it in a while (and the workaround has some problems as
>>> well).  I
>>>>>>> may try running it again to see what happens.
>>>>>>>
>>>>>>> http://bugzilla.open-bio.org/show_bug.cgi?id=1938
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On Sep 28, 2006, at 9:04 AM, Hilmar Lapp wrote:
>>>>>>>
>>>>>>>> Very odd. This is under Windows, presumably using Cygwin?
>>>>>>>>
>>>>>>>> The method Bio::Root::Root::throw() clearly exists, and
>>>>>>>> PersistentObject inherits from it. The exception it was
>>> trying to
>>>>>>>> throw has nothing to do with failure or success to find the
>>>>>>>> database
>>>>>>>> row (actually it did succeed since otherwise it wouldn't
>>> construct
>>>>>>>> the object) but with dynamically loading a class, presumably
>>>>>>>> Bio::DB::Persistent::Seq.
>>>>>>>>
>>>>>>>> Are you using the 1.5.x release of bioperl?
>>>>>>>>
>>>>>>>> Does anyone on the list have any experience with these sorts of
>>>>>>>> things on Windows?
>>>>>>>>
>>>>>>>> (Seth, I've moved this thread to the bioperl list, since  
>>>>>>>> this is
>>>>>>> what
>>>>>>>> the problem is about.)
>>>>>>>>
>>>>>>>>       -hilmar
>>>>>>>>
>>>>>>>> On Sep 27, 2006, at 1:39 PM, Seth Johnson wrote:
>>>>>>>>
>>>>>>>>> Hello guys,
>>>>>>>>>
>>>>>>>>> I successfully populated the biosql database, thanks to you.
>>>>>>>>> Now,
>>>>>>>>> I'm
>>>>>>>>> trying to retrieve a sequence from it following the example
>>> from
>>>>>>>>> BOSC2003
>>>>>>>>> slides and ran into uninformative error (at least to me it
>>>>>>>>> doesn't
>>>>>>>>> mean
>>>>>>>>> anyting).  I suspect that I'm missing something and hope you
>>> can
>>>>>>>>> point me in
>>>>>>>>> the right direction.  Here's my source code:
>>>>>>>>>
>>>>>>>
>>> -------------------------------------------------------------------
>>>>>>> --
>>>>>>>>> -
>>>>>>>>> ---
>>>>>>>>> #!/usr/bin/perl -w
>>>>>>>>> use strict;
>>>>>>>>> use warnings;
>>>>>>>>>
>>>>>>>>> use Bio::Seq;
>>>>>>>>> use Bio::Seq::SeqFactory;
>>>>>>>>> use Bio::DB::SimpleDBContext;
>>>>>>>>> use Bio::DB::BioDB;
>>>>>>>>>
>>>>>>>>> my $dbc = Bio::DB::SimpleDBContext->new(
>>>>>>>>>     -driver => 'mysql',
>>>>>>>>>     -dbname => 'BioSQL_1',
>>>>>>>>>     -host => ' 192.168.1.3',
>>>>>>>>>     -user => 'xxxxx',
>>>>>>>>>     -pass => 'xxxxxx'
>>>>>>>>> );
>>>>>>>>>
>>>>>>>>> my $db = Bio::DB::BioDB->new(-database  => 'biosql',
>>>>>>>>>                             -dbcontext => $dbc);
>>>>>>>>>
>>>>>>>>> my $seq = Bio::Seq->new(-accession_number => 'NM_014580', -
>>>>>>>>> namespace =>
>>>>>>>>> 'refseq_H_sapiens');
>>>>>>>>> my $seqfact = Bio::Seq::SeqFactory->new(-type => 'Bio::Seq');
>>>>>>>>> my $adp = $db->get_object_adaptor($seq);
>>>>>>>>> my $dbseq = $adp->find_by_unique_key($seq, -obj_factory =>
>>>>>>> $seqfact);
>>>>>>>>>
>>>>>>>>> my $out = Bio::SeqIO->newFh('-format' => 'EMBL');
>>>>>>>>> print $out $dbseq;
>>>>>>>>>
>>>>>>>>> exit;
>>>>>>>>>
>>> -----------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> Just when the "find_by_unique_key" function is executed I
>>> get the
>>>>>>>>> following
>>>>>>>>> error:
>>>>>>>>>
>>>>>>>>> ================================
>>>>>>>>> Undefined subroutine &Bio::Root::Root::throw called at
>>>>>>>>> c:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm line
>>> 199.
>>>>>>>>> ================================
>>>>>>>>>
>>>>>>>>> The sequence does exist in the database. I checked that.  Any
>>>>>>>>> ideas???
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best Regards,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Seth Johnson
>>>>>>>>> Senior Bioinformatics Associate
>>>>>>>>> _______________________________________________
>>>>>>>>> BioSQL-l mailing list
>>>>>>>>> BioSQL-l at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ===========================================================
>>>>>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>>>>>> ===========================================================
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>
>>>>>>> Christopher Fields
>>>>>>> Postdoctoral Researcher
>>>>>>> Lab of Dr. Robert Switzer
>>>>>>> Dept of Biochemistry
>>>>>>> University of Illinois Urbana-Champaign
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>>
>>>>>>>
>>>>>>> Seth Johnson
>>>>>>> Senior Bioinformatics Associate
>>>>>>>
>>>>>>> Ph: (202) 470-0900
>>>>>>> Fx: (775) 251-0358
>>>>>>
>>>>>> Christopher Fields
>>>>>> Postdoctoral Researcher
>>>>>> Lab of Dr. Robert Switzer
>>>>>> Dept of Biochemistry
>>>>>> University of Illinois Urbana-Champaign
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>>
>>>>>
>>>>> Seth Johnson
>>>>> Senior Bioinformatics Associate
>>>>>
>>>>> Ph: (202) 470-0900
>>>>> Fx: (775) 251-0358
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher
>>>> Lab of Dr. Robert Switzer
>>>> Dept of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>
>>>
>>> --
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>>
>>>
>>> Seth Johnson
>>> Senior Bioinformatics Associate
>>>
>>> Ph: (202) 470-0900
>>> Fx: (775) 251-0358
>>
>> --
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>>
>
>
> -- 
> Best Regards,
>
>
> Seth Johnson
> Senior Bioinformatics Associate
>
> Ph: (202) 470-0900
> Fx: (775) 251-0358
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From osborne1 at optonline.net  Sun Oct  1 17:49:47 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Sun, 01 Oct 2006 17:49:47 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061001183214.GB12075@iucha.net>
Message-ID: <C145B03B.A8A5%osborne1@optonline.net>

Florin,

This is fixed in CVS now. What had happened is that the DIP file had some
minimal protein (node) entries where the only id available was DIP's
internal identifier. Not ideal to have to use these as accessions but
there's no other choice.

Thank you for the note, and in the future write to bioperl-l since there may
be others who are interested in hearing about what you've encountered.

Brian O.


On 10/1/06 2:32 PM, "Florin Iucha" <florin at iucha.net> wrote:

> Hello,
> 
> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and
> I am using it to read the 20060402 edition release of the DIP [2] dataset.
> 
> Starting with the simple program you show in the man page:
> 
>    my $io = Bio::Network::IO->new(-format => 'psi',
>                                   -file   => $ARGV[0]);
> 
>    my $network = $io->next_network;
> 
> I get 772 instances of:
> 
>    Use of uninitialized value in string eq at
>    /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 326.
> 
> I don't know if it is just an annoyance or something bad, so you might
> want to take a look at it.
> 
> Thank you for your work,
> florin
> 
> [1] http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-network/
> [2] http://dip.doe-mbi.ucla.edu/


From osborne1 at optonline.net  Sun Oct  1 17:56:39 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Sun, 01 Oct 2006 17:56:39 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061001211844.GC12075@iucha.net>
Message-ID: <C145B1D7.A8A8%osborne1@optonline.net>

Florin,

I'm not seeing any segmentation fault using the same file you're using as
input (dip20060402.mif). I'm assuming you don't see this error when you use
smaller files as input, like those in the t/data directory.

When I watch the script in top I see Perl using about 135Mb (RSIZE) right
before the script exits. How much memory do you use?

Thank you for the note, and in the future write to bioperl-l since there may
be others who are interested in hearing about what you've encountered.

Brian O.


On 10/1/06 5:18 PM, "Florin Iucha" <florin at iucha.net> wrote:

> On Sun, Oct 01, 2006 at 01:32:14PM -0500, Florin Iucha wrote:
>> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and
>> I am using it to read the 20060402 edition release of the DIP [2] dataset.
> 
> Using the attached script, I am getting a segmentation fault at the
> end, right after printing "That's all, Folks!"  Maybe some cleanup is
> going off in a wrong direction.
> 
> florin


From florin at iucha.net  Sun Oct  1 20:24:03 2006
From: florin at iucha.net (Florin Iucha)
Date: Sun, 1 Oct 2006 19:24:03 -0500
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <C145B1D7.A8A8%osborne1@optonline.net>
References: <20061001211844.GC12075@iucha.net>
	<C145B1D7.A8A8%osborne1@optonline.net>
Message-ID: <20061002002403.GD12075@iucha.net>

On Sun, Oct 01, 2006 at 05:56:39PM -0400, Brian Osborne wrote:
> I'm not seeing any segmentation fault using the same file you're using as
> input (dip20060402.mif). I'm assuming you don't see this error when you use
> smaller files as input, like those in the t/data directory.

The t/data files are fine.

Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the
MINT [1] database does not produce the crash.  It has a new warning, however:

   Can't call method "text" on an undefined value at
   /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290.

> When I watch the script in top I see Perl using about 135Mb (RSIZE) right
> before the script exits. How much memory do you use?

"ps ux" tells me VSZ = 272788 and RSZ = 254992. This is on x86-64 with
64 bit perl.  The box has 2 GB of physical memory so these numbers
don't seem to be a concern.

> Thank you for the note, and in the future write to bioperl-l since there may
> be others who are interested in hearing about what you've encountered.

Do'h! You have the list address loud and clear in three places, but I got
your contact info from the AUTHORS.  Will use the proper channel from now
on!

Thanks,
florin

[1] ftp://mint.bio.uniroma2.it/pub/release/psi1/

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061001/901e447e/attachment-0002.bin>

From cjfields at uiuc.edu  Mon Oct  2 00:35:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 23:35:22 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880609301635w421fae0er3497ba655679f0bc@mail.gmail.com>
Message-ID: <000001c6e5dc$2eceabe0$15327e82@pyrimidine>

Seth,

What version of MySQL and perl are you using?  I'm using MySQL 5.0.18 (but
am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819.

I ran into a few problems with bioperl-db tests which were unrelated the
ones below, but I'm wondering if it is a difference in MySQL versions.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Seth Johnson
> Sent: Saturday, September 30, 2006 6:35 PM
> To: Hilmar Lapp
> Cc: Chris Fields; Bioperl List
> Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> 
> Here're complete test details:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

...

> FAILED tests 10-12
>     Failed 3/12 tests, 75.00% okay
> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> --------------------------------------------------------------------------
> -----
> t\02species.t                 65    2   3.08%  63 65
> t\03simpleseq.t    1   256    59  106 179.66%  7-59
> t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> t\12ontology.t     2   512   738 1471 199.32%  3-738
> t\16obda.t                    12    3  25.00%  10-12
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From torsten.seemann at infotech.monash.edu.au  Mon Oct  2 02:06:50 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Mon, 02 Oct 2006 16:06:50 +1000
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
References: <451C8ED8.2060003@infotech.monash.edu.au>
	<451CC40D.2030401@sendu.me.uk>
	<2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
Message-ID: <4520AC7A.1050009@infotech.monash.edu.au>


 >>> I have removed all use/@ISA Bio::Root::Object references from
 >>> bioperl-live, except for those in Bio::Root::* itself:

 >> So I'd say they're both relics that can be removed. In fact I was
 >> planning on getting rid off all references to both of these modules
 >> before you did, so thanks! :)

> I think they can go. It's probably a pre-1.0 deprecation that somehow  
> was never followed through on.

Today I did a fresh CVS checkout of bioperl-live, and deleted the 
following modules and tests, and all tests passed with BIOPERLDEBUG=0

     * Bio::Root::Err
     * Bio::Root::Global
     * Bio::Root::IOManager
     * Bio::Root::Object
     * Bio::Root::Storable
     * Bio::Root::Utilities  # may be used by third parties?
     * Bio::Root::Vector
     * Bio::Root::Xref
     * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
     * t/RootStorable.t

Should we schedule for deprecation, or deprecate immediately as Hilmar 
suggested they were meant to be deprecated long ago ?

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From bix at sendu.me.uk  Mon Oct  2 05:40:02 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 10:40:02 +0100
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>	<451E3707.4090400@sendu.me.uk>	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
	<3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
Message-ID: <4520DE72.4000603@sendu.me.uk>

Chris Fields wrote:
>
> The idea is to retain current behavior (remote DB access will not be  
> run unless BIOPERLDEBUG is set to 1) and apply it to all tests  
> requiring such access.  Otherwise, just those tests are skipped (and  
> not the rest of the tests, which occurs currently).  If BIOPERLDEBUG  
> is set, the next tests would check the URL, which passes/fails (based  
> on the specific value of $@), and runs/skips tests based on the mere  
> presence of $@, which indicates some URL issue.  You can do this with  
> Test::More, but I'm not sure this can be done with Test.pm or  
> Test::Simple.

Firstly, BIOPERLDEBUG should not be abused; it should be used only when 
you want to see extra debugging messages. There should be another 
variable that you can set to choose if network-requiring tests are run, 
and it should also be a configurable choice when you run perl Makefile.PL.

(But changing this isn't going to happen for 1.5.2)

When the server problem is ambiguous we should not fail the test. Just 
make the skip message visible and pass all ok...


> The current behavior just skips all tests based on a single failed  
> URL.  Then, Test::Harness, as currently set, shows skipped tests as  
> passed.  The last run I posted previously where XEMBL_DB.t remote DB  
> tests failed, I also ran all tests (make test) and get this, which  
> doesn't tell us that the remote URL failed:
> 
> -----------------------------------------
> 
> ...
> t/WABA.......................ok
> t/XEMBL_DB...................ok
> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext  
> is not installed or is installed incorrectly - skipping ztr.t tests
> ok
> All tests successful, 5 subtests skipped.

All you have to do to make it visible is start the skip message with the 
work 'Skip':

skip('Skip server may be down',1);

...
t/WABA.......................ok 

t/XEMBL_DB...................ok 

         1/9 skipped: server may be down
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is 
not installed or is installed incorrectly - skipping ztr.t tests
t/ztr........................ok


It's nicer when using Test::More.


From bix at sendu.me.uk  Mon Oct  2 05:55:27 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 10:55:27 +0100
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au>
References: <451C8ED8.2060003@infotech.monash.edu.au>	<451CC40D.2030401@sendu.me.uk>	<2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
	<4520AC7A.1050009@infotech.monash.edu.au>
Message-ID: <4520E20F.6040406@sendu.me.uk>

Torsten Seemann wrote:
>  >>> I have removed all use/@ISA Bio::Root::Object references from
>  >>> bioperl-live, except for those in Bio::Root::* itself:
> 
>  >> So I'd say they're both relics that can be removed. In fact I was
>  >> planning on getting rid off all references to both of these modules
>  >> before you did, so thanks! :)
> 
>> I think they can go. It's probably a pre-1.0 deprecation that somehow  
>> was never followed through on.
> 
> Today I did a fresh CVS checkout of bioperl-live, and deleted the 
> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> 
>      * Bio::Root::Err
>      * Bio::Root::Global
>      * Bio::Root::IOManager
>      * Bio::Root::Object
>      * Bio::Root::Storable
>      * Bio::Root::Utilities  # may be used by third parties?
>      * Bio::Root::Vector
>      * Bio::Root::Xref
>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>      * t/RootStorable.t
> 
> Should we schedule for deprecation, or deprecate immediately as Hilmar 
> suggested they were meant to be deprecated long ago ?

I'm happy to get rid of them all straight away. Does anyone object?


From florin at iucha.net  Sun Oct  1 21:40:07 2006
From: florin at iucha.net (Florin Iucha)
Date: Sun, 1 Oct 2006 20:40:07 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on
	AMD64
Message-ID: <20061002014007.GG12075@iucha.net>

Hello,

I am trying to install bioperl-network from CVS.  I found this to
require bioperl from CVS, which requires bioperl-ext from CVS.
I have compiled and installed io_lib 1.10.1.

After running "perl Makefile.PL; make test" in bioperl-ext I see a lot 
sources being compiled, then:

cc -c  -I./libs -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O2   -DVERSION=\"1.5.1\" -DXS_VERSION=\"1.5.1\" -fPIC "-I/usr/lib/perl/5.8/CORE"  -DPOSIX -DNOERROR Align.c
Running Mkbootstrap for Bio::Ext::Align ()
chmod 644 Align.bs
rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so
cc  -shared -L/usr/local/lib Align.o  -o ../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a  \
           -lm          \

/usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC
libs/libsw.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1
make[1]: Leaving directory `/scratch/dmbio/tools/bioperl-ext/Bio/Ext/Align'
make: *** [subdirs] Error 2

This is on a Debian AMD64 box:

florin at zeus $ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu
Thread model: posix
gcc version 4.1.2 20060901 (prerelease) (Debian 4.1.1-13)
florin at zeus $ perl -V
Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
  Platform:
    osname=linux, osvers=2.6.16-1-vserver-amd64-k8, archname=x86_64-linux-gnu-thread-multi
    uname='linux excelsior 2.6.16-1-vserver-amd64-k8 #2 smp tue apr 4 03:40:49 utc 2006 x86_64 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=define use64bitall=define uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion='', gccversion='4.1.2 20060729 (prerelease) (Debian 4.1.1-10)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.6.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8
    gnulibc_version='2.3.6'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl):
  Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT
                        PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ALL
                        USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES
                        USE_PERLIO USE_REENTRANT_API

The compiler command line for aln.o is lacking -fPIC:

cc -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN
-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -DPOSIX -DNOERROR   -c -o aln.o aln.c

Adding -fPIC to the CCFLAGS variable in Bio/Ext/Align/Makefile and
Makefile seems to take build further, but it fails with a similar
error in Bio/SeqIO/staden/_Inline/build/Bio/SeqIO/staden/read. That
Makefile seems to be regenerated every time I run 'make test' in the
top level directory.

The error in ../staden/read is:

rm -f blib/arch/auto/Bio/SeqIO/staden/read/read.so
cc  -shared -L/usr/local/lib read.o  -o blib/arch/auto/Bio/SeqIO/staden/read/read.so    \
           -L/usr/local/lib -lread -lz          \

/usr/bin/ld: /usr/local/lib/libread.a(libread_a-Read.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC
/usr/local/lib/libread.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [blib/arch/auto/Bio/SeqIO/staden/read/read.so] Error 1

So, the questions appears to be:
   - should "-fPIC" be appended to CFLAGS in the generated Makefiles?
   - is there anything wrong with io_lib flags?
   - has anybody built bioperl-ext on AMD64?

I can help with debugging or testing if given a gentle nudge in the right
direction, but I have little experience with the interactions between perl
and static libraries on 64 bit.

Thanks,
florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061001/bc134c7e/attachment-0002.bin>

From bix at sendu.me.uk  Mon Oct  2 06:52:47 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 11:52:47 +0100
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on	AMD64
In-Reply-To: <20061002014007.GG12075@iucha.net>
References: <20061002014007.GG12075@iucha.net>
Message-ID: <4520EF7F.40908@sendu.me.uk>

Florin Iucha wrote:
> Hello,
> 
> I am trying to install bioperl-network from CVS.  I found this to
> require bioperl from CVS, which requires bioperl-ext from CVS.

I can't help with the compile problems you encountered (other than to 
say I also have problems under AMD64), but from where did you get the 
idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
recent changes to Makefile.PL may give that impression...


From cjfields at uiuc.edu  Mon Oct  2 08:26:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 07:26:57 -0500
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <4520DE72.4000603@sendu.me.uk>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>	<451E3707.4090400@sendu.me.uk>	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
	<3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
	<4520DE72.4000603@sendu.me.uk>
Message-ID: <DAAC7FDC-0C03-4345-9E09-DBF04D521628@uiuc.edu>


On Oct 2, 2006, at 4:40 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>> The idea is to retain current behavior (remote DB access will not be
>> run unless BIOPERLDEBUG is set to 1) and apply it to all tests
>> requiring such access.  Otherwise, just those tests are skipped (and
>> not the rest of the tests, which occurs currently).  If BIOPERLDEBUG
>> is set, the next tests would check the URL, which passes/fails (based
>> on the specific value of $@), and runs/skips tests based on the mere
>> presence of $@, which indicates some URL issue.  You can do this with
>> Test::More, but I'm not sure this can be done with Test.pm or
>> Test::Simple.
>
> Firstly, BIOPERLDEBUG should not be abused; it should be used only  
> when
> you want to see extra debugging messages. There should be another
> variable that you can set to choose if network-requiring tests are  
> run,
> and it should also be a configurable choice when you run perl  
> Makefile.PL.
>
> (But changing this isn't going to happen for 1.5.2)
>
> When the server problem is ambiguous we should not fail the test. Just
> make the skip message visible and pass all ok...

I agree, as well as with your assessment of BIOPERLDEBUG (which I  
alluded to in a previous post).  Torsten suggested creating a new  
env. variable for network tests.

It's obvious this won't be done before 1.5.2, but we can make plans  
towards the next release.

>> The current behavior just skips all tests based on a single failed
>> URL.  Then, Test::Harness, as currently set, shows skipped tests as
>> passed.  The last run I posted previously where XEMBL_DB.t remote DB
>> tests failed, I also ran all tests (make test) and get this, which
>> doesn't tell us that the remote URL failed:
>>
>> -----------------------------------------
>>
>> ...
>> t/WABA.......................ok
>> t/XEMBL_DB...................ok
>> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext
>> is not installed or is installed incorrectly - skipping ztr.t tests
>> ok
>> All tests successful, 5 subtests skipped.
>
> All you have to do to make it visible is start the skip message  
> with the
> work 'Skip':
>
> skip('Skip server may be down',1);
>
> ...
> t/WABA.......................ok
>
> t/XEMBL_DB...................ok
>
>          1/9 skipped: server may be down
> t/ztr........................Bio::SeqIO::staden::read of bioperl- 
> ext is
> not installed or is installed incorrectly - skipping ztr.t tests
> t/ztr........................ok
>
>
> It's nicer when using Test::More.

Okay, if Test::Harness picks that up it would be okay.  We could use  
skip blocks to skip subsets of tests that require remote access (like  
SeqFeature.t) as opposed to skipping all tests.

I think we want to avoid promoting running tests with BIOPERLDEBUG  
(or similar) upon installation for everyday installation anyway (such  
as from CPAN, which Hilmar points out).  It's not something everybody  
installing a new BioPerl should be running unless they run into  
problems.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From florin at iucha.net  Mon Oct  2 08:15:06 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 07:15:06 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
	on	AMD64
In-Reply-To: <4520EF7F.40908@sendu.me.uk>
References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk>
Message-ID: <20061002121506.GB14409@iucha.net>

On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
> Florin Iucha wrote:
> > I am trying to install bioperl-network from CVS.  I found this to
> > require bioperl from CVS, which requires bioperl-ext from CVS.
> 
> I can't help with the compile problems you encountered (other than to 
> say I also have problems under AMD64), but from where did you get the 
> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
> recent changes to Makefile.PL may give that impression...

Running the tests for bioperl-live mention in some places that 'this
test has been skipped since $foo is not available' and I found the
'foos' in bioperl-ext.

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/8fc9df03/attachment-0002.bin>

From bix at sendu.me.uk  Mon Oct  2 10:05:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 15:05:11 +0100
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on	AMD64
In-Reply-To: <20061002121506.GB14409@iucha.net>
References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk>
	<20061002121506.GB14409@iucha.net>
Message-ID: <45211C97.2060800@sendu.me.uk>

Florin Iucha wrote:
> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
>> Florin Iucha wrote:
>>> I am trying to install bioperl-network from CVS.  I found this to
>>> require bioperl from CVS, which requires bioperl-ext from CVS.
>> I can't help with the compile problems you encountered (other than to 
>> say I also have problems under AMD64), but from where did you get the 
>> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
>> recent changes to Makefile.PL may give that impression...
> 
> Running the tests for bioperl-live mention in some places that 'this
> test has been skipped since $foo is not available' and I found the
> 'foos' in bioperl-ext.

Right, yes. The idea is, you'd only need to install bioperl-ext if you 
wanted to use the modules that the complaining tests test.
So if none of the things that were skipped matter to you, don't install ext.

I guess this needs to be clarified in documentation somewhere.


From cjfields at uiuc.edu  Mon Oct  2 10:13:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:13:56 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au>
Message-ID: <001801c6e62d$02c883d0$15327e82@pyrimidine>


>  >>> I have removed all use/@ISA Bio::Root::Object references from
>  >>> bioperl-live, except for those in Bio::Root::* itself:
> 
>  >> So I'd say they're both relics that can be removed. In fact I was
>  >> planning on getting rid off all references to both of these modules
>  >> before you did, so thanks! :)
> 
> > I think they can go. It's probably a pre-1.0 deprecation that somehow
> > was never followed through on.
> 
> Today I did a fresh CVS checkout of bioperl-live, and deleted the
> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> 
>      * Bio::Root::Err
>      * Bio::Root::Global
>      * Bio::Root::IOManager
>      * Bio::Root::Object
>      * Bio::Root::Storable
>      * Bio::Root::Utilities  # may be used by third parties?
>      * Bio::Root::Vector
>      * Bio::Root::Xref
>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>      * t/RootStorable.t
> 
> Should we schedule for deprecation, or deprecate immediately as Hilmar
> suggested they were meant to be deprecated long ago ?

I vote for quick deprecation; I had also noticed that these were superfluous
and added them as possible deprecations to the wiki page.  However, we need
to be careful about that 'third-party use' caveat you have for
Bio::Root::Utilities; there's another one with Bio::Root::Storable and
Ensembl:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2924/focus=2924

and it seems to have it's users:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/8242/focus=8242

The others (including Bio::Root::Utilities) haven't had any major threads on
the mail lists in a very long time.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> --
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Mon Oct  2 10:16:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:16:31 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of
	bioperl-exton	AMD64
In-Reply-To: <20061002121506.GB14409@iucha.net>
Message-ID: <001901c6e62d$5c4fac80$15327e82@pyrimidine>

They're not absolutely necessary; the tests are skipped w/o failure because
bioperl-ext is optional.  These are only necessary if you want the ability
to read sequence trace files.  

BTW, you might have a rough time on trying to install bioperl-ext depending
on your platform.  Note the following bug report:

http://bugzilla.open-bio.org/show_bug.cgi?id=2074

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Florin Iucha
> Sent: Monday, October 02, 2006 7:15 AM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-
> exton AMD64
> 
> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
> > Florin Iucha wrote:
> > > I am trying to install bioperl-network from CVS.  I found this to
> > > require bioperl from CVS, which requires bioperl-ext from CVS.
> >
> > I can't help with the compile problems you encountered (other than to
> > say I also have problems under AMD64), but from where did you get the
> > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though
> > recent changes to Makefile.PL may give that impression...
> 
> Running the tests for bioperl-live mention in some places that 'this
> test has been skipped since $foo is not available' and I found the
> 'foos' in bioperl-ext.
> 
> florin
> 
> --
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra


From osborne1 at optonline.net  Mon Oct  2 10:14:13 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 10:14:13 -0400
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520E20F.6040406@sendu.me.uk>
Message-ID: <C14696F5.A903%osborne1@optonline.net>

Sendu,

No objection but someone should check the scripts in examples/root to make
sure that they are not used there.

Brian O.


On 10/2/06 5:55 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> Torsten Seemann wrote:
>>>>> I have removed all use/@ISA Bio::Root::Object references from
>>>>> bioperl-live, except for those in Bio::Root::* itself:
>> 
>>>> So I'd say they're both relics that can be removed. In fact I was
>>>> planning on getting rid off all references to both of these modules
>>>> before you did, so thanks! :)
>> 
>>> I think they can go. It's probably a pre-1.0 deprecation that somehow
>>> was never followed through on.
>> 
>> Today I did a fresh CVS checkout of bioperl-live, and deleted the
>> following modules and tests, and all tests passed with BIOPERLDEBUG=0
>> 
>>      * Bio::Root::Err
>>      * Bio::Root::Global
>>      * Bio::Root::IOManager
>>      * Bio::Root::Object
>>      * Bio::Root::Storable
>>      * Bio::Root::Utilities  # may be used by third parties?
>>      * Bio::Root::Vector
>>      * Bio::Root::Xref
>>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>>      * t/RootStorable.t
>> 
>> Should we schedule for deprecation, or deprecate immediately as Hilmar
>> suggested they were meant to be deprecated long ago ?
> 
> I'm happy to get rid of them all straight away. Does anyone object?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From johnson.biotech at gmail.com  Mon Oct  2 10:21:50 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 2 Oct 2006 10:21:50 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <000001c6e5dc$2eceabe0$15327e82@pyrimidine>
References: <b99962880609301635w421fae0er3497ba655679f0bc@mail.gmail.com>
	<000001c6e5dc$2eceabe0$15327e82@pyrimidine>
Message-ID: <b99962880610020721j776d3801m4f5b49cd1bdf66c6@mail.gmail.com>

I'm using MySQL 5.0.19 and Perl v5.8.7 [MSWin32-x86-multi-thread]

On 10/2/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Seth,
>
> What version of MySQL and perl are you using?  I'm using MySQL 5.0.18 (but
> am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819.
>
> I ran into a few problems with bioperl-db tests which were unrelated the
> ones below, but I'm wondering if it is a difference in MySQL versions.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------
> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358


From osborne1 at optonline.net  Mon Oct  2 10:08:50 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 10:08:50 -0400
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on AMD64
In-Reply-To: <20061002014007.GG12075@iucha.net>
Message-ID: <C14695B2.A900%osborne1@optonline.net>

Florian,

Minor correction here, the Bioperl package does not require bioperl-ext.
However we see there is a problem compiling bioperl-ext...

Brian O.


On 10/1/06 9:40 PM, "Florin Iucha" <florin at iucha.net> wrote:

> I am trying to install bioperl-network from CVS.  I found this to
> require bioperl from CVS, which requires bioperl-ext from CVS.


From JK at novozymes.com  Mon Oct  2 10:05:34 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Mon, 2 Oct 2006 16:05:34 +0200
Subject: [Bioperl-l] Blast parser.
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net>


Hi. 

I've tried to use the blast-parser but I cannot get the original alignment
out of the parser. Is it possible to get that out of the 
Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a
clustalw alignment out when it isn't that type of alignment people are
used to get from blast. 

Thanks 

Jesper


From cjfields at uiuc.edu  Mon Oct  2 10:36:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:36:31 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <C14696F5.A903%osborne1@optonline.net>
Message-ID: <001d01c6e630$27792fb0$15327e82@pyrimidine>

> Sendu,
> 
> No objection but someone should check the scripts in examples/root to make
> sure that they are not used there.
> 
> Brian O.

I suppose it's also possible that the other bioperl distributions (like
bioperl-run) could use them as well.  

If they do we can take care of them as they pop up.  These are really old
and haven't been revised in a long time.  

The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
anyone know where Will Spooner is?  He's the maintainer for
Bio::Root::Storable.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct  2 11:01:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 10:01:44 -0500
Subject: [Bioperl-l] Blast parser.
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net>
Message-ID: <000001c6e633$ad0a6ce0$15327e82@pyrimidine>

The alignment that you get should come from GenericHSP, not BLASTHSP.
Either way, the HSP alignment that is retrieved using $hsp->get_aln() should
be a Bio::SimpleAlign object.  You can then output that to the proper
AlignIO format using an AlignIO stream object or use the Bio::SimpleAlign
methods for further analysis.  

my $aln = $hsp->get_aln();
my $alnout = Bio::AlignIO->new(-format => 'msf',
                               -fh  => \*STDOUT);
$alnout->write_aln($aln);

Quick note: not all AlignIO formats have write_aln() support at this time,
but most do.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of JK (Jesper Agerbo Krogh)
> Sent: Monday, October 02, 2006 9:06 AM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Blast parser.
> 
> 
> Hi.
> 
> I've tried to use the blast-parser but I cannot get the original alignment
> out of the parser. Is it possible to get that out of the
> Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a
> clustalw alignment out when it isn't that type of alignment people are
> used to get from blast.
> 
> Thanks
> 
> Jesper
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From whs at ebi.ac.uk  Mon Oct  2 12:00:19 2006
From: whs at ebi.ac.uk (Will Spooner)
Date: Mon, 2 Oct 2006 17:00:19 +0100 (BST)
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <001d01c6e630$27792fb0$15327e82@pyrimidine>
References: <001d01c6e630$27792fb0$15327e82@pyrimidine>
Message-ID: <Pine.LNX.4.64.0610021651550.1560@parrot.ebi.ac.uk>

On Mon, 2 Oct 2006, Chris Fields wrote:

>> Sendu,
>>
>> No objection but someone should check the scripts in examples/root to make
>> sure that they are not used there.
>>
>> Brian O.
>
> I suppose it's also possible that the other bioperl distributions (like
> bioperl-run) could use them as well.
>
> If they do we can take care of them as they pop up.  These are really old
> and haven't been revised in a long time.
>
> The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
> anyone know where Will Spooner is?  He's the maintainer for
> Bio::Root::Storable.
>

Hi Chris,

I'm still lurking...

If the tests for Bio::Root::Storable still pass (I assume that they do), 
then the module is working as advertised.

The idea behind Storable is very simple; object instances of any 
inhereting class can be serialised/retrieved from disk. BioPerl objects 
will probably not want this functionality by default, but it is trival to 
implement if needed.

Will


From cjfields at uiuc.edu  Mon Oct  2 13:58:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 12:58:15 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <Pine.LNX.4.64.0610021651550.1560@parrot.ebi.ac.uk>
Message-ID: <000601c6e64c$5746f990$15327e82@pyrimidine>

> On Mon, 2 Oct 2006, Chris Fields wrote:
> 
> >> Sendu,
> >>
> >> No objection but someone should check the scripts in examples/root to
> make
> >> sure that they are not used there.
> >>
> >> Brian O.
> >
> > I suppose it's also possible that the other bioperl distributions (like
> > bioperl-run) could use them as well.
> >
> > If they do we can take care of them as they pop up.  These are really
> old
> > and haven't been revised in a long time.
> >
> > The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
> > anyone know where Will Spooner is?  He's the maintainer for
> > Bio::Root::Storable.
> >
> 
> Hi Chris,
> 
> I'm still lurking...
> 
> If the tests for Bio::Root::Storable still pass (I assume that they do),
> then the module is working as advertised.
> 
> The idea behind Storable is very simple; object instances of any
> inhereting class can be serialised/retrieved from disk. BioPerl objects
> will probably not want this functionality by default, but it is trival to
> implement if needed.
> 
> Will

Okay, nice to know you're listening in!  Based on that we should keep it in.
The rest that Torsten mentioned could probably be removed right away.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From osborne1 at optonline.net  Mon Oct  2 13:59:58 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 13:59:58 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061002002403.GD12075@iucha.net>
Message-ID: <C146CBDE.A938%osborne1@optonline.net>

Florin,

OK, this is fixed in CVS now. The problem is that there's some variability
in how the PSI MI "standard" is used. In this case there was a species that
was not given a value for its scientific name ("fullName"), I had to use
common name in its place. Fortunately there's an NCBI taxon id behind all
this.

Thanks again,

Brian O.


On 10/1/06 8:24 PM, "Florin Iucha" <florin at iucha.net> wrote:

> Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the
> MINT [1] database does not produce the crash.  It has a new warning, however:
> 
>    Can't call method "text" on an undefined value at
>    /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290.


From mmacho at gmail.com  Mon Oct  2 13:43:13 2006
From: mmacho at gmail.com (ende)
Date: Mon, 2 Oct 2006 19:43:13 +0200
Subject: [Bioperl-l] Variable scope
Message-ID: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>


	Hi

this may be a typical perl topic and then out of this list center  
topic.  My apologize for any inconvenience.

It is a annoying problem that is making me waste lot of time.

I have a package with its new object, etc... and constants in it like:

#-----
use constant False => 0;
use constant True => 1;

our %CLRFG = (
               PLASMIDO      => RED,
               POLY_A        => GREEN,
               RESTR_SITES   => BLUE,
               CONECTORS     => MAGENTA,
               CONTAMINANTS  => CYAN,
           );

our %CLRBG = (
               PLASMIDO      => "",
               POLY_A        => "",
               RESTR_SITES   => "",
               CONECTORS     => "",
               CONTAMINANTS  => "",
           );
#------

this constants are include with require "h.pl" from the main package  
file.

I use this module from the mail command line driver to test it  
"using" it.  In the command line driver I can use with no gripe the  
constants False and True directly, for example "return True", etc  
without any reference to the origin of that constant.

But, with respect to the variables (I would like they also were  
constants.. but how?), %CLRFG and %CLRBG I can't find the way of  
refering those int the module.  Finally I have desisted and _copy_  
the definitions where  I have needed it (in the sub were I print Ansi  
terminal colouring seqs...).  I don't find how to refer those  
variables out of the module.

I have tried %modulename::CLRFG, for example, but Perl gives me errors.

Any help?


--
     Juan Falgueras
     Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n
     Universidad de M?laga


From cjfields at uiuc.edu  Mon Oct  2 16:52:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 15:52:11 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <C14696F5.A903%osborne1@optonline.net>
Message-ID: <000001c6e664$a25538d0$15327e82@pyrimidine>

I have updated the Deprecation page with the Bio::Root::* modules that we
plan on deprecating (note that I have them being removed for rel. 1.5.2).  I
have left out Bio::Root::Storable for now based on Will's response.  

http://www.bioperl.org/wiki/Deprecated_modules

I'll update the DEPRECATED doc in CVS as well.  There is a tentative
schedule for when warnings are added for modules before they are removed.  

In relation to the recent trend for house-cleaning, I noticed that all of
the Bio::Tools::BP* BLAST-related modules all are still present but haven't
been modified or had deprecation warnings added.  BPLite was marked for
deprecation around rel 1.5 since the functionality is present in
Bio::SearchIO, as well as the others.  Judging by the mail list, no one has
used these in quite a while, and everyone has been redirected to use
Bio::SearchIO instead.  Based on that I have added warnings in CVS for
deprecation to BPlite and the related modules BPpsilite and BPbl2seq.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Brian Osborne
> Sent: Monday, October 02, 2006 9:14 AM
> To: Sendu Bala; bioperl-l
> Subject: Re: [Bioperl-l] Do we need Bio::Root::Object anymore?
> 
> Sendu,
> 
> No objection but someone should check the scripts in examples/root to make
> sure that they are not used there.
> 
> Brian O.
> 
> 
> On 10/2/06 5:55 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:
> 
> > Torsten Seemann wrote:
> >>>>> I have removed all use/@ISA Bio::Root::Object references from
> >>>>> bioperl-live, except for those in Bio::Root::* itself:
> >>
> >>>> So I'd say they're both relics that can be removed. In fact I was
> >>>> planning on getting rid off all references to both of these modules
> >>>> before you did, so thanks! :)
> >>
> >>> I think they can go. It's probably a pre-1.0 deprecation that somehow
> >>> was never followed through on.
> >>
> >> Today I did a fresh CVS checkout of bioperl-live, and deleted the
> >> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> >>
> >>      * Bio::Root::Err
> >>      * Bio::Root::Global
> >>      * Bio::Root::IOManager
> >>      * Bio::Root::Object
> >>      * Bio::Root::Storable
> >>      * Bio::Root::Utilities  # may be used by third parties?
> >>      * Bio::Root::Vector
> >>      * Bio::Root::Xref
> >>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
> >>      * t/RootStorable.t
> >>
> >> Should we schedule for deprecation, or deprecate immediately as Hilmar
> >> suggested they were meant to be deprecated long ago ?
> >
> > I'm happy to get rid of them all straight away. Does anyone object?
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From florin at iucha.net  Mon Oct  2 16:47:01 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 15:47:01 -0500
Subject: [Bioperl-l] Variable scope
In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
Message-ID: <20061002204701.GG14409@iucha.net>

On Mon, Oct 02, 2006 at 07:43:13PM +0200, ende wrote:
> It is a annoying problem that is making me waste lot of time.
> 
> I have a package with its new object, etc... and constants in it like:
> 
> #-----
> use constant False => 0;
> use constant True => 1;
> 
> our %CLRFG = (
>                PLASMIDO      => RED,
>                POLY_A        => GREEN,
>                RESTR_SITES   => BLUE,
>                CONECTORS     => MAGENTA,
>                CONTAMINANTS  => CYAN,
>            );
> 
> our %CLRBG = (
>                PLASMIDO      => "",
>                POLY_A        => "",
>                RESTR_SITES   => "",
>                CONECTORS     => "",
>                CONTAMINANTS  => "",
>            );
> #------
> 
> this constants are include with require "h.pl" from the main package  
> file.
> 
> I use this module from the mail command line driver to test it  
> "using" it.  In the command line driver I can use with no gripe the  
> constants False and True directly, for example "return True", etc  
> without any reference to the origin of that constant.

It is possible you get them from somewhere else.

> But, with respect to the variables (I would like they also were  
> constants.. but how?), %CLRFG and %CLRBG I can't find the way of  
> refering those int the module.  Finally I have desisted and _copy_  
> the definitions where  I have needed it (in the sub were I print Ansi  
> terminal colouring seqs...).  I don't find how to refer those  
> variables out of the module.
> 
> I have tried %modulename::CLRFG, for example, but Perl gives me errors.

Did you actually declare a package name in "h.pl" ?

Is there any reason you don't call the file ".pm" and load it with
"use"?  I have attached a small example of importing that works.

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: one.pm
Type: text/x-perl
Size: 118 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0006.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: two.pl
Type: text/x-perl
Size: 69 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0007.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0008.bin>

From Kevin.M.Brown at asu.edu  Mon Oct  2 19:44:50 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 2 Oct 2006 16:44:50 -0700
Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module
Message-ID: <1A4207F8295607498283FE9E93B775B4021960CD@EX02.asurite.ad.asu.edu>

Well, for anyone that wants to know, I found a way to capture the output
of ClustalW to get at things like the score.

Copy STDOUT to another handle
open(OUTCOPY, ">&STDOUT") or die "Couldn't dup STDOUT: $!";

Change where STDOUT goes
open(STDOUT, ">log.test") or die "Couldn't open log.test: $!";

Run the alignment and its output will be captured by the STDOUT
redirection
$aln, $factory->align(\@seq);

Restore STDOUT to its normal location for the rest of the script
close STDOUT;
open(STDOUT, ">&OUTCOPY");

I guess I can understand why most of this is just dropped by the
ClustalW.pm module since there doesn't seem to be a way to hold it all
in a SimpleAlign object.

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Kevin Brown
> Sent: Thursday, September 28, 2006 2:48 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module
> 
> I've gotten a very simple script to run using bioperl that creates an
> alignment using clustalw of two sequences.  I see that clustal outputs
> to stdout information like the score, but I don't see any way to store
> that or retrieve that from the alignment object that is 
> returned (unless
> I'm just blind).  What follows is my very basic script which used code
> found in the Wiki.
> 
> print $aln->score() spits out an error about using an uninitialized
> value.
> 
> 
> #!/usr/bin/perl -w
> 
> use strict;
> use Bio::SeqIO;
> use Bio::Perl;
> use Bio::AlignIO;
> use Getopt::Long qw(:config no_ignore_case bundling pass_through);
> use POSIX;
> use Bio::Tools::Run::Alignment::Clustalw;
> 
> my $fileName   = "";         # filename(s) to be parsed for 
> information
> my $output_dir = "";
> my $format     = 'fasta';    # default format for SeqIO module
> 
> GetOptions(
>                    'file=s'   => \$fileName,
>                    'output=s' => \$output_dir,
>                   );
> 
> # Parse the input file for the needed information
> # SeqIO supports several normal formats including <tab>, <fasta> and
> <excel>
> 
> my @files = split(/\|/, $fileName);
> my @seq_array;
> 
> my $stream_out =
>   Bio::AlignIO->new(-file => '>test.msf', -format => 'msf', -flush =>
> 0);
> 
> foreach my $fileName (@files)
> {
>         my $file = Bio::SeqIO->new(-format => $format, -file =>
> $fileName);
>         my $seq;
>         while ($seq = $file->next_seq())
>         {
>                 push(@seq_array, $seq);
>         }
> }
> 
> my @params  = ('ktuple' => 2, 'matrix' => 'BLOSUM');
> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
> my $ktuple  = 3;
> $factory->ktuple($ktuple);    # change the parameter before executing
>     # where @seq_array is an array of {{PM|Bio::Seq}} objects
> 
> open my $out, ">seq.txt";
> 
> for (my $i = 1 ; $i <= $#seq_array ; $i++)
> {
>         my @seq = ($seq_array[0], $seq_array[$i]);
>         my $aln = $factory->align(\@seq);
>         $stream_out->write_aln($aln);
>         print $aln->score;
>         for my $seq ($aln->each_seq) {
>                 print $out $seq->display_id() ."\t". $seq->seq()."\n";
>         }
> }
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bix at sendu.me.uk  Mon Oct  2 19:48:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 00:48:34 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
Message-ID: <4521A552.60301@sendu.me.uk>

Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
upload tar.gz files when I have access to the server, then reply here 
with links.

In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
instructions on getting and testing this RC.

Developers:
   Make sure you're in the AUTHORS file in all 4 packages, as
   appropriate.

Users:
   Even though 1.5.2 is a 'developer' release, we consider it the most
   stable and capable version of Bioperl, and recommend that you use
   it in all but the most critical production environments. Please
   try it out and let us know of any problems or difficulties you run
   into.


Thank you,
Sendu.


From lincoln.stein at gmail.com  Mon Oct  2 17:53:38 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 2 Oct 2006 21:53:38 +0000
Subject: [Bioperl-l] Variable scope
In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
Message-ID: <6dce9a0b0610021453va2132c7u73747b9253211a66@mail.gmail.com>

Hi,

Read the documentation in Export. It is much better to formally export
constants, variables and functions and to import them with "use" than to use
"require". Also be sure that you understand how namespaces and modules work.

This is not a BioPerl topic and should have been directed to a general Perl
discussion list, such as Perl Monks.

Lincoln

On 10/2/06, ende <mmacho at gmail.com> wrote:
>
>
>         Hi
>
> this may be a typical perl topic and then out of this list center
> topic.  My apologize for any inconvenience.
>
> It is a annoying problem that is making me waste lot of time.
>
> I have a package with its new object, etc... and constants in it like:
>
> #-----
> use constant False => 0;
> use constant True => 1;
>
> our %CLRFG = (
>                PLASMIDO      => RED,
>                POLY_A        => GREEN,
>                RESTR_SITES   => BLUE,
>                CONECTORS     => MAGENTA,
>                CONTAMINANTS  => CYAN,
>            );
>
> our %CLRBG = (
>                PLASMIDO      => "",
>                POLY_A        => "",
>                RESTR_SITES   => "",
>                CONECTORS     => "",
>                CONTAMINANTS  => "",
>            );
> #------
>
> this constants are include with require "h.pl" from the main package
> file.
>
> I use this module from the mail command line driver to test it
> "using" it.  In the command line driver I can use with no gripe the
> constants False and True directly, for example "return True", etc
> without any reference to the origin of that constant.
>
> But, with respect to the variables (I would like they also were
> constants.. but how?), %CLRFG and %CLRBG I can't find the way of
> refering those int the module.  Finally I have desisted and _copy_
> the definitions where  I have needed it (in the sub were I print Ansi
> terminal colouring seqs...).  I don't find how to refer those
> variables out of the module.
>
> I have tried %modulename::CLRFG, for example, but Perl gives me errors.
>
> Any help?
>
>
>
>
> --
>      Juan Falgueras
>      Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n
>      Universidad de M?laga
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From florin at iucha.net  Mon Oct  2 22:30:31 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 21:30:31 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <20061003023031.GI14409@iucha.net>

On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.
> 
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
> instructions on getting and testing this RC.

[I won't create a wiki account just to report this.]

Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
not set.  Lots of warnings about missing packages and all, but this
looks interesting:

   Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.

Otherwise:

   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay.

The failed test is:

   t/ESEfinder..................dubious
      Test returned status 255 (wstat 65280, 0xff00)
   DIED. FAILED test 15

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra


From cjfields at uiuc.edu  Mon Oct  2 23:50:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 22:50:47 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>

So far all tests pass on Mac OS X.  I'll add this to the release page.

This RC will throw warnings for four tests I didn't remove in time  
(BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which  
correspond to their namesake deprecated Bio::Tools modules.  These  
are no longer in CVS HEAD so should be gone by the next RC, and the  
relevant modules marked for deprecation.

I can verify the Bio::DB::SeqFeature.t warning on Mac OS X that  
Florin reported, but ESEFinder.t works fine:

t/BioDBSeqFeature............Argument "+" isn't numeric in numeric lt  
(<) at Bio/DB/SeqFeature/Segment.pm line 423.
ok
....

I'll report WinXP tests tomorrow on the wiki.

Chris


On Oct 2, 2006, at 6:48 PM, Sendu Bala wrote:

> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll
> upload tar.gz files when I have access to the server, then reply here
> with links.
>
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
>
> Developers:
>    Make sure you're in the AUTHORS file in all 4 packages, as
>    appropriate.
>
> Users:
>    Even though 1.5.2 is a 'developer' release, we consider it the most
>    stable and capable version of Bioperl, and recommend that you use
>    it in all but the most critical production environments. Please
>    try it out and let us know of any problems or difficulties you run
>    into.
>
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct  2 23:54:29 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 22:54:29 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003023031.GI14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
Message-ID: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>

> [I won't create a wiki account just to report this.]
>
> Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> not set.  Lots of warnings about missing packages and all, but this
> looks interesting:
>
>    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ 
> SeqFeature/Segment.pm line 423.

This is verified on Mac OS X.

> Otherwise:
>
>    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,  
> 99.99% okay.
>
> The failed test is:
>
>    t/ESEfinder..................dubious
>       Test returned status 255 (wstat 65280, 0xff00)
>    DIED. FAILED test 15

What do you get when you run that set of tests using 'perl -I. -w t/ 
ESEFinder.t'?  The bad status code is odd and could be a remote  
server issue.

Chris


>
> florin
>
> -- 
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Tue Oct  3 00:30:06 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 03 Oct 2006 14:30:06 +1000
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
Message-ID: <4521E74E.1040404@infotech.monash.edu.au>

My understanding is that all Bioperl-compliant classes should inherit 
from Bio::Root::Root, not Bio::Root::RootI.

Additionally, if functions such as throw() or _rearrange() are to be 
used without a class instance reference, they are to be used as class 
methods via Bio::Root::Root, not Bio::Root::RootI.

Is this correct?

My naive audit of bioperl-live CVS brought up the following statistics:

# Root.pm
/cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
26
/cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
346

# RootI.pm
/cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
9
/cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
79

My guess would be that all RootI should be changed to plain Root ?

Any help appreciated,

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From jason at bioperl.org  Tue Oct  3 02:03:17 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 2 Oct 2006 23:03:17 -0700
Subject: [Bioperl-l] t/ESEFinder.t fixed on branch
Message-ID: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>

Looks like good work everyone.

All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1  
with RC1 except for the t/ESEFinder problem which I've fixed.

It skipped too few tests when BIOPERLDEBUG=0.

Don't forget to merge branch changes back to head for this test when  
it is done.   I don't want to muddy water so I'm holding off  
migrating the changes to main trunk as the files is substantially  
different (I presume pre-Test::More adoption?).

-jason


From bix at sendu.me.uk  Tue Oct  3 03:28:48 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:28:48 +0100
Subject: [Bioperl-l] t/ESEFinder.t fixed on branch
In-Reply-To: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>
References: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>
Message-ID: <45221130.2060405@sendu.me.uk>

Jason Stajich wrote:
> Looks like good work everyone.
> 
> All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1  
> with RC1 except for the t/ESEFinder problem which I've fixed.
> 
> It skipped too few tests when BIOPERLDEBUG=0.
> 
> Don't forget to merge branch changes back to head for this test when  
> it is done.   I don't want to muddy water so I'm holding off  
> migrating the changes to main trunk as the files is substantially  
> different (I presume pre-Test::More adoption?).

Actually, it was the same until Torsten made his own (different) fixes 
to HEAD but not to branch. It was my mistake and I've corrected in yet a 
third way, and now branch and HEAD match.

No harm done :)


From bix at sendu.me.uk  Tue Oct  3 03:31:10 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:31:10 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>
References: <4521A552.60301@sendu.me.uk>
	<7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>
Message-ID: <452211BE.6080107@sendu.me.uk>

Chris Fields wrote:
> So far all tests pass on Mac OS X.  I'll add this to the release page.
> 
> This RC will throw warnings for four tests I didn't remove in time  
> (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which  
> correspond to their namesake deprecated Bio::Tools modules.  These  
> are no longer in CVS HEAD so should be gone by the next RC, and the  
> relevant modules marked for deprecation.

Thanks Chris. Sorry I missed these.


From bix at sendu.me.uk  Tue Oct  3 03:32:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:32:08 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003023031.GI14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
Message-ID: <452211F8.8040104@sendu.me.uk>

Florin Iucha wrote:
> On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote:
>> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
>> upload tar.gz files when I have access to the server, then reply here 
>> with links.
>>
>> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
>> instructions on getting and testing this RC.
> 
> [I won't create a wiki account just to report this.]
> 
> Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> not set.  Lots of warnings about missing packages and all, but this
> looks interesting:
> 
>    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.
> 
> Otherwise:
> 
>    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay.
> 
> The failed test is:
> 
>    t/ESEfinder..................dubious
>       Test returned status 255 (wstat 65280, 0xff00)
>    DIED. FAILED test 15

Thanks for your feedback Florin. The ESEfinder fail will be fixed in the 
next RC.


From bix at sendu.me.uk  Tue Oct  3 04:29:37 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 09:29:37 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <45221F71.40206@sendu.me.uk>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.

Live/core:
http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-1.5.2-RC1.zip

Run:
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.zip

DB:
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.zip

Network:
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.zip

Md5 checksums are in:
http://bioperl.org/DIST/SIGNATURES.md5


From jason at bioperl.org  Tue Oct  3 02:11:30 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 2 Oct 2006 23:11:30 -0700
Subject: [Bioperl-l]  Use of Root.pm versus RootI.pm
Message-ID: <87F9B64E-8BDA-464B-814D-3F117AA646A1@bioperl.org>

I only briefly saw your question - but RootI is for interfaces,  
Root.pm is for instantiated objects.


From florin at iucha.net  Tue Oct  3 07:39:12 2006
From: florin at iucha.net (Florin Iucha)
Date: Tue, 3 Oct 2006 06:39:12 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
Message-ID: <20061003113912.GJ14409@iucha.net>

On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote:
> >Otherwise:
> >
> >   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,  
> >99.99% okay.
> >
> >The failed test is:
> >
> >   t/ESEfinder..................dubious
> >      Test returned status 255 (wstat 65280, 0xff00)
> >   DIED. FAILED test 15

$ perl -I. -w t/ESEfinder.t
1..15
ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
ok 2 - use Data::Dumper;
ok 3 - use Bio::PrimarySeq;
ok 4 - use Bio::Seq;
ok 5
ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
# Looks like you planned 15 tests but only ran 14.
$ grep Id t/ESEfinder.t
# $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra


From hlapp at gmx.net  Tue Oct  3 08:27:46 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 3 Oct 2006 08:27:46 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au>
References: <4521E74E.1040404@infotech.monash.edu.au>
Message-ID: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net>

The interface classes (those ending in 'I') should actually inherit  
from RootI, not Root.

In reality this recommendation is more theoretical than it makes that  
much of a difference I think. The motivation is that interface  
classes should not determine the actual implementation of a class  
(hash ref, array ref, whatever), and since Root.pm contains lots of  
implementation using a hash ref that decision will basically have  
been made.

On the contrary though, RootI contains implementation too, although  
I'm not sure it would prescribe the object implementation as opposed  
to merely implementing static methods (like throw(), warn(), etc).  
That would need to be checked.

	-hilmar

On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:

> My understanding is that all Bioperl-compliant classes should inherit
> from Bio::Root::Root, not Bio::Root::RootI.
>
> Additionally, if functions such as throw() or _rearrange() are to be
> used without a class instance reference, they are to be used as class
> methods via Bio::Root::Root, not Bio::Root::RootI.
>
> Is this correct?
>
> My naive audit of bioperl-live CVS brought up the following  
> statistics:
>
> # Root.pm
> /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> 26
> /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
> 346
>
> # RootI.pm
> /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> 9
> /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
> 79
>
> My guess would be that all RootI should be changed to plain Root ?
>
> Any help appreciated,
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct  3 08:33:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 07:33:37 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003113912.GJ14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
	<20061003113912.GJ14409@iucha.net>
Message-ID: <44724E16-74CD-4778-B04F-529475B47E37@uiuc.edu>

Florin,

Looks like this is fixed and should be working in the next release.

Chris

On Oct 3, 2006, at 6:39 AM, Florin Iucha wrote:

> On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote:
>>> Otherwise:
>>>
>>>   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,
>>> 99.99% okay.
>>>
>>> The failed test is:
>>>
>>>   t/ESEfinder..................dubious
>>>      Test returned status 255 (wstat 65280, 0xff00)
>>>   DIED. FAILED test 15
>
> $ perl -I. -w t/ESEfinder.t
> 1..15
> ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
> ok 2 - use Data::Dumper;
> ok 3 - use Bio::PrimarySeq;
> ok 4 - use Bio::Seq;
> ok 5
> ok 6 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 7 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 8 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 9 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 10 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 11 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 12 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 13 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 14 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> # Looks like you planned 15 tests but only ran 14.
> $ grep Id t/ESEfinder.t
> # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $
>
> florin
>
> -- 
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct  3 10:29:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 09:29:51 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net>
Message-ID: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>

> The interface classes (those ending in 'I') should actually inherit
> from RootI, not Root.
> 
> In reality this recommendation is more theoretical than it makes that
> much of a difference I think. The motivation is that interface
> classes should not determine the actual implementation of a class
> (hash ref, array ref, whatever), and since Root.pm contains lots of
> implementation using a hash ref that decision will basically have
> been made.
> 
> On the contrary though, RootI contains implementation too, although
> I'm not sure it would prescribe the object implementation as opposed
> to merely implementing static methods (like throw(), warn(), etc).
> That would need to be checked.
> 
> 	-hilmar

The constructor in Bio::Root::RootI lets one know that its use is
deprecated, so you shouldn't have any cases of 'our qw(Bio::Root::RootI)';
there should be some way of inheriting Root directly or indirectly.  I would
say that any direct use of RootI is not good practice, though.  For the
current implementation we should only inherit Bio::Root::Root, which
implements RootI.

Is there any reason to shut off the warning with BIOPERLDEBUG?  

>From RootI:

sub new {
  my $class = shift;
  my @args = @_;
  unless ( $ENV{'BIOPERLDEBUG'} ) {
      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
Bio::Root::Root instead");
  }
  eval "require Bio::Root::Root";
  return Bio::Root::Root->new(@args);
}


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> 
> On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> 
> > My understanding is that all Bioperl-compliant classes should inherit
> > from Bio::Root::Root, not Bio::Root::RootI.
> >
> > Additionally, if functions such as throw() or _rearrange() are to be
> > used without a class instance reference, they are to be used as class
> > methods via Bio::Root::Root, not Bio::Root::RootI.
> >
> > Is this correct?
> >
> > My naive audit of bioperl-live CVS brought up the following
> > statistics:
> >
> > # Root.pm
> > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > 26
> > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
> > 346
> >
> > # RootI.pm
> > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > 9
> > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
> > 79
> >
> > My guess would be that all RootI should be changed to plain Root ?
> >
> > Any help appreciated,
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From slenk at emich.edu  Tue Oct  3 13:31:47 2006
From: slenk at emich.edu (Stephen Gordon Lenk)
Date: Tue, 03 Oct 2006 13:31:47 -0400
Subject: [Bioperl-l] Perl 6 has 'roles' - may be cleanly applicable to the
	Root/RootI issue
Message-ID: <5147da5514e402.514e4025147da5@emich.edu>

I looked at the Perl6 site, there is an RFC on interfaces:
http://dev.perl.org/perl6/rfc/265.html

Roles seem to be the Perl 6 answer to the Root/RootI issue in Bioperl. 
Maybe it is too early to suggest this.

http://dev.perl.org/perl6/doc/design/apo/A12.html:
The primary role of a class is to manage instances, that is, objects. 
So a class must worry about object creation and destruction, and 
everything that happens in between. Classes have a secondary role as 
units of software reuse, in that they can be inherited from or 
delegated to. However, because this is a secondary role, and because 
of weaknesses in models of inheritance, composition, and delegation, 
Perl 6 will split out the notion of software reuse into a separate 
class-like entity called a "role". Roles are an abstraction mechanism 
for use by classes that don't care about the secondary aspects of 
software reuse, or that (looking at it the other way) care so much 
about it that they want to encapsulate any decisions about 
implementation, composition, delegation, and maybe even inheritance. 
Sounds fancy, but just think of them as includes of partial classes, 
with some safety checks. Roles don't manage objects. They manage 
interfaces and other abstract behavior (like default implementations), 
and they help classes manage objects. As such, a role may only be 
composed into a class or into another role, never inherited from or 
delegated to. That's what classes are for.


From slenk at emich.edu  Tue Oct  3 12:45:15 2006
From: slenk at emich.edu (Stephen Gordon Lenk)
Date: Tue, 03 Oct 2006 12:45:15 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
Message-ID: <5120d6a511f5a7.511f5a75120d6a@emich.edu>

The separation of interface and implementation is generally
regarded as a good idea. Right now the Bioperl community is
doing this as part of the implementation of Bioperl. I suggest
that this is an example of something which you might want to
have as part of the Perl implementation. If Perl 6 (or even
Perl 5) does not have this as a core part of the language or
as a standard package (reusable by all in a common fashion),
you may want to suggest to the Perl implementers that a way
for interface/implementation distinctions be made part of the
core language. My 2 cents, as you people are the experts on 
your own code.


----- Original Message -----
From: Chris Fields <cjfields at uiuc.edu>
Date: Tuesday, October 3, 2006 10:29 am
Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm

> > The interface classes (those ending in 'I') should actually inherit
> > from RootI, not Root.
> > 
> > In reality this recommendation is more theoretical than it makes 
> that> much of a difference I think. The motivation is that interface
> > classes should not determine the actual implementation of a class
> > (hash ref, array ref, whatever), and since Root.pm contains lots of
> > implementation using a hash ref that decision will basically have
> > been made.
> > 
> > On the contrary though, RootI contains implementation too, although
> > I'm not sure it would prescribe the object implementation as 
opposed
> > to merely implementing static methods (like throw(), warn(), etc).
> > That would need to be checked.
> > 
> > 	-hilmar
> 
> The constructor in Bio::Root::RootI lets one know that its use is
> deprecated, so you shouldn't have any cases of 'our 
> qw(Bio::Root::RootI)';there should be some way of inheriting Root 
> directly or indirectly.  I would
> say that any direct use of RootI is not good practice, though.  
> For the
> current implementation we should only inherit Bio::Root::Root, which
> implements RootI.
> 
> Is there any reason to shut off the warning with BIOPERLDEBUG?  
> 
> >From RootI:
> 
> sub new {
>  my $class = shift;
>  my @args = @_;
>  unless ( $ENV{'BIOPERLDEBUG'} ) {
>      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
> Bio::Root::Root instead");
>  }
>  eval "require Bio::Root::Root";
>  return Bio::Root::Root->new(@args);
> }
> 
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> > 
> > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> > 
> > > My understanding is that all Bioperl-compliant classes should 
> inherit> > from Bio::Root::Root, not Bio::Root::RootI.
> > >
> > > Additionally, if functions such as throw() or _rearrange() are 
> to be
> > > used without a class instance reference, they are to be used 
> as class
> > > methods via Bio::Root::Root, not Bio::Root::RootI.
> > >
> > > Is this correct?
> > >
> > > My naive audit of bioperl-live CVS brought up the following
> > > statistics:
> > >
> > > # Root.pm
> > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > > 26
> > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | 
> wc -l
> > > 346
> > >
> > > # RootI.pm
> > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > > 9
> > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | 
> wc -l
> > > 79
> > >
> > > My guess would be that all RootI should be changed to plain 
> Root ?
> > >
> > > Any help appreciated,
> > >
> > > --
> > > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > > Victorian Bioinformatics Consortium, Monash University, Australia
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > 
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Tue Oct  3 13:49:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 12:49:35 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <5120d6a511f5a7.511f5a75120d6a@emich.edu>
Message-ID: <000001c6e714$4c2cbb80$15327e82@pyrimidine>

Perl6 already has added flexibility for separation of
implementation/interface (I believe they are called roles).  

http://dev.perl.org/perl6/doc/design/syn/S12.html

To tell the truth, I'm not sure about Perl 5, except the way the Bioperl
devs have up the distinction between interface and implementation.  However,
I find the way we use interfaces is very simple (set up interface with
some/all methods as unimplemented, use the module as an abstract base class,
then override the unimplemented methods).  It works for me.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Stephen Gordon Lenk [mailto:slenk at emich.edu]
> Sent: Tuesday, October 03, 2006 11:45 AM
> To: Chris Fields
> Cc: 'Hilmar Lapp'; 'Torsten Seemann'; 'bioperl-l'
> Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm
> 
> The separation of interface and implementation is generally
> regarded as a good idea. Right now the Bioperl community is
> doing this as part of the implementation of Bioperl. I suggest
> that this is an example of something which you might want to
> have as part of the Perl implementation. If Perl 6 (or even
> Perl 5) does not have this as a core part of the language or
> as a standard package (reusable by all in a common fashion),
> you may want to suggest to the Perl implementers that a way
> for interface/implementation distinctions be made part of the
> core language. My 2 cents, as you people are the experts on
> your own code.
> 
> 
> ----- Original Message -----
> From: Chris Fields <cjfields at uiuc.edu>
> Date: Tuesday, October 3, 2006 10:29 am
> Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm
> 
> > > The interface classes (those ending in 'I') should actually inherit
> > > from RootI, not Root.
> > >
> > > In reality this recommendation is more theoretical than it makes
> > that> much of a difference I think. The motivation is that interface
> > > classes should not determine the actual implementation of a class
> > > (hash ref, array ref, whatever), and since Root.pm contains lots of
> > > implementation using a hash ref that decision will basically have
> > > been made.
> > >
> > > On the contrary though, RootI contains implementation too, although
> > > I'm not sure it would prescribe the object implementation as
> opposed
> > > to merely implementing static methods (like throw(), warn(), etc).
> > > That would need to be checked.
> > >
> > > 	-hilmar
> >
> > The constructor in Bio::Root::RootI lets one know that its use is
> > deprecated, so you shouldn't have any cases of 'our
> > qw(Bio::Root::RootI)';there should be some way of inheriting Root
> > directly or indirectly.  I would
> > say that any direct use of RootI is not good practice, though.
> > For the
> > current implementation we should only inherit Bio::Root::Root, which
> > implements RootI.
> >
> > Is there any reason to shut off the warning with BIOPERLDEBUG?
> >
> > >From RootI:
> >
> > sub new {
> >  my $class = shift;
> >  my @args = @_;
> >  unless ( $ENV{'BIOPERLDEBUG'} ) {
> >      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
> > Bio::Root::Root instead");
> >  }
> >  eval "require Bio::Root::Root";
> >  return Bio::Root::Root->new(@args);
> > }
> >
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > >
> > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> > >
> > > > My understanding is that all Bioperl-compliant classes should
> > inherit> > from Bio::Root::Root, not Bio::Root::RootI.
> > > >
> > > > Additionally, if functions such as throw() or _rearrange() are
> > to be
> > > > used without a class instance reference, they are to be used
> > as class
> > > > methods via Bio::Root::Root, not Bio::Root::RootI.
> > > >
> > > > Is this correct?
> > > >
> > > > My naive audit of bioperl-live CVS brought up the following
> > > > statistics:
> > > >
> > > > # Root.pm
> > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > > > 26
> > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio |
> > wc -l
> > > > 346
> > > >
> > > > # RootI.pm
> > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > > > 9
> > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio |
> > wc -l
> > > > 79
> > > >
> > > > My guess would be that all RootI should be changed to plain
> > Root ?
> > > >
> > > > Any help appreciated,
> > > >
> > > > --
> > > > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > > > Victorian Bioinformatics Consortium, Monash University, Australia
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > >
> > > --
> > > ===========================================================
> > > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > > ===========================================================
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >


From cmlapid at up.edu.ph  Tue Oct  3 22:06:06 2006
From: cmlapid at up.edu.ph (Carlo Lapid)
Date: Wed, 4 Oct 2006 10:06:06 +0800
Subject: [Bioperl-l] genbank mirror
Message-ID: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>

Hi,

I'm trying to set up a local mirror of a large part of the Genbank database.
For users to access the local database, I need to create a web-based search
tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
flat files I've downloaded based on a query entered by the user.

I'm trying to use Bioperl to create this from scratch, but I'm having a very
hard time, especially since I want the user to have reasonable flexibility
in customizing his search. The best that I've been able to accomplish is a
search function that retrieves genbank sequence objects based on their
primary IDs or accession numbers; by using the fetch method of the
Bio::Index::GenBank module. But this doesn't help users who don't know the
exact IDs for the sequences they want.

Can anybody suggest a way to use Bioperl to search for an ordinary word or
phrase, like "16S gene", which could be matched against the description
field, or the entire genbank entry? (Alternatively, is there some other
freely available tool or software that can do this?) I've been scouring the
Bioperl documentation, but I couldn't find anything. I just need to be
pointed in the right direction. What I thought was a relatively simple
problem has been driving me crazy for days; if anybody has any suggestions I
would really, really appreciate it.


From torsten.seemann at infotech.monash.edu.au  Tue Oct  3 22:58:03 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 04 Oct 2006 12:58:03 +1000
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
References: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <4523233B.7030505@infotech.monash.edu.au>

> I'm trying to set up a local mirror of a large part of the Genbank database.
> For users to access the local database, I need to create a web-based search
> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
> flat files I've downloaded based on a query entered by the user.

Have you coinsidered bioperl-db / BioSQL ?

http://www.bioperl.org/wiki/BioPerl_db
http://lists.open-bio.org/pipermail/biosql-l/

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From osborne1 at optonline.net  Tue Oct  3 23:16:20 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Tue, 03 Oct 2006 23:16:20 -0400
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <C1489FC4.AA43%osborne1@optonline.net>

Carlo,

You might want to look at the Bio::DB::Query::GenBank module:

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_dat
abase

However this works through NCBI's own eutils API, setting it up to query a
local mirror may be very difficult.


Brian O.


On 10/3/06 10:06 PM, "Carlo Lapid" <cmlapid at up.edu.ph> wrote:

> Hi,
> 
> I'm trying to set up a local mirror of a large part of the Genbank database.
> For users to access the local database, I need to create a web-based search
> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
> flat files I've downloaded based on a query entered by the user.
> 
> I'm trying to use Bioperl to create this from scratch, but I'm having a very
> hard time, especially since I want the user to have reasonable flexibility
> in customizing his search. The best that I've been able to accomplish is a
> search function that retrieves genbank sequence objects based on their
> primary IDs or accession numbers; by using the fetch method of the
> Bio::Index::GenBank module. But this doesn't help users who don't know the
> exact IDs for the sequences they want.
> 
> Can anybody suggest a way to use Bioperl to search for an ordinary word or
> phrase, like "16S gene", which could be matched against the description
> field, or the entire genbank entry? (Alternatively, is there some other
> freely available tool or software that can do this?) I've been scouring the
> Bioperl documentation, but I couldn't find anything. I just need to be
> pointed in the right direction. What I thought was a relatively simple
> problem has been driving me crazy for days; if anybody has any suggestions I
> would really, really appreciate it.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From osborne1 at optonline.net  Tue Oct  3 23:28:06 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Tue, 03 Oct 2006 23:28:06 -0400
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <4523233B.7030505@infotech.monash.edu.au>
Message-ID: <C148A286.AA47%osborne1@optonline.net>

Torsten and Carlo,

Right. For some simple examples of using Bio::DB::Query::BioQuery to query a
BioSQL db take a look at Bio::DB::BioSQL::OBDA.

You may also want to take a look at NCBI's eutils API, it's quite powerful
but not local. Or the ENSEMBL API, people have set up their own local
ENSEMBL dbs. There's an example of this API here:

http://www.bioperl.org/wiki/Getting_Genomic_Sequences


Brian O.


On 10/3/06 10:58 PM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

>> I'm trying to set up a local mirror of a large part of the Genbank database.
>> For users to access the local database, I need to create a web-based search
>> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
>> flat files I've downloaded based on a query entered by the user.
> 
> Have you coinsidered bioperl-db / BioSQL ?
> 
> http://www.bioperl.org/wiki/BioPerl_db
> http://lists.open-bio.org/pipermail/biosql-l/


From torsten.seemann at infotech.monash.edu.au  Wed Oct  4 01:21:24 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 04 Oct 2006 15:21:24 +1000
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
Message-ID: <452344D4.8070908@infotech.monash.edu.au>

Hi all,

Now that we have Perl 5.6.1 as a minimum, the following modules are 
standard: File::Spec, File::Temp, File::Path

Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() 
which currently dispatch to the File:: version, or try to emulate it. We 
don't need to emulate anymore. Jason Stajich suggested in a previous 
post that they should be deprecated, and that users should use directly 
the File:: functions themselves.

I have an uncommitted simplified version of Bio::Root::IO which does 
this, and "all tests pass". The functions currently (silently) dispatch 
directly to their native counterparts.

The only tricky function is tempfile() which is *mostly* like 
File::Temp::tempfile(), but does some voodoo of converting 
(TEMPLATE=>'xxx') to the non-hash first parameter of the File:: version, 
so I'm hesitant to commit. It may do other magic - Hilmar?

Comments?

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From gianluca.debellis at itb.cnr.it  Wed Oct  4 05:25:26 2006
From: gianluca.debellis at itb.cnr.it (Gianluca De Bellis)
Date: Wed, 04 Oct 2006 11:25:26 +0200
Subject: [Bioperl-l] Bioperl under WinXP
Message-ID: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>

I'm trying to use Bioperl under WinXP-SP2 (novice)

Bioperl has been just downloaded  (v 1.2.3)

Even the simplest program with a single command (use Bio::Perl;) ends up in
an error of the Perl interpreter with these details

AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll

ModVer: 0.0.0.0      Offset: 00003294

Coming from the  windos reporting system

Where is the problem?

 
Thanks in advance


From epsteinj at mail.nih.gov  Wed Oct  4 07:25:57 2006
From: epsteinj at mail.nih.gov (Epstein, Jonathan A (NIH/NICHD) [E])
Date: Wed, 4 Oct 2006 07:25:57 -0400
Subject: [Bioperl-l] genbank mirror
References: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <42504F69898FE546B3F0238C9BD03275532603@NIHCESMLBX7.nih.gov>

There's Seqhound:
  http://seqhound.blueprint.org/report.html

We set this up locally, and it's probably the most comprehensive free solution out there, but it's non-trivial to setup. Also, since the Blueprint&BIND have lost most of their funding, I'm not sure how long you can count on SeqHound to remain operational (although for now it is being updated).

Jonathan


-----Original Message-----
From: Carlo Lapid [mailto:cmlapid at up.edu.ph]
Sent: Tue 10/3/2006 10:06 PM
To: bioperl-l at bioperl.org
Subject: [Bioperl-l] genbank mirror
 
Hi,

I'm trying to set up a local mirror of a large part of the Genbank database.
For users to access the local database, I need to create a web-based search
tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
flat files I've downloaded based on a query entered by the user.

I'm trying to use Bioperl to create this from scratch, but I'm having a very
hard time, especially since I want the user to have reasonable flexibility
in customizing his search. The best that I've been able to accomplish is a
search function that retrieves genbank sequence objects based on their
primary IDs or accession numbers; by using the fetch method of the
Bio::Index::GenBank module. But this doesn't help users who don't know the
exact IDs for the sequences they want.

Can anybody suggest a way to use Bioperl to search for an ordinary word or
phrase, like "16S gene", which could be matched against the description
field, or the entire genbank entry? (Alternatively, is there some other
freely available tool or software that can do this?) I've been scouring the
Bioperl documentation, but I couldn't find anything. I just need to be
pointed in the right direction. What I thought was a relatively simple
problem has been driving me crazy for days; if anybody has any suggestions I
would really, really appreciate it.
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Wed Oct  4 09:19:45 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 04 Oct 2006 14:19:45 +0100
Subject: [Bioperl-l] Bioperl under WinXP
In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
References: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
Message-ID: <4523B4F1.3010305@sendu.me.uk>

Gianluca De Bellis wrote:
> I'm trying to use Bioperl under WinXP-SP2 (novice)
> 
> Bioperl has been just downloaded  (v 1.2.3)
> 
> Even the simplest program with a single command (use Bio::Perl;) ends up in
> an error of the Perl interpreter with these details
> 
> AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll
> 
> ModVer: 0.0.0.0      Offset: 00003294
> 
> Coming from the  windos reporting system
> 
> Where is the problem?

Hard to say. Do non-bioperl scripts work?

Make sure to follow the Bioperl installation instructions carefully:
http://bioperl.org/wiki/Installing_Bioperl_on_Windows

And make sure to install at least version 1.4. 1.2.3 is ancient and 
effectively unsupported.


From cjfields at uiuc.edu  Wed Oct  4 10:03:34 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 4 Oct 2006 09:03:34 -0500
Subject: [Bioperl-l] Bioperl under WinXP
In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
Message-ID: <000601c6e7bd$e22ad190$15327e82@pyrimidine>

If you're using PPM, you can install a (much) newer version of BioPerl from
here:

http://www.gmod.org/ggb/ppm/

Add that as one of your repositories in PPM4 (seeing that you are using
ActivePerl 5.8.8.819), then search for bioperl.  The version should be
1.512.

In a few weeks we'll be releasing a new developer release.  A WinXP PPM is
expected, as well as a bundled package to install all prerequisites.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Gianluca De Bellis
> Sent: Wednesday, October 04, 2006 4:25 AM
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] Bioperl under WinXP
> 
> I'm trying to use Bioperl under WinXP-SP2 (novice)
> 
> Bioperl has been just downloaded  (v 1.2.3)
> 
> Even the simplest program with a single command (use Bio::Perl;) ends up
> in
> an error of the Perl interpreter with these details
> 
> AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll
> 
> ModVer: 0.0.0.0      Offset: 00003294
> 
> Coming from the  windos reporting system
> 
> Where is the problem?
> 
> 
> 
> Thanks in advance
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at gmx.net  Wed Oct  4 10:25:23 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 4 Oct 2006 10:25:23 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>
References: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>
Message-ID: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net>


On Oct 3, 2006, at 10:29 AM, Chris Fields wrote:

> The constructor in Bio::Root::RootI lets one know that its use is
> deprecated, so you shouldn't have any cases of 'our qw 
> (Bio::Root::RootI)';

Don't confuse the constructor with the inheritance tree.

Interface classes should never be instantiated, hence the  
constructor, consistent with the documentation, should never get  
executed.

> there should be some way of inheriting Root directly or  
> indirectly.  I would
> say that any direct use of RootI is not good practice, though.

I don't know what you mean by 'directly' or 'indirectly' but  
inheritance from interfaces, and interfaces extending (inheriting  
from) other interfaces, is certainly standard practice. I'm not sure  
at all why it would be a bad one.

> For the current implementation we should only inherit  
> Bio::Root::Root, which
> implements RootI.

For the implementation classes, yes. For the interface classes, no.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Oct  4 10:43:54 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 4 Oct 2006 10:43:54 -0400
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <452344D4.8070908@infotech.monash.edu.au>
References: <452344D4.8070908@infotech.monash.edu.au>
Message-ID: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>


On Oct 4, 2006, at 1:21 AM, Torsten Seemann wrote:

> Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree()
> which currently dispatch to the File:: version, or try to emulate  
> it. We
> don't need to emulate anymore. Jason Stajich suggested in a previous
> post that they should be deprecated, and that users should use  
> directly
> the File:: functions themselves.

I don't think there's a need to deprecate - if the methods just plain  
delegate to whatever File:: module is appropriate their  
implementation (supposedly) will become very simple and hence won't  
pose a maintenance burden anymore.

One can still recommend for all new scripts or modules or code  
written to use the File:: modules directly, just I'm not sure there's  
a need to tell users that they should start changing their existing  
stuff.

>
> I have an uncommitted simplified version of Bio::Root::IO which does
> this, and "all tests pass". The functions currently (silently)  
> dispatch
> directly to their native counterparts.
>
> The only tricky function is tempfile() which is *mostly* like
> File::Temp::tempfile(), but does some voodoo of converting
> (TEMPLATE=>'xxx') to the non-hash first parameter of the File::  
> version,
> so I'm hesitant to commit. It may do other magic - Hilmar?

Not that I would know of. If the tests pass (without having to change  
them!) I'd give it a try.

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Oct  4 11:35:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 4 Oct 2006 10:35:16 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net>
Message-ID: <001901c6e7ca$b12fd5b0$15327e82@pyrimidine>

...
> Don't confuse the constructor with the inheritance tree.
> 
> Interface classes should never be instantiated, hence the
> constructor, consistent with the documentation, should never get
> executed.

I know that interfaces shouldn't be instantiated.  I had noticed there are
cases of 'our qw (Bio::Root::RootI)' where it is completely acceptable to
inherit the interface.  Makes sense to me now.

> > there should be some way of inheriting Root directly or
> > indirectly.  I would
> > say that any direct use of RootI is not good practice, though.
> 
> I don't know what you mean by 'directly' or 'indirectly' but
> inheritance from interfaces, and interfaces extending (inheriting
> from) other interfaces, is certainly standard practice. I'm not sure
> at all why it would be a bad one.

I was talking specifically about inheriting RootI, and not about all Bioperl
interfaces in general.  I completely understand the use of
interface/implementation in Bioperl.  However, I missed one small fact until
yesterday (of course AFTER I posed my reply), which was that interfaces may
inherit RootI directly.  My oops.

I had understood that, in general, any Bioperl implementation should not
inherit the RootI interface directly (they should inherit Root, since that
implements RootI).  The 'constructor' present in RootI is essentially to
make sure that no one inherits from the wrong class.

Probably a bad use of the terms 'direct' and 'indirect', so maybe I didn't
get that across very well.  What I meant was that all classes inherit Root
in some way, either 'directly' (as the direct parent class) or 'indirectly'
(through the inheritance tree). Probably comes from being primarily a
molecular microbiologist and not a computer scientist.

OT, but it would be nice to have an updated class diagram to sort out the
inheritance hierarchy a bit easier.  In the meantime, the Deobfuscator does
help quite a bit.

> > For the current implementation we should only inherit
> > Bio::Root::Root, which
> > implements RootI.
> 
> For the implementation classes, yes. For the interface classes, no.

I agree (see above).  That's the one small bit about interfaces I missed
along the way.  Makes sense; they use throw_not_implemented(), which is a
RootI method.

> 	-hilmar

Chris


From pmiguel at purdue.edu  Wed Oct  4 15:38:51 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Wed, 04 Oct 2006 15:38:51 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <45240DCB.2080204@purdue.edu>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.
>
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
> instructions on getting and testing this RC.
>
> Developers:
>    Make sure you're in the AUTHORS file in all 4 packages, as
>    appropriate.
>
> Users:
>    Even though 1.5.2 is a 'developer' release, we consider it the most
>    stable and capable version of Bioperl, and recommend that you use
>    it in all but the most critical production environments. Please
>    try it out and let us know of any problems or difficulties you run
>    into.
>
>
> Thank you,
> Sendu.
>   
I didn't see any tests done under solaris, so I asked our sys admin to 
do the install on one of our machines.

Just another data point:

He installed this release candidate on a Sun E450 box running solaris. 
uname -a gives:

SunOS descartes 5.10 Generic_118833-18 sun4u sparc SUNW,Ultra-4

perl -v gives:

This is perl, v5.8.8 built for sun4-solaris
(etc.)


$ time make test
PERL_DL_NONLAZY=1 /usr/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/AAChange...................ok
t/AAReverseMutate............ok
t/abi........................Bio::SeqIO::staden::read from bioperl-ext is not installed or is installed incorrectly - skipping abi.t tests
t/abi........................ok
t/ace........................ok
t/AlignIO....................ok
t/AlignStats.................ok
t/AlignUtil..................ok
t/alignUtilities.............ok
t/Allele.....................ok
t/Alphabet...................ok
t/Annotation.................ok
t/AnnotationAdaptor..........ok
t/asciitree..................ok
t/Assembly...................ok
        1/19 skipped:
t/Biblio.....................ok
t/Biblio_biofetch............ok
t/Biblio_eutils..............ok
t/BiblioReferences...........ok
t/BioDBGFF...................ok
t/BioDBSeqFeature............ok 1/46Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.
t/BioDBSeqFeature............ok
t/BioDBSeqFeature_BDB........ok
t/BioDBSeqFeature_mysql......ok 3/46prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT sequence,offset
   FROM sequence as s,locationlist as ll
   WHERE s.id=ll.id
     AND ll.seqname= ?
     AND offset >= ?
     AND offset <= ?
   ORDER BY offset
) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT sequence,offset
   FROM sequence as s,locationlist as ll
   WHERE s.id=ll.id
     AND ll.seqname= ?
     AND offset >= ?
     AND offset <= ?
   ORDER BY offset
) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
t/BioDBSeqFeature_mysql......ok
t/BioFetch_DB................ok
t/BioGraphics................ok
t/BlastIndex.................ok 1/13
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BlastIndex.................ok
t/BPbl2seq...................
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPbl2seq...................ok 1/108
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPbl2seq...................ok
t/BPlite.....................ok 1/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok 52/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok 88/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
STACK Bio::Tools::BPlite::new /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/Tools/BPlite.pm:197
STACK toplevel t/BPlite.t:127

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok
t/BPpsilite..................
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPpsilite..................ok 4/11
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPpsilite..................ok
t/bsml_sax...................ok
t/Chain......................ok
t/chaosxml...................ok
t/cigarstring................ok
t/ClusterIO..................ok
t/Coalescent.................ok
t/CodonTable.................ok
t/Compatible.................ok
t/consed.....................ok
t/CoordinateGraph............ok
t/CoordinateMapper...........ok
t/Correlate..................ok
t/ctf........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ctf.t tests
t/ctf........................ok
t/CytoMap....................ok
t/DB.........................skipped
        all skipped: Skipping all tests since they require network access, set BIOPERLDEBUG=1 to test
t/DBCUTG.....................ok
        11/34 skipped: Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
t/DBFasta....................ok
t/DNAMutation................ok
t/Domcut.....................ok
t/ECnumber...................ok
t/ELM........................ok 1/13
-------------------- WARNING ---------------------
MSG: sleeping for 1 seconds

---------------------------------------------------
t/ELM........................ok
t/embl.......................ok
t/EMBL_DB....................ok
t/EMBOSS_Tools...............ok
t/EncodedSeq.................ok
t/entrezgene.................ok 491/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 695/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 723/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 824/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok
t/ePCR.......................ok
t/ESEfinder..................ok 1/15# Looks like you planned 15 tests but only ran 14.
t/ESEfinder..................dubious
        Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED test 15
        Failed 1/15 tests, 93.33% okay (less 9 skipped tests: 5 okay, 33.33%)
t/est2genome.................ok
t/EUtilities.................skipped
        all skipped: Set BIOPERLDEBUG=1 to run tests
t/Exception..................ok
t/Exonerate..................ok
t/exp........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping exp.t tests
t/exp........................ok
t/fasta......................ok
t/FeatureIO..................ok 7/33
-------------------- WARNING ---------------------
MSG: '##feature-ontology' directive handling not yet implemented
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: '##attribute-ontology' directive handling not yet implemented
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: '##source-ontology' directive handling not yet implemented
---------------------------------------------------
t/FeatureIO..................ok
t/flat.......................ok
t/FootPrinter................ok
t/game.......................ok
t/GbrowseGFF.................ok
t/gcg........................ok
t/GDB........................ok
t/Gel........................ok
t/genbank....................ok
t/GeneCoordinateMapper.......ok
t/Geneid.....................ok
t/Genewise...................ok
        2/51 skipped:
t/Genomewise.................ok
t/Genpred....................ok
t/GFF........................ok
t/GOR4.......................ok
t/GOterm.....................ok
t/GraphAdaptor...............ok
t/GuessSeqFormat.............ok
t/hmmer......................ok
t/hmmer_pull.................ok
t/HNN........................ok
t/HtSNP......................ok
t/Index......................ok
t/InstanceSite...............ok
t/interpro...................ok
t/InterProParser.............ok
t/IUPAC......................ok
t/kegg.......................ok
t/largefasta.................ok
t/LargeLocatableSeq..........ok
t/largepseq..................ok
t/lasergene..................ok
t/LinkageMap.................ok
t/LiveSeq....................ok
t/LocatableSeq...............ok
t/Location...................ok
t/LocationFactory............ok
t/LocusLink..................ok
t/lucy.......................ok
t/Map........................ok
t/MapIO......................ok
t/masta......................ok
t/Matrix.....................ok
t/Measure....................ok
t/MeSH.......................ok
t/metafasta..................ok
t/MetaSeq....................ok
t/MicrosatelliteMarker.......ok
t/MiniMIMentry...............ok
t/MitoProt...................ok
t/Molphy.....................ok
t/MultiFile..................ok
t/multiple_fasta.............ok
t/Mutation...................ok
t/Mutator....................ok
t/NetPhos....................ok
        10/14 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/Node.......................ok
t/obo_parser.................ok
t/OddCodes...................ok
t/OMIMentry..................ok
t/OMIMentryAllelicVariant....ok
t/OMIMparser.................ok
t/Ontology...................ok
t/OntologyEngine.............ok
t/OntologyStore..............ok
t/PAML.......................ok
t/Perl.......................ok
t/phd........................ok
t/Phenotype..................ok
t/PhylipDist.................ok
t/PhysicalMap................ok
t/pICalculator...............ok
t/Pictogram..................ok
t/pir........................ok
t/pln........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping pln.t tests
t/pln........................ok
t/PopGen.....................ok
        2/89 skipped:
t/PopGenSims.................ok
t/primaryqual................ok
t/PrimarySeq.................ok
t/primedseq..................ok
t/Primer.....................ok
t/primer3....................ok
t/Promoterwise...............ok
t/ProtDist...................ok
t/protgraph..................ok
t/ProtMatrix.................ok
t/ProtPsm....................ok
t/Pseudowise.................ok
t/psm........................ok
t/QRNA.......................ok
t/qual.......................ok
t/RandDistFunctions..........ok
t/RandomTreeFactory..........ok
t/Range......................ok
t/RangeI.....................ok
t/raw........................ok
t/RefSeq.....................ok
t/Registry...................ok
t/Relationship...............ok
t/RelationshipType...........ok
t/RemoteBlast................ok
        11/13 skipped: to avoid timeout
t/RepeatMasker...............ok
t/RestrictionAnalysis........ok
t/RestrictionEnzyme..........ok 1/14
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::RestrictionEnzyme is deprecatedUse Bio::Restriction classes instead
---------------------------------------------------
t/RestrictionEnzyme..........ok
t/RestrictionIO..............ok
t/RNAChange..................ok
t/rnamotif...................ok
t/RootI......................ok
t/RootIO.....................ok
        2/27 skipped: various reasons
t/RootStorable...............ok
t/Scansite...................ok
t/scf........................ok
t/SearchDist.................ok
t/SearchIO...................ok
t/Seg........................ok
t/Seq........................ok
t/seq_quality................ok
t/SeqAnalysisParser..........ok
t/SeqBuilder.................ok
t/SeqDiff....................ok
t/SeqFeatCollection..........ok
t/SeqFeature.................ok
t/seqfeaturePrimer...........ok
t/SeqHound_DB................ok 4/14Writing into 'shoundlog' log file.
t/SeqHound_DB................ok
t/SeqIO......................ok
t/SeqPattern.................ok
t/seqread_fail...............ok
t/SeqStats...................ok
t/SequenceFamily.............ok
t/sequencetrace..............ok
t/SeqUtils...................ok
t/SeqVersion.................ok
t/seqwithquality.............ok
t/SeqWords...................ok
t/Sigcleave..................ok
t/Signalp....................ok
t/Sim4.......................ok
t/SimilarityPair.............ok
t/SimpleAlign................ok
t/simpleGOparser.............ok
t/singlet....................ok
t/sirna......................ok
t/SiteMatrix.................ok
t/SNP........................ok
t/Sopma......................ok
t/Species....................ok
        5/20 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/Spidey.....................ok
t/splicedseq.................ok
t/StandAloneBlast............ok
t/StructIO...................ok
t/Structure..................ok
t/swiss......................ok
t/Symbol.....................ok
t/tab........................ok
t/table......................ok
t/TagHaplotype...............ok
t/Taxonomy...................ok
        44/98 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/TaxonTree..................ok
t/Tempfile...................ok
t/Term.......................ok
t/tigrxml....................ok
t/tinyseq....................ok
t/Tmhmm......................ok
t/Tools......................ok
t/Tree.......................ok
t/TreeBuild..................ok
t/TreeIO.....................ok
t/trim.......................ok
t/tRNAscanSE.................ok
t/UCSCParsers................ok
t/Unflattener................ok
t/Unflattener2...............ok
t/UniGene....................ok
t/Variation_IO...............ok
t/WABA.......................ok
t/XEMBL_DB...................ok
        1/9 skipped: server may be down
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests
t/ztr........................ok
Failed Test   Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
t/ESEfinder.t  255 65280    15    2  13.33%  15
2 tests and 98 subtests skipped.
Failed 1/240 test scripts, 99.58% okay. 1/11910 subtests failed, 99.99% okay.
*** Error code 29
make: Fatal error: Command failed for target `test_dynamic'

real    13m10.064s
user    11m14.891s
sys     0m45.417s

$ TEST_VERBOSE=1 perl t/ESEfinder.t
1..15
ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
ok 2 - use Data::Dumper;
ok 3 - use Bio::PrimarySeq;
ok 4 - use Bio::Seq;
ok 5
ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
# Looks like you planned 15 tests but only ran 14.


From bix at sendu.me.uk  Thu Oct  5 03:19:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 08:19:39 +0100
Subject: [Bioperl-l] EUtilities term handling
Message-ID: <4524B20B.5010703@sendu.me.uk>

This is actually a general question and not limited to EUtilities. As I 
see it EUtiltiies lets you do queries in Bioperl that you can do on a 
website. The question is, should a Bioperl module always work with 
queries that the website it is a front-end to works with?

So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is 
essentially a frontend onto:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=

With a web-browser you can complete that url by supplying a term. For 
example, the term 'BRCA2+9606[taxid]' works and returns results:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid]

If you supply the exact same term to EUtilities::esearch like so:

my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => 
"gene", -term "BRCA2+9606[taxid]");

The search fails. From my 'user' perspective this is highly unexpected. 
Chris (the author) and I both understand /why/ it fails, but Chris 
doesn't think it is a bug, or at least something than can/should be 
changed. What do other people think? At the very least, if something 
unexpected happens, I'd suggest making a note of it in the POD 
somewhere. Eg. "Do not use + in term strings, even though they might 
work on the website".

Chris: what is the disadvantage of always submitting '+' as '+' to the 
server?


From bix at sendu.me.uk  Thu Oct  5 03:24:45 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 08:24:45 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <4524B33D.9070607@sendu.me.uk>

Sendu Bala wrote:
>
> With a web-browser you can complete that url by supplying a term. For 
> example, the term 'BRCA2+9606[taxid]' works and returns results:
> 
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] 
> 
> 
> If you supply the exact same term to EUtilities::esearch like so:
> 
> my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => 
> "gene", -term "BRCA2+9606[taxid]");

*cough*

my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db =>
"gene", -term => "BRCA2+9606[taxid]");


> The search fails. 


From m.weimer at dkfz-heidelberg.de  Thu Oct  5 08:15:53 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Thu, 05 Oct 2006 14:15:53 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Error
Message-ID: <1160050554.18691.11.camel@localhost>

When running


--------------------------------------------------------------

  #! /usr/bin/perl -w

  use strict;
  use Bio::DB::SwissProt;

  my $db_obj = new Bio::DB::SwissProt(-verbose=>1);

  my $seq_obj = $db_obj->get_Seq_by_acc('P43780');


-------------------------------------------------------------

using Bioperl 1.4-1 I get the error message

---------------------------------------------------------------------------------

  request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
  Content-Length: 45
  Content-Type: application/x-www-form-urlencoded

  format=swissprot&db=swall&style=raw&id=P43780


  ------------- EXCEPTION: Bio::Root::Exception -------------
  MSG: swissprot stream with no ID. Not swissprot in my book
  STACK: Error::throw
  STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
  STACK
Bio::SeqIO::swiss::next_seq /usr/share/perl5/Bio/SeqIO/swiss.pm:179
  STACK:
Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/share/perl5/Bio/DB/WebDBSeqI.pm:187
  STACK: ./putativeGele.pl:8
  -----------------------------------------------------------

--------------------------------------------------------------------------------

Any suggestions?

Thanks,

Marc


From bix at sendu.me.uk  Thu Oct  5 09:21:23 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 14:21:23 +0100
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <1160050554.18691.11.camel@localhost>
References: <1160050554.18691.11.camel@localhost>
Message-ID: <452506D3.5050501@sendu.me.uk>

Marc Weimer wrote:
[snip]
>   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
> 
>   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
[snip]
> using Bioperl 1.4-1 I get the error message
[snip]
>   ------------- EXCEPTION: Bio::Root::Exception -------------
>   MSG: swissprot stream with no ID. Not swissprot in my book
[snip]
> Any suggestions?

It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most 
recent official release), but 1.5.2 does 
(http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS 
(http://bioperl.org/wiki/Getting_BioPerl#CVS).


From m.weimer at dkfz-heidelberg.de  Thu Oct  5 09:35:06 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Thu, 05 Oct 2006 15:35:06 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <452506D3.5050501@sendu.me.uk>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
Message-ID: <1160055306.18691.14.camel@localhost>

Works fine with 1.5.2

Thanks,

Marc


> Marc Weimer wrote:
> [snip]
> >   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
> > 
> >   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
> [snip]
> > using Bioperl 1.4-1 I get the error message
> [snip]
> >   ------------- EXCEPTION: Bio::Root::Exception -------------
> >   MSG: swissprot stream with no ID. Not swissprot in my book
> [snip]
> > Any suggestions?
> 
> It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most 
> recent official release), but 1.5.2 does 
> (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS 
> (http://bioperl.org/wiki/Getting_BioPerl#CVS).
-- 
########################################

Dr. Marc Weimer
German Cancer Research Center
Central Unit Biostatistics
Im Neuenheimer Feld 280
D-69120 Heidelberg
Phone: +49 (0) 6221/42-2387
Fax: +49 (0) 6221/42-2397

########################################


From hlapp at gmx.net  Thu Oct  5 09:55:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 09:55:58 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>


On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:

> This is actually a general question and not limited to EUtilities.  
> As I
> see it EUtiltiies lets you do queries in Bioperl that you can do on a
> website. The question is, should a Bioperl module always work with
> queries that the website it is a front-end to works with?

I think yes, but stick to this definition.

Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez  
website it will actually not work. Hence, it should be no surprise  
that it doesn't work either using Bio::DB::EUtilities.

The URL you are using to make your point is much more an example for  
using a web-service (SOAP, REST, or not) than it is for using a  
website. Using the web-service URL with a space in place of the '+'  
works, but yields a different result (just searches for BRCA2), so if  
tested for correct result the test fails.

I.e., you don't expect an input form on a website to accept URL- 
encoded input. Instead, you expect it to do any URL-encoding for you  
that needs to be done. Conversely, if you are using a URL to retrieve  
stuff using e.g. wget or curl, it is clear that you will need to do  
URL encoding yourself unless there is a command line option that lets  
you instruct the querying program to do so.

I would be careful with mangling the two definitions into one,  
resulting in a module that needs to serve two masters. You could  
consider providing an option though that lets you turn off the URL  
encoding on demand.

Aside from that, one of the advantages of having the service wrapped  
in Bioperl is in fact that you can have it accept a wider variety of  
parameters that the actual service would allow you to have, e.g.,  
arrays, hashes, or whatever seems appropriate.

My $0.02.

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Thu Oct  5 10:08:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:08:01 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
Message-ID: <452511C1.5020709@sendu.me.uk>

Hilmar Lapp wrote:
> 
> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:
> 
>> This is actually a general question and not limited to EUtilities. As I
>> see it EUtiltiies lets you do queries in Bioperl that you can do on a
>> website. The question is, should a Bioperl module always work with
>> queries that the website it is a front-end to works with?
> 
> I think yes, but stick to this definition.
> 
> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez 
> website it will actually not work. Hence, it should be no surprise that 
> it doesn't work either using Bio::DB::EUtilities.

On the contrary, I find it a surprise because EUtilities is an interface 
to NCBI's eutils, not the entrez website.

If I had previously read instructions on using eutils:
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls
I might (do) expect that I /should/ use + in my term.


> Aside from that, one of the advantages of having the service wrapped in 
> Bioperl is in fact that you can have it accept a wider variety of 
> parameters that the actual service would allow you to have, e.g., 
> arrays, hashes, or whatever seems appropriate.

I was going to suggest that terms be supplied as an array, leaving 
Bioperl code to decide how to 'AND' all the terms (elements in the 
array) together. It would also further force the user not to think of 
how eutils normally works, but to only consider the Bioperl instructions 
on how to form a query. But I'm not sure of the value of all that.


From cjfields at uiuc.edu  Thu Oct  5 10:06:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:06:50 -0500
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <452506D3.5050501@sendu.me.uk>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
Message-ID: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>

On Oct 5, 2006, at 8:21 AM, Sendu Bala wrote:

> Marc Weimer wrote:
> [snip]
>>   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
>>
>>   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
> [snip]
>> using Bioperl 1.4-1 I get the error message
> [snip]
>>   ------------- EXCEPTION: Bio::Root::Exception -------------
>>   MSG: swissprot stream with no ID. Not swissprot in my book
> [snip]
>> Any suggestions?
>
> It works with the latest Bioperl. I'm not sure if 1.5.1 works (the  
> most
> recent official release), but 1.5.2 does
> (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS
> (http://bioperl.org/wiki/Getting_BioPerl#CVS).

Mark, you'll have to update to 1.5.2 or CVS, as Sendu suggested.   
There were server changes for biofetch which were fixed about 4-6  
months ago (post rel. 1.5.1); I think several changes were made to  
Bio::SeqIO::swiss as well during this period.

I think the error here results from Bio::SeqIO::swiss trying to parse  
an empty byte stream.  Sendu, do you think that Bio::SeqIO::swiss  
(and other SeqIO parsers) should throw a more specific message for  
getting an empty byte stream?  Or is it more trouble than it's worth?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 10:14:40 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:14:40 +0100
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
	<1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>
Message-ID: <45251350.5030608@sendu.me.uk>

Chris Fields wrote:
>
>>>   ------------- EXCEPTION: Bio::Root::Exception -------------
>>>   MSG: swissprot stream with no ID. Not swissprot in my book
[snip]
> I think the error here results from Bio::SeqIO::swiss trying to parse an 
> empty byte stream.  Sendu, do you think that Bio::SeqIO::swiss (and 
> other SeqIO parsers) should throw a more specific message for getting an 
> empty byte stream?  Or is it more trouble than it's worth?

Trouble wise, I've no idea without looking into it. Generally speaking 
though I can say that the error message is pretty useless and I'm always 
in favour of better error messages.


From hlapp at gmx.net  Thu Oct  5 10:21:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 10:21:49 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452511C1.5020709@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
Message-ID: <F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>


On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote:

>>
>> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:
>>
>>> This is actually a general question and not limited to  
>>> EUtilities. As I
>>> see it EUtiltiies lets you do queries in Bioperl that you can do  
>>> on a
>>> website. The question is, should a Bioperl module always work with
>>> queries that the website it is a front-end to works with?
>>
>> I think yes, but stick to this definition.
>>
>> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez
>> website it will actually not work. Hence, it should be no surprise  
>> that
>> it doesn't work either using Bio::DB::EUtilities.
>
> On the contrary, I find it a surprise because EUtilities is an  
> interface
> to NCBI's eutils, not the entrez website.
>
> If I had previously read instructions on using eutils:
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=coursework.section.constructing-urls
> I might (do) expect that I /should/ use + in my term.

This is my point - stick to your definitions. Are you wrapping a  
query form on a website or are you wrapping a web service (i.e., a URL)?

The examples you give are about wrapping a web-service. Your original  
question was about wrapping a website. Yet another question is what  
the author of Bio::DB::EUtilities intended to wrap.

The other thing to consider is user-friendliness. If you are wrapping  
a web-service, do you still make not URL-encoding the user input the  
default? What will 90% of the users probably want or expect to be  
able to do? URL-encode all input themselves or expect the module to  
do this for them unless they turn it off?

As far as I'm concerned, I'll happily count myself among those who  
are lazy and ignorant, don't read NCBI's documentation, don't want to  
know how to URL encode and why this needs to be done, but just want  
it to work.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Oct  5 10:31:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:31:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>

On Oct 5, 2006, at 2:19 AM, Sendu Bala wrote:

> This is actually a general question and not limited to EUtilities.  
> As I
> see it EUtiltiies lets you do queries in Bioperl that you can do on a
> website. The question is, should a Bioperl module always work with
> queries that the website it is a front-end to works with?
>
> So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is
> essentially a frontend onto:
>
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? 
> retmode=xml&db=gene&term=
>
> With a web-browser you can complete that url by supplying a term. For
> example, the term 'BRCA2+9606[taxid]' works and returns results:
>
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? 
> retmode=xml&db=gene&term=BRCA2+9606[taxid]
>
> If you supply the exact same term to EUtilities::esearch like so:
>
> my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db =>
> "gene", -term "BRCA2+9606[taxid]");
>
> The search fails. From my 'user' perspective this is highly  
> unexpected.
> Chris (the author) and I both understand /why/ it fails, but Chris
> doesn't think it is a bug, or at least something than can/should be
> changed. What do other people think? At the very least, if something
> unexpected happens, I'd suggest making a note of it in the POD
> somewhere. Eg. "Do not use + in term strings, even though they might
> work on the website".
>
> Chris: what is the disadvantage of always submitting '+' as '+' to the
> server?

A few reasons:

1)  According to NCBI, you can use '+' in queries, but not as a  
boolean.  Global changes of '+' to a space may change the meaning of  
the query in a few rare occasions.  So, if you really wanted to  
search for the string 'BRCA2+ATG', NCBI looks for that term literally.

2)  '+' is a URI reserved symbol for a space delimiter.  Therefore,  
any parameters containing '+' are URI-encoded into %2B, which is  
decoded on NCBI's end back to '+' (The is demonstrable with current  
EUtilities output and the returned XML data).

3)  Why not just use a space (implicit AND)?  Or an explicit  
boolean?  Or '&' (which apparently works but is not specified in the  
NCBI Entrez docs)?

The bug is in the query and not in the code, i.e. is is a  user- 
generated bug, not an EUtilities bug.  And it shouldn't be  
unexpected, as NCBI has very specific rules for building queries for  
Entrez (just like any other database).  If I were to use nonstandard  
queries for MySQL, BioFetch, UCSC, or anything else, I would expect  
to get bad results.  As the old saying goes, garbage in, garbage out.

The following link has their updated rules:

http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
rid=helpentrez.chapter.EntrezHelp

Here is their old one:

http://www.ncbi.nlm.nih.gov/entrez/query/static/help/helpdoc.html

We could, of course, put something in POD, but you never presented  
that option to me before.  I'll grant that the EUtilities API needs  
some cleaning up, not easy to do when the returned data varies from  
each utility.  But it does get the URL encoding correct, at least in  
this case.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 10:32:49 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:32:49 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
Message-ID: <45251791.9040409@sendu.me.uk>

Hilmar Lapp wrote:
> 
> On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote:
>
>> On the contrary, I find it a surprise because EUtilities is an interface
>> to NCBI's eutils, not the entrez website.
>>
>> If I had previously read instructions on using eutils:
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls 
>>
>> I might (do) expect that I /should/ use + in my term.
> 
> This is my point - stick to your definitions. Are you wrapping a query 
> form on a website or are you wrapping a web service (i.e., a URL)?
> 
> The examples you give are about wrapping a web-service. Your original 
> question was about wrapping a website.

Right... I don't see that that changes the answer to my question though 
does it?

"The question is, should a Bioperl module always work with
queries that the web-service it is a front-end to works with?"

For me, the answer is still yes.


> As far as I'm concerned, I'll happily count myself among those who are 
> lazy and ignorant, don't read NCBI's documentation, don't want to know 
> how to URL encode and why this needs to be done, but just want it to work.

That's a reasonable attitude to take. Which comes back to the question I 
asked of Chris - naively, if you send + as + you can please everyone, 
can't you? Both people who have read the docs on the web-service and 
those who haven't? Or are there real queries in which a user may want to 
search for a phrase with a literal + in it (and where such a search 
works via eutils)?


From bix at sendu.me.uk  Thu Oct  5 10:44:33 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:44:33 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
Message-ID: <45251A51.6020802@sendu.me.uk>

Chris Fields wrote:
> The bug is in the query and not in the code, i.e. is is a  
> user-generated bug, not an EUtilities bug.  And it shouldn't be 
> unexpected, as NCBI has very specific rules for building queries for 
> Entrez (just like any other database).

So I guess this comes down to something Hilmar mentioned and I never 
even considered before. You consider your EUtilities stuff as a frontend 
to entrez, and therefore consider valid queries as queries that are 
valid for entrez and not eutils?

If that's the case, fine. I understand why you don't think this is a 
bug. Again, something that might warrant a mention in the POD.
Currently the naming of the modules and the explicit references to 
eutils (and me knowing the implementation uses eutils) got me confused.


From cjfields at uiuc.edu  Thu Oct  5 10:51:28 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:51:28 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452511C1.5020709@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
Message-ID: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>


On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote:

>>> This is actually a general question and not limited to  
>>> EUtilities. As I
>>> see it EUtiltiies lets you do queries in Bioperl that you can do  
>>> on a
>>> website. The question is, should a Bioperl module always work with
>>> queries that the website it is a front-end to works with?
>>
>> I think yes, but stick to this definition.
>>
>> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez
>> website it will actually not work. Hence, it should be no surprise  
>> that
>> it doesn't work either using Bio::DB::EUtilities.
>
> On the contrary, I find it a surprise because EUtilities is an  
> interface
> to NCBI's eutils, not the entrez website.

It uses NCBI's CGI interface for eutils, not the SOAP interface.   
Very different.  I have considered using the NCBI SOAP-based  
interface, but the web services are still somewhat incomplete, unlike  
the CGI interface.

> If I had previously read instructions on using eutils:
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=coursework.section.constructing-urls
> I might (do) expect that I /should/ use + in my term.

You are looking at part of the naked URL on that page.  Here's what  
that page says:

"When constructing URLs for the eUtils, please use lowercase  
characters for all parameters except &WebEnv. There is no required  
order for the URL parameters in an eUtils URL, and null values or  
inappropriate parameters are ignored. Avoid placing spaces in the  
URLs, particularly in queries. If a space is required, use a plus  
sign (+) instead of a space:

     * Incorrect: &id=352, 25125, 234, ...
     * Correct: &id=352,25125,234,...
     * Incorrect: &term=biomol mrna[properties] AND mouse[organism]
     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]

Other special characters, such as the # symbol used in referring to a  
query key on the History server, should be represented by their URL  
encodings (%23 for #).top link"

I use URI for building the URL with the parameters.  URI specifically  
encodes all of this for you, so spaces convert to '+' and '+'  
converts to %2B.

>> Aside from that, one of the advantages of having the service  
>> wrapped in
>> Bioperl is in fact that you can have it accept a wider variety of
>> parameters that the actual service would allow you to have, e.g.,
>> arrays, hashes, or whatever seems appropriate.
>
> I was going to suggest that terms be supplied as an array, leaving
> Bioperl code to decide how to 'AND' all the terms (elements in the
> array) together. It would also further force the user not to think of
> how eutils normally works, but to only consider the Bioperl  
> instructions
> on how to form a query. But I'm not sure of the value of all that.

Why do we need to intuit what the user is thinking at an particular  
time?  How would I know that someone actually wanted to search using  
the literal string 'abc+123' as opposed to 'abc 123'?

I see value in your last suggestion but I think a class or set of  
classes would be best suited for that:

MySQL Query     |  in                      out   | MySQL Query
Entrez Query    |-----> Generic Query class----->| Entrez Query
SRS Query       |                                | SRS Query
ad infinitum...

The generic query object could then be used in DB searches as an  
option besides using a raw string.  Though it would get tricky with  
SQL's complexity...

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Thu Oct  5 10:54:04 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 10:54:04 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251791.9040409@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
	<45251791.9040409@sendu.me.uk>
Message-ID: <9916EDEE-EA3C-4C55-A004-A46F37B559BF@gmx.net>


On Oct 5, 2006, at 10:32 AM, Sendu Bala wrote:

>> The examples you give are about wrapping a web-service. Your  
>> original question was about wrapping a website.
>
> Right... I don't see that that changes the answer to my question  
> though does it?
>
> "The question is, should a Bioperl module always work with
> queries that the web-service it is a front-end to works with?"
>
> For me, the answer is still yes.

The answer is still yes. My point was the query that works with a  
website is not necessarily the query that works with a web-service,  
even if that web-service also powers the website.

>
>> As far as I'm concerned, I'll happily count myself among those who  
>> are lazy and ignorant, don't read NCBI's documentation, don't want  
>> to know how to URL encode and why this needs to be done, but just  
>> want it to work.
>
> That's a reasonable attitude to take. Which comes back to the  
> question I asked of Chris - naively, if you send + as + you can  
> please everyone, can't you? Both people who have read the docs on  
> the web-service and those who haven't? Or are there real queries in  
> which a user may want to search for a phrase with a literal + in it  
> (and where such a search works via eutils)?

So are you suggesting to URL-encode some characters but not others?  
This would move you into muddy waters and I'm wondering what the gain  
is from that, and for whom it is a gain.

It sounds like it will mostly benefit those who have studied the NCBI  
documentation and know exactly the URL they want to send and want to  
ignore the EUtilities POD.

My humble guess is the far majority of people will either not read  
any documentation, or read the module's POD.

Maybe a better way to serve both types of people is to accept a  
parameter -querystring that is expected to include everything from  
'term=' onwards (including 'term=' itself) which gives you complete  
control and freedom if you know what you are doing, and otherwise  
implement what you suggested before:

> I was going to suggest that terms be supplied as an array, leaving
> Bioperl code to decide how to 'AND' all the terms (elements in the
> array) together. It would also further force the user not to think of
> how eutils normally works, but to only consider the Bioperl  
> instructions
> on how to form a query.


	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Thu Oct  5 11:02:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:02:01 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
Message-ID: <45251E69.7040507@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote:
>
>> On the contrary, I find it a surprise because EUtilities is an interface
>> to NCBI's eutils, not the entrez website.
> 
> It uses NCBI's CGI interface for eutils, not the SOAP interface.  Very 
> different.  I have considered using the NCBI SOAP-based interface, but 
> the web services are still somewhat incomplete, unlike the CGI interface.

I don't know anything about the SOAP interface. I'm talking about the 
CGI interface that you use.


>> If I had previously read instructions on using eutils:
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls 
>>
>> I might (do) expect that I /should/ use + in my term.
> 
> You are looking at part of the naked URL on that page.  Here's what that 
> page says:

I know what it says...

>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]

The correct query is the one that has +s in it.


> I use URI for building the URL with the parameters.  URI specifically 
> encodes all of this for you, so spaces convert to '+' and '+' converts 
> to %2B.

Well, yes. This causes what I thought of as a bug. It prevents me from 
submitting a /correct/ eutils term. However it isn't a bug if you 
explain to users they shouldn't be submitting valid eutils terms, but 
only valid /entrez/ terms.


From cjfields at uiuc.edu  Thu Oct  5 11:15:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:15:49 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251A51.6020802@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
	<45251A51.6020802@sendu.me.uk>
Message-ID: <B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>


On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> The bug is in the query and not in the code, i.e. is is a  user- 
>> generated bug, not an EUtilities bug.  And it shouldn't be  
>> unexpected, as NCBI has very specific rules for building queries  
>> for Entrez (just like any other database).
>
> So I guess this comes down to something Hilmar mentioned and I  
> never even considered before. You consider your EUtilities stuff as  
> a frontend to entrez, and therefore consider valid queries as  
> queries that are valid for entrez and not eutils?

The eutils tools access the same databases as the web page, in the  
same way, using the same search terms.  From the EUtilities docs:

"The eUtils access the core search and retrieval engine of the Entrez  
system and, therefore, are only capable of retrieving data that are  
already in Entrez."

> If that's the case, fine. I understand why you don't think this is  
> a bug. Again, something that might warrant a mention in the POD.
> Currently the naming of the modules and the explicit references to  
> eutils (and me knowing the implementation uses eutils) got me  
> confused.

I'll note that in there is URI encoding in POD, but that should be a  
no-brainer.  I don't think every Bio::DB* class specifies this,  
mainly because it is taken for granted.  Pretty much anything that  
builds URL strings needs to encode based on the URI standard, and any  
server that accepts URLs is expected to decode using the same standard.

So, again, why does that have to be specifically outlined in POD?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 11:24:39 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:24:39 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251E69.7040507@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
Message-ID: <BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>

>> I use URI for building the URL with the parameters.  URI  
>> specifically encodes all of this for you, so spaces convert to '+'  
>> and '+' converts to %2B.
>
> Well, yes. This causes what I thought of as a bug. It prevents me  
> from submitting a /correct/ eutils term. However it isn't a bug if  
> you explain to users they shouldn't be submitting valid eutils  
> terms, but only valid /entrez/ terms.

I can specify in POD that URI encoding is in effect if that placates  
you, and maybe add a bit about how terms are to be built (based on  
the website).  I also noticed that the esearch POD doesn't have a  
demo in the SYNOPSIS yet (my fault).

However, I think this is all a bit silly.  This is something most  
people already realize and take for granted (it's standard for any  
CGI interface to use URI encoding).

Also, most Entrez users do not use a term like 'BRCA2+Human 
[ORGANISM]'.  They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human 
[ORGANISM]', the latter which is implicit.  All of this is on the  
Entrez website.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From MEC at stowers-institute.org  Thu Oct  5 11:12:02 2006
From: MEC at stowers-institute.org (Cook, Malcolm)
Date: Thu, 5 Oct 2006 10:12:02 -0500
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
Message-ID: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>

Lincoln,

I committed a change to Bio::SeqFeature::Store to use nfreeze instead of
freeze which should allow SeqFeature objects to survive database
freeze/thaw cycles across architectures.

I hope I was not presumptuous or in error in doing this....

Regards,

Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
 

From bix at sendu.me.uk  Thu Oct  5 11:28:55 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:28:55 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
	<45251A51.6020802@sendu.me.uk>
	<B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>
Message-ID: <452524B7.5080003@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>> The bug is in the query and not in the code, i.e. is is a  
>>> user-generated bug, not an EUtilities bug.  And it shouldn't be 
>>> unexpected, as NCBI has very specific rules for building queries for 
>>> Entrez (just like any other database).
>>
>> So I guess this comes down to something Hilmar mentioned and I never 
>> even considered before. You consider your EUtilities stuff as a 
>> frontend to entrez, and therefore consider valid queries as queries 
>> that are valid for entrez and not eutils?
> 
> The eutils tools access the same databases as the web page, in the same 
> way, using the same search terms.

It doesn't. The eutils interface behaves differently with +s than does 
the entrez website interface. In eutils + means space, whilst in entrez, 
+ means the plus symbol.


>> If that's the case, fine. I understand why you don't think this is a 
>> bug. Again, something that might warrant a mention in the POD.
>> Currently the naming of the modules and the explicit references to 
>> eutils (and me knowing the implementation uses eutils) got me confused.
> 
> I'll note that in there is URI encoding in POD, but that should be a 
> no-brainer.

Just that it is URI encoded isn't the problem. The problem is the 
difference in behaviour outlined above.


> I don't think every Bio::DB* class specifies this, mainly 
> because it is taken for granted.  Pretty much anything that builds URL 
> strings needs to encode based on the URI standard, and any server that 
> accepts URLs is expected to decode using the same standard.
> 
> So, again, why does that have to be specifically outlined in POD?

Because they're different. If I construct a valid eutils query it might 
not work. You ought to explain why.

"EUtilities takes any valid entrez query and transforms it into a valid 
eutils query for submission. Do not try and provide a valid eutils query 
of your own, or the extra transformation will result in no results"


From bix at sendu.me.uk  Thu Oct  5 11:30:44 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:30:44 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
Message-ID: <45252524.7030006@sendu.me.uk>

Chris Fields wrote:
>>> I use URI for building the URL with the parameters.  URI specifically 
>>> encodes all of this for you, so spaces convert to '+' and '+' 
>>> converts to %2B.
>>
>> Well, yes. This causes what I thought of as a bug. It prevents me from 
>> submitting a /correct/ eutils term. However it isn't a bug if you 
>> explain to users they shouldn't be submitting valid eutils terms, but 
>> only valid /entrez/ terms.
> 
> I can specify in POD that URI encoding is in effect if that placates 
> you, and maybe add a bit about how terms are to be built (based on the 
> website).  I also noticed that the esearch POD doesn't have a demo in 
> the SYNOPSIS yet (my fault).
> 
> However, I think this is all a bit silly.  This is something most people 
> already realize and take for granted (it's standard for any CGI 
> interface to use URI encoding).
> 
> Also, most Entrez users do not use a term like 'BRCA2+Human[ORGANISM]'.  
> They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human[ORGANISM]', the 
> latter which is implicit.  All of this is on the Entrez website.

Exactly. You're assuming an entrez user and expecting an entrez query. I 
don't think its silly given the name of the modules for the user to 
assume the code needs an eutils query, which is a different thing with 
different behaviour /independent/ of URI encoding.


From cjfields at uiuc.edu  Thu Oct  5 11:50:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:50:51 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251E69.7040507@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
Message-ID: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>

> I know what it says...

Ah, that's the Sendu I know and love.

>
>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>
> The correct query is the one that has +s in it.

Yes, that's because it's a URL, not a raw search term string (it has  
been URI-encoded so spaces are converted to '+').  If you use that as  
a direct query in Entrez you will not get the same response.  You do  
get something if you use the new NCBI global query form on the main  
page, but clicking on the nucleotide or PMC hits reveals that the URL  
is malformed and no term is present.  That is exactly the same  
response in EUtilities:

<?xml version="1.0"?>
<!DOCTYPE eSearchResult PUBLIC "-//NLM//DTD eSearchResult, 11 May  
2002//EN" "http://www.ncbi.nlm.nih.gov/entrez/query/DTD/ 
eSearch_020511.dtd">
<eSearchResult>
         <Count>0</Count>
         <RetMax>0</RetMax>
         <RetStart>0</RetStart>
         <IdList>
         </IdList>
         <TranslationSet>
         </TranslationSet>
         <QueryTranslation></QueryTranslation>
</eSearchResult>

Note the QueryTranslation tag is empty.

The only noticeable difference is using egquery (which I just fixed  
in CVS yesterday).  The returned XML gives no hits for any database,  
which is true based on individual esearch queries for those database,  
and is actually more consistent than the website version.

>> I use URI for building the URL with the parameters.  URI specifically
>> encodes all of this for you, so spaces convert to '+' and '+'  
>> converts
>> to %2B.
>
> Well, yes. This causes what I thought of as a bug. It prevents me from
> submitting a /correct/ eutils term. However it isn't a bug if you
> explain to users they shouldn't be submitting valid eutils terms, but
> only valid /entrez/ terms.

If you mean that most users will actually use a URL-like search term,  
then I would say you have a point.  But that simply isn't the case.

If clarifying the docs makes it better, then so be it.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 11:59:53 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:59:53 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45252524.7030006@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
Message-ID: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>


On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>>> I use URI for building the URL with the parameters.  URI  
>>>> specifically encodes all of this for you, so spaces convert to  
>>>> '+' and '+' converts to %2B.
>>>
>>> Well, yes. This causes what I thought of as a bug. It prevents me  
>>> from submitting a /correct/ eutils term. However it isn't a bug  
>>> if you explain to users they shouldn't be submitting valid eutils  
>>> terms, but only valid /entrez/ terms.
>> I can specify in POD that URI encoding is in effect if that  
>> placates you, and maybe add a bit about how terms are to be built  
>> (based on the website).  I also noticed that the esearch POD  
>> doesn't have a demo in the SYNOPSIS yet (my fault).
>> However, I think this is all a bit silly.  This is something most  
>> people already realize and take for granted (it's standard for any  
>> CGI interface to use URI encoding).
>> Also, most Entrez users do not use a term like 'BRCA2+Human 
>> [ORGANISM]'.  They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human 
>> [ORGANISM]', the latter which is implicit.  All of this is on the  
>> Entrez website.
>
> Exactly. You're assuming an entrez user and expecting an entrez  
> query. I don't think its silly given the name of the modules for  
> the user to assume the code needs an eutils query, which is a  
> different thing with different behaviour /independent/ of URI  
> encoding.

It's a silly distinction.  The POD for Bio::DB::EUtilities states:

Bio::DB::EUtilities - interface for handling web queries and data  
retrieval from NCBI's Entrez Utilities.

My question is this : why would anyone (particularly the everyday  
bioperl user) want to use URL-encoded parameters for a query?  That  
seems to be your main argument here.  If so, wouldn't I just paste  
them together then send them off NCBI eutils?  Would I devote ~ 10  
classes to that?  I could do that in a short program using an array,  
join, and LWP::Simple.

The purpose is quite clearly stated, but if you feel that by  
badgering me to add something to POD I consider common sense, then  
you're right.  You've succeeded.  Bravo.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 12:02:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:02:05 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
Message-ID: <45252C7D.3050009@sendu.me.uk>

Chris Fields wrote:
>
>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>
>> The correct query is the one that has +s in it.
> 
> Yes, that's because it's a URL, not a raw search term string (it has 
> been URI-encoded so spaces are converted to '+').  If you use that as a 
> direct query in Entrez you will not get the same response.

But we're not doing Entrez queries. We're using a module called 
EUtilities to do an eutils query, which involves forming a url in which 
spaces should to be converted to +. That's the source of confusion. Is 
the user supposed to do this, or is EUtilities?

All you had to do 8 emails ago is tell me that EUtilities is supposed to 
do that. You /still/ haven't told me that. I give up.


From cjfields at uiuc.edu  Thu Oct  5 12:12:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 11:12:11 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45252C7D.3050009@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
Message-ID: <A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>


On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>>
>>> The correct query is the one that has +s in it.
>> Yes, that's because it's a URL, not a raw search term string (it  
>> has been URI-encoded so spaces are converted to '+').  If you use  
>> that as a direct query in Entrez you will not get the same response.
>
> But we're not doing Entrez queries. We're using a module called  
> EUtilities to do an eutils query, which involves forming a url in  
> which spaces should to be converted to +. That's the source of  
> confusion. Is the user supposed to do this, or is EUtilities?
>
> All you had to do 8 emails ago is tell me that EUtilities is  
> supposed to do that. You /still/ haven't told me that. I give up.

It should be apparent from the documentation and the URLs posted in  
debugging output the first few times you used it.  Again, why would I  
dedicate ~ 10 classes to pasting together URI-encoded strings?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 12:22:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:22:36 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
Message-ID: <4525314C.7020205@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote:
>
>> Exactly. You're assuming an entrez user and expecting an entrez query. 
>> I don't think its silly given the name of the modules for the user to 
>> assume the code needs an eutils query, which is a different thing with 
>> different behaviour /independent/ of URI encoding.
> 
> It's a silly distinction.  The POD for Bio::DB::EUtilities states:
> 
> Bio::DB::EUtilities - interface for handling web queries and data 
> retrieval from NCBI's Entrez Utilities.
> 
> My question is this : why would anyone (particularly the everyday 
> bioperl user) want to use URL-encoded parameters for a query?

Well I'll tell you why I was trying to use URL-encoded parameters, if 
that helps you any.

I read the pod for EUtilities but all the examples have very simple 
-term s defined with just a single word. So I wonder how I'm supposed to 
make an 'AND' term. I also have no idea what utilities I'm supposed to 
use, or what databases etc. I need to get the answer I want.

The POD points me here:
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
Combined with the EUtilities synopsis I know I'm supposed to start with 
esearch so I look at:
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html
And figure out what my terms are supposed to be.

Then I test some example terms in my web browser using the esearch base 
url (http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?) to see 
if they work, and copy/paste the terms into my EUtilities-using perl 
script, replacing variable terms with perl variables.

Then I find that my terms don't work, ask you about it, and you fail to 
tell me I should be testing my terms at 
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene.

If you think I'm stupid, fine, but I'm probably not the only stupid 
person on the planet. Which is why I suggested a POD addition. You don't 
have to make any POD change if you don't want to. I simply thought it 
might help avoid anyone 'badgering' you in the future with a similar 
problem.


From bix at sendu.me.uk  Thu Oct  5 12:28:51 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:28:51 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
	<A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
Message-ID: <452532C3.9030804@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>>
>>>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>>>
>>>> The correct query is the one that has +s in it.
>>> Yes, that's because it's a URL, not a raw search term string (it has 
>>> been URI-encoded so spaces are converted to '+').  If you use that as 
>>> a direct query in Entrez you will not get the same response.
>>
>> But we're not doing Entrez queries. We're using a module called 
>> EUtilities to do an eutils query, which involves forming a url in 
>> which spaces should to be converted to +. That's the source of 
>> confusion. Is the user supposed to do this, or is EUtilities?
>>
>> All you had to do 8 emails ago is tell me that EUtilities is supposed 
>> to do that. You /still/ haven't told me that. I give up.
> 
> It should be apparent from the documentation and the URLs posted in 
> debugging output the first few times you used it.  Again, why would I 
> dedicate ~ 10 classes to pasting together URI-encoded strings?

I'm not sure how not doing URI-encoding would suddenly make your classes 
worthless. I find them to be very useful (even when I didn't know there 
was any URI-encoding, was incorrectly using +s and it happened to work 
anyway).


From bernd.web at gmail.com  Thu Oct  5 10:09:38 2006
From: bernd.web at gmail.com (Bernd Web)
Date: Thu, 5 Oct 2006 16:09:38 +0200
Subject: [Bioperl-l] Eutilities Batch
Message-ID: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>

Hi,

I am using the new EUtilities. It looks great.
I was trying to use epost followed by elink but i get an error. The
same error is actually given with the example on
http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
Can't call method "get_databases" on an undefined value at EU.pl line 25.

For completeness, the code is shown below too.

Any suggestions what is going wrong?

Regards,
Bernd

# chain EUtilities for complex queries

  use Bio::DB::EUtilities;

  my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                         -db         => 'pubmed',
                                         -term       => 'hutP',
                                         -usehistory => 'y');

  $esearch->get_response; # parse the response, fetch a cookie

  my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
                                       -db           => 'protein,taxonomy',
                                       -dbfrom       => 'pubmed',
                                       -cookie       => $esearch->next_cookie,
                                       -cmd          => 'neighbor');

  # this retrieves the Bio::DB::EUtilities::ElinkData object

  my ($linkset) = $elink->next_linkset;
  my @ids;

  # step through IDs for each linked database in the ElinkData object

  for my $db ($linkset->get_databases) {
    @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
    # do something here
  }


From cjfields at uiuc.edu  Thu Oct  5 13:31:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 12:31:33 -0500
Subject: [Bioperl-l] Eutilities Batch
In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
Message-ID: <F53B83B9-E188-4715-8229-0B6D9C0C982A@uiuc.edu>

I'll look into it.  I'm busy updating the EUtilities tools now.

Chris

On Oct 5, 2006, at 9:09 AM, Bernd Web wrote:

> Hi,
>
> I am using the new EUtilities. It looks great.
> I was trying to use epost followed by elink but i get an error. The
> same error is actually given with the example on
> http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
> Can't call method "get_databases" on an undefined value at EU.pl  
> line 25.
>
> For completeness, the code is shown below too.
>
> Any suggestions what is going wrong?
>
> Regards,
> Bernd
>
> # chain EUtilities for complex queries
>
>   use Bio::DB::EUtilities;
>
>   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                          -db         => 'pubmed',
>                                          -term       => 'hutP',
>                                          -usehistory => 'y');
>
>   $esearch->get_response; # parse the response, fetch a cookie
>
>   my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
>                                        -db           =>  
> 'protein,taxonomy',
>                                        -dbfrom       => 'pubmed',
>                                        -cookie       => $esearch- 
> >next_cookie,
>                                        -cmd          => 'neighbor');
>
>   # this retrieves the Bio::DB::EUtilities::ElinkData object
>
>   my ($linkset) = $elink->next_linkset;
>   my @ids;
>
>   # step through IDs for each linked database in the ElinkData object
>
>   for my $db ($linkset->get_databases) {
>     @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
>     # do something here
>   }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From daniel.lang at biologie.uni-freiburg.de  Thu Oct  5 13:12:02 2006
From: daniel.lang at biologie.uni-freiburg.de (Daniel Lang)
Date: Thu, 05 Oct 2006 19:12:02 +0200
Subject: [Bioperl-l] Bio::DB::SeqFeature
Message-ID: <45253CE2.1070208@biologie.uni-freiburg.de>

Hi,

we are storing Bio::SeqFeature::Gene::GeneStructure objects (with
multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db
(latest bioperl-live checkout).

The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch
out of a database.

The first observation is that is seems to work (fetched objects behave
like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we
get these warnings:

Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into
lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into
lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
        (in cleanup) Not a CODE reference at
/home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
prepare_cached(SELECT f.id,f.object
  FROM feature as f
  WHERE (   f.seqid=?
   AND   f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?))
)

) statement handle DBI::st=HASH(0x1c317cf0) still Active at
/home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm
line 1422
        (in cleanup) Not a CODE reference at
/home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.

Is this something serious? Does this mean that the stored object doesn't
have everything it had before freezing? Or are we using
Bio::DB::SeqFeature inappropriately?

The other question would be, if we can visualize these stored feature
objects easily using gbrowse? I didn't find a hint mentioning
Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages...
Is it working already? Will it?

Thanks in advance,
Daniel

-- 

Daniel Lang
University of Freiburg, Plant Biotechnology
Schaenzlestr. 1, D-79104 Freiburg
fax: +49 761 203 6945
phone: +49 761 203 6974
homepage:  http://www.plant-biotech.net/
e-mail: daniel.lang at biologie.uni-freiburg.de

#################################################
My software never has bugs.
It just develops random features.
#################################################


From cjfields at uiuc.edu  Thu Oct  5 13:45:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 12:45:40 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452532C3.9030804@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
	<A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
	<452532C3.9030804@sendu.me.uk>
Message-ID: <003DD8C4-6E59-44C2-9A1C-117E036D93BC@uiuc.edu>


On Oct 5, 2006, at 11:28 AM, Sendu Bala wrote:

> I'm not sure how not doing URI-encoding would suddenly make your  
> classes worthless. I find them to be very useful (even when I  
> didn't know there was any URI-encoding, was incorrectly using +s  
> and it happened to work anyway).

That's not my point (and sincerest apologies for the 'badgering'  
bit).  If you made the assumption that all the parameters had to be  
URI-encoded, why couldn't I do something like:

my %param = (#make up your list of parameters here#);
my $eutil = 'esearch';
my $url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/$eutil.fcgi";
# join the key value pairs with '=', then join all those with &
# add to end of url
# post and retrieve via LWP::Simple

It's more user-friendly to set up the parameters so that you wouldn't  
have to encode everything yourself, esp. when the most reliable way  
to encode URI strings is to 'use URI'.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 14:11:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 13:11:25 -0500
Subject: [Bioperl-l] Eutilities Batch
In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
Message-ID: <4A340977-C6AD-4728-8947-BF5A8A782807@uiuc.edu>


On Oct 5, 2006, at 9:09 AM, Bernd Web wrote:

> Hi,
>
> I am using the new EUtilities. It looks great.
> I was trying to use epost followed by elink but i get an error. The
> same error is actually given with the example on
> http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
> Can't call method "get_databases" on an undefined value at EU.pl  
> line 25.
>
> For completeness, the code is shown below too.
>
> Any suggestions what is going wrong?
>
> Regards,
> Bernd

Grr...that's my error, sorry Bernd.  The POD wasn't updated to match  
the change I made and has a few errors.  The elink object, for  
starters, doesn't fetch the response using get_response().  Also, the  
ElinkData method has changed slightly but accomplishes the same  
thing.  Odd, since I copied and pasted that from working code...

Just a note: these are considered highly experimental at the moment,  
though they should be ready for general use and toying around.  I  
would like any suggestions on methods and so on you may have (Sendu  
has made some very helpful ones off-list which I plan on implementing).

Feel free to let me know if something doesn't work.  Note that,  
because of their experimental nature, you will want to take note of  
any methods changes in particular as I try to solidify the API and  
clean up the POD, so expect some momentary 'outages'.  I plan on  
setting up a remedial interface for all the container objects (like  
ElinkData) which will help clarify things and solidify the API in the  
next few weeks, at least to a point where the class methods have a  
consistent naming scheme.  I plan on using this as a backend web  
agent for a general Entrez interface at some point to get data into  
Bio* objects.

In the meantime, try this:

use Bio::DB::EUtilities;

my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                        -db         => 'pubmed',
                                        -term       => 'hutP',
                                        -usehistory => 'y');

$esearch->get_response; # parse the response, fetch a cookie

my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
                                      -db           =>  
'protein,taxonomy',
                                      -dbfrom       => 'pubmed',
                                      -cookie       => $esearch- 
 >next_cookie,
                                      -cmd          => 'neighbor');

$elink->get_response;

# this retrieves the Bio::DB::EUtilities::ElinkData object

my $linkset = $elink->next_linkset;
my @ids;

# step through IDs for each linked database in the ElinkData object

for my $db ($linkset->get_all_linkdbs) {
   @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
   print join q(,), @ids;
   # do something here
}


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dmessina at wustl.edu  Thu Oct  5 14:07:56 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 5 Oct 2006 13:07:56 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
Message-ID: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>

I'm pleased to announce a revised version of the BioPerl Deobfuscator  
is now available. Many thanks to Mauricio Cuadra for updating  
bioperl.org's installation:

http://bioperl.org/cgi-bin/deob_interface.cgi

I've incorporated many of the suggestions you all sent in after the  
first release, and many of the modules that had non-standard  
documentation have been updated in the meantime, too, so hopefully  
you'll find it much improved. There are still some issues with a few  
modules; please report any problems you see. Also, it's now indexing  
bioperl-live instead of 1.4, which should make it a little more  
useful, too. A complete list of changes is below.

I welcome your bug reports and suggestions for improvements, via  
email, this list, Bugzilla, or the Wiki page.


Thanks,
Dave


Changes

0.0.3  Mon Oct  2 20:01:45 CDT 2006
        FIX: change default $deob_detail_path to be a relative URL  
instead of
             having localhost hardcoded. Thanks to Jason Stajich for  
pointing
             this out.
        FIX: Bio::Ontology modules are no longer missing their prefix  
in the
             class list, and their methods are now shown in the lower  
pane
             as expected. Thanks to Hilmar Lapp for reporting this bug.
        FIX: can now handle (and ignore) VERSION POD section.
        FIX: missing SYNOPSIS section now handled properly. In fact, the
             SYNOPSIS and DESCRIPTION sections can be in reverse  
order now,
             although for consistency this is not recommended.
        FIX: Bug #2114: "Obfuscator doesn't show "Bio:Matrix:Generic"  
has been
             fixed. This bug turned out to afflict multiple modules,  
which
             weren't getting parsed correctly by deob_index.pl.
        NEW: Table cells have been padded out to get rid of that  
"scrunched"
             look. Thanks to Sendu Bala for this great suggestion.
        NEW: If the 'Returns' subsection of a method's documentation  
contains
             a POD L<> link, the Deobfuscator assumes this to be a  
package
             name, and wraps it in an href for display. This feature is
             not robust, but seems to work well enough for now.
        NEW: the list of classes is now sorted alphabetically depth- 
first, so
             that subclasses appear just after their parent class.  
Thanks to
             Amir Karger for noticing the strange sorting behavior.
        NEW: HTML page title now 'BioPerl Deobfuscator' to  
distinguish it from
             other Deobfuscators out there. Thanks to Amir Karger for
             suggesting this.
        NEW: 'No match' search string now more prominent. Yep, kudos  
to Amir
             Karger again -- another great idea!
        NEW: Search box caption now explicitly states that only  
package names
             can be searched. Big ups to Amir Karger for this  
suggestion.
             The ability to search method names is planned for a  
future version.
        NEW: added -x option to deob_index.pl. This allows the use of an
             'excluded modules' file. This feature was added to  
resolve an
             issue with four modules which rely on external modules  
to compile.
             Class::Inspector, used by the Deobfuscator needs to load a
             module to traverse its inheritance tree, and modules  
must compile
             before they can be loaded.
     CHANGE: using short name now when traversing with File::Find to  
help
             identify excluded modules (deob_index.pl).


From lincoln.stein at gmail.com  Thu Oct  5 14:41:08 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:41:08 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
Message-ID: <6dce9a0b0610051141x6b61407ar1c0a13cf7616b35f@mail.gmail.com>

The non-numeric comparison bug in Bio::DB::SeqFeature is fixed in the
latest CVS. Do I need to do anything special to get the CVS fixes into
the release candidate?

Lincoln

On 10/2/06, Chris Fields <cjfields at uiuc.edu> wrote:
> > [I won't create a wiki account just to report this.]
> >
> > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> > not set.  Lots of warnings about missing packages and all, but this
> > looks interesting:
> >
> >    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/
> > SeqFeature/Segment.pm line 423.
>
> This is verified on Mac OS X.
>
> > Otherwise:
> >
> >    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,
> > 99.99% okay.
> >
> > The failed test is:
> >
> >    t/ESEfinder..................dubious
> >       Test returned status 255 (wstat 65280, 0xff00)
> >    DIED. FAILED test 15
>
> What do you get when you run that set of tests using 'perl -I. -w t/
> ESEFinder.t'?  The bad status code is odd and could be a remote
> server issue.
>
> Chris
>
>
> >
> > florin
> >
> > --
> > If we wish to count lines of code, we should not regard them as lines
> > produced but as lines spent.                       -- Edsger Dijkstra
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From MEC at stowers-institute.org  Thu Oct  5 15:18:08 2006
From: MEC at stowers-institute.org (Cook, Malcolm)
Date: Thu, 5 Oct 2006 14:18:08 -0500
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
Message-ID: <CED81D34E37D5043A1211565277A51E5065E9897@exchkc02.stowers-institute.org>


Yes, there is overhead (c.f. perldoc Storable)

    "When writing in network order, all fields are written
    out as standard lengths, which allows full interworking, but takes
    longer to read and write)"

And, I suppose there is also risk of loosing precision in using network
order:

    You can also store data in network order to allow easy sharing
across
    multiple platforms, or when storing on a socket known to be remotely
    connected. The routines to call have an initial "n" prefix for
    *network*, as in "nstore" and "nstore_fd". At retrieval time, your
data
    will be correctly restored so you don't have to know whether you're
    restoring from native or network ordered data. Double values are
stored
    stringified to ensure portability as well, at the slight risk of
loosing
    some precision in the last decimals.

So, I agree, it should be configuration option, perhaps defaulting to
using network order.

However, given the factoring of ../Bio/DB/SeqFeature/Store.pm I'm not
sure how to best make it a configuration option since the two provided
serializers don't share a common interface.  Possibly something like:

=head1 Methods for Connecting and Initializating a Database

=head2 new

 Title   : new
 Usage   : $db = Bio::DB::SeqFeature::Store->new(@options)
 Function: connect to a database
 Returns : A descendent of Bio::DB::Seqfeature::Store
 Args    : several - see below
 Status  : public

This class method creates a new database connection. The following
-name=E<gt>$value arguments are
accepted:http://iowg.brcdevel.org/gff3.html#a_fasta

 Name               Value
 ----               -----

 -adaptor           The name of the Adaptor class (default DBI::mysql)

 -serializer        The name of the serializer class (default Storable)

 -network_order     Strive to 'preserve network order' (if the
serializer implements it.  
		        Currently, only Storable.pm does, and this will
cause it to use nfreeze 
                    instead of freeze.  (default 1)

 -index_subfeatures Whether or not to make subfeatures searchable
                    (default true)

 -cache             Activate LRU caching feature -- size of cache

 -compress          Compresses features before storing them in database
                    using Compress::Zlib


Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
  

> -----Original Message-----
> From: Lincoln Stein [mailto:lincoln.stein at gmail.com] 
> Sent: Thursday, October 05, 2006 1:43 PM
> To: Cook, Malcolm
> Cc: lstein at cshl.org; bioperl-l
> Subject: Re: using nfreeze instead of freeze in Bio::SeqFeature::Store
> 
> I think it's fine unless there is a significant performance hit, in
> which case the change should be made into a configuration option. Do
> you know if there is any overhead on doing this?
> 
> Lincoln
> 
> On 10/5/06, Cook, Malcolm <MEC at stowers-institute.org> wrote:
> > Lincoln,
> >
> > I committed a change to Bio::SeqFeature::Store to use 
> nfreeze instead of
> > freeze which should allow SeqFeature objects to survive database
> > freeze/thaw cycles across architectures.
> >
> > I hope I was not presumptuous or in error in doing this....
> >
> > Regards,
> >
> > Malcolm Cook
> > Database Applications Manager - Bioinformatics
> > Stowers Institute for Medical Research - Kansas City, Missouri
> >
> >
> 
> 
> -- 
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)
> FOR URGENT MESSAGES & SCHEDULING,
> PLEASE CONTACT MY ASSISTANT,
> SANDRA MICHELSEN, AT michelse at cshl.edu
> 


From lincoln.stein at gmail.com  Thu Oct  5 14:32:40 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:32:40 -0400
Subject: [Bioperl-l] Bio::DB::SeqFeature
In-Reply-To: <45253CE2.1070208@biologie.uni-freiburg.de>
References: <45253CE2.1070208@biologie.uni-freiburg.de>
Message-ID: <6dce9a0b0610051132p7d7fcf84g27578731f9727f3f@mail.gmail.com>

Hi Daniel,

The warnings you are seeing are occurring because
Bio::SeqFeature::Gene::GeneStructure contains a CODE reference. I
think it must be registering a cleanup method via its Bio::Root::Root
ancestor. When Storable serializes the object, it complains that it
can't serialize the CODE reference and instead converts it into the
string "CODE(0xXXXXX)". Then, after you thaw the object,
Bio::Root::Root is complaining that the CODE reference is invalid
because it is a string, not a reference.

Yuck. I think, however, that I can fix this by setting some magic
variables in Storable version 2.05 that will decompile and compile the
CODE references. I will try this and send you a note when the code is
in CVS.

GBrowse does run off Bio::DB::SeqFeature::Store and is noticeably
faster than the original Bio::DB::GFF adaptor. Nothing really changes
except that you set the db_adaptor option to
Bio::DB::SeqFeature::Store. I haven't tried it using
Bio::SeqFeature::Gene::GeneStructure, so no guarantees, but I am
hopeful that it will work.

Lincoln


On 10/5/06, Daniel Lang <daniel.lang at biologie.uni-freiburg.de> wrote:
> Hi,
>
> we are storing Bio::SeqFeature::Gene::GeneStructure objects (with
> multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db
> (latest bioperl-live checkout).
>
> The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch
> out of a database.
>
> The first observation is that is seems to work (fetched objects behave
> like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we
> get these warnings:
>
> Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into
> lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
> Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into
> lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
>         (in cleanup) Not a CODE reference at
> /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
> prepare_cached(SELECT f.id,f.object
>   FROM feature as f
>   WHERE (   f.seqid=?
>    AND   f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?))
> )
>
> ) statement handle DBI::st=HASH(0x1c317cf0) still Active at
> /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm
> line 1422
>         (in cleanup) Not a CODE reference at
> /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
>
> Is this something serious? Does this mean that the stored object doesn't
> have everything it had before freezing? Or are we using
> Bio::DB::SeqFeature inappropriately?
>
> The other question would be, if we can visualize these stored feature
> objects easily using gbrowse? I didn't find a hint mentioning
> Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages...
> Is it working already? Will it?
>
> Thanks in advance,
> Daniel
>
> --
>
> Daniel Lang
> University of Freiburg, Plant Biotechnology
> Schaenzlestr. 1, D-79104 Freiburg
> fax: +49 761 203 6945
> phone: +49 761 203 6974
> homepage:  http://www.plant-biotech.net/
> e-mail: daniel.lang at biologie.uni-freiburg.de
>
> #################################################
> My software never has bugs.
> It just develops random features.
> #################################################
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From hlapp at gmx.net  Thu Oct  5 16:34:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 16:34:49 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4525314C.7020205@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
Message-ID: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>


On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote:

> If you think I'm stupid, fine, but I'm probably not the only stupid
> person on the planet.

That's a great suggestion that I hope we can all agree on? I'll  
happily count myself among the stupid ones too so you're not alone,  
and stupid people and even more so those who are lucky enough not to  
be stupid have an obligation to document stuff so that even the  
stupid can understand, no matter how silly the documentation might get.

Is that agreeable without causing yet more progressive hair loss?

Actually - I'm having second thoughts. Isn't it a distinguishing  
feature of stupid people that - among other things - they are stupid  
enough to believe they don't need to read documentation? You admitted  
publicly that you read documentation - are you just faking the stupid?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Oct  5 17:11:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:11:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
Message-ID: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>


On Oct 5, 2006, at 3:34 PM, Hilmar Lapp wrote:

>
> On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote:
>
>> If you think I'm stupid, fine, but I'm probably not the only stupid
>> person on the planet.
>
> That's a great suggestion that I hope we can all agree on? I'll  
> happily count myself among the stupid ones too so you're not alone,  
> and stupid people and even more so those who are lucky enough not  
> to be stupid have an obligation to document stuff so that even the  
> stupid can understand, no matter how silly the documentation might  
> get.
>
> Is that agreeable without causing yet more progressive hair loss?
>
> Actually - I'm having second thoughts. Isn't it a distinguishing  
> feature of stupid people that - among other things - they are  
> stupid enough to believe they don't need to read documentation? You  
> admitted publicly that you read documentation - are you just faking  
> the stupid?
>
> 	-hilmar

If lack of good documentation == stupid, I know of a few other  
modules in trouble besides mine.  Based on that we're in for a whole  
lot of stupid!  And I feel stupid for my earlier remarks, Sendu, so  
apologies.

And Hilmar, you're too late on the hair loss, at least on my end.

I have corrected the EUtilities POD to reflect that all text input  
needs to be raw as URI encoding is done in the module, which should  
work (I think).  I plan on committing it tonight.  It also indicates  
that EUtilities search queries need to be made as if they are regular  
Entrez queries.  Would that be sufficient?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From pmiguel at purdue.edu  Thu Oct  5 16:42:00 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Thu, 05 Oct 2006 16:42:00 -0400
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
Message-ID: <45256E18.3080103@purdue.edu>

David Messina wrote:
> I'm pleased to announce a revised version of the BioPerl Deobfuscator  
> is now available. Many thanks to Mauricio Cuadra for updating  
> bioperl.org's installation:
>
> http://bioperl.org/cgi-bin/deob_interface.cgi
>
> I've incorporated many of the suggestions you all sent in after the  
> first release, and many of the modules that had non-standard  
> documentation have been updated in the meantime, too, so hopefully  
> you'll find it much improved. There are still some issues with a few  
> modules; please report any problems you see. Also, it's now indexing  
> bioperl-live instead of 1.4, which should make it a little more  
> useful, too. A complete list of changes is below.
>
> I welcome your bug reports and suggestions for improvements, via  
> email, this list, Bugzilla, or the Wiki page.
>
>
> Thanks,
> Dave
>
>   
Here are some comments:
Would be good to have the column headings for the methods table in the 
fixed part of the page, rather than the scroll box. That way you could 
always see the column headings from anywhere in the list.

Second, I've noticed that there are a fair number of methods that have 
"not documented" for "Returns" and "Usage". But in every case I've 
checked both of these were documented. For example, consider methods for 
Bio::Seq::SeqWithQuality. The method "accession_number" is listed as 
"not documented". But if you click on Bio::Seq:SeqWithQuality link to 
the documentation, usage is defined as: "$unique_biological_key = 
$obj->accession_number;" and returns is defined as "A string".

Finally, it would be good to have the version of bioperl being 
deobfuscated on the deob_interface.cgi page. Just as a quick 
sanity-checking measure. After poking around a bit I found that 
bioperl-live is being indexed in the wiki. But, I can tell, it is just 
the sort of thing I'm going to forget and look for every time come  back 
to the page after a few months...

Overall very nice, though. Just what is needed when I'm trying to 
remember "which was the method that returns subseq string and which one 
returns an object?"


Phillip SanMiguel
Purdue University


From bix at sendu.me.uk  Thu Oct  5 17:24:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 22:24:34 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
Message-ID: <45257812.5050008@sendu.me.uk>

Chris Fields wrote:
> 
> I have corrected the EUtilities POD to reflect that all text input needs 
> to be raw as URI encoding is done in the module, which should work (I 
> think).  I plan on committing it tonight.  It also indicates that 
> EUtilities search queries need to be made as if they are regular Entrez 
> queries.  Would that be sufficient?

You may not even need to mention anything about URI encoding, which 
might frighten some people. Something as simple as:

=head1 SYNOPSIS

use Bio::DB::EUtilities;

   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                          -db         => 'pubmed',
                                          -term       => 'hutP AND xyz',
...

and/or some POD for the new() method:

=head2 new

  Title   : new
...
  Args    : -eutil => ...
            -db    => ...
            -term  => string, an entrez-style query

=cut

would get the point across, I think.

BTW, can the term string be supplied anywhere else other than new()? It 
doesn't matter at all if it can't, I'm just idly wondering if I missed 
anything.


From dmessina at wustl.edu  Thu Oct  5 17:42:49 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 5 Oct 2006 16:42:49 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45256E18.3080103@purdue.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
Message-ID: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>

Thanks so much, Phillip, for taking the time to check out the new  
version and send your comments. I really appreciate it! I've added  
them to the wiki page so I can track them.

Best,
Dave


From cjfields at uiuc.edu  Thu Oct  5 17:50:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:50:11 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45257812.5050008@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
	<45257812.5050008@sendu.me.uk>
Message-ID: <A0B37F41-7C33-49F6-A039-A35AB5696947@uiuc.edu>

Sendu,

I have the parameters all set up as get/sets at this point, but I'm  
open to suggestions on that.  Note in the BEGIN block the heredoc eval 
{} block.  Yes, nasty I know, but I hate AUTOLOAD.  It works as a  
quick way of getting parameter get/sets up-and-running.  I plan on  
making those explicit get/sets as soon as I can then sorting out  
particular ones to the various eutil modules where they are primarily  
used.

Long story short, every parameter is a get/set at this time  
(including term()).  The common ones needed for most EUtilities are  
initialized in the parent EUtilities::_initialize(), and eutil- 
specific parameters are initialized in the individual eutil plugins.   
Each eutil plugin only sets whatever parameters may be needed for  
operation (though you could circumvent that, since all of them are  
inherited via EUtilities).

We could always simplify it to accept simple key-value pairs, but get/ 
sets (at least to me) allow more flexibility as long as you remember  
which parameters are set and to what.

Chris

On Oct 5, 2006, at 4:24 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> I have corrected the EUtilities POD to reflect that all text input  
>> needs to be raw as URI encoding is done in the module, which  
>> should work (I think).  I plan on committing it tonight.  It also  
>> indicates that EUtilities search queries need to be made as if  
>> they are regular Entrez queries.  Would that be sufficient?
>
> You may not even need to mention anything about URI encoding, which  
> might frighten some people. Something as simple as:
>
> =head1 SYNOPSIS
>
> use Bio::DB::EUtilities;
>
>   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                          -db         => 'pubmed',
>                                          -term       => 'hutP AND  
> xyz',
> ...
>
> and/or some POD for the new() method:
>
> =head2 new
>
>  Title   : new
> ...
>  Args    : -eutil => ...
>            -db    => ...
>            -term  => string, an entrez-style query
>
> =cut
>
> would get the point across, I think.
>
> BTW, can the term string be supplied anywhere else other than new 
> ()? It doesn't matter at all if it can't, I'm just idly wondering  
> if I missed anything.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 17:51:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:51:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45257812.5050008@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
	<45257812.5050008@sendu.me.uk>
Message-ID: <5B2E844F-7B8B-4F69-9005-138826B835FB@uiuc.edu>

> You may not even need to mention anything about URI encoding, which
> might frighten some people. Something as simple as:
>
> =head1 SYNOPSIS
>
> use Bio::DB::EUtilities;
>
>    my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                           -db         => 'pubmed',
>                                           -term       => 'hutP AND  
> xyz',
> ...
>
> and/or some POD for the new() method:
>
> =head2 new
>
>   Title   : new
> ...
>   Args    : -eutil => ...
>             -db    => ...
>             -term  => string, an entrez-style query
>
> =cut
>
> would get the point across, I think.

Oops, forgot.  I'll add this in and update new() when I can.  Thanks!

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Thu Oct  5 18:12:49 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 05 Oct 2006 17:12:49 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45256E18.3080103@purdue.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
Message-ID: <45258361.8080803@campus.iztacala.unam.mx>

Phillip San Miguel wrote:
> Finally, it would be good to have the version of bioperl being 
> deobfuscated on the deob_interface.cgi page. Just as a quick 
> sanity-checking measure. After poking around a bit I found that 
> bioperl-live is being indexed in the wiki. But, I can tell, it is just 
> the sort of thing I'm going to forget and look for every time come  back 
> to the page after a few months...

Dave,

I think this value can be stored in one of the index files and passed as 
an argument to the deob_index.pl script. What do you think?

Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From lincoln.stein at gmail.com  Thu Oct  5 14:42:41 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:42:41 -0400
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
In-Reply-To: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>
References: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>
Message-ID: <6dce9a0b0610051142h56479843ofc5429d959cb6e3@mail.gmail.com>

I think it's fine unless there is a significant performance hit, in
which case the change should be made into a configuration option. Do
you know if there is any overhead on doing this?

Lincoln

On 10/5/06, Cook, Malcolm <MEC at stowers-institute.org> wrote:
> Lincoln,
>
> I committed a change to Bio::SeqFeature::Store to use nfreeze instead of
> freeze which should allow SeqFeature objects to survive database
> freeze/thaw cycles across architectures.
>
> I hope I was not presumptuous or in error in doing this....
>
> Regards,
>
> Malcolm Cook
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, Missouri
>
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From torsten.seemann at infotech.monash.edu.au  Fri Oct  6 01:26:10 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 06 Oct 2006 15:26:10 +1000
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
Message-ID: <4525E8F2.1000704@infotech.monash.edu.au>

Hilmar,

> I don't think there's a need to deprecate - if the methods just plain  
> delegate to whatever File:: module is appropriate their  
> implementation (supposedly) will become very simple and hence won't  
> pose a maintenance burden anymore.

>> I have an uncommitted simplified version of Bio::Root::IO which does
>> this, and "all tests pass". The functions currently (silently)  
>> dispatch
>> directly to their native counterparts.
>>
>> The only tricky function is tempfile() which is *mostly* like
>> File::Temp::tempfile(), but does some voodoo of converting
>> (TEMPLATE=>'xxx') to the non-hash first parameter of the File::  
>> version,
>> so I'm hesitant to commit. It may do other magic - Hilmar?
> 
> Not that I would know of. If the tests pass (without having to change  
> them!) I'd give it a try.

Tempfile.t had two tests that failed. It seems that Bio::Root::IO had 
some magic whereby it would keep a list of all tempfilenames created 
with UNLINK != 0 and when the Bio::Root::IO object was destroyed (eg. 
undef $obj) it would MANUALLY unlink each of them. This would occur 
before File::Temp got to unlink them. Not sure why it was written like 
this (as File::Temp will delete them at the end of the script anyway) 
but maybe it was legacy for when File::Temp::tempfile WASN'T available.
Anyway, I've kept backward compatibility there, although I think 
eventually it should be removed and Tempfile.t adjusted.

Although all tests pass with my new trim Bio/Root/IO.pm I am still 
concerned about committing as the assumption is that the BioPerl test 
suite is good enough to handle such a change to an important module, but 
the reality may be different :-)

Let me know if you think I should commit anyway,

Your advice is appreciated.

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From dmessina at wustl.edu  Fri Oct  6 01:25:56 2006
From: dmessina at wustl.edu (David Messina)
Date: Fri, 6 Oct 2006 00:25:56 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45258361.8080803@campus.iztacala.unam.mx>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
	<45258361.8080803@campus.iztacala.unam.mx>
Message-ID: <CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>


On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote:
> I think this value can be stored in one of the index files and  
> passed as an argument to the deob_index.pl script. What do you think?

Yep, I think that works nicely. I added this feature and committed it  
to CVS. Here's what the new header looks like if you do deob_index.pl  
-s "bioperl-live":

?
Thanks for the suggestions, guys.

Dave

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061006/1c5819f9/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deob_header.jpg
Type: image/jpeg
Size: 25739 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061006/1c5819f9/attachment-0002.jpg>

From deep_ans at yahoo.com  Fri Oct  6 09:22:49 2006
From: deep_ans at yahoo.com (deepak shingan)
Date: Fri, 6 Oct 2006 06:22:49 -0700 (PDT)
Subject: [Bioperl-l] Sort blast file result according to evalues
Message-ID: <20061006132249.49450.qmail@web51711.mail.yahoo.com>

Hi ,
  Is  there any way to parse the blast file according to evalue for each hit. I want the output sorted according to hit evalue. I am using SearchIO algorithm and already tried sorting the hits according to bits, gaps, but I am not able to sort the hits by evalue. 
  As evalues are mainly associated with hsp and each hit may have multiple hsps. 
   
  waiting for help.
   
  Thanks,
  Dun Dansi
   
   
---------------------------------
How low will we go? Check out Yahoo! Messenger?s low  PC-to-Phone call rates.


From hlapp at gmx.net  Fri Oct  6 10:03:04 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 6 Oct 2006 10:03:04 -0400
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <4525E8F2.1000704@infotech.monash.edu.au>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
	<4525E8F2.1000704@infotech.monash.edu.au>
Message-ID: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>

This is a 1.5, i.e. developers release that's in the works, and also  
you'd be doing this on the main trunk. If you get the tests to pass  
there's no reason to hold back.

You may be right and in reality it has repercussions somewhere, but  
those will be the opportunities to improve our test suite.

	-hilmar

On Oct 6, 2006, at 1:26 AM, Torsten Seemann wrote:

> Although all tests pass with my new trim Bio/Root/IO.pm I am still  
> concerned about committing as the assumption is that the BioPerl  
> test suite is good enough to handle such a change to an important  
> module, but the reality may be different :-)
>
> Let me know if you think I should commit anyway,
>
> Your advice is appreciated.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct  6 10:58:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 09:58:09 -0500
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
Message-ID: <F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>

The evalue for the hit is retrieved by the BlastHit::signifiance()  
method, if I remember correctly.  So if $hit is a  
Bio::Search::Hit::BlastHit object, you use $hit->significance.  If  
you want individual HSP evalues, you would use $hsp->evalue for the  
individual HSP objects.

The output is normally sorted by the order they appear in the  
alignments and table, which is typically by increasing evalue or  
decreasing bits (score).  So they are already sorted.  If you wanted  
to run a sort yourself you could use a sort block using '{$a- 
 >significance() <=> $b->significance()} @hits', but as pointed out  
on the wiki it may be safer to run a Schwartzian transform instead:

http://www.bioperl.org/wiki/Bioperl_Best_Practices#Sorting

Chris

On Oct 6, 2006, at 8:22 AM, deepak shingan wrote:

> Hi ,
>   Is  there any way to parse the blast file according to evalue for  
> each hit. I want the output sorted according to hit evalue. I am  
> using SearchIO algorithm and already tried sorting the hits  
> according to bits, gaps, but I am not able to sort the hits by evalue.
>   As evalues are mainly associated with hsp and each hit may have  
> multiple hsps.
>
>   waiting for help.
>
>   Thanks,
>   Dun Dansi
>
>
>
>
>  		
> ---------------------------------
> How low will we go? Check out Yahoo! Messenger?s low  PC-to-Phone  
> call rates.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Oct  6 11:03:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 10:03:45 -0500
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
	<4525E8F2.1000704@infotech.monash.edu.au>
	<074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>
Message-ID: <265AD609-F74E-4545-B3DD-FF94290BE0B4@uiuc.edu>

On Oct 6, 2006, at 9:03 AM, Hilmar Lapp wrote:

> This is a 1.5, i.e. developers release that's in the works, and also
> you'd be doing this on the main trunk. If you get the tests to pass
> there's no reason to hold back.
>
> You may be right and in reality it has repercussions somewhere, but
> those will be the opportunities to improve our test suite.
>
> 	-hilmar

Agreed, though I think Sendu only wants bug fixes for 1.5.2.  You  
could always commit to CVS HEAD and it could be in 1.5.3.

Let me rethink that.  There were some subtle tempfile/tempdir issues  
that were popping up on WinXP where the some tempfiles were not being  
deleted b/c of permissions issues; I had planned on adding that to  
Bugzilla today or tomorrow.  Maybe changing to File::Temp would fix  
that, so in essence it would be a bug fix!

I'll go ahead and post the bug.

Chris

>> Although all tests pass with my new trim Bio/Root/IO.pm I am still
>> concerned about committing as the assumption is that the BioPerl
>> test suite is good enough to handle such a change to an important
>> module, but the reality may be different :-)
>>
>> Let me know if you think I should commit anyway,
>>
>> Your advice is appreciated.
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From pmiguel at purdue.edu  Fri Oct  6 11:06:56 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Fri, 06 Oct 2006 11:06:56 -0400
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>	<45256E18.3080103@purdue.edu>
	<5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>
Message-ID: <45267110.7030905@purdue.edu>

David Messina wrote:
> Thanks so much, Phillip, for taking the time to check out the new  
> version and send your comments. I really appreciate it! I've added  
> them to the wiki page so I can track them.
>
> Best,
> Dave
>   
Dave,
    No problem.
    I've just added a "keyword" to search BioPerl Deobfuscator to my 
Firefox browser. That way I can just type "deob qual" in my URL bar in 
firefox and the browser jumps directly to BioPerl Deobfuscator (like a 
bookmark) but it pre-submits the search item "qual".
    I heard about the Firefox "keywords" in a TWiT/FLOSS episode on 
mozilla. You just go to any search page and right-click in the search 
box of interest and one of the choices is "Add a Keyword for this 
Search". Then you just have to fill out "Name" and "Keyword" fields and 
drop the keyword into whatever folder you like. The "Keyword" then 
becomes the word to invoke that search with parameters that follow it 
when it is typed into the URL bar.
Phillip


From arareko at campus.iztacala.unam.mx  Fri Oct  6 11:18:02 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Fri, 06 Oct 2006 10:18:02 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>	<45256E18.3080103@purdue.edu>	<45258361.8080803@campus.iztacala.unam.mx>
	<CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>
Message-ID: <452673AA.7070305@campus.iztacala.unam.mx>

Looks great! I'll update it during the weekend.

Mauricio.

David Messina wrote:
> 
> On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote:
>> I think this value can be stored in one of the index files and passed 
>> as an argument to the deob_index.pl script. What do you think?
> 
> Yep, I think that works nicely. I added this feature and committed it to 
> CVS. Here's what the new header looks like if you do deob_index.pl -s 
> "bioperl-live":
> 
> 
> Thanks for the suggestions, guys.
> 
> Dave
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From bix at sendu.me.uk  Fri Oct  6 11:27:14 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 06 Oct 2006 16:27:14 +0100
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
	<F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
Message-ID: <452675D2.9090803@sendu.me.uk>

Chris Fields wrote:
> The evalue for the hit is retrieved by the BlastHit::signifiance()  
> method, if I remember correctly.  So if $hit is a  
> Bio::Search::Hit::BlastHit object, you use $hit->significance.  If  
> you want individual HSP evalues, you would use $hsp->evalue for the  
> individual HSP objects.
> 
> The output is normally sorted by the order they appear in the  
> alignments and table, which is typically by increasing evalue or  
> decreasing bits (score).  So they are already sorted.

Concur.


> If you wanted to run a sort yourself you could use a sort block using
> '{$a->significance() <=> $b->significance()} @hits'

Actually, it is best to use the sort_hits() method of the result object 
prior to asking for any hits. (As this allows for potential optimization 
in the parser.)

->significance is still the thing you need to sort on though.


From cjfields at uiuc.edu  Fri Oct  6 11:52:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 10:52:57 -0500
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <452675D2.9090803@sendu.me.uk>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
	<F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
	<452675D2.9090803@sendu.me.uk>
Message-ID: <31A6FC3A-8BEB-42B8-B51D-66E659EF7495@uiuc.edu>


On Oct 6, 2006, at 10:27 AM, Sendu Bala wrote:

>> If you wanted to run a sort yourself you could use a sort block using
>> '{$a->significance() <=> $b->significance()} @hits'
>
> Actually, it is best to use the sort_hits() method of the result  
> object
> prior to asking for any hits. (As this allows for potential  
> optimization
> in the parser.)

Ah, forgot about that one!

Chris


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Fri Oct  6 14:36:49 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 6 Oct 2006 11:36:49 -0700
Subject: [Bioperl-l] tempfile cleanup
In-Reply-To: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu>
References: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu>
Message-ID: <0FCEC6B2-E190-4800-AAB1-89559C552FA6@bioperl.org>

I think the magic trickery in there for cleanup is that File::Temp  
only cleans up tempfiles when Perl exits not when the Root::IO object  
goes out of scope -- so this can be a problem for people on CGI  
scripts that stay resident in memory and don't ever have tempfiles  
cleaned up.  The managing the list aspect allows us to call _cleanup  
periodically (perhaps before the start of every Blast run) to insure  
that tempfiles are removed.  perhaps newer File::Temp versions can  
solve this better now but I believe that was the behavior we were  
trying to deal with with managing the list of to-be-deleted files by  
the Root::IO object.

This is some hackery that also had to do with not expecting  
File::Temp to be installed I believe.

-jason


From torsten.seemann at infotech.monash.edu.au  Mon Oct  9 00:52:29 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Mon, 09 Oct 2006 14:52:29 +1000
Subject: [Bioperl-l] Multiple packages in the one .pm file
Message-ID: <4529D58D.1080004@infotech.monash.edu.au>

Hi all,

The following modules have more than one "package xxxx;" declaration in 
them. For small, internal classes I guess this is fine, but for others,
they should be split up into the filesystem - otherwise they are 
troublesome to locate and the online documentation doesn't list them!

eg.
bioperl-run/Bio/Tools/Run/Analysis/Job.pm
is in
bioperl-run/Bio/Tools/Run/Analysis.pm

Here's the culprits:

% for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | 
sed 's/:.*$//' | sort | uniq -d ; done

bioperl-live/Bio/AnalysisI.pm
bioperl-live/Bio/DB/Fasta.pm
bioperl-live/Bio/DB/GFF.pm
bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
bioperl-live/Bio/SeqIO/interpro.pm

bioperl-run/Bio/Tools/Run/Analysis.pm
bioperl-run/Bio/Tools/Run/Analysis/soap.pm

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From pmiguel at purdue.edu  Mon Oct  9 15:57:12 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Mon, 09 Oct 2006 15:57:12 -0400
Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC?
Message-ID: <452AA998.5010104@purdue.edu>

I found a bug in Bio::SeqIO::phd and am wondering if the fix will 
propagate into the next release candidate?

The bug is here:

http://bugzilla.open-bio.org/show_bug.cgi?id=2120

I also created a patch that fixes it (on my machine, anyway).  It is a 
fairly minor change, so it seems like it would be worth propagating it 
into the next release candidate.

-- 
Phillip SanMiguel


From bix at sendu.me.uk  Mon Oct  9 16:57:28 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 09 Oct 2006 21:57:28 +0100
Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC?
In-Reply-To: <452AA998.5010104@purdue.edu>
References: <452AA998.5010104@purdue.edu>
Message-ID: <452AB7B8.4040404@sendu.me.uk>

Phillip San Miguel wrote:
> I found a bug in Bio::SeqIO::phd and am wondering if the fix will 
> propagate into the next release candidate?
> 
> The bug is here:
> 
> http://bugzilla.open-bio.org/show_bug.cgi?id=2120
> 
> I also created a patch that fixes it (on my machine, anyway).  It is a 
> fairly minor change, so it seems like it would be worth propagating it 
> into the next release candidate.

If it gets committed to HEAD before I make the next candidate, then yes.
I'll do that if no one beats me to it (and if someone does, please add a 
new test for this).

BTW Phillip, thank you for the bug report but in future use the 
attachment capabilities for files, please don't paste them into the 
comments box.


From bix at sendu.me.uk  Mon Oct  9 17:01:56 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 09 Oct 2006 22:01:56 +0100
Subject: [Bioperl-l] Analysis soap problem
Message-ID: <452AB8C4.1010704@sendu.me.uk>

I thought I'd 'advertise' this bug on the list so more people see it:
http://bugzilla.open-bio.org/show_bug.cgi?id=2117

I don't want to make the next 1.5.2 release candidate until its fixed. 
Does anyone have any idea about it? Even if you can't fix it, just 
explaining what's (supposed) to be going on would help a lot.

Thank you,
Sendu.


From Kevin.M.Brown at asu.edu  Mon Oct  9 18:40:54 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 9 Oct 2006 15:40:54 -0700
Subject: [Bioperl-l] Analysis soap problem
Message-ID: <1A4207F8295607498283FE9E93B775B40219690B@EX02.asurite.ad.asu.edu>

If I had to guess from looking at the snippet provided, the variable
$seq holds no data so when you try to setup the regex /^$seq$/ you end
up with /^$/ (blank line) and the warning.

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 09, 2006 2:02 PM
> To: bioperl-l List
> Subject: [Bioperl-l] Analysis soap problem
> 
> I thought I'd 'advertise' this bug on the list so more people see it:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2117
> 
> I don't want to make the next 1.5.2 release candidate until 
> its fixed. 
> Does anyone have any idea about it? Even if you can't fix it, just 
> explaining what's (supposed) to be going on would help a lot.
> 
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Mon Oct  9 22:34:23 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 9 Oct 2006 21:34:23 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452AB8C4.1010704@sendu.me.uk>
References: <452AB8C4.1010704@sendu.me.uk>
Message-ID: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>

I have 'fixed' this in CVS.  Note the quotes; it depends on what you  
might consider fixed.  Multiple calls to results() were returning  
empty hash refs, so no data was being returned.   For now, I stored  
the hash reference in a variable then tested each one.  All tests now  
pass, including the 'outseq' one.

Maybe it's just me, but shouldn't results() either consistently  
return the same information, or contain documentation that it doesn't  
do so?  Anyway, I have left the bugzilla report open for now.

Chris

On Oct 9, 2006, at 4:01 PM, Sendu Bala wrote:

> I thought I'd 'advertise' this bug on the list so more people see it:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2117
>
> I don't want to make the next 1.5.2 release candidate until its fixed.
> Does anyone have any idea about it? Even if you can't fix it, just
> explaining what's (supposed) to be going on would help a lot.
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bosborne11 at verizon.net  Mon Oct  9 22:09:45 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 09 Oct 2006 22:09:45 -0400
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <C1507929.AB8F%bosborne11@verizon.net>

Torsten,

Fixed interpro.pm, it could have been written more simply (or more like
other SeqIO modules). Can't really address the others.

Brian O.


On 10/9/06 12:52 AM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

> Hi all,
> 
> The following modules have more than one "package xxxx;" declaration in
> them. For small, internal classes I guess this is fine, but for others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
> 
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
> 
> Here's the culprits:
> 
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
> 
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
> 
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm


From bix at sendu.me.uk  Tue Oct 10 03:03:20 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 08:03:20 +0100
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
Message-ID: <452B45B8.8010401@sendu.me.uk>

Chris Fields wrote:
> I have 'fixed' this in CVS.  Note the quotes; it depends on what you  
> might consider fixed.  Multiple calls to results() were returning  
> empty hash refs, so no data was being returned.   For now, I stored  
> the hash reference in a variable then tested each one.  All tests now  
> pass, including the 'outseq' one.
> 
> Maybe it's just me, but shouldn't results() either consistently  
> return the same information, or contain documentation that it doesn't  
> do so?  Anyway, I have left the bugzilla report open for now.

Judging by the tests there seems a clear expectation that multiple calls 
to results() should work, and certainly that makes sense and seems 
natural. So I'd say that results() should be fixed and the test script 
reverted.


From cjfields at uiuc.edu  Tue Oct 10 07:42:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 06:42:33 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452B45B8.8010401@sendu.me.uk>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
	<452B45B8.8010401@sendu.me.uk>
Message-ID: <A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>

I agree, though I think Martin Senger should be contacted, at least  
to get his thoughts.  Has anyone tried yet?

Chris

On Oct 10, 2006, at 2:03 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> I have 'fixed' this in CVS.  Note the quotes; it depends on what you
>> might consider fixed.  Multiple calls to results() were returning
>> empty hash refs, so no data was being returned.   For now, I stored
>> the hash reference in a variable then tested each one.  All tests now
>> pass, including the 'outseq' one.
>>
>> Maybe it's just me, but shouldn't results() either consistently
>> return the same information, or contain documentation that it doesn't
>> do so?  Anyway, I have left the bugzilla report open for now.
>
> Judging by the tests there seems a clear expectation that multiple  
> calls
> to results() should work, and certainly that makes sense and seems
> natural. So I'd say that results() should be fixed and the test script
> reverted.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 10 08:14:31 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 13:14:31 +0100
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
	<452B45B8.8010401@sendu.me.uk>
	<A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>
Message-ID: <452B8EA7.1080800@sendu.me.uk>

Chris Fields wrote:
> I agree, though I think Martin Senger should be contacted, at least to 
> get his thoughts.  Has anyone tried yet?

He's CCd on the bug report, but I haven't tried directly, no. Do you 
want to tackle this (contacting him and/or fixing the bug)?

Cheers,
Sendu.


From cjfields at uiuc.edu  Tue Oct 10 09:20:03 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 08:20:03 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452B8EA7.1080800@sendu.me.uk>
Message-ID: <001801c6ec6e$cc016900$15327e82@pyrimidine>

I'll try giving it a closer look, just didn't have much time yesterday.
I'll also try contacting Martin.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Tuesday, October 10, 2006 7:15 AM
> To: bioperl-l
> Subject: Re: [Bioperl-l] Analysis soap problem
> 
> Chris Fields wrote:
> > I agree, though I think Martin Senger should be contacted, at least to
> > get his thoughts.  Has anyone tried yet?
> 
> He's CCd on the bug report, but I haven't tried directly, no. Do you
> want to tackle this (contacting him and/or fixing the bug)?
> 
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pmiguel at purdue.edu  Tue Oct 10 10:26:35 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Tue, 10 Oct 2006 10:26:35 -0400
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452AB7B8.4040404@sendu.me.uk>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
Message-ID: <452BAD9B.5010903@purdue.edu>

Sendu Bala wrote:
>
> BTW Phillip, thank you for the bug report but in future use the 
> attachment capabilities for files, please don't paste them into the 
> comments box.
>   
Sendu,
    Sounds reasonable to me. I should note, however; when I entered the 
bug, I was looking for some method to attach files. There is none on the 
"Enter Bug: Bioperl" page:

http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl

Also, "bug writing guidelines" makes no mention of it. I vaguely 
remembered there being some method to do it--but given the "bug writing 
guidelines" exhortations to be specific and detailed, I thought I must 
put the information somewhere. So I put them them the only place offered 
(on that page)--"Description:"
    I see that, once submitted, attachments can be added to a bug 
report. Is that normally how it is done? Doesn't each attachment result 
in a separate email to the bioperl guts email list?
    Anyway,  I've just added the files to the bug report as attachments, 
in case someone needs them to construct a test.
   
-- 
Phillip


From bix at sendu.me.uk  Tue Oct 10 11:10:25 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 16:10:25 +0100
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
	<452BAD9B.5010903@purdue.edu>
Message-ID: <452BB7E1.5020200@sendu.me.uk>

Phillip San Miguel wrote:
> Sendu Bala wrote:
>> BTW Phillip, thank you for the bug report but in future use the 
>> attachment capabilities for files, please don't paste them into the
>>  comments box.
>> 
> Sendu, Sounds reasonable to me. I should note, however; when I
> entered the bug, I was looking for some method to attach files. There
> is none on the "Enter Bug: Bioperl" page:
> 
> http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl
> 
> Also, "bug writing guidelines" makes no mention of it. I vaguely 
> remembered there being some method to do it--but given the "bug
> writing guidelines" exhortations to be specific and detailed, I
> thought I must put the information somewhere. So I put them them the
> only place offered (on that page)--"Description:"

I agree that things could be better here. Who looks after bugzilla, and
is this an alterable feature?


> I see that, once submitted, attachments can be added to a bug report.
> Is that normally how it is done?

Yes, AFAIK.


> Doesn't each attachment result in a separate email to the bioperl
> guts email list?

Yes, but that's not a problem. In fact, doing it this way means you
don't email everyone subscribed to guts your big files in plain text,
but instead they get a small email with a link to the download.


> Anyway,  I've just added the files to the bug report as attachments,
>  in case someone needs them to construct a test.

Thank you.


From arareko at campus.iztacala.unam.mx  Tue Oct 10 11:14:00 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Tue, 10 Oct 2006 10:14:00 -0500
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
	<452BAD9B.5010903@purdue.edu>
Message-ID: <452BB8B8.40409@campus.iztacala.unam.mx>

Phillip San Miguel wrote:
> I see that, once submitted, attachments can be added to a bug report.
>  Is that normally how it is done?

Yes, it's the normal method: create the bug report, then attach files.

> Doesn't each attachment result in a separate email to the bioperl 
> guts email list?

Adding a file will generate an informative email per bug change 
(attaching the file in this case) but won't send the attachment to the list.

Regards,
Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Tue Oct 10 11:20:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 10:20:55 -0500
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
Message-ID: <002801c6ec7f$ae8d85f0$15327e82@pyrimidine>

> Also, "bug writing guidelines" makes no mention of it. I vaguely
> remembered there being some method to do it--but given the "bug writing
> guidelines" exhortations to be specific and detailed, I thought I must
> put the information somewhere. So I put them them the only place offered
> (on that page)--"Description:"
>     I see that, once submitted, attachments can be added to a bug
> report. Is that normally how it is done? Doesn't each attachment result
> in a separate email to the bioperl guts email list?
>     Anyway,  I've just added the files to the bug report as attachments,
> in case someone needs them to construct a test.

Phillip,

Initial bug reports only require the general description, OS used, bioperl
version, etc.  That's quite normal.  Any relevant attachments are added
afterward.  We should probably make that clearer upfront on the wiki page; I
don't know if anyone can make similar changes to bugzilla.

Any bug changes, CVS commits, etc are mailed to bioperl-guts, yes.  That
isn't an issue though; it keeps the developers updated on the various
bugs/commits that are going on and is a pretty common practice.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 12:48:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 11:48:22 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
References: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>

There are a number of other bioperl-run examples (the  
Bio::Tools::Run::Analysis::soap issue I looked into revealed such).

I agree with both points, 1) that it depends on the size of the  
classes, and 2) from a maintainability standpoint, it can be very  
frustrating when looking for documentation.  Is there really any  
advantage to doing this?

Chris

On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:

> Hi all,
>
> The following modules have more than one "package xxxx;"  
> declaration in
> them. For small, internal classes I guess this is fine, but for  
> others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
>
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
>
> Here's the culprits:
>
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
>
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
>
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 12:48:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 11:48:22 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
References: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>

There are a number of other bioperl-run examples (the  
Bio::Tools::Run::Analysis::soap issue I looked into revealed such).

I agree with both points, 1) that it depends on the size of the  
classes, and 2) from a maintainability standpoint, it can be very  
frustrating when looking for documentation.  Is there really any  
advantage to doing this?

Chris

On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:

> Hi all,
>
> The following modules have more than one "package xxxx;"  
> declaration in
> them. For small, internal classes I guess this is fine, but for  
> others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
>
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
>
> Here's the culprits:
>
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
>
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
>
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From lzhtom at hotmail.com  Tue Oct 10 15:42:48 2006
From: lzhtom at hotmail.com (zhihua li)
Date: Tue, 10 Oct 2006 19:42:48 +0000
Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise?
Message-ID: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>

Hi netters.

I've installed Bioperl 1.5.1, both core and run modules.  But when I tried 
to use the Pise module, an error occured saying that there's no "new" 
method in this package.

My script is:

use strict;
use warnings;
use Bio::Tools::Run::AnalysisFactory::Pise;
my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
my $program=$factory->program('mfold');
$program->seq('my_input_file');
my $job = $program->run();
print STDERR $job->contect('mfold.out');

The error message I got is:

Can't locate object method "new" via package 
"Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load 
"Bio::Tools::Run::AnalysisFactor::Pise"?)

I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm and 
it DOES contain a sub new.

So what's going on? Anyone could give me a hint?

Thanks a lot!


From cjfields at uiuc.edu  Tue Oct 10 16:27:27 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 15:27:27 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
Message-ID: <E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>

Makes sense to me.  I think, as long as they're documented, it  
shouldn't be a problem.

I think the main point is that the class methods for these don't show  
up using perldoc (something I ran into with Bio::DB::Fasta's  
inclusion of Bio::PrimarySeq::Fasta), but they do show up when using  
other documentation.  So 'perldoc Bio::DB::Fasta' works, but 'perldoc  
Bio::PrimarySeq::Fasta' doesn't.  So these can be problematic when  
looking for specific methods.

However, I think pod2html handles multiple package declarations in  
one module, and the PDOC online do as well.  Does the Deobfuscator?

Chris

On Oct 10, 2006, at 3:11 PM, Lincoln Stein wrote:

> Hi,
>
> These ones are all mine:
>
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
>
> In each case, the second modules are teeny tiny ones that implement  
> iterators which are at most two methods long (typically a new() and  
> a next()). I prefer not to split them out because they will just  
> clutter up the file tree with stuff that is already well documented  
> in the "parent ship" modules.
>
> Lincoln
>
>
> On 10/10/06, Chris Fields <cjfields at uiuc.edu> wrote: There are a  
> number of other bioperl-run examples (the
> Bio::Tools::Run::Analysis::soap issue I looked into revealed such).
>
> I agree with both points, 1) that it depends on the size of the
> classes, and 2) from a maintainability standpoint, it can be very
> frustrating when looking for documentation.  Is there really any
> advantage to doing this?
>
> Chris
>
> On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:
>
> > Hi all,
> >
> > The following modules have more than one "package xxxx;"
> > declaration in
> > them. For small, internal classes I guess this is fine, but for
> > others,
> > they should be split up into the filesystem - otherwise they are
> > troublesome to locate and the online documentation doesn't list  
> them!
> >
> > eg.
> > bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> > is in
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> >
> > Here's the culprits:
> >
> > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/ 
> Bio |
> > sed 's/:.*$//' | sort | uniq -d ; done
> >
> > bioperl-live/Bio/AnalysisI.pm
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> > bioperl-live/Bio/SeqIO/interpro.pm
> >
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> > bioperl-run/Bio/Tools/Run/Analysis/soap.pm
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)
> FOR URGENT MESSAGES & SCHEDULING,
> PLEASE CONTACT MY ASSISTANT,
> SANDRA MICHELSEN, AT michelse at cshl.edu

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 16:30:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 15:30:16 -0500
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <870B7500-AA83-42D7-965B-865B91AA8E7F@uiuc.edu>


On Oct 10, 2006, at 2:42 PM, zhihua li wrote:

> Hi netters.
>
> I've installed Bioperl 1.5.1, both core and run modules.  But when  
> I tried to use the Pise module, an error occured saying that  
> there's no "new" method in this package.
>
> My script is:
>
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
>
> The error message I got is:
>
> Can't locate object method "new" via package  
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load  
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
>
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/ 
> Pise.pm and it DOES contain a sub new.
>
> So what's going on? Anyone could give me a hint?
>
> Thanks a lot!

Well, according to your error output you have AnalysisFactory  
misspelled ('AnalysisFactor'), which should tell you what the problem  
is.  Look for the same thing in your script.

Chris


>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 10 16:43:06 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 21:43:06 +0100
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <452C05DA.5050803@sendu.me.uk>

zhihua li wrote:
> Hi netters.
> 
> I've installed Bioperl 1.5.1, both core and run modules.  But when I 
> tried to use the Pise module, an error occured saying that there's no 
> "new" method in this package.
> 
> My script is:
> 
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
> 
> The error message I got is:
> 
> Can't locate object method "new" via package 
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load 
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
> 
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm 
> and it DOES contain a sub new.
> 
> So what's going on? Anyone could give me a hint?

You have a typo.

Bio::Tools::Run::AnalysisFactory::Pise, not
Bio::Tools::Run::AnalysisFactor::Pise


From lincoln.stein at gmail.com  Tue Oct 10 16:11:00 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 10 Oct 2006 16:11:00 -0400
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
Message-ID: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>

Hi,

These ones are all mine:

> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm

In each case, the second modules are teeny tiny ones that implement
iterators which are at most two methods long (typically a new() and a
next()). I prefer not to split them out because they will just clutter up
the file tree with stuff that is already well documented in the "parent
ship" modules.

Lincoln


On 10/10/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> There are a number of other bioperl-run examples (the
> Bio::Tools::Run::Analysis::soap issue I looked into revealed such).
>
> I agree with both points, 1) that it depends on the size of the
> classes, and 2) from a maintainability standpoint, it can be very
> frustrating when looking for documentation.  Is there really any
> advantage to doing this?
>
> Chris
>
> On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:
>
> > Hi all,
> >
> > The following modules have more than one "package xxxx;"
> > declaration in
> > them. For small, internal classes I guess this is fine, but for
> > others,
> > they should be split up into the filesystem - otherwise they are
> > troublesome to locate and the online documentation doesn't list them!
> >
> > eg.
> > bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> > is in
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> >
> > Here's the culprits:
> >
> > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> > sed 's/:.*$//' | sort | uniq -d ; done
> >
> > bioperl-live/Bio/AnalysisI.pm
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> > bioperl-live/Bio/SeqIO/interpro.pm
> >
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> > bioperl-run/Bio/Tools/Run/Analysis/soap.pm
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From asjo at koldfront.dk  Tue Oct 10 16:04:35 2006
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Tue, 10 Oct 2006 22:04:35 +0200
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <871wpglyy4.fsf@topper.koldfront.dk>

On Tue, 10 Oct 2006 19:42:48 +0000, zhihua wrote:

> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
                                               ^
                                               y
[...]

> Can't locate object method "new" via package
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load
> "Bio::Tools::Run::AnalysisFactor::Pise"?)

You missed a 'y' in "Factory".


  Best wishes,

-- 
 "We've reached a special place... Spiritually...             Adam Sj?gren
  ecumenically... grammatically."                        asjo at koldfront.dk


From dmessina at wustl.edu  Tue Oct 10 17:08:45 2006
From: dmessina at wustl.edu (David Messina)
Date: Tue, 10 Oct 2006 16:08:45 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
	<E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
Message-ID: <A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>

> However, I think pod2html handles multiple package declarations in
> one module, and the PDOC online do as well.  Does the Deobfuscator?

Nope. From my cursory examination at the time they mostly were, as  
Lincoln said, short and sweet, so I didn't consider it a big deal.

I do think the Deobfuscator should theoretically handle such cases  
anyway, though. I'll add it as a feature request on the wiki page. Or  
if you're chomping at the bit for it, I could certainly be beer- 
suaded to do it sooner rather than later... :)

Dave


From cjfields at uiuc.edu  Tue Oct 10 17:33:39 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 16:33:39 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
	<E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
	<A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>
Message-ID: <7F35F565-7D28-4B06-A501-4D4083652C5C@uiuc.edu>

Me?  I'm a lowly postdoc.  Lincoln's got the cash!

Chris

On Oct 10, 2006, at 4:08 PM, David Messina wrote:

>> However, I think pod2html handles multiple package declarations in
>> one module, and the PDOC online do as well.  Does the Deobfuscator?
>
> Nope. From my cursory examination at the time they mostly were, as  
> Lincoln said, short and sweet, so I didn't consider it a big deal.
>
> I do think the Deobfuscator should theoretically handle such cases  
> anyway, though. I'll add it as a feature request on the wiki page.  
> Or if you're chomping at the bit for it, I could certainly be beer- 
> suaded to do it sooner rather than later... :)
>
> Dave
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From sdavis2 at mail.nih.gov  Wed Oct 11 05:43:35 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 11 Oct 2006 05:43:35 -0400
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <452CBCC7.30108@mail.nih.gov>

zhihua li wrote:
> Hi netters.
>
> I've installed Bioperl 1.5.1, both core and run modules. But when I
> tried to use the Pise module, an error occured saying that there's no
> "new" method in this package.
>
> My script is:
>
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
>
> The error message I got is:
>
> Can't locate object method "new" via package
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
>
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm
> and it DOES contain a sub new.
>
> So what's going on? Anyone could give me a hint?
>
> Thanks a lot!

The module name is Bio::Tools::Run::AnalysisFactory::Pise. Note that it
is not "factor" but "factory". That should probably fix your problem.

Sean


From jay at jays.net  Sat Oct  7 18:34:23 2006
From: jay at jays.net (Jay Hannah)
Date: Sat, 07 Oct 2006 17:34:23 -0500
Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult
Message-ID: <45282B6F.1030308@jays.net>

I just updated my bioperl-live this morning, so I think I'm current. :)

perldoc Bio::Search::Result::GenericResult
------------
SYNOPSIS
           # typically one gets Results from a SearchIO stream
           use Bio::SearchIO;
           my $io = new Bio::SearchIO(-format => 'blast',
                                       -file   => 't/data/HUMBETGLOA.tblastx');
           while( my $result = $io->next_result) {
               # process all search results within the input stream
               while( my $hit = $result->next_hits()) {
-------------

Except that "next_hits()" does not exist. Should be "next_hit()".

(Should I have posted a patch instead?)

Thanks,

j


From bosborne11 at verizon.net  Tue Oct 10 18:42:25 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 10 Oct 2006 18:42:25 -0400
Subject: [Bioperl-l] Documentation typo:
	Bio::Search::Result::GenericResult
In-Reply-To: <45282B6F.1030308@jays.net>
Message-ID: <C1519A11.ABD1%bosborne11@verizon.net>

j,

No need, not for something so simple.

Brian O.


On 10/7/06 6:34 PM, "Jay Hannah" <jay at jays.net> wrote:

> Except that "next_hits()" does not exist. Should be "next_hit()".
> 
> (Should I have posted a patch instead?)


From zchou at cau.edu.cn  Wed Oct 11 02:34:24 2006
From: zchou at cau.edu.cn (zhuocheng Hou)
Date: Wed, 11 Oct 2006 14:34:24 +0800
Subject: [Bioperl-l] about retreive alinged sequence
Message-ID: <000a01c6ecff$4ea4b2f0$0915020a@zchou>

Hello,everyone,

I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out.

The codes as follows (from the tutorials of HOWTOPAML):

         #
         # These codes run  and can find the screen print out of clustalw
         .......
         my $aa_aln = $aln_factory->align(\@prots, at params);
         # project the protein alignment back to CDS coordinates
         my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs);  
         my @each = $dna_aln->each_seq();         
         
         # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. 


         my $in  = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta');
         my $aln=$dna_aln;
         my $out = Bio::AlignIO->new(-file => ">out.msf" ,
                                   -format => 'msf');
         #print $out $_ while <$in>; 
         while ($aln = $in->next_aln() ) {
               my $out->write_aln($aln);
         }
         

Best regards,

Zhuocheng
CAU


From n.haigh at sheffield.ac.uk  Wed Oct 11 10:00:33 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 11 Oct 2006 15:00:33 +0100
Subject: [Bioperl-l] about retreive alinged sequence
In-Reply-To: <000a01c6ecff$4ea4b2f0$0915020a@zchou>
References: <000a01c6ecff$4ea4b2f0$0915020a@zchou>
Message-ID: <452CF901.6020409@sheffield.ac.uk>

Dear Zhuocheng

I'm not familiar with the aa_to_dna_al method but it appears that from 
your code that it returns an alignment object. Please find comments 
inserted below - hope they help!

Nathan

zhuocheng Hou wrote:
> Hello,everyone,
>
> I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out.
>
> The codes as follows (from the tutorials of HOWTOPAML):
>
>          #
>          # These codes run  and can find the screen print out of clustalw
>          .......
>          my $aa_aln = $aln_factory->align(\@prots, at params);
>          # project the protein alignment back to CDS coordinates
>          my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs);  
>   
$dna_aln should be a Bio::AlignIO object so all you need to do is setup 
the output stream to write the alignment object similar to what you 
wrote below. i.e.

my $out = Bio::AlignIO->new(-file => ">out.msf" ,
                                   -format => 'msf');

Then simply write the input alignment ($dna_aln) to the output stream 
with this:

my $out->write_aln($dna_aln);


>          my @each = $dna_aln->each_seq();         
>          
>          # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. 
>
>
>          my $in  = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta');
>          my $aln=$dna_aln;
>          my $out = Bio::AlignIO->new(-file => ">out.msf" ,
>                                    -format => 'msf');
>          #print $out $_ while <$in>; 
>          while ($aln = $in->next_aln() ) {
>                my $out->write_aln($aln);
>          }
>          
>
> Best regards,
>
> Zhuocheng
> CAU
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From melcher at rescomp.berkeley.edu  Wed Oct 11 17:09:17 2006
From: melcher at rescomp.berkeley.edu (Graham Melcher)
Date: Wed, 11 Oct 2006 14:09:17 -0700
Subject: [Bioperl-l] Accessing GO through MYSQL?
Message-ID: <20061011210917.GA783@rescomp.berkeley.edu>

Hey all,

Preface:: This is my first post to this list, please redirect if my
questions belong elsewhere.  

I need to lookup GO ontology information given GO:Accessors, and I have
a local mysql db that mirrors the GO db from that website.  I am not
sure if the Bio::Ontology::* libraries were designed to be used in a
dynamic, load-as-you-need sort of way, and am wondering how other people
have gone about solving this problem.  Details follow...

Right now I'm using Class::DBI to access the Mysql database, then made a
new set of subclassed Bio::Ontology::TermI and
Bio::Ontology::RelationshipI which use these class::DBI objects to
access the relevent information in the database on the fly.
Unfortunately, I was getting stuck with the implementation of some of
the other Bio::Ontology::*I, especially Ontology.   Making all of these
subclasses seems infeasible, or at least enough work that it might be
available somewhere.  Are mysql accessors out there, and I just haven't
found them, or is Bio::Ontology possibly not way to go?  

Alternatively, if I end up having to write this sort of Bio::Ontology -
Class::DBI interface, would anyone be interested in it being made
generally usable and available?

Finally, I just found go-perl, but although I haven't had a lot of time
to look into it, it doesn't seem to use mysql either.

Thanks!

Graham

-- 
Graham Melcher


From sdavis2 at mail.nih.gov  Thu Oct 12 07:51:14 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 07:51:14 -0400
Subject: [Bioperl-l] Accessing GO through MYSQL?
In-Reply-To: <20061011210917.GA783@rescomp.berkeley.edu>
References: <20061011210917.GA783@rescomp.berkeley.edu>
Message-ID: <452E2C32.7070502@mail.nih.gov>

Graham Melcher wrote:
> Finally, I just found go-perl, but although I haven't had a lot of time
> to look into it, it doesn't seem to use mysql either.
>   
Yep.  Keep going.  Go-perl and Go-db-perl:

http://www.godatabase.org/dev/go-db-perl/doc/go-db-perl-doc.html

Sean


From hlapp at gmx.net  Thu Oct 12 00:44:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 12 Oct 2006 00:44:49 -0400
Subject: [Bioperl-l] NESCent Phyloinformatics Hackathon
Message-ID: <939B253E-2F87-450A-A277-78B5645D3494@gmx.net>

(apologies in advance to those who receive this multiple times)

The National Evolutionary Synthesis Center (NESCent) in collaboration  
with Arlin Stoltzfus (U. Maryland, NIST), Aaron Mackey (GSK), Rutger  
Vos (UBC), and Mark Holder (FSU) sponsors a Phyloinformatics  
Hackathon to take place Dec 11-15 in Durham, NC.

The (wiki) website with more information and a formal proposal is at

	https://www.nescent.org/wg_phyloinformatics/

In short, the goal is to leverage the Bio* toolkits to provide the  
"glue" for evolutionary analyses of various types that depend on  
automation, interoperability, and data integration.

CALL FOR INPUT:

The specific objectives are driven by "use cases", that is, specific  
target problems of interest to evolutionary biologists (click 'Use  
Cases' at the above website). We invite community input in order to  
focus efforts on the most urgent or pervasive problems. The wiki for  
the hackathon allows direct editing of the use cases after  
registration. You may also upload data files, or add comments to the  
"Forum" page. Alternatively, send email to hlapp at nescent.org. You  
may also contact any of the organizers with questions or comments.

ATTENDANCE:

The hackathon is scheduled for Dec 11-15, 2006 in Durham NC. Space is  
limited, and attendance is by invitation. If you have not been  
contacted but desire to attend, please contact Hilmar Lapp (hlapp at  
nescent.org).

ORGANIZERS:

Hilmar Lapp (NESCent; hlapp at nescent.org)
Aaron Mackey (GSK; aaron.j.mackey at gsk.com)
Mark Holder (FSU; mholder at scs.fsu.edu)
Arlin Stoltzfus (CARB, NIST; arlin.stoltzfus at nist.gov)
Todd Vision (NESCent; tjv at bio.unc.edu)
Rutger Vos (UBC; rvosa at sfu.ca)


From neetisomaiya at gmail.com  Thu Oct 12 02:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Thu Oct 12 02:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Thu Oct 12 02:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive


From sayali_salodkar at persistent.co.in  Thu Oct 12 06:16:34 2006
From: sayali_salodkar at persistent.co.in (Sayali)
Date: Thu, 12 Oct 2006 15:46:34 +0530
Subject: [Bioperl-l] regarding polyphred output
Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in>

Hi, 

I want to parse the output of polyphred
http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already
available in Bioperl which would help me in doing the same.

Thanks,

Sayali

 
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails.


From sayali_salodkar at persistent.co.in  Thu Oct 12 06:16:34 2006
From: sayali_salodkar at persistent.co.in (Sayali)
Date: Thu, 12 Oct 2006 15:46:34 +0530
Subject: [Bioperl-l] regarding polyphred output
Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in>

Hi, 

I want to parse the output of polyphred
http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already
available in Bioperl which would help me in doing the same.

Thanks,

Sayali

 
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails.


From sdavis2 at mail.nih.gov  Thu Oct 12 06:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean


From sdavis2 at mail.nih.gov  Thu Oct 12 06:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean


From sdavis2 at mail.nih.gov  Thu Oct 12 06:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean


From crabtree at tigr.ORG  Thu Oct 12 07:28:06 2006
From: crabtree at tigr.ORG (Jonathan Crabtree)
Date: Thu, 12 Oct 2006 07:28:06 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <452E26C6.6040800@tigr.org>


Hi Neeti-

neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
>   

This doesn't sound like a BioPerl issue per se, so this list might not
be the best venue for your question.  Since SQL*Loader is an Oracle
utility you may have better luck in a forum frequented by Oracle DBAs
and/or general bioinformatics people.  (Not that this isn't such a
forum, but unless your difficulty is actually being caused by BioPerl,
or there's some kind of SQL*Loader wrapper in BioPerl--which I don't
think is the case--you run the risk of having people complain that your
question doesn't have enough to do with BioPerl.)

> We have tried loading sequences into CLOB columns using sql loader, and that
> works fine, but the same syntax when used for loading alignments, is not
> working.
>   

It's been a while since I've done any work with SQL*Loader, but I'd
guess that the reason it works with sequences and not alignments is
because there are characters in the alignments (newlines, perhaps?) that
SQL*Loader is incorrectly interpreting as either column (field) or row
(record) delimiters.  You may need to change your flat file encoding to
use delimiters other than the defaults (and alter the SQL*Loader control
file accordingly.)  As Sean pointed out, however, it's difficult to be
much help without seeing an example of a failed input and the
corresponding error(s)!  One other thing I remember about SQL*Loader (as
of Oracle 8-9 or so) is that all the CLOB values had to appear *last* in
the SQL*Loader record, at least if you were using variable-length
fields.  But since you've loaded sequences successfully, I doubt this is
the issue.  One final thought is that I believe SQL*Loader has an option
whereby you can place your LOB values in files external to the main
SQL*Loader input file, which sidesteps the field/row delimiter issue
completely; you may want to look into this if you're not already loading
your Oracle database this way.

Jonathan


From bix at sendu.me.uk  Fri Oct 13 04:56:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 13 Oct 2006 09:56:01 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au>
References: <4521E74E.1040404@infotech.monash.edu.au>
Message-ID: <452F54A1.7010908@sendu.me.uk>

Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's 
certainly interface-like, but doesn't follow the normal interface naming 
convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed 
WrapperBaseI? Left alone?


From cjfields at uiuc.edu  Fri Oct 13 08:20:58 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 13 Oct 2006 07:20:58 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <452F54A1.7010908@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>
	<452F54A1.7010908@sendu.me.uk>
Message-ID: <43CC4E80-8F15-4C83-929D-DDC719360C8F@uiuc.edu>

I would say, according to BioPerl convention, it should be renamed  
WrapperBaseI.  It has a few interface-like methods and (importantly)  
lacks a constructor.  Unless someone else out there has other reasoning?

Note that this will require lots of bioperl-run changes as well, at  
least I think it will.

Chris

On Oct 13, 2006, at 3:56 AM, Sendu Bala wrote:

> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's
> certainly interface-like, but doesn't follow the normal interface  
> naming
> convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed
> WrapperBaseI? Left alone?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From avilella at gmail.com  Fri Oct 13 11:26:47 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 13 Oct 2006 16:26:47 +0100
Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method
Message-ID: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>

Hi all,

While using the remove_gaps method in Bio::SimpleAlign I found out
that if the alignment is (bad enough for) having no columns without
any gap at all, the method will give a:

Use of uninitialized value in split at this line in add_seq:

map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq);

So my idea was to tweak this line to something like:

map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || '');

But I am unsure about any other side effects this may have.

Anyone?

    Albert.


From cjfields at uiuc.edu  Fri Oct 13 11:51:38 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 13 Oct 2006 10:51:38 -0500
Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method
In-Reply-To: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>
References: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>
Message-ID: <EE9FE57F-EE17-44FE-B298-CD4084675085@uiuc.edu>

You can check to see if it passes all tests.  I'm guessing  
SimpleAlign.t tests this method out in some way (though it's always  
safer to check).

Chris

On Oct 13, 2006, at 10:26 AM, Albert Vilella wrote:

> Hi all,
>
> While using the remove_gaps method in Bio::SimpleAlign I found out
> that if the alignment is (bad enough for) having no columns without
> any gap at all, the method will give a:
>
> Use of uninitialized value in split at this line in add_seq:
>
> map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq);
>
> So my idea was to tweak this line to something like:
>
> map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || '');
>
> But I am unsure about any other side effects this may have.
>
> Anyone?
>
>     Albert.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jay at jays.net  Fri Oct 13 12:09:16 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 11:09:16 -0500
Subject: [Bioperl-l] Documentation typo:
	Bio::Search::Result::GenericResult
In-Reply-To: <C1519A11.ABD1%bosborne11@verizon.net>
References: <C1519A11.ABD1%bosborne11@verizon.net>
Message-ID: <452FBA2C.7070003@jays.net>

Thanks Brian! 

My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :)

/home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v
----------------------------
revision 1.27
date: 2006/10/10 22:41:46;  author: bosborne;  state: Exp;  lines: +4 -4
next_hit, not next_hits
----------------------------

I'm a simple man who takes great satisfaction in the simple things. :)

j


Brian Osborne wrote:
> j,
> 
> No need, not for something so simple.
> 
> Brian O.
> 
> 
> On 10/7/06 6:34 PM, "Jay Hannah" <jay at jays.net> wrote:
>> Except that "next_hits()" does not exist. Should be "next_hit()".
>>
>> (Should I have posted a patch instead?)
> 


From jay at jays.net  Fri Oct 13 12:24:48 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 11:24:48 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
Message-ID: <452FBDD0.2070008@jays.net>

So I'm doing the following:

1) Using Bio::SeqIO to read in a genbank file and kick out fasta.
2) Reading that fasta file w/ command line formatdb.
3) Using that output for command line blastall.
4) Using Bio::SearchIO to read the blast results.

(If there's a better way, do tell. -grin-)

This sequence is working great for nucleotide BLASTing, but I'm stuck on step 1 when trying protein BLAST. 

my $seq_in  = Bio::SeqIO->new(
   -file => "<Organism1.genbank", 
   -format => "genbank", 
   -alphabet => "protein"
);
my $seq_out_protein = Bio::SeqIO->new(
   -file => ">out",
   -format => 'fasta',
   -alphabet => 'protein'
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");
   $seq_out_protein->write_seq($inseq);
}

This creates a nucleotide file "out". Setting -alphabet doesn't seem to do anything. Setting molecule("protein") doesn't seem to do anything either.

I was expecting that it would just pull all the CDS strings out of the genbank file and dump those into fasta format?

Am I missing something obvious?

Thanks,

j


From bosborne11 at verizon.net  Fri Oct 13 12:54:02 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 13 Oct 2006 12:54:02 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <452FBDD0.2070008@jays.net>
Message-ID: <C1553CEA.AC2E%bosborne11@verizon.net>

Jay,

You're looking for the "translation" string in the CDS section, yes? You
need to delve a bit into features, the CDS is considered to be a feature of
the main or parent nucleotide sequence and the translation is part of CDS
feature:

http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank


Brian O.


On 10/13/06 12:24 PM, "Jay Hannah" <jay at jays.net> wrote:

> Am I missing something 


From bix at sendu.me.uk  Fri Oct 13 12:59:46 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 13 Oct 2006 17:59:46 +0100
Subject: [Bioperl-l] Documentation
	typo:	Bio::Search::Result::GenericResult
In-Reply-To: <452FBA2C.7070003@jays.net>
References: <C1519A11.ABD1%bosborne11@verizon.net> <452FBA2C.7070003@jays.net>
Message-ID: <452FC602.3080302@sendu.me.uk>

Jay Hannah wrote:
> Thanks Brian! 
> 
> My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :)
> 
> /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v
> ----------------------------
> revision 1.27
> date: 2006/10/10 22:41:46;  author: bosborne;  state: Exp;  lines: +4 -4
> next_hit, not next_hits
> ----------------------------

Congratulations! :D

Next it will be two byte corrections and from there, the sky's the limit! :)


From hlapp at gmx.net  Fri Oct 13 13:28:50 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 13 Oct 2006 13:28:50 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <452F54A1.7010908@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>
	<452F54A1.7010908@sendu.me.uk>
Message-ID: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>

What does the POD (and the code) say about instantiating it?

	-hilmar

On Oct 13, 2006, at 4:56 AM, Sendu Bala wrote:

> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's
> certainly interface-like, but doesn't follow the normal interface  
> naming
> convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed
> WrapperBaseI? Left alone?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jay at jays.net  Fri Oct 13 14:56:38 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 13:56:38 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <C1553CEA.AC2E%bosborne11@verizon.net>
References: <C1553CEA.AC2E%bosborne11@verizon.net>
Message-ID: <452FE166.5080405@jays.net>

Brian Osborne wrote:
> You're looking for the "translation" string in the CDS section, yes? You
> need to delve a bit into features, the CDS is considered to be a feature of
> the main or parent nucleotide sequence and the translation is part of CDS
> feature:
> 
> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank

Yes. Thanks. I "rolled my own" -- I'm now doing this:

while (my $inseq = $seq_in->next_seq) {
   my @features = $inseq->get_SeqFeatures();
   foreach my $feat ( @features ) {
      next unless ($feat->primary_tag eq "CDS");
      my @db_xrefs = $feat->annotation->get_Annotations("db_xref");
      @db_xrefs = grep { /^GI:/ } @db_xrefs;
      die "Panic! More than one GI: db_xref?"     if (@db_xrefs > 1);
      die "Panic! No GI: db_xref?"            unless (@db_xrefs == 1);
      my $gi = $db_xrefs[0];
      $gi =~ s/^GI://;
      my @translations = $feat->annotation->get_Annotations("translation");
      die "Panic! More than one translation?" if (@translations > 1);
      my @protein_ids = $feat->annotation->get_Annotations("protein_id");
      die "Panic! More than one protein_id?"  if (@protein_ids > 1);
      my @product = $feat->annotation->get_Annotations("product");
      die "Panic! More than one product?"  if (@product > 1);
      print ">gi|$gi|gb|$protein_ids[0]|";
      print $inseq->id . " $product[0]\n";
      print "$translations[0]\n";
   }
}

To generate a homebrew fasta file for a protein BLAST.

I just thought that -alphabet and molecule() would do that stuff for me? What else would "protein" mean in those? Does anyone use -alphabet and/or molecule()? For what? How? Again, here's what I'm talking about:

==========
my $seq_out_protein = Bio::SeqIO->new(
   -file => ">out",
   -format => 'fasta',
   -alphabet => 'protein'    # No effect?
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");    # No effect?
==========

Thanks,

j


From bosborne11 at verizon.net  Fri Oct 13 17:20:40 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 13 Oct 2006 17:20:40 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <452FE166.5080405@jays.net>
Message-ID: <C1557B68.AC3E%bosborne11@verizon.net>

Jay,

Yes, people use the -alphabet parameter. If you set it to something then
Bioperl will not try to determine whether the sequence is protein, rna, or
dna and this is particularly useful when the sequence contains characters
that Bioperl would object to (sequences with distasteful characters can be
created by various applications, for example, or you might introduce some
weird character for some reason). Setting the -alphabet would also speed up
Bioperl a bit, for the same reason.

Brian O.


On 10/13/06 2:56 PM, "Jay Hannah" <jay at jays.net> wrote:

> 
> I just thought that -alphabet and molecule() would do that stuff for me? What
> else would "protein" mean in those? 


From jay at jays.net  Sat Oct 14 11:25:05 2006
From: jay at jays.net (Jay Hannah)
Date: Sat, 14 Oct 2006 10:25:05 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <C1557B68.AC3E%bosborne11@verizon.net>
References: <C1557B68.AC3E%bosborne11@verizon.net>
Message-ID: <45310151.5050901@jays.net>

Brian Osborne wrote:
> Yes, people use the -alphabet parameter. If you set it to something then
> Bioperl will not try to determine whether the sequence is protein, rna, or
> dna and this is particularly useful when the sequence contains characters
> that Bioperl would object to (sequences with distasteful characters can be
> created by various applications, for example, or you might introduce some
> weird character for some reason). Setting the -alphabet would also speed up
> Bioperl a bit, for the same reason.

Huh. That's what I assumed when I stumbled into the -alphabet parameter. So I thought this would read the protein sequences out of my genbank file and write a fasta file for me:

my $seq_in  = Bio::SeqIO->new(
   -file     => "<$file",  
   -format   => "genbank",
   -alphabet => "protein"  # No effect?
);
my $seq_out = Bio::SeqIO->new(
   -file     => ">$outfile",
   -format   => "fasta",
   -alphabet => "protein"  # No effect?
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");    # No effect?
   $seq_out->write_seq($inseq);
}

It didn't. Would it be a Good Thing if it did what I was expecting? (Like I said I rolled my own, but I'm always looking for ways to enhance BioPerl that other people might find useful... Someday I will contribute something useful, by golly. -grin-)

(Background: I'm doing protein BLASTs from genbank files. To make formatdb happy I have to have fasta files full of the protein sequences.)

j


From bosborne11 at verizon.net  Sat Oct 14 14:40:21 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Sat, 14 Oct 2006 14:40:21 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <45310151.5050901@jays.net>
Message-ID: <C156A755.AC52%bosborne11@verizon.net>

Jay,

What you expected was that setting the -alphabet to "protein" would make
Bioperl translate the input nucleotide sequence to output protein. In
Bioperl this is accomplished by using the translate() method, no surprise
there. If you take a look at the documentation on translate() in the online
Bioperl Tutorial you'll see that this is a fairly sophisticated method, you
can do all sorts of different things with it. So using -alphabet for this
purpose won't really work, there are too many different ways to translate.

Brian O.


On 10/14/06 11:25 AM, "Jay Hannah" <jay at jays.net> wrote:

> Would it be a Good Thing if it did what I was expecting?


From cjfields at uiuc.edu  Sat Oct 14 20:44:04 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 14 Oct 2006 19:44:04 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <45310151.5050901@jays.net>
Message-ID: <000601c6eff3$084663c0$15327e82@pyrimidine>

...
> Huh. That's what I assumed when I stumbled into the -alphabet parameter.
> So I thought this would read the protein sequences out of my genbank file
> and write a fasta file for me:

You have to think about it this way: the GenBank record you are using is for
the nucleotide sequence only, and all other information in that record
describes the sequence.  Similarly, if you used a 'GenPept' sequence, the
focus would be the protein sequence.  Both normally contain annotations
which describe the sequence globally, such as references, organism info,
etc.  Both also may contain features (or SeqFeatures), which describe a
feature bound to a particular location on the sequence.  However, features
are not an absolute requirement for a sequence; they're sort of 'window
dressing', albeit almost always essential for describing the main sequence.

I would do exactly as Brian suggests.  See the Feature/Annotation HOWTO for
ideas on how to screen out the particular features you want and either grab
the 'translation' tag data or get the sequence object from the feature and
translate it directly.  You should get the same result either way though
getting the tag may be faster.

...

> It didn't. Would it be a Good Thing if it did what I was expecting? (Like
> I said I rolled my own, but I'm always looking for ways to enhance BioPerl
> that other people might find useful... Someday I will contribute something
> useful, by golly. -grin-)
> 
> (Background: I'm doing protein BLASTs from genbank files. To make formatdb
> happy I have to have fasta files full of the protein sequences.)
> 
> j

You could, theoretically, write up a method to only retrieve features which
correspond to coding regions only (CDS).  You may want to optionally screen
out pseudogenes but that's up to you.

Chris


From avilella at gmail.com  Sun Oct 15 07:08:23 2006
From: avilella at gmail.com (Albert Vilella)
Date: Sun, 15 Oct 2006 12:08:23 +0100
Subject: [Bioperl-l] no_residues test in SimpleAlign.t
Message-ID: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>

Hi all,

Can somebody check the SimpleAlign.t test?

perl t/SimpleAlign.t

I get a few errors, I am looking at one that deals with no_residues. I
don't understand if this is suposed to work:

sub no_residues {
    my $self = shift;
    my $count = 0;

    foreach my $seq ($self->each_seq) {
	my $str = $seq->seq();

	$count += ($str =~ s/[^A-Za-z]//g);
        #is this the same as:
        # $str =~ s/[^A-Za-z]//g;
        # $count += length($str);
    }

Cheers,

    Albert.
    return $count;
}


From cjfields at uiuc.edu  Sun Oct 15 13:53:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 15 Oct 2006 12:53:50 -0500
Subject: [Bioperl-l] no_residues test in SimpleAlign.t
In-Reply-To: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>
References: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>
Message-ID: <FE798536-21DA-4377-96E2-0BF98C235970@uiuc.edu>

Albert,

I get all 75 tests passing.  SimpleAlign.t was recently switched over  
to Test::More, so you should be seeing more explicit test  
descriptions.  It looks like test 27 is no_residues().  Were there  
any more that failed?

I usually run 'perl -I. t/test.t' from the main bioperl directory to  
check individual tests from the local directory.  Otherwise you are  
checking your installed version which may be older (and may not match  
tests and recent bug fixes).  Could that be the problem?

Chris

On Oct 15, 2006, at 6:08 AM, Albert Vilella wrote:

> Hi all,
>
> Can somebody check the SimpleAlign.t test?
>
> perl t/SimpleAlign.t
>
> I get a few errors, I am looking at one that deals with no_residues. I
> don't understand if this is suposed to work:
>
> sub no_residues {
>     my $self = shift;
>     my $count = 0;
>
>     foreach my $seq ($self->each_seq) {
> 	my $str = $seq->seq();
>
> 	$count += ($str =~ s/[^A-Za-z]//g);
>         #is this the same as:
>         # $str =~ s/[^A-Za-z]//g;
>         # $count += length($str);
>     }
>
> Cheers,
>
>     Albert.
>     return $count;
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From DGroskreutz at twt.com  Mon Oct 16 02:00:39 2006
From: DGroskreutz at twt.com (DGroskreutz at twt.com)
Date: Mon, 16 Oct 2006 01:00:39 -0500
Subject: [Bioperl-l] CN=Deb Groskreutz/OU=MSN/O=TWT is out of the office.
Message-ID: <OF66FF39D7.C58855EB-ON86257209.002104F9-86257209.002104F9@twt.com>


I will be out of the office starting  10/13/2006 and will not return until
10/30/2006.

I will be out of the office until October 30, 2006.
I will reply to your message at that time.

Thanks,
Deb


NOTICE OF CONFIDENTIALITY:
The information contained in this communication, including attachments, is intended for the specific delivery to and use by the individual(s) to whom it is addressed. This email includes confidential information that may be attorney-client privileged. Any review, retransmission, dissemination, or unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please reply to the sender immediately and delete the original communication and any copy of it from your computer system, including all attachments.


From bix at sendu.me.uk  Mon Oct 16 04:08:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 09:08:34 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>
	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
Message-ID: <45333E02.9070808@sendu.me.uk>

Hilmar Lapp wrote:
> What does the POD (and the code) say about instantiating it?

=head1 SYNOPSIS

   # do not use this object directly, it provides the following methods
   # for its subclasses

...


=head1 DESCRIPTION

This is a basic module from which to build executable wrapper modules.
It has some basic methods to help when implementing new modules.


There is no new() method.


From bix at sendu.me.uk  Mon Oct 16 09:23:41 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 14:23:41 +0100
Subject: [Bioperl-l] Bio::WebAgent sleep warning
Message-ID: <453387DD.3040105@sendu.me.uk>

Hi,

Does anyone think it's appropriate for Bio::WebAgent to issue warnings 
every time it sleeps? I'd consider the sleeping part of its normal, 
expected and desired behaviour so I don't need to be warned about it. 
Perhaps change the $self->warn to a $self->debug?


From cjfields at uiuc.edu  Mon Oct 16 10:12:10 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 09:12:10 -0500
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <453387DD.3040105@sendu.me.uk>
Message-ID: <000c01c6f12d$121b5000$15327e82@pyrimidine>

> Hi,
> 
> Does anyone think it's appropriate for Bio::WebAgent to issue warnings
> every time it sleeps? I'd consider the sleeping part of its normal,
> expected and desired behaviour so I don't need to be warned about it.
> Perhaps change the $self->warn to a $self->debug?

That sounds fine.  Using debugging output for sleep would be similar
behavior to Bio::DB::NCBIHelper and BioDB::GenericWebDBI.  You may want to
pass it by Heikki (I think that's his module).  

The only reason I would want to see sleep output, personally, is to make
sure it is working properly.

Almost looks like that class has the same intent that GenericWebDBI has
(even down to using LWP::UserAgent as a superclass).  I may look into it to
see if I can use this as a superclass for GenericWebDBI.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Mon Oct 16 10:26:21 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 15:26:21 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
Message-ID: <4533968D.6040009@sheffield.ac.uk>

Did anyone reconfigure the bioperl web server (which ever server hosts
http://bioperl.org/DIST) by adding the following lines to the httpd.conf
file:

RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*)
http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1

This will be required as a workaround to a bug in ActivePerl 5.8.8.819
which will result in a failed install of Bioperl via PPM.

Cheers
Nath


From n.haigh at sheffield.ac.uk  Mon Oct 16 11:30:16 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 16:30:16 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A257.2000207@campus.iztacala.unam.mx>
References: <4533968D.6040009@sheffield.ac.uk>
	<4533A257.2000207@campus.iztacala.unam.mx>
Message-ID: <4533A588.9020505@sheffield.ac.uk>

Mauricio Herrera Cuadra wrote:
> Done. Could you please check if it works as it should?
>
> Cheers,
> Mauricio.
Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
someone to pop it in http://bioperl/DIST

Volunteers?

BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
the PPD? I seem to remember that there was talk about having to maintain
a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
this front?

Nath


From arareko at campus.iztacala.unam.mx  Mon Oct 16 11:16:39 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 10:16:39 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533968D.6040009@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>
Message-ID: <4533A257.2000207@campus.iztacala.unam.mx>

Done. Could you please check if it works as it should?

Cheers,
Mauricio.

Nathan Haigh wrote:
> Did anyone reconfigure the bioperl web server (which ever server hosts
> http://bioperl.org/DIST) by adding the following lines to the httpd.conf
> file:
> 
> RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*)
> http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1
> 
> This will be required as a workaround to a bug in ActivePerl 5.8.8.819
> which will result in a failed install of Bioperl via PPM.
> 
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From arareko at campus.iztacala.unam.mx  Mon Oct 16 11:33:33 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 10:33:33 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>
	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk>
Message-ID: <4533A64D.6040203@campus.iztacala.unam.mx>

Nathan Haigh wrote:
> Mauricio Herrera Cuadra wrote:
>> Done. Could you please check if it works as it should?
>>
>> Cheers,
>> Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?

You can send it to me.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From akarger at CGR.Harvard.edu  Mon Oct 16 11:54:33 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 16 Oct 2006 11:54:33 -0400
Subject: [Bioperl-l] Bio::Location::Split
Message-ID: <B9182BFF5B004245BABC12956EA6322E016B1ED7@huls5.nucleus.harvard.edu>

I recently came across bug 2101, where Bio::Location::Split::to_FTstring
gives the incorrect order for multi-sublocation locations on the minus
strand. That is, I found it by getting incorrect results, and then found
it in Bugzilla and in the September archives.

I'm converting CDS files from one format to another. E.g., I read an
EMBL file with a chromosome and CDS features, and want to output the
location in a FASTA header. If I do something like:

foreach (<$in>) {
    foreach my $feat ($seq->getSeqFeatures) {
        print $feat->location->to_FTstring()
    }
}

I get the wrong results for multi-exon CDSs on the -1 strand, as
described in the bug report.

Is there a relatively easy way around this? I assume I can't get at the
original string of the location, which in this case is all I need. Can I
just flip the order of the exons in certain cases? Chris F, can you tell
me the preliminary solution you mentioned?

I must say I'm sort of surprised this wasn't found before. It seems like
a not-that-rare occurrence. Oh well.

Thanks,

- Amir Karger
Research Computing
Life Sciences Division
Harvard University


From bix at sendu.me.uk  Mon Oct 16 12:14:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 17:14:39 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk>
Message-ID: <4533AFEF.8080103@sendu.me.uk>

Nathan Haigh wrote:
> Mauricio Herrera Cuadra wrote:
>> Done. Could you please check if it works as it should?
>>
>> Cheers,
>> Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?

I'm sure Mauricio would be happy to do it, but so am I. You may want to 
hold off a little while until I release rc2, which may be a few hours away.


> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
> the PPD? I seem to remember that there was talk about having to maintain
> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
> this front?

It depends on what is in the PPD and what kind of auto-dependency 
features the ActiveState installer has. Given Perl 5.8 and your current 
PPD, does Bioperl install with the same or fewer number of skips if you 
also install Bundle::BioPerl first? That is, does Bundle::BioPerl even 
do anything useful anymore? If not, obviously don't bother making it a 
pre-req. If it does, my opinion is that you make it a pre-req. If people 
really don't want to install the optional stuff they can download the 
.zip file and install manually without even a make.


From Kevin.M.Brown at asu.edu  Mon Oct 16 12:14:51 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Oct 2006 09:14:51 -0700
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
Message-ID: <1A4207F8295607498283FE9E93B775B402196FAA@EX02.asurite.ad.asu.edu>

> > Yes, people use the -alphabet parameter. If you set it to 
> something then
> > Bioperl will not try to determine whether the sequence is 
> protein, rna, or
> > dna and this is particularly useful when the sequence 
> contains characters
> > that Bioperl would object to (sequences with distasteful 
> characters can be
> > created by various applications, for example, or you might 
> introduce some
> > weird character for some reason). Setting the -alphabet 
> would also speed up
> > Bioperl a bit, for the same reason.
> 
> Huh. That's what I assumed when I stumbled into the -alphabet 
> parameter. So I thought this would read the protein sequences 
> out of my genbank file and write a fasta file for me:
> 
> my $seq_in  = Bio::SeqIO->new(
>    -file     => "<$file",  
>    -format   => "genbank",
>    -alphabet => "protein"  # No effect?
> );
> my $seq_out = Bio::SeqIO->new(
>    -file     => ">$outfile",
>    -format   => "fasta",
>    -alphabet => "protein"  # No effect?
> );
> while (my $inseq = $seq_in->next_seq) {
>    $inseq->molecule("protein");    # No effect?
>    $seq_out->write_seq($inseq);
> }
> 
> It didn't. Would it be a Good Thing if it did what I was 
> expecting? (Like I said I rolled my own, but I'm always 
> looking for ways to enhance BioPerl that other people might 
> find useful... Someday I will contribute something useful, by 
> golly. -grin-)
> 
> (Background: I'm doing protein BLASTs from genbank files. To 
> make formatdb happy I have to have fasta files full of the 
> protein sequences.)

This might work for your needs (CDS to protein FASTA).

my $seq_in  = Bio::SeqIO->new(
   -file     => "<$file",  
   -format   => "genbank",
);

open my $seq_out, ">$outfile";

while (my $inseq = $seq_in->next_seq) {
   print $seq_out ">". $inseq->display_id(). "\n";
   print $seq_out $inseq->translate() ."\n";
}


From bix at sendu.me.uk  Mon Oct 16 11:44:19 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 16:44:19 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
Message-ID: <4533A8D3.90709@sendu.me.uk>

I think Chris recently deprecated this, but should it be? For me, its 
POD description justifies its existence, and perhaps more importantly, 
Bio::Index::Blast relies on it.

I took a quick peek at the latter and it didn't seem trivial to move it 
over to Bio::SearchIO instead.

Should it be undeprecated?


From n.haigh at sheffield.ac.uk  Mon Oct 16 12:39:02 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 17:39:02 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533AFEF.8080103@sendu.me.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk>
Message-ID: <4533B5A6.1070701@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> Mauricio Herrera Cuadra wrote:
>>> Done. Could you please check if it works as it should?
>>>
>>> Cheers,
>>> Mauricio.
>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
>> someone to pop it in http://bioperl/DIST
>>
>> Volunteers?
>
> I'm sure Mauricio would be happy to do it, but so am I. You may want
> to hold off a little while until I release rc2, which may be a few
> hours away.

Just e-mailed Mauricio links to the files off list, It's not a big job
for me to remake the bioperl PPD, so Mauricio it's up to you if you want
to wait 18hrs for me to make the PPDs for 1.5.2-rc2.
>
>
>> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
>> the PPD? I seem to remember that there was talk about having to maintain
>> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
>> this front?
>
> It depends on what is in the PPD and what kind of auto-dependency
> features the ActiveState installer has. Given Perl 5.8 and your
> current PPD, does Bioperl install with the same or fewer number of
> skips if you also install Bundle::BioPerl first? That is, does
> Bundle::BioPerl even do anything useful anymore? If not, obviously
> don't bother making it a pre-req. If it does, my opinion is that you
> make it a pre-req. If people really don't want to install the optional
> stuff they can download the .zip file and install manually without
> even a make.
As far as the PPDs are concerned - no tests are run during installation.
PPM more or less just copies files into the correct place for Perl to
find so both approaches result in the same thing. However, I've not
tried making a CPAN distribution file for either Bioperl or
Bundle::Bioperl - I wouldn't know where to start!

MakeFile.PL now only documents the prereq in one place (%packages), and
this is used to add the prereq to the bioperl PPD when issuing "nmake
ppd". This way, each release of BioPerl should be up-to-date with prereq
as long as developers add their modules prereq to %packages. If we have
Bundle::BioPerl, most of those prereq need to be moved from the Bioperl
PPD to the Bundle::BioPerl PPD - a bit of a pain because there are no
guidelines as to what should/should not go in Bundle::BioPerl.
Therefore, as far as the PPDs are concerned, it far easier to do away
with Bundel::BioPerl.

Nath


From hlapp at gmx.net  Mon Oct 16 13:04:24 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:04:24 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <45333E02.9070808@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>
	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
	<45333E02.9070808@sendu.me.uk>
Message-ID: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>

So it looks like an abstract base class, not an interface that  
defines a contract or API? Should use Root.pm then, would be my vote.

	-hilmar

On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> What does the POD (and the code) say about instantiating it?
>
> =head1 SYNOPSIS
>
>    # do not use this object directly, it provides the following  
> methods
>    # for its subclasses
>
> ...
>
>
> =head1 DESCRIPTION
>
> This is a basic module from which to build executable wrapper modules.
> It has some basic methods to help when implementing new modules.
>
>
> There is no new() method.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Oct 16 13:08:28 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:08:28 -0400
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <453387DD.3040105@sendu.me.uk>
References: <453387DD.3040105@sendu.me.uk>
Message-ID: <B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>

It depends. What triggers the sleeping? If it's part of every request  
that it processes then I'd agree. If it is triggered by failure to  
precede the next try then the failure is probably not expected  
(though possible), and hence should be reported by warn().

If it is just part of the polling cycle then there should probably be  
a limit up to which the time waited is considered 'normal' and after  
which it is considered 'excessive' and hence should be reported  
through warn().

My $0.02.

	-hilmar

On Oct 16, 2006, at 9:23 AM, Sendu Bala wrote:

> Hi,
>
> Does anyone think it's appropriate for Bio::WebAgent to issue warnings
> every time it sleeps? I'd consider the sleeping part of its normal,
> expected and desired behaviour so I don't need to be warned about it.
> Perhaps change the $self->warn to a $self->debug?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Mon Oct 16 13:13:53 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 18:13:53 +0100
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>
References: <453387DD.3040105@sendu.me.uk>
	<B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>
Message-ID: <4533BDD1.8060204@sendu.me.uk>

Hilmar Lapp wrote:
> It depends. What triggers the sleeping? If it's part of every request 
> that it processes then I'd agree. If it is triggered by failure to 
> precede the next try then the failure is probably not expected (though 
> possible), and hence should be reported by warn().
> 
> If it is just part of the polling cycle then there should probably be a 
> limit up to which the time waited is considered 'normal' and after which 
> it is considered 'excessive' and hence should be reported through warn().

=head2 sleep

  Title   : sleep
  Usage   : $self->sleep
  Function: sleep for a number of seconds indicated by the delay policy
  Returns : none
  Args    : none

NOTE: This method keeps track of the last time it was called and only
imposes a sleep if it was called more recently than the delay_policy()
allows.

=cut

It issues a warning every time it actually sleeps. I find it 
inappropriate that a method warns me that it did what I asked it to do.


From arareko at campus.iztacala.unam.mx  Mon Oct 16 13:14:06 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 12:14:06 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533B5A6.1070701@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>	<4533A588.9020505@sheffield.ac.uk>
	<4533AFEF.8080103@sendu.me.uk> <4533B5A6.1070701@sheffield.ac.uk>
Message-ID: <4533BDDE.2040801@campus.iztacala.unam.mx>

Nathan Haigh wrote:
> Sendu Bala wrote:
>> Nathan Haigh wrote:
>>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
>>> someone to pop it in http://bioperl/DIST
>>>
>>> Volunteers?
>> I'm sure Mauricio would be happy to do it, but so am I. You may want
>> to hold off a little while until I release rc2, which may be a few
>> hours away.
> 
> Just e-mailed Mauricio links to the files off list, It's not a big job
> for me to remake the bioperl PPD, so Mauricio it's up to you if you want
> to wait 18hrs for me to make the PPDs for 1.5.2-rc2.

Too late, I've already placed 1.5.2-rc1 in DIST. hehe :)

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From bix at sendu.me.uk  Mon Oct 16 12:32:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 17:32:11 +0100
Subject: [Bioperl-l] Swissprot problems
Message-ID: <4533B40B.2030908@sendu.me.uk>

t/Biofetch.t and t/DB.t are skipping their swissprot database fetches.
Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for 
maintenance but is now back up. However I'm guessing the databases must 
have changed. I've manually looked for the test case 'YNB3_YEAST' in 
database 'UniProtKB' and it came back with no result, even though I can 
find the test case manually at the expasy website.

Is this an EBI bug or deliberate change that makes sense to someone?


From m.weimer at dkfz-heidelberg.de  Mon Oct 16 12:43:38 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Mon, 16 Oct 2006 18:43:38 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Problem
Message-ID: <1161017019.5203.6.camel@localhost>

Dear list members,

when running 

######################################################################
#! /usr/bin/perl -w

use strict;
use Bio::DB::SwissProt;

my $db_obj = new Bio::DB::SwissProt(-verbose => 1);

my $seq_obj = $db_obj->get_Seq_by_acc("O02938");
######################################################################

using Bioperl 1.5.2 I get the following message:

##########################################################################################

request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
Content-Length: 49
Content-Type: application/x-www-form-urlencoded

format=swissprot&db=UniProtKB&style=raw&id=O02938


------------- EXCEPTION: Bio::Root::Exception -------------
MSG: acc O02938 does not exist
STACK: Error::throw
STACK:
Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350
STACK:
Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181
STACK: ./get.test.pl:8
-----------------------------------------------------------

##########################################################################################

But the accession number does exist. Surprisingly, everything worked
fine a few days ago. Any ideas of what might have happened?

Thanks and best regards,

Marc

 
From hlapp at gmx.net  Mon Oct 16 13:15:50 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:15:50 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
References: <4533A8D3.90709@sendu.me.uk>
Message-ID: <C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>

The problem is it is not maintained, and there are outstanding been  
bug reports.

If you un-deprecate it, then we need a response to people who come  
across problems with it when using it. Either you change the POD to  
say exactly who and when one should use it (or rather not) and point  
to the fact that it is unsupported for all other cases.

Or what would you suggest?

	-hilmar

On Oct 16, 2006, at 11:44 AM, Sendu Bala wrote:

> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
>
> I took a quick peek at the latter and it didn't seem trivial to  
> move it
> over to Bio::SearchIO instead.
>
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Oct 16 13:21:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:21:46 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
Message-ID: <000001c6f147$8efdfd60$15327e82@pyrimidine>

Bio::Tools::BPlite was placed on the deprecation list a while back (~ rel
1.5); the other related Bio::Tools::BP* modules were also supposed to be on
that list as well.  

If we want to undeprecate (de-deprecate? reprecate?) BPlite we also would
need to do the same for the others.  They must be updated to parse current
BLAST/PSI-BLAST/bl2seq text output, something that Bio::SearchIO::blast is
currently capable of (so the functionality is redundant).  And someone needs
to take them over.

In my opinion it may be more trouble than it's worth as they haven't been
touched in a while.    Seems if we 'revive' BPlite we're not really moving
forward esp. since you have added the PullParser recently and made
substantial improvements to SearchIO.  

Maybe Bio::Index::Blast just needs to be deprecated or rewritten to use
SearchIO?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 16, 2006 10:44 AM
> To: bioperl-l
> Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
> 
> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
> 
> I took a quick peek at the latter and it didn't seem trivial to move it
> over to Bio::SearchIO instead.
> 
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Mon Oct 16 13:21:58 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 18:21:58 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>
References: <4533A8D3.90709@sendu.me.uk>
	<C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>
Message-ID: <4533BFB6.5070504@sendu.me.uk>

Hilmar Lapp wrote:
> The problem is it is not maintained, and there are outstanding been bug 
> reports.
> 
> If you un-deprecate it, then we need a response to people who come 
> across problems with it when using it. Either you change the POD to say 
> exactly who and when one should use it (or rather not) and point to the 
> fact that it is unsupported for all other cases.
> 
> Or what would you suggest?

I'm not sure.

Does Bio::Index::Blast even work correctly? Does it suffer from whatever 
bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should 
that be deprecated as well?

Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO 
and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't 
seem trivial (or even appropriate).

Ultimately I just wanted to solve the warnings in the test suite. 
Thoughts, Chris?


From cjfields at uiuc.edu  Mon Oct 16 13:30:05 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:30:05 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
Message-ID: <000101c6f148$b8538b20$15327e82@pyrimidine>

> Mauricio Herrera Cuadra wrote:
> > Done. Could you please check if it works as it should?
> >
> > Cheers,
> > Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?
> 
> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
> the PPD? I seem to remember that there was talk about having to maintain
> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
> this front?
> 
> Nath

Nathan,

I think Chris Dagdigian still maintains Bundle::Bioperl on CPAN.  That
version should be the common basis for prereqs for any Bioperl core
installation.  

It's relatively easy to add/remove modules to the Bundle::Bioperl.  Contact
Chris D. and let him know if anything needs to be changed.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 16 13:33:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:33:50 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
Message-ID: <000201c6f149$3ed63490$15327e82@pyrimidine>

> So it looks like an abstract base class, not an interface that
> defines a contract or API? Should use Root.pm then, would be my vote.
> 
> 	-hilmar

Makes sense to me.  Maybe another audit is needed to catch similar
instances, or has this been done already?

Chris

> On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote:
> 
> > Hilmar Lapp wrote:
> >> What does the POD (and the code) say about instantiating it?
> >
> > =head1 SYNOPSIS
> >
> >    # do not use this object directly, it provides the following
> > methods
> >    # for its subclasses
> >
> > ...
> >
> >
> > =head1 DESCRIPTION
> >
> > This is a basic module from which to build executable wrapper modules.
> > It has some basic methods to help when implementing new modules.
> >
> >
> > There is no new() method.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 16 13:57:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:57:35 -0500
Subject: [Bioperl-l] Bio::Location::Split
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E016B1ED7@huls5.nucleus.harvard.edu>
Message-ID: <000301c6f14c$8fb0e060$15327e82@pyrimidine>

> I recently came across bug 2101, where Bio::Location::Split::to_FTstring
> gives the incorrect order for multi-sublocation locations on the minus
> strand. That is, I found it by getting incorrect results, and then found
> it in Bugzilla and in the September archives.
>
> I'm converting CDS files from one format to another. E.g., I read an
> EMBL file with a chromosome and CDS features, and want to output the
> location in a FASTA header. If I do something like:
> 
> foreach (<$in>) {
>     foreach my $feat ($seq->getSeqFeatures) {
>         print $feat->location->to_FTstring()
>     }
> }
> 
> I get the wrong results for multi-exon CDSs on the -1 strand, as
> described in the bug report.
> 
> Is there a relatively easy way around this? I assume I can't get at the
> original string of the location, which in this case is all I need. Can I
> just flip the order of the exons in certain cases? Chris F, can you tell
> me the preliminary solution you mentioned?
> 
> I must say I'm sort of surprised this wasn't found before. It seems like
> a not-that-rare occurrence. Oh well.
> 
> Thanks,
> 
> - Amir Karger
> Research Computing
> Life Sciences Division
> Harvard University

Could you let me know specifically which EMBL file contains the odd
locations?  The bug report uses theoretical locations, not actual ones, so
it would be nice to have a real-world example to test against.  

As for the lack of catching this, the particular types of locations that
cause the issue are quite rare.  Note that there are two bugs for that bug
report.  The first (and more serious) is still unresolved.  The second
(where remote locations are treated differently in Location::Split, which
caused more problems than it was worth) had a fix committed about a month
ago.  

Any fixes I have made for the first bug invariably break several other
methods, which use the current Location::Split object logic for retrieving
sequences, building feature strings, etc.  Since a new RC is imminent and
the bug only affects a small number of locations, I have held off until
after a final release is made (the last thing I want to do is fix something
that breaks ~6-8 other methods), but I'll try looking at it again this week.


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 16 14:29:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:29:02 -0500
Subject: [Bioperl-l] Swissprot problems
In-Reply-To: <4533B40B.2030908@sendu.me.uk>
Message-ID: <000401c6f150$f57dfc30$15327e82@pyrimidine>


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 16, 2006 11:32 AM
> To: bioperl-l
> Subject: [Bioperl-l] Swissprot problems
> 
> t/Biofetch.t and t/DB.t are skipping their swissprot database fetches.
> Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for
> maintenance but is now back up. However I'm guessing the databases must
> have changed. I've manually looked for the test case 'YNB3_YEAST' in
> database 'UniProtKB' and it came back with no result, even though I can
> find the test case manually at the expasy website.
> 
> Is this an EBI bug or deliberate change that makes sense to someone?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

I can confirm that.  It's not our end, though.  Entering the same data on
the DBFetch web page also gets no data.  I have emailed EBI about the
problem and will let you know if I hear anything; I think it's related to
the maintenance issue.

Notably, nothing on the web page indicates any database name changes yet.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 16 14:29:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:29:52 -0500
Subject: [Bioperl-l] Bio::DB::SwissProt Problem
In-Reply-To: <1161017019.5203.6.camel@localhost>
Message-ID: <000501c6f151$12918710$15327e82@pyrimidine>

We think there is a problem on the SwissProt (DBFetch) server.  I have
contacted them about the problem and will post something when I hear
something back.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Marc Weimer
> Sent: Monday, October 16, 2006 11:44 AM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bio::DB::SwissProt Problem
> 
> Dear list members,
> 
> when running
> 
> ######################################################################
> #! /usr/bin/perl -w
> 
> use strict;
> use Bio::DB::SwissProt;
> 
> my $db_obj = new Bio::DB::SwissProt(-verbose => 1);
> 
> my $seq_obj = $db_obj->get_Seq_by_acc("O02938");
> ######################################################################
> 
> using Bioperl 1.5.2 I get the following message:
> 
> ##########################################################################
> ################
> 
> request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
> Content-Length: 49
> Content-Type: application/x-www-form-urlencoded
> 
> format=swissprot&db=UniProtKB&style=raw&id=O02938
> 
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: acc O02938 does not exist
> STACK: Error::throw
> STACK:
> Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350
> STACK:
> Bio::DB::WebDBSeqI::get_Seq_by_acc
> /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181
> STACK: ./get.test.pl:8
> -----------------------------------------------------------
> 
> ##########################################################################
> ################
> 
> But the accession number does exist. Surprisingly, everything worked
> fine a few days ago. Any ideas of what might have happened?
> 
> Thanks and best regards,
> 
> Marc
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Mon Oct 16 14:39:28 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:39:28 -0500
Subject: [Bioperl-l] SwissProt Down
Message-ID: <000601c6f152$6997dbd0$15327e82@pyrimidine>

Looks like the swissprot problem stems from maintenance at EBI.  From the
EBI page http://www.ebi.ac.uk/Information/ (not on the DBFetch page, BTW):

Please Note: Monday October 16th 12:00-15:00 -  Due to general maintenance,
some services from the EBI may be temporarily unavailable. We apologise for
any inconvenience.

At least we know that Test::More skips are working!

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Mon Oct 16 14:51:31 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 19:51:31 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C15946CA.ACA9%bosborne11@verizon.net>
References: <C15946CA.ACA9%bosborne11@verizon.net>
Message-ID: <4533D4B3.2000809@sendu.me.uk>

Brian Osborne wrote:
> Sendu,
> 
> I just made a commit that makes Bio::Index::Blast use SearchIO instead of
> BPlite.

I was concerned about the whole id_parser thing. Did you determine that 
your change still allows for id_parser to be used and have the intended 
effect, or that id_parser is in someway meaningless and should be 
removed as a method?


From cjfields at uiuc.edu  Mon Oct 16 15:03:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 14:03:33 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533BFB6.5070504@sendu.me.uk>
Message-ID: <000301c6f155$c7029ff0$15327e82@pyrimidine>

> Hilmar Lapp wrote:
> > The problem is it is not maintained, and there are outstanding been bug
> > reports.
> >
> > If you un-deprecate it, then we need a response to people who come
> > across problems with it when using it. Either you change the POD to say
> > exactly who and when one should use it (or rather not) and point to the
> > fact that it is unsupported for all other cases.
> >
> > Or what would you suggest?
> 
> I'm not sure.
> 
> Does Bio::Index::Blast even work correctly? Does it suffer from whatever
> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should
> that be deprecated as well?
> 
> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO
> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't
> seem trivial (or even appropriate).
> 
> Ultimately I just wanted to solve the warnings in the test suite.
> Thoughts, Chris?

My opinion is we either have to completely support BPlite (and the others)
or drop it altogether.  I don't think we can state "use BPLite only with
Bio::Index::Blast, use SearchIO everywhere else."  That's too inconsistent.


It seems simpler to deprecate the various Bio::Tools::BP* classes and either
fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working
on) or deprecate Bio::Index::Blast as well.  

The warnings in the test suite belong to BlastIndex.t, correct?  I updated
using Brian's Bio::Index::blast fix and it passes now w/o warnings.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From akarger at CGR.Harvard.edu  Mon Oct 16 15:00:28 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 16 Oct 2006 15:00:28 -0400
Subject: [Bioperl-l] Bio::Location::Split
Message-ID: <B9182BFF5B004245BABC12956EA6322E016B1F89@huls5.nucleus.harvard.edu>

 
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu] 
> >
> > I'm converting CDS files from one format to another. E.g., I read an
> > EMBL file with a chromosome and CDS features, and want to output the
> > location in a FASTA header.> > 
> > I get the wrong results for multi-exon CDSs on the -1 strand, as
> > described in the bug report.
> > 
>
> Could you let me know specifically which EMBL file contains the odd
> locations?  The bug report uses theoretical locations, not 
> actual ones, so
> it would be nice to have a real-world example to test against. 

I downloaded candida glabrata chromosome B from EBI:
http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948

testportal>perl location.pl new_glabrata_B.embl > bio
testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/'
new_glabrata_B.embl > nonbio
testportal>wc bio nonbio
 217  217 4537 bio
 217  217 4549 nonbio
 434  434 9086 total
testportal>diff bio nonbio
4c4
< complement(join(10632..11157,10347..10372))
---
> join(complement(10632..11157),complement(10347..10372))

Just one example here, but see below.
 
> As for the lack of catching this, the particular types of 
> locations that
> cause the issue are quite rare.  

Really? I guess our definitions of rare depend on which sequences we're
working with. I'm doing fungal genomes, and here's a grep for a few
species' entire genomes:

testportal>foreach i ( *.embl )
foreach? echo $i
foreach? grep CDS $i | grep join | grep -c complement
foreach? end
glabrata_orf.embl
29
hansenii_orf.embl
151
lactis_orf.embl
70
lipolytica_orf.embl
337
pombe_orf.embl
1137

You might like to use pombe as a test case, as it has lots of these
complement joins, including ones with multiple introns.

Anyway, I'd question the "rare" designation. It seems to me like any
species that has introns will have situations like this in their CDSs.
Not to mention any other sequence that uses Bio::Location::Split. (Since
I'm not a Real Biologist, I can't think up mor examples here, but I'm
sure they exist.)

Or are you saying it's rare to use join (complement(C..D),
complement(A..B)) instead of complement(join(A..B, C..D)). In that case,
I guess I just got really unlucky in that five fungal genomes I was
using decided to use the "rare" syntax. 

> Note that there are two bugs 
> for that bug
> report.  The first (and more serious) is still unresolved.  The second
> (where remote locations are treated differently in 
> Location::Split, which
> caused more problems than it was worth) had a fix committed 
> about a month
> ago.  

Sadly, it's the first (and in my case, more common (I have no remote
locations.)) bug for me.

> Any fixes I have made for the first bug invariably break several other
> methods, which use the current Location::Split object logic 
> for retrieving
> sequences, building feature strings, etc.  Since a new RC is 
> imminent and
> the bug only affects a small number of locations, I have held 
> off until
> after a final release is made (the last thing I want to do is 
> fix something
> that breaks ~6-8 other methods), but I'll try looking at it 
> again this week.

IMO this is a pretty serious bug (if these kinds of sequences aren't
that rare as I've shown above), because you're outputting sequence
descriptions that are just plain wrong. Anyone who uses
FTLocationFactory to read these output description will have incorrect
sequence, incorrect translated proteins, etc. And it's even more serious
if other methods are depending on it.

I know I can't dictate your time, and should be volunteering to work on
fixing it. But if it affects other modules, then I will no doubt break
things even more than you have in your attempts.  

-Amir

> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 


From bosborne11 at verizon.net  Mon Oct 16 14:25:14 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 14:25:14 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
Message-ID: <C15946CA.ACA9%bosborne11@verizon.net>

Sendu,

I just made a commit that makes Bio::Index::Blast use SearchIO instead of
BPlite. The BlastIndex.t test is giving a few warnings so I need to take a
look at that but all tests are passing.

An awful lot of work has gone into the SearchIO system, for more on why its
approach is deemed to be superior in the context of Bioperl see the SearchIO
HOWTO. One key feature of this upcoming release is an emphasis on removing
extraneous modules, I think it's safe to say that BPlite has been considered
extraneous for a number of years now.

Brian O.


On 10/16/06 11:44 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
> 
> I took a quick peek at the latter and it didn't seem trivial to move it
> over to Bio::SearchIO instead.
> 
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Mon Oct 16 14:59:38 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 14:59:38 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533D4B3.2000809@sendu.me.uk>
Message-ID: <C1594EDA.ACB9%bosborne11@verizon.net>

Sendu,

OK. I _think_ this change shouldn't affect id_parser() but I will test this
in BlastIndex.t. The id_parser() method is relevant to all these Index*
modules - don't know how much it's used but it certainly is nice to have it
available.

Brian O.


On 10/16/06 2:51 PM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> Brian Osborne wrote:
>> Sendu,
>> 
>> I just made a commit that makes Bio::Index::Blast use SearchIO instead of
>> BPlite.
> 
> I was concerned about the whole id_parser thing. Did you determine that
> your change still allows for id_parser to be used and have the intended
> effect, or that id_parser is in someway meaningless and should be
> removed as a method?


From cjfields at uiuc.edu  Mon Oct 16 16:51:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 15:51:08 -0500
Subject: [Bioperl-l] Bio::Location::Split
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E016B1F89@huls5.nucleus.harvard.edu>
Message-ID: <000001c6f164$d1380190$15327e82@pyrimidine>

...
> I downloaded candida glabrata chromosome B from EBI:
> http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948
> 
> testportal>perl location.pl new_glabrata_B.embl > bio
> testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/'
> new_glabrata_B.embl > nonbio
> testportal>wc bio nonbio
>  217  217 4537 bio
>  217  217 4549 nonbio
>  434  434 9086 total
> testportal>diff bio nonbio
> 4c4
> < complement(join(10632..11157,10347..10372))
> ---
> > join(complement(10632..11157),complement(10347..10372))
> 
> Just one example here, but see below.
> 
> > As for the lack of catching this, the particular types of
> > locations that
> > cause the issue are quite rare.
> 
> Really? I guess our definitions of rare depend on which sequences we're
> working with. I'm doing fungal genomes, and here's a grep for a few
> species' entire genomes:
> 
> testportal>foreach i ( *.embl )
> foreach? echo $i
> foreach? grep CDS $i | grep join | grep -c complement
> foreach? end
> glabrata_orf.embl
> 29
> hansenii_orf.embl
> 151
> lactis_orf.embl
> 70
> lipolytica_orf.embl
> 337
> pombe_orf.embl
> 1137
> 
> You might like to use pombe as a test case, as it has lots of these
> complement joins, including ones with multiple introns.

I'll use those.  I'll see if an analogous GenBank file exists as well.  

I can probably make a preliminary fix for FT_string() so that it arranges
the sublocations correctly, but I think the best way to go is to have
FTLocationFactory not modify the various sublocations to start with, which
it currently does when it sets strand() (strand() propagates the strand info
to sublocations). 

> Anyway, I'd question the "rare" designation. It seems to me like any
> species that has introns will have situations like this in their CDSs.
> Not to mention any other sequence that uses Bio::Location::Split. (Since
> I'm not a Real Biologist, I can't think up mor examples here, but I'm
> sure they exist.)

I think that additional tests are definitely needed for pulling out
sequences.  

What I mean by 'rare' is that the majority of sequences do not have
problems.  Also, this seems to be a 'silent' bug since the error shows up in
to_FTstring() but the object sublocations seem to beprocessed correctly when
using the location object directly (such as via SeqFeatureI).  

Round-tripping the sequence should pick it up though.  Since
complement(join(10632..11157,10347..10372)) is not the same as
join(complement(10632..11157),complement(10347..10372)).  

That is essentially what you are doing, correct? i.e. getting the sequences
using Bioperl, saving them (which passes them through SeqIO), reading them
again (back through SeqIO with the malformed location string).

> Or are you saying it's rare to use join (complement(C..D),
> complement(A..B)) instead of complement(join(A..B, C..D)). In that case,
> I guess I just got really unlucky in that five fungal genomes I was
> using decided to use the "rare" syntax.

Location::Split is supposed to handle all variations, but apparently it
isn't.  

> > Note that there are two bugs
> > for that bug
> > report.  The first (and more serious) is still unresolved.  The second
> > (where remote locations are treated differently in
> > Location::Split, which
> > caused more problems than it was worth) had a fix committed
> > about a month
> > ago.
> 
> Sadly, it's the first (and in my case, more common (I have no remote
> locations.)) bug for me.
> 
> > Any fixes I have made for the first bug invariably break several other
> > methods, which use the current Location::Split object logic
> > for retrieving
> > sequences, building feature strings, etc.  Since a new RC is
> > imminent and
> > the bug only affects a small number of locations, I have held
> > off until
> > after a final release is made (the last thing I want to do is
> > fix something
> > that breaks ~6-8 other methods), but I'll try looking at it
> > again this week.
> 
> IMO this is a pretty serious bug (if these kinds of sequences aren't
> that rare as I've shown above), because you're outputting sequence
> descriptions that are just plain wrong. Anyone who uses
> FTLocationFactory to read these output description will have incorrect
> sequence, incorrect translated proteins, etc. And it's even more serious
> if other methods are depending on it.
> 
> I know I can't dictate your time, and should be volunteering to work on
> fixing it. But if it affects other modules, then I will no doubt break
> things even more than you have in your attempts.
> 
> -Amir

I'll give it a look over the next week.  Like I mentioned above, I may be
able to fix it in Split::to_FTstring() w/o breaking other tests (in which
case I'll commit it for the 1.5.2 release), but it would be a temporary hack
until I can work out why other tests are failing.

Chris


From jason at bioperl.org  Mon Oct 16 18:45:21 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Oct 2006 15:45:21 -0700
Subject: [Bioperl-l] split location problems
Message-ID: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>

The whole point of split locations is to represent genes with introns  
so that is not the "rare" case.

I'm confused where the problem is.  The locations that I get out with  
to_FTstring on the location object are exactly the same as those input.

I have processed the genbank fungal genomes into GFF3 and have had no  
problems so I'm confused where you are breaking down.  If I write  
them out as embl I also get the correct thing.  This is using the CVS  
version of bioperl from the HEAD.

I've added code to test this to bug 2101 including a C.glabrata  
chromsome downloaded from genbank.  Perhaps the problem is on the  
EMBL parsing side, I didn't test that.

On the technical side, I still am not sure I fully know where the  
strand information should be stored - the top level container or the  
sub-features.  I'll try and stay up on the discussion if anything has  
been decided that I should know about.

-jason


From torsten.seemann at infotech.monash.edu.au  Mon Oct 16 18:23:23 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 17 Oct 2006 08:23:23 +1000
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <000201c6f149$3ed63490$15327e82@pyrimidine>
References: <000201c6f149$3ed63490$15327e82@pyrimidine>
Message-ID: <4534065B.9020309@infotech.monash.edu.au>

Chris Fields wrote:
>> So it looks like an abstract base class, not an interface that
>> defines a contract or API? Should use Root.pm then, would be my vote.
>> 	-hilmar
> 
> Makes sense to me.  Maybe another audit is needed to catch similar
> instances, or has this been done already?

The purpose of my original (poorly phrased) question was to try and sort 
out where Root and RootI where being used the wrong way around.

I'm currently "all-audited out" so I leave this task to another volunteer.

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From cjfields at uiuc.edu  Mon Oct 16 21:07:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 20:07:55 -0500
Subject: [Bioperl-l] split location problems
In-Reply-To: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
Message-ID: <BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>


On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote:

> The whole point of split locations is to represent genes with  
> introns so that is not the "rare" case.
>
> I'm confused where the problem is.  The locations that I get out  
> with to_FTstring on the location object are exactly the same as  
> those input.

The problem is with the a subset of split locations described in the  
bug report.  The following works:

complement(join(2691..4571,4918..5163))

whereas this:

join(complement(4918..5163),complement(2691..4571))

gives this:

complement(join(4918..5163,2691..4571))

which is not syntactically the same.  It should be:

complement(join(2691..4571,4918..5163))

since 'join' implies that the order of the segments to be joined is  
important ('order' and 'bond' do not, I guess).

> I have processed the genbank fungal genomes into GFF3 and have had  
> no problems so I'm confused where you are breaking down.  If I  
> write them out as embl I also get the correct thing.  This is using  
> the CVS version of bioperl from the HEAD.
>
> I've added code to test this to bug 2101 including a C.glabrata  
> chromsome downloaded from genbank.  Perhaps the problem is on the  
> EMBL parsing side, I didn't test that.
>
> On the technical side, I still am not sure I fully know where the  
> strand information should be stored - the top level container or  
> the sub-features.  I'll try and stay up on the discussion if  
> anything has been decided that I should know about.
>
> -jason

Split::strand() sets the sublocations as well, which seems to confuse  
the situation more but it is consistent with LocationI, as Hilmar  
points out.  I'm looking into a few solutions now, including a fix in  
Split::to_FTstring().

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Mon Oct 16 22:48:14 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Oct 2006 19:48:14 -0700
Subject: [Bioperl-l] split location problems
In-Reply-To: <BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>
References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
	<BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>
Message-ID: <8273f6c20610161948w201537a5v2fcfa189eb809283@mail.gmail.com>

This probably was exposed by the fact that the Split object used to
explicitly sort the features by start*strand always.  But with remote
locations and needing to be able to explicitly set the order (for features
that are not required to be 5' -> 3') that code must have been removed.   I
think there is just one place that must be missing a 'reverse' on the list
of sub-locations when the top-level feature is a complement.  I'll wait for
your fix before wading in - we probably might want to figure out a
'consolidate' method to shrink redundant and equivalent representations to
the shortest possible form. Ugh this really starts to resemble trying to
write a boolean logic toolkit....
-jason

On 10/16/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
> On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote:
>
> > The whole point of split locations is to represent genes with
> > introns so that is not the "rare" case.
> >
> > I'm confused where the problem is.  The locations that I get out
> > with to_FTstring on the location object are exactly the same as
> > those input.
>
> The problem is with the a subset of split locations described in the
> bug report.  The following works:
>
> complement(join(2691..4571,4918..5163))
>
> whereas this:
>
> join(complement(4918..5163),complement(2691..4571))
>
> gives this:
>
> complement(join(4918..5163,2691..4571))
>
> which is not syntactically the same.  It should be:
>
> complement(join(2691..4571,4918..5163))
>
> since 'join' implies that the order of the segments to be joined is
> important ('order' and 'bond' do not, I guess).
>
> > I have processed the genbank fungal genomes into GFF3 and have had
> > no problems so I'm confused where you are breaking down.  If I
> > write them out as embl I also get the correct thing.  This is using
> > the CVS version of bioperl from the HEAD.
> >
> > I've added code to test this to bug 2101 including a C.glabrata
> > chromsome downloaded from genbank.  Perhaps the problem is on the
> > EMBL parsing side, I didn't test that.
> >
> > On the technical side, I still am not sure I fully know where the
> > strand information should be stored - the top level container or
> > the sub-features.  I'll try and stay up on the discussion if
> > anything has been decided that I should know about.
> >
> > -jason
>
> Split::strand() sets the sublocations as well, which seems to confuse
> the situation more but it is consistent with LocationI, as Hilmar
> points out.  I'm looking into a few solutions now, including a fix in
> Split::to_FTstring().
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
Jason Stajich
jason at bioperl.org
http://www.duke.edu/~jes12/


From cjfields at uiuc.edu  Mon Oct 16 23:34:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 22:34:25 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C159C54B.ACD5%bosborne11@verizon.net>
References: <C159C54B.ACD5%bosborne11@verizon.net>
Message-ID: <AE334107-1639-468E-ABA8-2F992693809A@uiuc.edu>


On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:

> Chris and Sendu,
>
> Sendu was correct in wondering whether id_parser() in Blast.pm  
> would work
> after the module was altered to use SearchIO but what I've found  
> out from my
> local tests is that id_parser() didn't work when BPlite was being used
> either. I can continue to work on this but it's safe to say that  
> removing
> BPlite doesn't cause a problem with id_parser, it was already there.
>
> Brian O.

....

It may be one reason (the main reason?) the method wasn't tested.   
Maybe it should be removed if it can't be easily fixed; I don't think  
it makes sense keeping it otherwise.

Chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bosborne11 at verizon.net  Mon Oct 16 23:24:59 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 23:24:59 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <000301c6f155$c7029ff0$15327e82@pyrimidine>
Message-ID: <C159C54B.ACD5%bosborne11@verizon.net>

Chris and Sendu,

Sendu was correct in wondering whether id_parser() in Blast.pm would work
after the module was altered to use SearchIO but what I've found out from my
local tests is that id_parser() didn't work when BPlite was being used
either. I can continue to work on this but it's safe to say that removing
BPlite doesn't cause a problem with id_parser, it was already there.

Brian O.


On 10/16/06 3:03 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

>> Hilmar Lapp wrote:
>>> The problem is it is not maintained, and there are outstanding been bug
>>> reports.
>>> 
>>> If you un-deprecate it, then we need a response to people who come
>>> across problems with it when using it. Either you change the POD to say
>>> exactly who and when one should use it (or rather not) and point to the
>>> fact that it is unsupported for all other cases.
>>> 
>>> Or what would you suggest?
>> 
>> I'm not sure.
>> 
>> Does Bio::Index::Blast even work correctly? Does it suffer from whatever
>> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should
>> that be deprecated as well?
>> 
>> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO
>> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't
>> seem trivial (or even appropriate).
>> 
>> Ultimately I just wanted to solve the warnings in the test suite.
>> Thoughts, Chris?
> 
> My opinion is we either have to completely support BPlite (and the others)
> or drop it altogether.  I don't think we can state "use BPLite only with
> Bio::Index::Blast, use SearchIO everywhere else."  That's too inconsistent.
> 
> 
> It seems simpler to deprecate the various Bio::Tools::BP* classes and either
> fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working
> on) or deprecate Bio::Index::Blast as well.
> 
> The warnings in the test suite belong to BlastIndex.t, correct?  I updated
> using Brian's Bio::Index::blast fix and it passes now w/o warnings.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Mon Oct 16 23:48:56 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 23:48:56 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <AE334107-1639-468E-ABA8-2F992693809A@uiuc.edu>
Message-ID: <C159CAE8.ACD9%bosborne11@verizon.net>

Chris,

OK. In fact there's no written guarantee that all Bio::Index* modules have
an id_parser() method. It happens that most do, and it's useful. I'll fix
the documentation in Bio::Index::Blast and add an enhancement request to
Bugzilla, may be able to get around to before 1.5.2 release but no promises.

Brian O.


On 10/16/06 11:34 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> 
> On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:
> 
>> Chris and Sendu,
>> 
>> Sendu was correct in wondering whether id_parser() in Blast.pm
>> would work
>> after the module was altered to use SearchIO but what I've found
>> out from my
>> local tests is that id_parser() didn't work when BPlite was being used
>> either. I can continue to work on this but it's safe to say that
>> removing
>> BPlite doesn't cause a problem with id_parser, it was already there.
>> 
>> Brian O.
> 
> ....
> 
> It may be one reason (the main reason?) the method wasn't tested.
> Maybe it should be removed if it can't be easily fixed; I don't think
> it makes sense keeping it otherwise.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Tue Oct 17 02:35:43 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 07:35:43 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
Message-ID: <453479BF.90408@sheffield.ac.uk>

I'm a bit unclear as to what is happening with these files.

Are these files now superseded by the wikified versions? If so, should 
these files now just simply contain a link to the wikified versions - 
otherwise things could get in a mess since I updated the wiki version of 
INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks 
ago - hopefully these differences aren't that big.

Nath


From faruque at ebi.ac.uk  Tue Oct 17 04:19:44 2006
From: faruque at ebi.ac.uk (Nadeem Faruque)
Date: Tue, 17 Oct 2006 09:19:44 +0100
Subject: [Bioperl-l] split location problems
Message-ID: <F2A2DB48-8EDF-43AA-AFCF-45B48AF43B1C@ebi.ac.uk>

EMBL' currently outputs join-complements in the format
join(complement(30..40),complement(10..20))
instead of the Genbank preferred
complement(join(10..20,30..40))

EMBL's may reflect what happens in the cell a little more than  
Genbank's, but it is less readable and less concise.
NB I've also seen a couple of people construct these incorrectly
eg join(complement(10..20),complement(30..40))

I believe we are moving to the complement-join format but I can't  
give a date for the transition.

Having said that, trans-splicing will still give us the joys of  
complex locations,
eg
join(1..5,complement(join(10..20,30..40)))
complement(join(30..40,10..20)) <- looks wrong (unless it is a very  
small circle) but mis-ordered exons are resolved by the trans- 
splicing machinery.

Nadeem


--
S.M. Nadeem N. Faruque
EMBL Nucleotide Database Curation Team
EMBL Outstation
Tel: +44 1223 494611                     Fax: +44 1223 494472
The European Bioinformatics Institute    URL: http://www.ebi.ac.uk/
Email for data submissions: datasubs at ebi.ac.uk
Email for updates: update at ebi.ac.uk
========================================================


From bix at sendu.me.uk  Tue Oct 17 04:59:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 09:59:36 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>	<45333E02.9070808@sendu.me.uk>
	<1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
Message-ID: <45349B78.8090905@sendu.me.uk>

Hilmar Lapp wrote:
> So it looks like an abstract base class, not an interface that  
> defines a contract or API? Should use Root.pm then, would be my vote.

Agreed, that was actually what I did in my local copy when I made a new 
inheriting class (so discovering the problem). This change is harmless 
to other modules, but does mean they'll have redundant use of 
Bio::Root::Root which will want cleaning up at some stage.


From bix at sendu.me.uk  Tue Oct 17 06:32:54 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 11:32:54 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
Message-ID: <4534B156.4090501@sendu.me.uk>

Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
See http://www.bioperl.org/wiki/Release_1.5.2 for
instructions on getting and testing this RC.

Developers:
   This should be the last RC before release ~next monday. Now would
   be a good time for last minute documentaiton updates and additions.

Users:
   Even though 1.5.2 is a 'developer' release, we consider it the most
   stable and capable version of Bioperl, and recommend that you use
   it in all but the most critical production environments. Please
   try it out and let us know of any problems or difficulties you run
   into.


Thank you,
Sendu.


From cjfields at uiuc.edu  Tue Oct 17 07:16:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 06:16:47 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <453479BF.90408@sheffield.ac.uk>
References: <453479BF.90408@sheffield.ac.uk>
Message-ID: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>

The general consensus was to keep text versions available; we could  
add URL links to the wiki pages for the most up-to-dat version.  BTW,  
I have modified INSTALL already.  INSTALL.WIN is next in line (I was  
waiting for your changes).

Chris

On Oct 17, 2006, at 1:35 AM, Nathan S. Haigh wrote:

> I'm a bit unclear as to what is happening with these files.
>
> Are these files now superseded by the wikified versions? If so, should
> these files now just simply contain a link to the wikified versions -
> otherwise things could get in a mess since I updated the wiki  
> version of
> INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks
> ago - hopefully these differences aren't that big.
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Tue Oct 17 07:45:45 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 12:45:45 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>
References: <453479BF.90408@sheffield.ac.uk>
	<72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>
Message-ID: <4534C269.5050704@sheffield.ac.uk>

Chris Fields wrote:
> The general consensus was to keep text versions available; we could 
> add URL links to the wiki pages for the most up-to-dat version.  BTW, 
> I have modified INSTALL already.  INSTALL.WIN is next in line (I was 
> waiting for your changes).
>
Is it possible to generate these files from the wiki whenever there is a 
release? I now edits shouldn't be too severe or too often - but I can 
see things getting a little messy/annoying if edits have to be made in 2 
places.

Nath


From cjfields at uiuc.edu  Tue Oct 17 10:04:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:04:32 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534C269.5050704@sheffield.ac.uk>
Message-ID: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>

There isn't a very easy way since so many links have to be removed/modified.
I have found a few CPAN modules that could help, but for now I just dump the
text output from a text browser (elinks) using the 'printable version' page
and hand-edit, which works very quickly.  That works for the time being
until I can find another more automated solution.

Fortunately there have been very few edits to either INSTALL wiki page so
they should remain relatively stable.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
> Sent: Tuesday, October 17, 2006 6:46 AM
> To: Chris Fields
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> 
> Chris Fields wrote:
> > The general consensus was to keep text versions available; we could
> > add URL links to the wiki pages for the most up-to-dat version.  BTW,
> > I have modified INSTALL already.  INSTALL.WIN is next in line (I was
> > waiting for your changes).
> >
> Is it possible to generate these files from the wiki whenever there is a
> release? I now edits shouldn't be too severe or too often - but I can
> see things getting a little messy/annoying if edits have to be made in 2
> places.
> 
> Nath


From cjfields at uiuc.edu  Tue Oct 17 10:12:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:12:09 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C159CAE8.ACD9%bosborne11@verizon.net>
Message-ID: <000401c6f1f6$424b5580$15327e82@pyrimidine>


> Chris,
> 
> OK. In fact there's no written guarantee that all Bio::Index* modules have
> an id_parser() method. It happens that most do, and it's useful. I'll fix
> the documentation in Bio::Index::Blast and add an enhancement request to
> Bugzilla, may be able to get around to before 1.5.2 release but no
> promises.
> 
> Brian O.

Do the various Bio::Index* modules share a common interface?  

I wouldn't worry too much about it for this release, unless you really have
time.  It is still, after all, a developer's release, and you've noted it in
Bugzilla.  We could try for another dev release in winter (rel 1.5.3, I
guess) to get any bug fixes or new modules added.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> On 10/16/06 11:34 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
> 
> >
> > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:
> >
> >> Chris and Sendu,
> >>
> >> Sendu was correct in wondering whether id_parser() in Blast.pm
> >> would work
> >> after the module was altered to use SearchIO but what I've found
> >> out from my
> >> local tests is that id_parser() didn't work when BPlite was being used
> >> either. I can continue to work on this but it's safe to say that
> >> removing
> >> BPlite doesn't cause a problem with id_parser, it was already there.
> >>
> >> Brian O.
> >
> > ....
> >
> > It may be one reason (the main reason?) the method wasn't tested.
> > Maybe it should be removed if it can't be easily fixed; I don't think
> > it makes sense keeping it otherwise.
> >
> > Chris
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Tue Oct 17 10:15:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 15:15:17 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
Message-ID: <4534E575.5050308@sheffield.ac.uk>

Chris Fields wrote:
> There isn't a very easy way since so many links have to be removed/modified.
> I have found a few CPAN modules that could help, but for now I just dump the
> text output from a text browser (elinks) using the 'printable version' page
> and hand-edit, which works very quickly.  That works for the time being
> until I can find another more automated solution.
>
> Fortunately there have been very few edits to either INSTALL wiki page so
> they should remain relatively stable.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>   
So am I correct in saying that the best way is to make all updates to 
the wikified versions of these files, and then at regular 
intervals/major releases you (or someone else) will update the CVS 
version of the files in the way describe above?

Cheers
Nath


From bix at sendu.me.uk  Tue Oct 17 10:00:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 15:00:39 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E09C.9030707@genomics.dk>
References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk>
Message-ID: <4534E207.8030508@sendu.me.uk>

Niels Larsen wrote:
> Greetings,
> 
> I am no perl beginner, but I am a BioPerl beginner. Today I looked
> for remote similarity services that can be used from Perl. I found
> the EBI SOAP interface where their example script returns
> 
> Can't find method element in the message at 
> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

What script exactly? There was a problem with the SOAP server that was 
fixed earlier today.


> and the DDBJ service which (from Denmark) returns
> 
> undef

What returned undef? Specifics please.


> and then the NCBI server accessed through BioPerls RemoteBlast which
> seems to spin in a loop that fills TMPDIR with many tempfiles. Will
> release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
> is working towards that).

What version of Bioperl were you testing with? What did you do to get it 
to 'spin in a loop'? I can tell you that remote blasting certainly works 
in Bioperl 1.5.2, but you'll have to give more details on the things you 
tried and the problems you encountered.

You can also answer the questions yourself by trying the release candidate.


From B.Beckert at ibmc.u-strasbg.fr  Tue Oct 17 09:59:30 2006
From: B.Beckert at ibmc.u-strasbg.fr (Bertrand Beckert)
Date: Tue, 17 Oct 2006 15:59:30 +0200
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
Message-ID: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>


hi,

I am running a large number of blasts via a connexion to ncbi blast
page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
some problems. I make a simple example with only one sequence in
order to understand how work this module. This is my simple input
file, a DNA sequence in fasta form:

> test
>
TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT
TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG
TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA
I have made some modification of the example available in doc of
bioperl.
It give me a RID which contain the results of my blast but I have a
problem with the "$result=$factory->retrieve_blast($rid)" in my script.
In the documentation it wrote that $result=$factory->retrieve_blast
($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
object. In my case it returns a Bio::SearchIO::blast... I don't
understand why I don't have the good type of object return (see PART I).

I also try to resolve the problem by replace the foreach loop in my
script by a new one in order to explore the blast page result but it
also don't work (see part II).

could you help me please. Thank you

Bertrand Beckert.

PART I:

Here is my script with a little annotation and also the shell window
printing:
------------------------------------------------------------------------

----------------------------
#!/usr/bin/perl -w
use Bio::Tools::Run::RemoteBlast;
use Bio::SearchIO;
sub blast {
my $prog='blastn';
my $db='refseq_genomic';
my $e_val='1e-10';
my $Input='Seq.fasta';
my @params = ('-prog' =>  $prog, '-data' =>  $db, '-expect' =>
$e_val, '-readmethod' => 'SearchIO');
my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
#changes parameters
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25';
$factory->submit_blast($Input);
print STDERR "waiting...\n";
while (my @rids=$factory->each_rid) {
           print "my rid: ", at rids,"\n";
	 #return me the ID of the submited blast i.e. RID:
1161079157-766-185099855365.BLASTQ2
	 #this page contains the result of my blast...
	         foreach my $rid (@rids) {
		                 $result=$factory->retrieve_blast($rid);
		#line in order to understand what type of object is
return by
retrieve_blast		
                  print "rc:", $result,"\n";
		
		                }
			}
		}

&blast;
------------------------------------------------------------------------

----------------------------

here you can see the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc54)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc30)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x89eb7f4)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x8a2cc74)
my rid: 1161079157-766-185099855365.BLASTQ2
...
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x886bbac)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x89eb5f0)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x8a2d2d4)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x84fa054)
...

PARTII:

I also try to resolve the problem by replace the foreach loop in my
script by:
------------------------------------------------------------------------

----------------------------
foreach my $rid (@rids) {
                  while(1) {
                  $result=$factory->retrieve_blast($rid)->next_result();
                  print "rc:", $result,"\n";
                  if ($result) {
                  print  $result->num_hits(),"\n";
                  }
------------------------------------------------------------------------

----------------------------
With tis loop I could explore the result Blast page. that is what I
obtain in the shell window:
		
bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161088606-9905-123050755601.BLASTQ4
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb834)


----
-- 
Berrtrand BECKERT
PhD student
IBMC - UPR 9002 du CNRS - ARN
15, rue Rene Descartes
F-67084 STRASBOURG Cedex
b.beckert at ibmc.u-strasbg.fr


From niels at genomics.dk  Tue Oct 17 09:54:36 2006
From: niels at genomics.dk (Niels Larsen)
Date: Tue, 17 Oct 2006 15:54:36 +0200
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534B156.4090501@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>
Message-ID: <4534E09C.9030707@genomics.dk>

Greetings,

I am no perl beginner, but I am a BioPerl beginner. Today I looked
for remote similarity services that can be used from Perl. I found
the EBI SOAP interface where their example script returns

Can't find method element in the message at 
/ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

and the DDBJ service which (from Denmark) returns

undef

and then the NCBI server accessed through BioPerls RemoteBlast which
seems to spin in a loop that fills TMPDIR with many tempfiles. Will
release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
is working towards that).

Niels L


------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------


From cjfields at uiuc.edu  Tue Oct 17 10:28:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:28:40 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534E575.5050308@sheffield.ac.uk>
Message-ID: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>

...
> So am I correct in saying that the best way is to make all updates to
> the wikified versions of these files, and then at regular
> intervals/major releases you (or someone else) will update the CVS
> version of the files in the way describe above?
> 
> Cheers
> Nath

Yes.  I think the online docs will stay relatively stable.  A week or so ago
Mauricio and I were discussing moving the dependencies list to it's own CVS
document (since they pertain to all Bioperl installations, not just UNIX'y
flavors).  I haven't done that yet since I was waiting on the INSTALL.WIN
changes before I made any more changes.  Well, that and I've been really
busy doing other things.

One way we could make sure that changes to the online docs would match the
CVS docs would be to only allow certain wiki users (such as sysadmins) make
modifications to those pages.  That way any changes would have to go through
someone who also has CVS access and could make similar changes to the
distribution docs.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Tue Oct 17 10:37:38 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 15:37:38 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>
References: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>
Message-ID: <4534EAB2.50609@sheffield.ac.uk>

Chris Fields wrote:
> ...
>   
>> So am I correct in saying that the best way is to make all updates to
>> the wikified versions of these files, and then at regular
>> intervals/major releases you (or someone else) will update the CVS
>> version of the files in the way describe above?
>>
>> Cheers
>> Nath
>>     
>
> Yes.  I think the online docs will stay relatively stable.  A week or so ago
> Mauricio and I were discussing moving the dependencies list to it's own CVS
> document (since they pertain to all Bioperl installations, not just UNIX'y
> flavors).  I haven't done that yet since I was waiting on the INSTALL.WIN
> changes before I made any more changes.  Well, that and I've been really
> busy doing other things.
>   
Sounds good.
> One way we could make sure that changes to the online docs would match the
> CVS docs would be to only allow certain wiki users (such as sysadmins) make
> modifications to those pages.  That way any changes would have to go through
> someone who also has CVS access and could make similar changes to the
> distribution docs.
>   
Ugh, not sure I like the sound of maintaining 2 copies of any files - 
sounds like a future headache even if they are pretty stable. It also 
makes it unclear which of the two file should be considered first (i.e. 
is the most up-to-date) on pages such as:
http://www.bioperl.org/wiki/Installing_BioPerl

It suggests that INSTALL and INSTALL.WIN should be looked at first, but 
there are online copies of those files available - this should now be 
the other way around - shouldn't it? I might just be making a mountain 
out of a molehill, so I'll shut up on this topic and make any future 
edits to the wiki pages instead.
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   


From bosborne11 at verizon.net  Tue Oct 17 10:48:54 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 17 Oct 2006 10:48:54 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <000401c6f1f6$424b5580$15327e82@pyrimidine>
Message-ID: <C15A6596.AD0B%bosborne11@verizon.net>

Chris,

The Bio::Index modules either 'use base qw(Bio::Index::Abstract)' or 'use
base qw(Bio::Index::AbstractSeq)'. Neither of these modules has an
id_parser() method.

Brian O.


On 10/17/06 10:12 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> Do the various Bio::Index* modules share a common interface?  


From cjfields at uiuc.edu  Tue Oct 17 10:45:53 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:45:53 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534EAB2.50609@sheffield.ac.uk>
Message-ID: <000601c6f1fa$f260b560$15327e82@pyrimidine>

...
> > One way we could make sure that changes to the online docs would match
> the
> > CVS docs would be to only allow certain wiki users (such as sysadmins)
> make
> > modifications to those pages.  That way any changes would have to go
> through
> > someone who also has CVS access and could make similar changes to the
> > distribution docs.
> >
> Ugh, not sure I like the sound of maintaining 2 copies of any files -
> sounds like a future headache even if they are pretty stable. It also
> makes it unclear which of the two file should be considered first (i.e.
> is the most up-to-date) on pages such as:
> http://www.bioperl.org/wiki/Installing_BioPerl
> 
> It suggests that INSTALL and INSTALL.WIN should be looked at first, but
> there are online copies of those files available - this should now be
> the other way around - shouldn't it? I might just be making a mountain
> out of a molehill, so I'll shut up on this topic and make any future
> edits to the wiki pages instead.

Yes that should be the other way around (the wiki would be the most
up-to-date), so the CVS docs should point to the wiki, not vice-versa.

Getting the docs right is as important as getting the code to work.  So I
don't consider it a 'mountain-out-of-a-molehill' problem.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Tue Oct 17 11:07:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 10:07:49 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E207.8030508@sendu.me.uk>
Message-ID: <001001c6f1fe$02fd4de0$15327e82@pyrimidine>

> Niels Larsen wrote:
> > Greetings,
> >
> > I am no perl beginner, but I am a BioPerl beginner. Today I looked
> > for remote similarity services that can be used from Perl. I found
> > the EBI SOAP interface where their example script returns
> >
> > Can't find method element in the message at
> > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.
> 
> What script exactly? There was a problem with the SOAP server that was
> fixed earlier today.
> 
> 
> > and the DDBJ service which (from Denmark) returns
> >
> > undef
> 
> What returned undef? Specifics please.
> 

The first problem, like Sendu mentions, was fixed on the remote server (I
get them to pass now).  Those were from bioperl-run, though, not the bioperl
core distribution.

As for DDBJ, do you mean EBI or SwissProt?  I ask b/c you mention Denmark.
EBI were having server maintenance outages yesterday, which was announced
here.

As Sendu mentions, please be more specific.

> > and then the NCBI server accessed through BioPerls RemoteBlast which
> > seems to spin in a loop that fills TMPDIR with many tempfiles. Will
> > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
> > is working towards that).
> 
> What version of Bioperl were you testing with? What did you do to get it
> to 'spin in a loop'? I can tell you that remote blasting certainly works
> in Bioperl 1.5.2, but you'll have to give more details on the things you
> tried and the problems you encountered.
> 
> You can also answer the questions yourself by trying the release
> candidate.

The tempfiles showing up are from the repeated RID requests and are deleted
after the BLAST run (at least they should be); this is quite normal.  They
don't 'spin in a loop' unless the BLAST query is taking a particularly long
time, which can happen depending on how the BLAST query is set up, i.e. what
type of BLAST program is requested, if comp-based stats are requested,
length of query, database requested, etc.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 17 11:14:07 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 16:14:07 +0100
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
In-Reply-To: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>
References: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>
Message-ID: <4534F33F.3070809@sendu.me.uk>

Bertrand Beckert wrote:
> hi,
> 
> I am running a large number of blasts via a connexion to ncbi blast
> page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
> I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
> some problems.
[snip]
> In the documentation it wrote that $result=$factory->retrieve_blast
> ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
> object. In my case it returns a Bio::SearchIO::blast... I don't
> understand why I don't have the good type of object return (see PART I).

I take it you're using some old version of Bioperl where unfortunately 
the documentation was incorrect. In fact you're supposed to get a 
Bio::SearchIO object, so it is a good thing that you are. The latest 
version of Bioperl has (as far as I can see) correct documentation and 
behaviour.

Bio::Tools::Bplite and Bio::Tools::Blast are deprecated. You want 
Bio::SearchIO::blast. All is well.


> I also try to resolve the problem by replace the foreach loop in my
> script by a new one in order to explore the blast page result but it
> also don't work (see part II).

I'm not really sure what problem you might be facing there, but take a 
look at some up-to-date documentation, using the new example code:

http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html


From n.haigh at sheffield.ac.uk  Tue Oct 17 12:10:15 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 17:10:15 +0100
Subject: [Bioperl-l] [Fwd: Re: Bundle::BioPerl]
Message-ID: <45350067.6070604@sheffield.ac.uk>

FYI on Bundle::BioPerl

Nathan

-------- Original Message --------
Subject: 	Re: Bundle::BioPerl
Date: 	Tue, 17 Oct 2006 11:52:00 -0400
From: 	Chris Dagdigian <dag at sonsorol.org>
To: 	Nathan S. Haigh <n.haigh at sheffield.ac.uk>
References: 	<45348FB8.4050009 at sheffield.ac.uk>


Hi Nathan,

I've updated the Bundle and uploaded it to CPAN.

I *think* the rationale for keeping it still exists but I'm removed  
enough from Bioperl now that I'll defer to others on the decision.

The basic idea was that BioPerl has a heck of a lot of dependencies  
that it requires of (other perl modules) in order to get all the  
functionality out of it. Many of these dependencies may not be  
present in default Perl installations.  Tracking down all of the  
dependencies and installing them (along with all of the dependencies- 
of-the-dependencies) by hand is a massive pain.

The nice thing about the Bundle is that it lists the core module  
dependencies and it works great with the CPAN.pm module to automate  
the downloading and installation of everything that BioPerl requires.  
The CPAN module is smart enough that when processing *our* bundle it  
will also track down and install anything that our bundle entries  
themselves list as a dependency.

So for unix/Linux systems the Bundle is a great one-liner ("perl - 
MCPAN -e 'install Bundle::BioPerl'" )  way to auto-install or update  
the many perl modules that BioPerl makes use of.

On the windows side, not sure if it is of any help though.

Regards,
Chris


On Oct 17, 2006, at 4:09 AM, Nathan S. Haigh wrote:

> Hi Chris
>
> I've been working on making a PPD for the upcoming Bioperl 1.5.2  
> release. During this time I also updated Bundle::BioPerl to include  
> up-to-date prereqs. I was wondering if you could update the CPAN  
> package? The updated BioPerl.pm file is attached.
>
> There is some talk about why and if we need Bundle::BioPerl  
> anymore. What was the rationale for having it in the first place,  
> and does it still hold true now?
>
> Cheers
> Nath
>


From plu5even at gmail.com  Tue Oct 17 12:26:34 2006
From: plu5even at gmail.com (Peter H. Baenziger)
Date: Tue, 17 Oct 2006 12:26:34 -0400
Subject: [Bioperl-l] LocatableSeq object vs Sequence Object
Message-ID: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com>

All,
This is my first bioperl script (but not my first Perl script) so
please forgive my naivety.  I've read through documentation and looked
through cookbooks and the like but to no avail.  Any advice is
appreciated.
 So...I am working with an alignment object of several sequences.  My
intentions is to loop through all the sequences of the alignment to
find what amino acid they have at a known position in the alignment
(not the position in the sequence).  I was thinking I could use:
foreach $seq ($alignment->each_seq())
to loop through the sequences and call:
$seq->location_from_column($pos)
on each of the sequences.  However, I don't think I have
"LocatableSequences" (the type of object that has method
"location_from_columns") being returned by $alignment->each_seq().
So, how do I bridge this gap here?  Or is there a better way?
My appreciation in advance!
Peter

 code:
my $swissObj = $swissdb->get_Seq_by_acc($query);  //put several of
these in @sequenceObjects
...
my $alignFactory = Bio::Tools::Run::Alignment::Clustalw->new();
    my $alignment = $alignFactory->align(\@sequenceObjects);
    #print $alignment->overall_percentage_identity(); #works

    #now we find the "alignment position" of the mutation we have on
the human version and get the amino acid at that "alignment position"
for all seq
    my $humanSequence = $prefix."HUMAN";
    my $pos = $alignment->column_from_residue_number($humanSequence,
$aa_seqpos); #this is the "alignment position" equivalent to the
mutation position

    #we'll keep track of what amino acid each species has at the
"alignment equivalent" location listed as being a mutation on the the
human version
    foreach $seq ($alignment->each_seq())
    {
        #print $seq->species() . "\n"; #won't work because
$alignment->each_seq() actually returns a locatableSeq object, not a
normal sequence object
        $speciesAA{$species} = $seq->locatation_from_column($pos);
    }


-- 
<<->>
Peter H. Baenziger


From akarger at CGR.Harvard.edu  Tue Oct 17 12:53:19 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Tue, 17 Oct 2006 12:53:19 -0400
Subject: [Bioperl-l] split location problems
Message-ID: <B9182BFF5B004245BABC12956EA6322E018E6735@huls5.nucleus.harvard.edu>

> From: Jason Stajich [mailto:jason.stajich at gmail.com]
> 
> The whole point of split locations is to represent genes with 
> introns  
> so that is not the "rare" case.

Absolutely.

> I have processed the genbank fungal genomes into GFF3 and 
> have had no  
> problems so I'm confused where you are breaking down.  If I write  
> them out as embl I also get the correct thing.  This is using 
> the CVS  
> version of bioperl from the HEAD.
> 
> I've added code to test this to bug 2101 including a C.glabrata  
> chromsome downloaded from genbank.  Perhaps the problem is on the  
> EMBL parsing side, I didn't test that.

Well, I don't know whether it's EMBL parsing, or a bit further down the
pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968),
and it describes the complement/joins in the way that Bioperl is
handling correctly.

GenBank:
     CDS             complement(join(10347..10372,10632..11157))
                     /locus_tag="CAGL0B00242g"

EMBL:
FT   CDS
join(complement(10632..11157),complement(10347..10372))
FT                   /locus_tag="CAGL0B00242g"

Here's the diff when I run the location-printing script I posted
yesterday:

diff biogb bio
1c1,5
< complement(join(10347..10372,10632..11157))
---
> complement(1701..2651)
> complement(2635..3345)
> complement(3980..4408)
> complement(join(10632..11157,10347..10372))
> 10379..10615
209a214,217
> 498198..498890
> 499712..500062
> 499851..500702
> 500579..501364

As you can see, the complement/join CDS is written out in a different
order, which is Bad.

(I looked at at least one of the other differences: the GB file says
it's a "misc feature" and EMBL says it's a CDS. But they don't seem to
be relevant here.)

-Amir

> 
> On the technical side, I still am not sure I fully know where the  
> strand information should be stored - the top level container or the  
> sub-features.  I'll try and stay up on the discussion if 
> anything has  
> been decided that I should know about.
> 
> -jason
> 
> 
> 
> 


From paul.boutros at utoronto.ca  Tue Oct 17 12:57:19 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 12:57:19 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
Message-ID: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>

Hi,
Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed  
tests, the first seems to be just a result of me not having DBD::mysql  
installed.
Paul

Test Summary
============

Failed Test               Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
t/BioDBSeqFeature_mysql.t               46   46  1-46
t/SearchIO.t                22  5632  1337 2671  2-1337
2 tests and 106 subtests skipped.
Failed 2/236 test scripts. 1382/11688 subtests failed.
Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =  
159.61 CPU)

BioDBSeqFeature_mysql
=====================
pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
1..46
install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC  
contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t  
/db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi  
/db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at  
(eval 37) line 3.
Perhaps the DBD::mysql perl module hasn't been fully installed,
or perhaps the capitalisation of 'mysql' isn't right.
Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
  at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208

SearchIO
========
pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
1..1337
ok 1

-------------------- WARNING ---------------------
MSG: XML::SAX::Expat not currently supported; must have local copies  
of NCBI DTD docs!
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: error in parsing a report:

404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'  
does not exist  
file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
Handler couldn't resolve external entity at line 2, column 82, byte 104
error in processing external entity reference at line 2, column 82,  
byte 104 at  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line  
187

---------------------------------------------------
not ok 2
# Failed test 2 in t/SearchIO.t at line 68
Can't call method "database_name" on an undefined value at  
t/SearchIO.t line 69.

------------------------------

Message: 10
Date: Tue, 17 Oct 2006 11:32:54 +0100
From: Sendu Bala <bix at sendu.me.uk>
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
To: bioperl-l at bioperl.org
Message-ID: <4534B156.4090501 at sendu.me.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
See http://www.bioperl.org/wiki/Release_1.5.2 for
instructions on getting and testing this RC.

Developers:
    This should be the last RC before release ~next monday. Now would
    be a good time for last minute documentaiton updates and additions.

Users:
    Even though 1.5.2 is a 'developer' release, we consider it the most
    stable and capable version of Bioperl, and recommend that you use
    it in all but the most critical production environments. Please
    try it out and let us know of any problems or difficulties you run
    into.


Thank you,
Sendu.


From barry.moore at genetics.utah.edu  Tue Oct 17 12:57:48 2006
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Tue, 17 Oct 2006 10:57:48 -0600
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
Message-ID: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>

lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix

does a reasonable job of textifying html.  You get the links as  
numbered references at the bottom or:

lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |  
perl -ane 's/\[?\[\d+\](edit\])?//g;print'

to remove the links all together.

Barry

P.S.  Looks like this:

    #Creative Commons copyright

Installing Bioperl for Unix

 From BioPerl

    Jump to: navigation, search

Contents

      * 1 BIOPERL INSTALLATION
      * 2 SYSTEM REQUIREMENTS
      * 3 OPTIONAL
      * 4 ADDITIONAL INSTALLATION INFORMATION
      * 5 THE BIOPERL BUNDLE
      * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
      * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
      * 8 WHERE ARE THE MAN PAGES?
      * 9 EXTERNAL PROGRAMS
           + 9.1 Environment Variables
      * 10 INSTALLING BIOPERL SCRIPTS
      * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
      * 12 INSTALLING BIOPERL MODULES THE HARD WAY
      * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
      * 14 THE TEST SYSTEM
      * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
           + 15.1 CONFIGURING for BSD and Solaris boxes
           + 15.2 INSTALLATION
         * 16 DEPENDENCIES AND Bundle::BioPerl


BIOPERL INSTALLATION

    Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
    and on Mac OS X (see the PLATFORMS file for more details).  
Following are
    instructions  for  installing Bioperl for Unix/Linux/Mac OS X;  
Windows
    installation instructions can be found here. For installing  
Bioperl for
    Mac OS X using Fink, see Getting BioPerl.


SYSTEM REQUIREMENTS

      * Perl 5.005 or later; version 5.6 and greater are recommended.  
Note
        that most modules will work with earlier versions of Perl.  
The only ones
        that will not are Bio::SimpleAlign and the Bio::Index::*  
modules. If
        you don't need these modules and you want to install Bioperl  
using an
        earlier version of Perl, edit the "require 5.005;" line in  
Makefile.PL
        as necessary.

      * External modules: Bioperl uses functionality provided in  
other Perl
        modules. Some of these are included in the standard perl  
package but
        some  need to be obtained from the CPAN site. The list of  
external
        modules is included at the bottom of this document.

    The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of  
these
    external modules easy. Simply install the bundle using your CPAN  
shell and
    all necessary modules will be installed. See THE BIOPERL BUNDLE,  
below.


OPTIONAL

      * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
        bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
        PACKAGE, below).


ADDITIONAL INSTALLATION INFORMATION

      * Additional information on Bioperl and MAC OS:
           + OS 9 - http://bioperl.org/Core/mac-bioperl.html
           + OSX-http://www.tc.umn.edu/~cann0010/ 
Bioperl_OSX_install.html
           + OS X - Installing using Fink (in Getting BioPerl)


THE BIOPERL BUNDLE

    You typically need root privileges to install using CPAN. If you  
don't
    have these privileges please see INSTALLING BIOPERL IN A PERSONAL  
MODULE
    AREA for additional information.

    Install Bundle::Bioperl using CPAN. One way:
 >perl -MCPAN -e "install Bundle::BioPerl"

    Another way:
 >perl -MCPAN -e shell
cpan>install Bundle::BioPerl


On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:

> There isn't a very easy way since so many links have to be removed/ 
> modified.
> I have found a few CPAN modules that could help, but for now I just  
> dump the
> text output from a text browser (elinks) using the 'printable  
> version' page
> and hand-edit, which works very quickly.  That works for the time  
> being
> until I can find another more automated solution.
>
> Fortunately there have been very few edits to either INSTALL wiki  
> page so
> they should remain relatively stable.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>> -----Original Message-----
>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>> Sent: Tuesday, October 17, 2006 6:46 AM
>> To: Chris Fields
>> Cc: bioperl-l
>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>
>> Chris Fields wrote:
>>> The general consensus was to keep text versions available; we could
>>> add URL links to the wiki pages for the most up-to-dat version.   
>>> BTW,
>>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
>>> waiting for your changes).
>>>
>> Is it possible to generate these files from the wiki whenever  
>> there is a
>> release? I now edits shouldn't be too severe or too often - but I can
>> see things getting a little messy/annoying if edits have to be  
>> made in 2
>> places.
>>
>> Nath
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From niels at genomics.dk  Tue Oct 17 12:58:14 2006
From: niels at genomics.dk (Niels Larsen)
Date: Tue, 17 Oct 2006 18:58:14 +0200
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E207.8030508@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk>
	<4534E207.8030508@sendu.me.uk>
Message-ID: <45350BA6.3040102@genomics.dk>

Ok, here are ways to reproduce; I sure apologize if I made the
test scripts wrong. And I suppose EBI/DDBJ's interfaces are not
a bioperl issue really.

Niels

------------ EBI

I invoked the EBI script

http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip

like this

WSWUBlastClient.pl -p blastn -D embl test.fasta

where the content of test.fasta is below, and got

Can't find method element in the message at 
/ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

 >Planctomyces sp. 282; Genbank Taxonomy ID: 79927
AATGAACGTTGGCGGCATGGATTAGGCATGCAAGTCGAGGGAGAACCCGCAAGGGGACACCGGCG
AACGGGGTAGGAATACATAGGTAACGTACCCTCAGGACGGGGATAGCCAAGGGAAACTTTGGGTA
ATACCCGATGTGATGGCAAGATGTGAATGCTTGTCATCAAAGGTGAGATTCCACCTGAGGAGCGG
CTTATGCATCATTAGCTTGTTGGCGGGGTAACGGCCCACCAAGGCTGCGATGATTAGGGGGTGTG
AGAGCATGGCCCCCACCACTGGCACTGAGACACTGGCCAGACACCTACGGGTGGCTGCAGTCGAG

I tried with this test sequence in fasta format and with just the
sequence.

------------ DDBJ

Inspired by this page,

http://xml.nig.ac.jp/doc/Blast.txt

I made this test script

------ cut --
#!/usr/bin/env perl

use strict;
use warnings FATAL => qw ( all );

my ( $service, $seqstr, $result );

use SOAP::Lite;
use Data::Dumper;

$service = SOAP::Lite->service('http://xml.nig.ac.jp/wsdl/Blast.wsdl');

$seqstr = "MSSRIARALALVVTLLHLTRLALSTCPAACHCPLEAPKCAPGVGLVRDGCGCCKVCAKQL";

$result = $service->searchSimple( "blastp", "SWISS", $seqstr );

print Dumper( $result );
------ cut --

which for me prints undef.

------------- NCBI/Bioperl

I installed 1.5.2-RC2, looked at the RemoteBlast example in

http://www.bioperl.org/wiki/Bptutorial.pl

and then put that into this test code, more or less cut/paste,

--- cut --
#!/usr/bin/env perl

use strict;
use warnings FATAL => qw ( all );

use Bio::Tools::Run::RemoteBlast;
use Data::Dumper;

my ( $remote_blast, $r, $rc, $rid, @rids );

$remote_blast = Bio::Tools::Run::RemoteBlast->new (
                 -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' );

$r = $remote_blast->submit_blast("ecoli.fasta");

while ( @rids = $remote_blast->each_rid )
{
#    print Dumper( \@rids );

     for $rid ( @rids ) {
         $rc = $remote_blast->retrieve_blast($rid);
#        print Dumper( $rc );
     }

     sleep 10;
}
--- cut --

which saves the same blast report to TMPDIR for every 10 seconds.
The "ecoli.fasta" file contains this

 >test
gggggctctgttggttctcccgcaacgctactctgtttaccaggtcaggtccggaaggaa
gcagccaaggcagatgacgcgtgtgccgggatgtagctggcagggcccccaccc

Maybe I am supposed to add a check for content in $rc and then stop
the inner loop? I could figure that out maybe, but I wish there was a
function which simply takes a single sequence + arguments and only
returns a list of matches when done, and does not return until then
(or until a specified timeout).


------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------


From bertrand.beckert at gmail.com  Tue Oct 17 10:52:36 2006
From: bertrand.beckert at gmail.com (bertrand beckert)
Date: Tue, 17 Oct 2006 16:52:36 +0200
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
Message-ID: <500217090610170752q565cfc08t5208e3b64f99ef7f@mail.gmail.com>

hi,

I am running a large number of blasts via a connexion to ncbi blast
page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
some problems. I make a simple example with only one sequence in
order to understand how work this module. This is my simple input
file, a DNA sequence in fasta form:

>test
TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT
TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG
TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA

I have made some modification of the example available in doc of bioperl.
It give me a RID which contain the results of my blast but I have a
problem with the "$result=$factory->retrieve_blast($rid)" in my script.
In the documentation it wrote that $result=$factory->retrieve_blast
($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
object. In my case it returns a Bio::SearchIO::blast... I don't
understand why I don't have the good type of object return (see PART I).

I also try to resolve the problem by replace the foreach loop in my
script by a new one in order to explore the blast page result but it
also don't work (see part II).

could you help me please. Thank you

Bertrand Beckert.

PART I:

Here is my script with a little annotation and also the shell window
printing:
------------------------------------------------------------------------
#!/usr/bin/perl -w
use Bio::Tools::Run::RemoteBlast;
use Bio::SearchIO;
sub blast {
my $prog='blastn';
my $db='refseq_genomic';
my $e_val='1e-10';
my $Input='Seq.fasta';
my @params = ('-prog' =>  $prog, '-data' =>  $db, '-expect' =>
$e_val, '-readmethod' => 'SearchIO');
my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
#changes parameters
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25';
$factory->submit_blast($Input);
print STDERR "waiting...\n";
while (my @rids=$factory->each_rid) {
          print "my rid: ", at rids,"\n";
     #return me the ID of the submited blast i.e. RID:
1161079157-766-185099855365.BLASTQ2
     #this page contains the result of my blast...
             foreach my $rid (@rids) {
                         $result=$factory->retrieve_blast($rid);
        #line in order to understand what type of object is
return by
retrieve_blast
                 print "rc:", $result,"\n";

                        }
            }
        }

&blast;
------------------------------------------------------------------------

here you can see the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc54)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc30)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x89eb7f4)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x8a2cc74)
my rid: 1161079157-766-185099855365.BLASTQ2
...
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x886bbac)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x89eb5f0)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x8a2d2d4)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x84fa054)
...

PARTII:

I also try to resolve the problem by replace the foreach loop in my
script by:
------------------------------------------------------------------------

foreach my $rid (@rids) {
                 while(1) {
                 $result=$factory->retrieve_blast($rid)->next_result();
                 print "rc:", $result,"\n";
                 if ($result) {
                 print  $result->num_hits(),"\n";
                 }
------------------------------------------------------------------------

With tis loop I could explore the result Blast page. that is what I
obtain in the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161088606-9905-123050755601.BLASTQ4
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb834)


----
-- 
Berrtrand BECKERT
PhD student
IBMC - UPR 9002 du CNRS - ARN
15, rue Rene Descartes
F-67084 STRASBOURG Cedex
b.beckert at ibmc.u-strasbg.fr
bertrand.beckert at gmail.com


From cjfields at uiuc.edu  Tue Oct 17 13:50:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 12:50:49 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>
Message-ID: <001201c6f214$c8934440$15327e82@pyrimidine>

(Apologies for the top post, but I thought my response might get lost below)

I use elinks in a similar fashion.  It tends to format the tables a bit
better than lynx.

Chris

> -----Original Message-----
> From: Barry Moore [mailto:barry.moore at genetics.utah.edu]
> Sent: Tuesday, October 17, 2006 11:58 AM
> To: Chris Fields
> Cc: 'Nathan S. Haigh'; 'bioperl-l'
> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
> 
> does a reasonable job of textifying html.  You get the links as
> numbered references at the bottom or:
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
> 
> to remove the links all together.
> 
> Barry
> 
> P.S.  Looks like this:
> 
>     #Creative Commons copyright
> 
> Installing Bioperl for Unix
> 
>  From BioPerl
> 
>     Jump to: navigation, search
> 
> Contents
> 
>       * 1 BIOPERL INSTALLATION
>       * 2 SYSTEM REQUIREMENTS
>       * 3 OPTIONAL
>       * 4 ADDITIONAL INSTALLATION INFORMATION
>       * 5 THE BIOPERL BUNDLE
>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>       * 8 WHERE ARE THE MAN PAGES?
>       * 9 EXTERNAL PROGRAMS
>            + 9.1 Environment Variables
>       * 10 INSTALLING BIOPERL SCRIPTS
>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>       * 14 THE TEST SYSTEM
>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>            + 15.1 CONFIGURING for BSD and Solaris boxes
>            + 15.2 INSTALLATION
>          * 16 DEPENDENCIES AND Bundle::BioPerl
> 
> 
> BIOPERL INSTALLATION
> 
>     Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
>     and on Mac OS X (see the PLATFORMS file for more details).
> Following are
>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
> Windows
>     installation instructions can be found here. For installing
> Bioperl for
>     Mac OS X using Fink, see Getting BioPerl.
> 
> 
> SYSTEM REQUIREMENTS
> 
>       * Perl 5.005 or later; version 5.6 and greater are recommended.
> Note
>         that most modules will work with earlier versions of Perl.
> The only ones
>         that will not are Bio::SimpleAlign and the Bio::Index::*
> modules. If
>         you don't need these modules and you want to install Bioperl
> using an
>         earlier version of Perl, edit the "require 5.005;" line in
> Makefile.PL
>         as necessary.
> 
>       * External modules: Bioperl uses functionality provided in
> other Perl
>         modules. Some of these are included in the standard perl
> package but
>         some  need to be obtained from the CPAN site. The list of
> external
>         modules is included at the bottom of this document.
> 
>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
> these
>     external modules easy. Simply install the bundle using your CPAN
> shell and
>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
> below.
> 
> 
> OPTIONAL
> 
>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>         PACKAGE, below).
> 
> 
> 
> ADDITIONAL INSTALLATION INFORMATION
> 
>       * Additional information on Bioperl and MAC OS:
>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>            + OSX-http://www.tc.umn.edu/~cann0010/
> Bioperl_OSX_install.html
>            + OS X - Installing using Fink (in Getting BioPerl)
> 
> 
> 
> THE BIOPERL BUNDLE
> 
>     You typically need root privileges to install using CPAN. If you
> don't
>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
> MODULE
>     AREA for additional information.
> 
>     Install Bundle::Bioperl using CPAN. One way:
>  >perl -MCPAN -e "install Bundle::BioPerl"
> 
>     Another way:
>  >perl -MCPAN -e shell
> cpan>install Bundle::BioPerl
> 
> 
> 
> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
> 
> > There isn't a very easy way since so many links have to be removed/
> > modified.
> > I have found a few CPAN modules that could help, but for now I just
> > dump the
> > text output from a text browser (elinks) using the 'printable
> > version' page
> > and hand-edit, which works very quickly.  That works for the time
> > being
> > until I can find another more automated solution.
> >
> > Fortunately there have been very few edits to either INSTALL wiki
> > page so
> > they should remain relatively stable.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >> -----Original Message-----
> >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
> >> Sent: Tuesday, October 17, 2006 6:46 AM
> >> To: Chris Fields
> >> Cc: bioperl-l
> >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> >>
> >> Chris Fields wrote:
> >>> The general consensus was to keep text versions available; we could
> >>> add URL links to the wiki pages for the most up-to-dat version.
> >>> BTW,
> >>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
> >>> waiting for your changes).
> >>>
> >> Is it possible to generate these files from the wiki whenever
> >> there is a
> >> release? I now edits shouldn't be too severe or too often - but I can
> >> see things getting a little messy/annoying if edits have to be
> >> made in 2
> >> places.
> >>
> >> Nath
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Oct 17 13:52:36 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 12:52:36 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
Message-ID: <001301c6f215$07a9a070$15327e82@pyrimidine>

What do you get when you run the SearchIO.t test by itself using 'perl -I.
t/SearchIO.t'?  It looks like something pretty catastrophic happened.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
> Sent: Tuesday, October 17, 2006 11:57 AM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
> 
> Hi,
> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
> tests, the first seems to be just a result of me not having DBD::mysql
> installed.
> Paul
> 
> Test Summary
> ============
> 
> Failed Test               Stat Wstat Total Fail  List of Failed
> --------------------------------------------------------------------------
> -----
> t/BioDBSeqFeature_mysql.t               46   46  1-46
> t/SearchIO.t                22  5632  1337 2671  2-1337
> 2 tests and 106 subtests skipped.
> Failed 2/236 test scripts. 1382/11688 subtests failed.
> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
> 159.61 CPU)
> 
> BioDBSeqFeature_mysql
> =====================
> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
> 1..46
> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
> (eval 37) line 3.
> Perhaps the DBD::mysql perl module hasn't been fully installed,
> or perhaps the capitalisation of 'mysql' isn't right.
> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
> 
> SearchIO
> ========
> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
> 1..1337
> ok 1
> 
> -------------------- WARNING ---------------------
> MSG: XML::SAX::Expat not currently supported; must have local copies
> of NCBI DTD docs!
> ---------------------------------------------------
> 
> -------------------- WARNING ---------------------
> MSG: error in parsing a report:
> 
> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> does not exist
> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> Handler couldn't resolve external entity at line 2, column 82, byte 104
> error in processing external entity reference at line 2, column 82,
> byte 104 at
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> 187
> 
> ---------------------------------------------------
> not ok 2
> # Failed test 2 in t/SearchIO.t at line 68
> Can't call method "database_name" on an undefined value at
> t/SearchIO.t line 69.
> 
> ------------------------------
> 
> Message: 10
> Date: Tue, 17 Oct 2006 11:32:54 +0100
> From: Sendu Bala <bix at sendu.me.uk>
> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
> To: bioperl-l at bioperl.org
> Message-ID: <4534B156.4090501 at sendu.me.uk>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> See http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
> 
> Developers:
>     This should be the last RC before release ~next monday. Now would
>     be a good time for last minute documentaiton updates and additions.
> 
> Users:
>     Even though 1.5.2 is a 'developer' release, we consider it the most
>     stable and capable version of Bioperl, and recommend that you use
>     it in all but the most critical production environments. Please
>     try it out and let us know of any problems or difficulties you run
>     into.
> 
> 
> Thank you,
> Sendu.
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paul.boutros at utoronto.ca  Tue Oct 17 13:59:33 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 13:59:33 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
Message-ID: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca>

Hi Chris,

Here it is:
pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
1..1337
ok 1

-------------------- WARNING ---------------------
MSG: XML::SAX::Expat not currently supported; must have local copies  
of NCBI DTD docs!
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: error in parsing a report:

404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'  
does not exist  
file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
Handler couldn't resolve external entity at line 2, column 82, byte 104
error in processing external entity reference at line 2, column 82,  
byte 104 at  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line  
187

---------------------------------------------------
not ok 2
# Failed test 2 in t/SearchIO.t at line 68
Can't call method "database_name" on an undefined value at  
t/SearchIO.t line 69.


Quoting Chris Fields <cjfields at uiuc.edu>:

> What do you get when you run the SearchIO.t test by itself using 'perl -I.
> t/SearchIO.t'?  It looks like something pretty catastrophic happened.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>> Sent: Tuesday, October 17, 2006 11:57 AM
>> To: bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>>
>> Hi,
>> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
>> tests, the first seems to be just a result of me not having DBD::mysql
>> installed.
>> Paul
>>
>> Test Summary
>> ============
>>
>> Failed Test               Stat Wstat Total Fail  List of Failed
>> --------------------------------------------------------------------------
>> -----
>> t/BioDBSeqFeature_mysql.t               46   46  1-46
>> t/SearchIO.t                22  5632  1337 2671  2-1337
>> 2 tests and 106 subtests skipped.
>> Failed 2/236 test scripts. 1382/11688 subtests failed.
>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
>> 159.61 CPU)
>>
>> BioDBSeqFeature_mysql
>> =====================
>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>> 1..46
>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
>> (eval 37) line 3.
>> Perhaps the DBD::mysql perl module hasn't been fully installed,
>> or perhaps the capitalisation of 'mysql' isn't right.
>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>>
>> SearchIO
>> ========
>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>> 1..1337
>> ok 1
>>
>> -------------------- WARNING ---------------------
>> MSG: XML::SAX::Expat not currently supported; must have local copies
>> of NCBI DTD docs!
>> ---------------------------------------------------
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>
>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> does not exist
>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> error in processing external entity reference at line 2, column 82,
>> byte 104 at
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> 187
>>
>> ---------------------------------------------------
>> not ok 2
>> # Failed test 2 in t/SearchIO.t at line 68
>> Can't call method "database_name" on an undefined value at
>> t/SearchIO.t line 69.
>>
>> ------------------------------
>>
>> Message: 10
>> Date: Tue, 17 Oct 2006 11:32:54 +0100
>> From: Sendu Bala <bix at sendu.me.uk>
>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>> To: bioperl-l at bioperl.org
>> Message-ID: <4534B156.4090501 at sendu.me.uk>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
>> See http://www.bioperl.org/wiki/Release_1.5.2 for
>> instructions on getting and testing this RC.
>>
>> Developers:
>>     This should be the last RC before release ~next monday. Now would
>>     be a good time for last minute documentaiton updates and additions.
>>
>> Users:
>>     Even though 1.5.2 is a 'developer' release, we consider it the most
>>     stable and capable version of Bioperl, and recommend that you use
>>     it in all but the most critical production environments. Please
>>     try it out and let us know of any problems or difficulties you run
>>     into.
>>
>>
>> Thank you,
>> Sendu.
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From barry.moore at genetics.utah.edu  Tue Oct 17 14:07:12 2006
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Tue, 17 Oct 2006 12:07:12 -0600
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <C15A8DE6.AD40%bosborne11@verizon.net>
References: <C15A8DE6.AD40%bosborne11@verizon.net>
Message-ID: <588DE26B-8F18-4540-BAEE-2B479CBDE8B3@genetics.utah.edu>

In fact, I think it was you who taught me that trick in the first place.

B

On Oct 17, 2006, at 11:40 AM, Brian Osborne wrote:

> Barry,
>
> I second that. lynx does the best job of converting HTML to text  
> I've seen.
>
> Brian O.
>
>
> On 10/17/06 12:57 PM, "Barry Moore" <barry.moore at genetics.utah.edu>  
> wrote:
>
>> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
>>
>> does a reasonable job of textifying html.  You get the links as
>> numbered references at the bottom or:
>>
>> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
>> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
>>
>> to remove the links all together.
>>
>> Barry
>>
>> P.S.  Looks like this:
>>
>>     #Creative Commons copyright
>>
>> Installing Bioperl for Unix
>>
>>  From BioPerl
>>
>>     Jump to: navigation, search
>>
>> Contents
>>
>>       * 1 BIOPERL INSTALLATION
>>       * 2 SYSTEM REQUIREMENTS
>>       * 3 OPTIONAL
>>       * 4 ADDITIONAL INSTALLATION INFORMATION
>>       * 5 THE BIOPERL BUNDLE
>>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>>       * 8 WHERE ARE THE MAN PAGES?
>>       * 9 EXTERNAL PROGRAMS
>>            + 9.1 Environment Variables
>>       * 10 INSTALLING BIOPERL SCRIPTS
>>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>>       * 14 THE TEST SYSTEM
>>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>>            + 15.1 CONFIGURING for BSD and Solaris boxes
>>            + 15.2 INSTALLATION
>>          * 16 DEPENDENCIES AND Bundle::BioPerl
>>
>>
>> BIOPERL INSTALLATION
>>
>>     Bioperl has been installed on many forms of Unix, Win9X/NT/ 
>> 2000/XP,
>>     and on Mac OS X (see the PLATFORMS file for more details).
>> Following are
>>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
>> Windows
>>     installation instructions can be found here. For installing
>> Bioperl for
>>     Mac OS X using Fink, see Getting BioPerl.
>>
>>
>> SYSTEM REQUIREMENTS
>>
>>       * Perl 5.005 or later; version 5.6 and greater are recommended.
>> Note
>>         that most modules will work with earlier versions of Perl.
>> The only ones
>>         that will not are Bio::SimpleAlign and the Bio::Index::*
>> modules. If
>>         you don't need these modules and you want to install Bioperl
>> using an
>>         earlier version of Perl, edit the "require 5.005;" line in
>> Makefile.PL
>>         as necessary.
>>
>>       * External modules: Bioperl uses functionality provided in
>> other Perl
>>         modules. Some of these are included in the standard perl
>> package but
>>         some  need to be obtained from the CPAN site. The list of
>> external
>>         modules is included at the bottom of this document.
>>
>>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
>> these
>>     external modules easy. Simply install the bundle using your CPAN
>> shell and
>>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
>> below.
>>
>>
>> OPTIONAL
>>
>>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions  
>> (the
>>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>>         PACKAGE, below).
>>
>>
>>
>> ADDITIONAL INSTALLATION INFORMATION
>>
>>       * Additional information on Bioperl and MAC OS:
>>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>>            + OSX-http://www.tc.umn.edu/~cann0010/
>> Bioperl_OSX_install.html
>>            + OS X - Installing using Fink (in Getting BioPerl)
>>
>>
>>
>> THE BIOPERL BUNDLE
>>
>>     You typically need root privileges to install using CPAN. If you
>> don't
>>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
>> MODULE
>>     AREA for additional information.
>>
>>     Install Bundle::Bioperl using CPAN. One way:
>>> perl -MCPAN -e "install Bundle::BioPerl"
>>
>>     Another way:
>>> perl -MCPAN -e shell
>> cpan>install Bundle::BioPerl
>>
>>
>>
>> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
>>
>>> There isn't a very easy way since so many links have to be removed/
>>> modified.
>>> I have found a few CPAN modules that could help, but for now I just
>>> dump the
>>> text output from a text browser (elinks) using the 'printable
>>> version' page
>>> and hand-edit, which works very quickly.  That works for the time
>>> being
>>> until I can find another more automated solution.
>>>
>>> Fortunately there have been very few edits to either INSTALL wiki
>>> page so
>>> they should remain relatively stable.
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher - Switzer Lab
>>> Dept. of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>> -----Original Message-----
>>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>>>> Sent: Tuesday, October 17, 2006 6:46 AM
>>>> To: Chris Fields
>>>> Cc: bioperl-l
>>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>>>
>>>> Chris Fields wrote:
>>>>> The general consensus was to keep text versions available; we  
>>>>> could
>>>>> add URL links to the wiki pages for the most up-to-dat version.
>>>>> BTW,
>>>>> I have modified INSTALL already.  INSTALL.WIN is next in line  
>>>>> (I was
>>>>> waiting for your changes).
>>>>>
>>>> Is it possible to generate these files from the wiki whenever
>>>> there is a
>>>> release? I now edits shouldn't be too severe or too often - but  
>>>> I can
>>>> see things getting a little messy/annoying if edits have to be
>>>> made in 2
>>>> places.
>>>>
>>>> Nath
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From bix at sendu.me.uk  Tue Oct 17 14:07:04 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 19:07:04 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
References: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
Message-ID: <45351BC8.9080507@sendu.me.uk>

Paul Boutros wrote:
> Hi,
> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed  
> tests, the first seems to be just a result of me not having DBD::mysql  
> installed.
[snip]

Thanks for those, very useful. Not something that's come up before 
afaik; I'll look into them.


From cjfields at uiuc.edu  Tue Oct 17 14:31:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 13:31:51 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca>
Message-ID: <001401c6f21a$836f9fc0$15327e82@pyrimidine>

Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX
backend parser.  For some reason BLAST XML parsing doesn't work with that
parser (it tries to verify the XML first before parsing, hence the DTD
error).  I may try getting this to work again, but so far I haven't found an
easy way to prevent XML verification via XML::SAX::Expat.

There are two options: 1) install XML::SAX::ExpatXS (the better option),
which works AND is 4x faster than XML::SAX::Expat, or  2) set the default
parser in the PareserDetails.ini file in your local to use
XML::SAX::PurePerl.  

BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just
hasn't officially happened yet); the latter hasn't had significant
development in about three years.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
> Sent: Tuesday, October 17, 2006 1:00 PM
> To: Chris Fields
> Cc: bioperl-l at lists.open-bio.org
> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
> 
> Hi Chris,
> 
> Here it is:
> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
> 1..1337
> ok 1
> 
> -------------------- WARNING ---------------------
> MSG: XML::SAX::Expat not currently supported; must have local copies
> of NCBI DTD docs!
> ---------------------------------------------------
> 
> -------------------- WARNING ---------------------
> MSG: error in parsing a report:
> 
> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> does not exist
> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> Handler couldn't resolve external entity at line 2, column 82, byte 104
> error in processing external entity reference at line 2, column 82,
> byte 104 at
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> 187
> 
> ---------------------------------------------------
> not ok 2
> # Failed test 2 in t/SearchIO.t at line 68
> Can't call method "database_name" on an undefined value at
> t/SearchIO.t line 69.
> 
> 
> Quoting Chris Fields <cjfields at uiuc.edu>:
> 
> > What do you get when you run the SearchIO.t test by itself using 'perl -
> I.
> > t/SearchIO.t'?  It looks like something pretty catastrophic happened.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
> >> Sent: Tuesday, October 17, 2006 11:57 AM
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
> >>
> >> Hi,
> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
> >> tests, the first seems to be just a result of me not having DBD::mysql
> >> installed.
> >> Paul
> >>
> >> Test Summary
> >> ============
> >>
> >> Failed Test               Stat Wstat Total Fail  List of Failed
> >> -----------------------------------------------------------------------
> ---
> >> -----
> >> t/BioDBSeqFeature_mysql.t               46   46  1-46
> >> t/SearchIO.t                22  5632  1337 2671  2-1337
> >> 2 tests and 106 subtests skipped.
> >> Failed 2/236 test scripts. 1382/11688 subtests failed.
> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
> >> 159.61 CPU)
> >>
> >> BioDBSeqFeature_mysql
> >> =====================
> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
> >> 1..46
> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
> >> (eval 37) line 3.
> >> Perhaps the DBD::mysql perl module hasn't been fully installed,
> >> or perhaps the capitalisation of 'mysql' isn't right.
> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
> >>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
> >>
> >> SearchIO
> >> ========
> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
> >> 1..1337
> >> ok 1
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: XML::SAX::Expat not currently supported; must have local copies
> >> of NCBI DTD docs!
> >> ---------------------------------------------------
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: error in parsing a report:
> >>
> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> >> does not exist
> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> >> Handler couldn't resolve external entity at line 2, column 82, byte 104
> >> error in processing external entity reference at line 2, column 82,
> >> byte 104 at
> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> >> 187
> >>
> >> ---------------------------------------------------
> >> not ok 2
> >> # Failed test 2 in t/SearchIO.t at line 68
> >> Can't call method "database_name" on an undefined value at
> >> t/SearchIO.t line 69.
> >>
> >> ------------------------------
> >>
> >> Message: 10
> >> Date: Tue, 17 Oct 2006 11:32:54 +0100
> >> From: Sendu Bala <bix at sendu.me.uk>
> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
> >> To: bioperl-l at bioperl.org
> >> Message-ID: <4534B156.4090501 at sendu.me.uk>
> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>
> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> >> See http://www.bioperl.org/wiki/Release_1.5.2 for
> >> instructions on getting and testing this RC.
> >>
> >> Developers:
> >>     This should be the last RC before release ~next monday. Now would
> >>     be a good time for last minute documentaiton updates and additions.
> >>
> >> Users:
> >>     Even though 1.5.2 is a 'developer' release, we consider it the most
> >>     stable and capable version of Bioperl, and recommend that you use
> >>     it in all but the most critical production environments. Please
> >>     try it out and let us know of any problems or difficulties you run
> >>     into.
> >>
> >>
> >> Thank you,
> >> Sendu.
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> 


From cjfields at uiuc.edu  Tue Oct 17 15:05:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 14:05:59 -0500
Subject: [Bioperl-l] split location problems
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E018E6735@huls5.nucleus.harvard.edu>
Message-ID: <001b01c6f21f$48640420$15327e82@pyrimidine>

> > From: Jason Stajich [mailto:jason.stajich at gmail.com]
> >
> > The whole point of split locations is to represent genes with
> > introns
> > so that is not the "rare" case.
> 
> Absolutely.

Right, but that specific kind of join statement is not commonly used  in
GenBank files, which seems to be the format predominately used (no offense
to EBI).  This may explain why we haven't seen this pop up more often.  

I believe we're seeing is a difference in the way these locations are
described at NCBI vs EBI, which Nadeem Faruque seems to corroborate.  He
indicated that EBI may move to using similar GenBank-like location strings.
Regardless, FTlocationFactory and Bio::Location::Split should handle both if
they are present but only seems to like the GenBank version.

> > I've added code to test this to bug 2101 including a C.glabrata
> > chromsome downloaded from genbank.  Perhaps the problem is on the
> > EMBL parsing side, I didn't test that.
> 
> Well, I don't know whether it's EMBL parsing, or a bit further down the
> pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968),
> and it describes the complement/joins in the way that Bioperl is
> handling correctly.
> 
> GenBank:
>      CDS             complement(join(10347..10372,10632..11157))
>                      /locus_tag="CAGL0B00242g"
> 
> EMBL:
> FT   CDS
> join(complement(10632..11157),complement(10347..10372))
> FT                   /locus_tag="CAGL0B00242g"

Yes, something that I found out independently (and corroborated by Nadeem).

> Here's the diff when I run the location-printing script I posted
> yesterday:
> 
> diff biogb bio
> 1c1,5
> < complement(join(10347..10372,10632..11157))
> ---
> > complement(1701..2651)
> > complement(2635..3345)
> > complement(3980..4408)
> > complement(join(10632..11157,10347..10372))
> > 10379..10615
> 209a214,217
> > 498198..498890
> > 499712..500062
> > 499851..500702
> > 500579..501364
> 
> As you can see, the complement/join CDS is written out in a different
> order, which is Bad.

I think this can be handled directly in to_FTstring().  I'll have to add a
method to get the strand info from the Split object w/o going through
strand().  

However, I'm thinking about trying a different tact which is a bit simpler
and, if it proves fruitful, may simplify Split locations somewhat.  It won't
be ready for 1.5.2 but maybe the next release.

> (I looked at at least one of the other differences: the GB file says
> it's a "misc feature" and EMBL says it's a CDS. But they don't seem to
> be relevant here.)
> -Amir

Probably not but something to keep in mind.
 
-c

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From er at xs4all.nl  Tue Oct 17 15:01:48 2006
From: er at xs4all.nl (Erikjan)
Date: Tue, 17 Oct 2006 21:01:48 +0200 (CEST)
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
Message-ID: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>

Hello,

I noticed a little problem with the Annotation "DBLink" from GenBank entries

When I run:

perl -MBio::DB::GenBank -e 'my $gi =
56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
$db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
$ac=$seq->annotation(); my @annotations = $ac->get_Annotations("dblink");
for(@annotations) { print $_, "\n";} print $INC{
"Bio/Annotation/DBLink.pm" }, "\n"; '

This yields:

   GenBank:AL591065.17.17

and the place where the used Bio/Annotation/DBLink.pm resides.

Can others repeat this?

I have dug into the source a little and Bio::Annotation::DBLink seems to
be the place where this happens: it has a concatenation which leads to
that repeated version number.

It this something that I should fix "client-side", so to speak, or is it
worthwhile to add some logic to that concatenation to prevent this?


Thanks,

Eric


From bosborne11 at verizon.net  Tue Oct 17 13:40:54 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 17 Oct 2006 13:40:54 -0400
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>
Message-ID: <C15A8DE6.AD40%bosborne11@verizon.net>

Barry,

I second that. lynx does the best job of converting HTML to text I've seen.

Brian O.


On 10/17/06 12:57 PM, "Barry Moore" <barry.moore at genetics.utah.edu> wrote:

> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
> 
> does a reasonable job of textifying html.  You get the links as
> numbered references at the bottom or:
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
> 
> to remove the links all together.
> 
> Barry
> 
> P.S.  Looks like this:
> 
>     #Creative Commons copyright
> 
> Installing Bioperl for Unix
> 
>  From BioPerl
> 
>     Jump to: navigation, search
> 
> Contents
> 
>       * 1 BIOPERL INSTALLATION
>       * 2 SYSTEM REQUIREMENTS
>       * 3 OPTIONAL
>       * 4 ADDITIONAL INSTALLATION INFORMATION
>       * 5 THE BIOPERL BUNDLE
>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>       * 8 WHERE ARE THE MAN PAGES?
>       * 9 EXTERNAL PROGRAMS
>            + 9.1 Environment Variables
>       * 10 INSTALLING BIOPERL SCRIPTS
>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>       * 14 THE TEST SYSTEM
>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>            + 15.1 CONFIGURING for BSD and Solaris boxes
>            + 15.2 INSTALLATION
>          * 16 DEPENDENCIES AND Bundle::BioPerl
> 
> 
> BIOPERL INSTALLATION
> 
>     Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
>     and on Mac OS X (see the PLATFORMS file for more details).
> Following are
>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
> Windows
>     installation instructions can be found here. For installing
> Bioperl for
>     Mac OS X using Fink, see Getting BioPerl.
> 
> 
> SYSTEM REQUIREMENTS
> 
>       * Perl 5.005 or later; version 5.6 and greater are recommended.
> Note
>         that most modules will work with earlier versions of Perl.
> The only ones
>         that will not are Bio::SimpleAlign and the Bio::Index::*
> modules. If
>         you don't need these modules and you want to install Bioperl
> using an
>         earlier version of Perl, edit the "require 5.005;" line in
> Makefile.PL
>         as necessary.
> 
>       * External modules: Bioperl uses functionality provided in
> other Perl
>         modules. Some of these are included in the standard perl
> package but
>         some  need to be obtained from the CPAN site. The list of
> external
>         modules is included at the bottom of this document.
> 
>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
> these
>     external modules easy. Simply install the bundle using your CPAN
> shell and
>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
> below.
> 
> 
> OPTIONAL
> 
>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>         PACKAGE, below).
> 
> 
> 
> ADDITIONAL INSTALLATION INFORMATION
> 
>       * Additional information on Bioperl and MAC OS:
>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>            + OSX-http://www.tc.umn.edu/~cann0010/
> Bioperl_OSX_install.html
>            + OS X - Installing using Fink (in Getting BioPerl)
> 
> 
> 
> THE BIOPERL BUNDLE
> 
>     You typically need root privileges to install using CPAN. If you
> don't
>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
> MODULE
>     AREA for additional information.
> 
>     Install Bundle::Bioperl using CPAN. One way:
>> perl -MCPAN -e "install Bundle::BioPerl"
> 
>     Another way:
>> perl -MCPAN -e shell
> cpan>install Bundle::BioPerl
> 
> 
> 
> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
> 
>> There isn't a very easy way since so many links have to be removed/
>> modified.
>> I have found a few CPAN modules that could help, but for now I just
>> dump the
>> text output from a text browser (elinks) using the 'printable
>> version' page
>> and hand-edit, which works very quickly.  That works for the time
>> being
>> until I can find another more automated solution.
>> 
>> Fortunately there have been very few edits to either INSTALL wiki
>> page so
>> they should remain relatively stable.
>> 
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>> 
>>> -----Original Message-----
>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>>> Sent: Tuesday, October 17, 2006 6:46 AM
>>> To: Chris Fields
>>> Cc: bioperl-l
>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>> 
>>> Chris Fields wrote:
>>>> The general consensus was to keep text versions available; we could
>>>> add URL links to the wiki pages for the most up-to-dat version.
>>>> BTW,
>>>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
>>>> waiting for your changes).
>>>> 
>>> Is it possible to generate these files from the wiki whenever
>>> there is a
>>> release? I now edits shouldn't be too severe or too often - but I can
>>> see things getting a little messy/annoying if edits have to be
>>> made in 2
>>> places.
>>> 
>>> Nath
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Oct 17 16:30:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 15:30:15 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
Message-ID: <0FB91820-B2A1-4F7F-866C-8D4791DD8306@uiuc.edu>

I can confirm this using bioperl-live:

GenBank:AL591065.17.17
/Users/cjfields/src/bioperl-live/Bio/Annotation/DBLink.pm

Could you file a bug report via bugzilla?

Chris

On Oct 17, 2006, at 2:01 PM, Erikjan wrote:

> Hello,
>
> I noticed a little problem with the Annotation "DBLink" from  
> GenBank entries
>
> When I run:
>
> perl -MBio::DB::GenBank -e 'my $gi =
> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations 
> ("dblink");
> for(@annotations) { print $_, "\n";} print $INC{
> "Bio/Annotation/DBLink.pm" }, "\n"; '
>
> This yields:
>
>    GenBank:AL591065.17.17
>
> and the place where the used Bio/Annotation/DBLink.pm resides.
>
> Can others repeat this?
>
> I have dug into the source a little and Bio::Annotation::DBLink  
> seems to
> be the place where this happens: it has a concatenation which leads to
> that repeated version number.
>
> It this something that I should fix "client-side", so to speak, or  
> is it
> worthwhile to add some logic to that concatenation to prevent this?
>
>
> Thanks,
>
> Eric
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From paul.boutros at utoronto.ca  Tue Oct 17 19:49:52 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 19:49:52 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
Message-ID: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>

Hi Chris,

Yup, that's it.  I installed XML::SAX::ExpatXS (make test output  
below).  Should there be a note somewhere in the INSTALL docs saying  
basically what you just wrote?  Or maybe it's already there somewhere  
and I missed it.

Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks  
if DBD::mysql can be loaded, and if not doesn't run the test.  Since  
the file is only one-line long, here's the modified file rather than a  
patch:
################################################################
BEGIN {
         # DBD::mysql is required
         eval {
                 require DBD::mysql;
                 };
         if ( $@ ) {
                 use Test::More skip_all => "DBD::mysql is not  
installed or is installed incorrectly - skipping BioDBSeqFeature
_mysql.t";
                 exit(0);
                 }
         }

system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1  
-dsn test";
################################################################

And when I run it I get:
t/BioDBSeqFeature_mysql......skipped
         all skipped: DBD::mysql is not installed or is installed  
incorrectly - skipping BioDBSeqFeature_mysql.t

And for the overall make test:
All tests successful, 3 tests and 106 subtests skipped.
Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys =  
164.24 CPU)

Hope this helps,
Paul


Quoting Chris Fields <cjfields at uiuc.edu>:

> Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX
> backend parser.  For some reason BLAST XML parsing doesn't work with that
> parser (it tries to verify the XML first before parsing, hence the DTD
> error).  I may try getting this to work again, but so far I haven't found an
> easy way to prevent XML verification via XML::SAX::Expat.
>
> There are two options: 1) install XML::SAX::ExpatXS (the better option),
> which works AND is 4x faster than XML::SAX::Expat, or  2) set the default
> parser in the PareserDetails.ini file in your local to use
> XML::SAX::PurePerl.
>
> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just
> hasn't officially happened yet); the latter hasn't had significant
> development in about three years.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>> -----Original Message-----
>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
>> Sent: Tuesday, October 17, 2006 1:00 PM
>> To: Chris Fields
>> Cc: bioperl-l at lists.open-bio.org
>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
>>
>> Hi Chris,
>>
>> Here it is:
>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
>> 1..1337
>> ok 1
>>
>> -------------------- WARNING ---------------------
>> MSG: XML::SAX::Expat not currently supported; must have local copies
>> of NCBI DTD docs!
>> ---------------------------------------------------
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>
>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> does not exist
>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> error in processing external entity reference at line 2, column 82,
>> byte 104 at
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> 187
>>
>> ---------------------------------------------------
>> not ok 2
>> # Failed test 2 in t/SearchIO.t at line 68
>> Can't call method "database_name" on an undefined value at
>> t/SearchIO.t line 69.
>>
>>
>> Quoting Chris Fields <cjfields at uiuc.edu>:
>>
>> > What do you get when you run the SearchIO.t test by itself using 'perl -
>> I.
>> > t/SearchIO.t'?  It looks like something pretty catastrophic happened.
>> >
>> > Christopher Fields
>> > Postdoctoral Researcher - Switzer Lab
>> > Dept. of Biochemistry
>> > University of Illinois Urbana-Champaign
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>> >> Sent: Tuesday, October 17, 2006 11:57 AM
>> >> To: bioperl-l at lists.open-bio.org
>> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>> >>
>> >> Hi,
>> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
>> >> tests, the first seems to be just a result of me not having DBD::mysql
>> >> installed.
>> >> Paul
>> >>
>> >> Test Summary
>> >> ============
>> >>
>> >> Failed Test               Stat Wstat Total Fail  List of Failed
>> >> -----------------------------------------------------------------------
>> ---
>> >> -----
>> >> t/BioDBSeqFeature_mysql.t               46   46  1-46
>> >> t/SearchIO.t                22  5632  1337 2671  2-1337
>> >> 2 tests and 106 subtests skipped.
>> >> Failed 2/236 test scripts. 1382/11688 subtests failed.
>> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
>> >> 159.61 CPU)
>> >>
>> >> BioDBSeqFeature_mysql
>> >> =====================
>> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>> >> 1..46
>> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
>> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
>> >> (eval 37) line 3.
>> >> Perhaps the DBD::mysql perl module hasn't been fully installed,
>> >> or perhaps the capitalisation of 'mysql' isn't right.
>> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>> >>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>> >>
>> >> SearchIO
>> >> ========
>> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>> >> 1..1337
>> >> ok 1
>> >>
>> >> -------------------- WARNING ---------------------
>> >> MSG: XML::SAX::Expat not currently supported; must have local copies
>> >> of NCBI DTD docs!
>> >> ---------------------------------------------------
>> >>
>> >> -------------------- WARNING ---------------------
>> >> MSG: error in parsing a report:
>> >>
>> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> >> does not exist
>> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> >> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> >> error in processing external entity reference at line 2, column 82,
>> >> byte 104 at
>> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> >> 187
>> >>
>> >> ---------------------------------------------------
>> >> not ok 2
>> >> # Failed test 2 in t/SearchIO.t at line 68
>> >> Can't call method "database_name" on an undefined value at
>> >> t/SearchIO.t line 69.
>> >>
>> >> ------------------------------
>> >>
>> >> Message: 10
>> >> Date: Tue, 17 Oct 2006 11:32:54 +0100
>> >> From: Sendu Bala <bix at sendu.me.uk>
>> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>> >> To: bioperl-l at bioperl.org
>> >> Message-ID: <4534B156.4090501 at sendu.me.uk>
>> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>> >>
>> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
>> >> See http://www.bioperl.org/wiki/Release_1.5.2 for
>> >> instructions on getting and testing this RC.
>> >>
>> >> Developers:
>> >>     This should be the last RC before release ~next monday. Now would
>> >>     be a good time for last minute documentaiton updates and additions.
>> >>
>> >> Users:
>> >>     Even though 1.5.2 is a 'developer' release, we consider it the most
>> >>     stable and capable version of Bioperl, and recommend that you use
>> >>     it in all but the most critical production environments. Please
>> >>     try it out and let us know of any problems or difficulties you run
>> >>     into.
>> >>
>> >>
>> >> Thank you,
>> >> Sendu.
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >
>> >
>>
>
>
>


From cjfields at uiuc.edu  Tue Oct 17 20:51:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 19:51:35 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
	<20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
Message-ID: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>

On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote:

> Hi Chris,
>
> Yup, that's it.  I installed XML::SAX::ExpatXS (make test output
> below).  Should there be a note somewhere in the INSTALL docs saying
> basically what you just wrote?  Or maybe it's already there somewhere
> and I missed it.

The INSTALL docs should have this, yes.  I'll double-check though.

Pretty much anything that plugs into XML::SAX except XML::SAX::Expat  
works (XML::LibXML also works, I found).

> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks
> if DBD::mysql can be loaded, and if not doesn't run the test.  Since
> the file is only one-line long, here's the modified file rather than a
> patch:
> ################################################################
> BEGIN {
>          # DBD::mysql is required
>          eval {
>                  require DBD::mysql;
>                  };
>          if ( $@ ) {
>                  use Test::More skip_all => "DBD::mysql is not
> installed or is installed incorrectly - skipping BioDBSeqFeature
> _mysql.t";
>                  exit(0);
>                  }
>          }
>
> system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1
> -dsn test";
> ################################################################
>
> And when I run it I get:
> t/BioDBSeqFeature_mysql......skipped
>          all skipped: DBD::mysql is not installed or is installed
> incorrectly - skipping BioDBSeqFeature_mysql.t
>
> And for the overall make test:
> All tests successful, 3 tests and 106 subtests skipped.
> Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys =
> 164.24 CPU)

It should check this when using 'perl Makefile.PL', since the tests  
are only set up if MySQL is present (so you would assume that it  
checks for DBD::mysql).  I'll look into it.

Chris

> Hope this helps,
> Paul
>
>
> Quoting Chris Fields <cjfields at uiuc.edu>:
>
>> Your local copy of XML::SAX has XML::SAX::Expat set as the default  
>> SAX
>> backend parser.  For some reason BLAST XML parsing doesn't work  
>> with that
>> parser (it tries to verify the XML first before parsing, hence the  
>> DTD
>> error).  I may try getting this to work again, but so far I  
>> haven't found an
>> easy way to prevent XML verification via XML::SAX::Expat.
>>
>> There are two options: 1) install XML::SAX::ExpatXS (the better  
>> option),
>> which works AND is 4x faster than XML::SAX::Expat, or  2) set the  
>> default
>> parser in the PareserDetails.ini file in your local to use
>> XML::SAX::PurePerl.
>>
>> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it  
>> just
>> hasn't officially happened yet); the latter hasn't had significant
>> development in about three years.
>>
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>> -----Original Message-----
>>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
>>> Sent: Tuesday, October 17, 2006 1:00 PM
>>> To: Chris Fields
>>> Cc: bioperl-l at lists.open-bio.org
>>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
>>>
>>> Hi Chris,
>>>
>>> Here it is:
>>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
>>> 1..1337
>>> ok 1
>>>
>>> -------------------- WARNING ---------------------
>>> MSG: XML::SAX::Expat not currently supported; must have local copies
>>> of NCBI DTD docs!
>>> ---------------------------------------------------
>>>
>>> -------------------- WARNING ---------------------
>>> MSG: error in parsing a report:
>>>
>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>>> does not exist
>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>>> Handler couldn't resolve external entity at line 2, column 82,  
>>> byte 104
>>> error in processing external entity reference at line 2, column 82,
>>> byte 104 at
>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm  
>>> line
>>> 187
>>>
>>> ---------------------------------------------------
>>> not ok 2
>>> # Failed test 2 in t/SearchIO.t at line 68
>>> Can't call method "database_name" on an undefined value at
>>> t/SearchIO.t line 69.
>>>
>>>
>>> Quoting Chris Fields <cjfields at uiuc.edu>:
>>>
>>>> What do you get when you run the SearchIO.t test by itself using  
>>>> 'perl -
>>> I.
>>>> t/SearchIO.t'?  It looks like something pretty catastrophic  
>>>> happened.
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher - Switzer Lab
>>>> Dept. of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>>>>> Sent: Tuesday, October 17, 2006 11:57 AM
>>>>> To: bioperl-l at lists.open-bio.org
>>>>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>>>>>
>>>>> Hi,
>>>>> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two  
>>>>> failed
>>>>> tests, the first seems to be just a result of me not having  
>>>>> DBD::mysql
>>>>> installed.
>>>>> Paul
>>>>>
>>>>> Test Summary
>>>>> ============
>>>>>
>>>>> Failed Test               Stat Wstat Total Fail  List of Failed
>>>>> ------------------------------------------------------------------ 
>>>>> -----
>>> ---
>>>>> -----
>>>>> t/BioDBSeqFeature_mysql.t               46   46  1-46
>>>>> t/SearchIO.t                22  5632  1337 2671  2-1337
>>>>> 2 tests and 106 subtests skipped.
>>>>> Failed 2/236 test scripts. 1382/11688 subtests failed.
>>>>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14  
>>>>> csys =
>>>>> 159.61 CPU)
>>>>>
>>>>> BioDBSeqFeature_mysql
>>>>> =====================
>>>>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>>>>> 1..46
>>>>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC  
>>>>> (@INC
>>>>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>>>>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>>>>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/ 
>>>>> site_perl) at
>>>>> (eval 37) line 3.
>>>>> Perhaps the DBD::mysql perl module hasn't been fully installed,
>>>>> or perhaps the capitalisation of 'mysql' isn't right.
>>>>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>>>>>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>>>>>
>>>>> SearchIO
>>>>> ========
>>>>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>>>>> 1..1337
>>>>> ok 1
>>>>>
>>>>> -------------------- WARNING ---------------------
>>>>> MSG: XML::SAX::Expat not currently supported; must have local  
>>>>> copies
>>>>> of NCBI DTD docs!
>>>>> ---------------------------------------------------
>>>>>
>>>>> -------------------- WARNING ---------------------
>>>>> MSG: error in parsing a report:
>>>>>
>>>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/ 
>>>>> NCBI_BlastOutput.dtd'
>>>>> does not exist
>>>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>>>>> Handler couldn't resolve external entity at line 2, column 82,  
>>>>> byte 104
>>>>> error in processing external entity reference at line 2, column  
>>>>> 82,
>>>>> byte 104 at
>>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/ 
>>>>> Parser.pm line
>>>>> 187
>>>>>
>>>>> ---------------------------------------------------
>>>>> not ok 2
>>>>> # Failed test 2 in t/SearchIO.t at line 68
>>>>> Can't call method "database_name" on an undefined value at
>>>>> t/SearchIO.t line 69.
>>>>>
>>>>> ------------------------------
>>>>>
>>>>> Message: 10
>>>>> Date: Tue, 17 Oct 2006 11:32:54 +0100
>>>>> From: Sendu Bala <bix at sendu.me.uk>
>>>>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>>>>> To: bioperl-l at bioperl.org
>>>>> Message-ID: <4534B156.4090501 at sendu.me.uk>
>>>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>>>
>>>>> Bioperl 1.5.2 Release Candidate 2 is ready and available for  
>>>>> testing.
>>>>> See http://www.bioperl.org/wiki/Release_1.5.2 for
>>>>> instructions on getting and testing this RC.
>>>>>
>>>>> Developers:
>>>>>     This should be the last RC before release ~next monday. Now  
>>>>> would
>>>>>     be a good time for last minute documentaiton updates and  
>>>>> additions.
>>>>>
>>>>> Users:
>>>>>     Even though 1.5.2 is a 'developer' release, we consider it  
>>>>> the most
>>>>>     stable and capable version of Bioperl, and recommend that  
>>>>> you use
>>>>>     it in all but the most critical production environments.  
>>>>> Please
>>>>>     try it out and let us know of any problems or difficulties  
>>>>> you run
>>>>>     into.
>>>>>
>>>>>
>>>>> Thank you,
>>>>> Sendu.
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>
>>
>>
>>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Wed Oct 18 02:52:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 07:52:05 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534B156.4090501@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>
Message-ID: <4535CF15.4090502@sendu.me.uk>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> See http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
> 
> Developers:
>    This should be the last RC before release ~next monday. Now would
>    be a good time for last minute documentaiton updates and additions.

Given the few issues that have come up, it would be prudent to have 
another RC, so expect one around the time the 'Needs investigation' 
issues on the release page have been solved.

If you think there are more things that need investigation, please add 
them, but note the bias toward things that affect the successful 
completion of the test suite as opposed to general bugs which should go 
to Bugzilla as normal.


From bix at sendu.me.uk  Wed Oct 18 04:55:21 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 09:55:21 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45350BA6.3040102@genomics.dk>
References: <4534B156.4090501@sendu.me.uk>
	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>
	<45350BA6.3040102@genomics.dk>
Message-ID: <4535EBF9.1090706@sendu.me.uk>

Niels Larsen wrote:

> ------------ EBI
> 
> I invoked the EBI script
> 
> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip
> 
> like this
> 
> WSWUBlastClient.pl -p blastn -D embl test.fasta
> 
> where the content of test.fasta is below, and got
> 
> Can't find method element in the message at 
> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

As you admit, this is not a Bioperl issue. I would suggest you contact 
EBI support.

In the mean time/alternatively I'd suggest investigating the Bioperl 
interface to the SOAP server, which is part of the Bioperl-run package.

http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/Analysis.html


> ------------ DDBJ
> 
> Inspired by this page,
> 
> http://xml.nig.ac.jp/doc/Blast.txt
> 
> I made this test script
[snip]
> which for me prints undef.

Again, not something I can really help you with. You'll need to 
triple-check your code and then seek support from the providers of that 
SOAP service.


> ------------- NCBI/Bioperl
> 
> I installed 1.5.2-RC2, looked at the RemoteBlast example in
> 
> http://www.bioperl.org/wiki/Bptutorial.pl
> 
> and then put that into this test code, more or less cut/paste,
[snip]
> Maybe I am supposed to add a check for content in $rc and then stop
> the inner loop?

Yes, the wiki page example isn't really adequate. I'll update it. For a 
better code example see the RemoteBlast documentation:

http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html


> I could figure that out maybe, but I wish there was a
> function which simply takes a single sequence + arguments and only
> returns a list of matches when done, and does not return until then
> (or until a specified timeout).

Yes, I hardly find dealing with RIDs that pleasant. You might like to 
add a feature request to Bugzilla.


From n.haigh at sheffield.ac.uk  Wed Oct 18 05:58:00 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 10:58:00 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
Message-ID: <4535FAA8.2050506@sheffield.ac.uk>

I get all tests passing except for BioDBSeqFeature_mysql which fails all
tests (1-46).

During perl Makefile.PL I get:
"I see you have Berkeleydb installed. I will create the DBD tests for
Bio::DB::SeqFeature::Store..."

I notice under the "needs investigation" there is mention about tests
been generated even if DBD::mysql isn't installed. I assume this is the
problem? If this is the problem should DBD::mysql be added to the
dependencies in Makefile.PL?

Is there an easy way to find out what tests are being skipped due to
absent modules?

Cheers
Nath


From n.haigh at sheffield.ac.uk  Wed Oct 18 07:34:21 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 12:34:21 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4535EBF9.1090706@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk>
Message-ID: <4536113D.1080307@sheffield.ac.uk>

I've just added test results for 1.5.2 RC2 to the wiki.

There are lots of fails for packages other than bioperl-live. I'm not
sure excatly how the test fails/skipps are/should be handled since my
setups are as follows.

Clean WinXP Pro:
This is a clean install of WinXP Pro SP2 with no major software
installed, other than ActivePerl 5.8.8.819 and a few tools for archive
extracting, anti virus etc. Therefore, I'm unsure how tests in
bioperl-network and bioperl-db should return. For example, I have made
no effort to setup biosql-schema but I thought that maybe there would be
a test that would detect this, and fail, then skip over other tests
gracefully - like the bioperl-run tests when a piece of software is not
installed???

Debian Linux:
This is a Bio-Linux machine with quite a lot of bioinformatics software
installed in the Path. So most of the tests in bioperl-run should
probably have passed. The same goes for bioperl-network and bioperl-db
as with my Windows setup.

If my thoughts are totally wrong - let me know!
Nath


From bix at sendu.me.uk  Wed Oct 18 08:03:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 13:03:11 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <4535FAA8.2050506@sheffield.ac.uk>
References: <4535FAA8.2050506@sheffield.ac.uk>
Message-ID: <453617FF.9080508@sendu.me.uk>

Nathan Haigh wrote:
> I get all tests passing except for BioDBSeqFeature_mysql which fails all
> tests (1-46).
> 
> During perl Makefile.PL I get:
> "I see you have Berkeleydb installed. I will create the DBD tests for
> Bio::DB::SeqFeature::Store..."
> 
> I notice under the "needs investigation" there is mention about tests
> been generated even if DBD::mysql isn't installed. I assume this is the
> problem? 

Probably. I'm looking into it. Not sure why it wasn't causing a problem 
before now.

 > If this is the problem should DBD::mysql be added to the
 > dependencies in Makefile.PL?

No. You can use the modules in question without mysql (presumably; ie. 
you have a different sql setup), so it makes no sense to warn people 
they don't have a module they absolutely do not need.


> Is there an easy way to find out what tests are being skipped due to
> absent modules?

Ideally, when the skip occurs the test script will issue a message. I 
think that happens in most, if not all cases.


From bix at sendu.me.uk  Wed Oct 18 09:02:50 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 14:02:50 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <453617FF.9080508@sendu.me.uk>
References: <4535FAA8.2050506@sheffield.ac.uk> <453617FF.9080508@sendu.me.uk>
Message-ID: <453625FA.6090907@sendu.me.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
?
>> I notice under the "needs investigation" there is mention about tests
>> been generated even if DBD::mysql isn't installed. I assume this is the
>> problem? 
> 
> Probably. I'm looking into it. Not sure why it wasn't causing a problem 
> before now.
> 
>  > If this is the problem should DBD::mysql be added to the
>  > dependencies in Makefile.PL?
> 
> No. You can use the modules in question without mysql (presumably; ie. 
> you have a different sql setup), so it makes no sense to warn people 
> they don't have a module they absolutely do not need.

Oops. It /is/ in the pre-reqs in Makefile.PL. Maybe DBD::mysql is the 
only supported driver?


From bix at sendu.me.uk  Wed Oct 18 09:16:24 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 14:16:24 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>	<20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
	<67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>
Message-ID: <45362928.8070104@sendu.me.uk>

Chris Fields wrote:
> On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote:
> 
>> Hi Chris,
>>
>> Yup, that's it.  I installed XML::SAX::ExpatXS (make test output
>> below).  Should there be a note somewhere in the INSTALL docs saying
>> basically what you just wrote?  Or maybe it's already there somewhere
>> and I missed it.
> 
> The INSTALL docs should have this, yes.  I'll double-check though.
> 
> Pretty much anything that plugs into XML::SAX except XML::SAX::Expat  
> works (XML::LibXML also works, I found).
> 
>> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks
>> if DBD::mysql can be loaded,
[snip]
> It should check this when using 'perl Makefile.PL', since the tests  
> are only set up if MySQL is present (so you would assume that it  
> checks for DBD::mysql).  I'll look into it.

This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in 
my t directory when I packed it up for release.

I'm tweaking Makefile.PL right now in any case; there are a few errors 
and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean.


From cjfields at uiuc.edu  Wed Oct 18 09:55:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 08:55:37 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
Message-ID: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>

Ding dong the witch is dead!  As announce previously, from the latest
GenBank release (156.0):

-----------------------------------------------

1.3.8 Feature location syntax X.Y no longer supported

  The Feature Table has supported feature locations of the form 'X.Y', to
represent a base position which is greater or equal to X, and less than or
equal to Y. For example:

	misc_feature    1.10..20
	misc_feature    join(100..150,200.210..250)

  In the first example, the misc_feature starts somewhere between bases 1
and 10 (inclusive), and ends at basepair 20. In the second, the 51 bases
from 100..150 are joined together with a second basepair interval, which
could be anywhere from 200..250 to 210..250 .

  Although this syntax seems like a reasonable way to capture an uncertain
interval, it is used for features on a vanishingly small number of sequence
records, most database submission mechanisms don't support it, and the
meaning of its use in a join() context is not entirely clear.

  As of October 2006, this type of location is no longer supported.
Those records with features which utilize X.Y locations will be reviewed and
converted to a non-uncertain format.

-----------------------------------------------

EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
Not sure about UniProt/SwissProt.

I guess we're keeping this in for backwards compatibility only, but how do
we handle any bugs that pop up related to this?  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Wed Oct 18 10:10:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:10:07 -0500
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <453617FF.9080508@sendu.me.uk>
Message-ID: <001f01c6f2bf$20737270$15327e82@pyrimidine>

> Nathan Haigh wrote:
> > I get all tests passing except for BioDBSeqFeature_mysql which fails all
> > tests (1-46).
> >
> > During perl Makefile.PL I get:
> > "I see you have Berkeleydb installed. I will create the DBD tests for
> > Bio::DB::SeqFeature::Store..."
> >
> > I notice under the "needs investigation" there is mention about tests
> > been generated even if DBD::mysql isn't installed. I assume this is the
> > problem?
> 
> Probably. I'm looking into it. Not sure why it wasn't causing a problem
> before now.

Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP
because 'perl Makefile.PL' doesn't detect my MySQL installation, so the
MySQL-based tests don't run even though I have DBD::mysql installed.  I
thought this might just be a WinXP issue, but apparently not.  If I can get
to it I'll run a few checks.

>  > If this is the problem should DBD::mysql be added to the
>  > dependencies in Makefile.PL?
> 
> No. You can use the modules in question without mysql (presumably; ie.
> you have a different sql setup), so it makes no sense to warn people
> they don't have a module they absolutely do not need.

Agreed, though I don't know if other relational DB's are supported like
PostgreSQL.

> > Is there an easy way to find out what tests are being skipped due to
> > absent modules?
> 
> Ideally, when the skip occurs the test script will issue a message. I
> think that happens in most, if not all cases.

Yes, though we may run into the same issue we had with XEMBL tests not
reporting the reasons it skipped.  Each test suite should run an eval{} to
check the required modules, then only skip blocks of tests that rely on
those modules.  I think we have caught most of those, but who knows w/o
doing a complete test suite audit?

Our eventual complete switchover to Test::More should hopefully clean these
up.  I don't consider it a pressing issue for this release, though Sendu may
feel differently.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Wed Oct 18 10:12:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:12:52 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45362928.8070104@sendu.me.uk>
Message-ID: <002001c6f2bf$807849c0$15327e82@pyrimidine>

...
> This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in
> my t directory when I packed it up for release.
> 
> I'm tweaking Makefile.PL right now in any case; there are a few errors
> and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean.

Okay, makes sense now.  No big deal, it's still an RC (a developer's RC at
that!).

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Wed Oct 18 10:17:35 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 15:17:35 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <001f01c6f2bf$20737270$15327e82@pyrimidine>
References: <001f01c6f2bf$20737270$15327e82@pyrimidine>
Message-ID: <4536377F.6000408@sheffield.ac.uk>

Chris Fields wrote:
>> Nathan Haigh wrote:
>>     
>>> I get all tests passing except for BioDBSeqFeature_mysql which fails all
>>> tests (1-46).
>>>
>>> During perl Makefile.PL I get:
>>> "I see you have Berkeleydb installed. I will create the DBD tests for
>>> Bio::DB::SeqFeature::Store..."
>>>
>>> I notice under the "needs investigation" there is mention about tests
>>> been generated even if DBD::mysql isn't installed. I assume this is the
>>> problem?
>>>       
>> Probably. I'm looking into it. Not sure why it wasn't causing a problem
>> before now.
>>     
>
> Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP
> because 'perl Makefile.PL' doesn't detect my MySQL installation, so the
> MySQL-based tests don't run even though I have DBD::mysql installed.  I
> thought this might just be a WinXP issue, but apparently not.  If I can get
> to it I'll run a few checks.
>
>   
This was on WinXP.
>>  > If this is the problem should DBD::mysql be added to the
>>  > dependencies in Makefile.PL?
>>
>> No. You can use the modules in question without mysql (presumably; ie.
>> you have a different sql setup), so it makes no sense to warn people
>> they don't have a module they absolutely do not need.
>>     
>
> Agreed, though I don't know if other relational DB's are supported like
> PostgreSQL.
>
>   
>>> Is there an easy way to find out what tests are being skipped due to
>>> absent modules?
>>>       
>> Ideally, when the skip occurs the test script will issue a message. I
>> think that happens in most, if not all cases.
>>     
>
> Yes, though we may run into the same issue we had with XEMBL tests not
> reporting the reasons it skipped.  Each test suite should run an eval{} to
> check the required modules, then only skip blocks of tests that rely on
> those modules.  I think we have caught most of those, but who knows w/o
> doing a complete test suite audit?
>
> Our eventual complete switchover to Test::More should hopefully clean these
> up.  I don't consider it a pressing issue for this release, though Sendu may
> feel differently.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   


From hlapp at gmx.net  Wed Oct 18 10:36:31 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 18 Oct 2006 10:36:31 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
References: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
Message-ID: <B8036AB0-741F-427A-9EB1-7E80A28EC79F@gmx.net>


On Oct 18, 2006, at 9:55 AM, Chris Fields wrote:

> how do we handle any bugs that pop up related to this?

By an evil grin, followed by deflecting the blame to NCBI, followed  
by another evil grin.
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Oct 18 10:43:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:43:31 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <B8036AB0-741F-427A-9EB1-7E80A28EC79F@gmx.net>
Message-ID: <002401c6f2c3$c83c7e30$15327e82@pyrimidine>

> On Oct 18, 2006, at 9:55 AM, Chris Fields wrote:
> 
> > how do we handle any bugs that pop up related to this?
> 
> By an evil grin, followed by deflecting the blame to NCBI, followed
> by another evil grin.
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Sounds good to me!  One less thing to worry about.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Wed Oct 18 10:45:57 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 15:45:57 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
Message-ID: <45363E25.8010806@sheffield.ac.uk>

Nathan Haigh wrote:
> I've just added test results for 1.5.2 RC2 to the wiki.
>
> There are lots of fails for packages other than bioperl-live. I'm not
> sure excatly how the test fails/skipps are/should be handled since my
> setups are as follows.
>
> Clean WinXP Pro:
> This is a clean install of WinXP Pro SP2 with no major software
> installed, other than ActivePerl 5.8.8.819 and a few tools for archive
> extracting, anti virus etc. Therefore, I'm unsure how tests in
> bioperl-network and bioperl-db should return. For example, I have made
> no effort to setup biosql-schema but I thought that maybe there would be
> a test that would detect this, and fail, then skip over other tests
> gracefully - like the bioperl-run tests when a piece of software is not
> installed???
>
> Debian Linux:
> This is a Bio-Linux machine with quite a lot of bioinformatics software
> installed in the Path. So most of the tests in bioperl-run should
> probably have passed. The same goes for bioperl-network and bioperl-db
> as with my Windows setup.
>
> If my thoughts are totally wrong - let me know!
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just looking into the failed Linux tests.

Several of the tests result in errors like:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: ARGUMENTS !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Alignment::Exonerate::AUTOLOAD
/home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:126
STACK: Bio::Tools::Run::Alignment::Exonerate::new
/home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:154
STACK: t/Exonerate.t:32
-----------------------------------------------------------

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: 'arguments' !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Hmmer::AUTOLOAD Bio/Tools/Run/Hmmer.pm:172
STACK: Bio::Tools::Run::Hmmer::_run Bio/Tools/Run/Hmmer.pm:253
STACK: Bio::Tools::Run::Hmmer::run Bio/Tools/Run/Hmmer.pm:228
STACK: t/Hmmer.t:54
-----------------------------------------------------------

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: ARGUMENTS !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Phrap::AUTOLOAD Bio/Tools/Run/Phrap.pm:137
STACK: Bio::Tools::Run::Phrap::new Bio/Tools/Run/Phrap.pm:165
STACK: t/Phrap.t:34
-----------------------------------------------------------

Any ideas??

Nath


From hlapp at gmx.net  Wed Oct 18 10:51:36 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 18 Oct 2006 10:51:36 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk>
Message-ID: <E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>


On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:

>  For example, I have made
> no effort to setup biosql-schema but I thought that maybe there  
> would be
> a test that would detect this

I'm afraid there isn't. Bioperl-db is meaningless without biosql-schema.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bosborne11 at verizon.net  Wed Oct 18 10:43:06 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 10:43:06 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
 GenBank/EMBL/DDBJ
In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
Message-ID: <C15BB5BA.ADAA%bosborne11@verizon.net>

Chris,

I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
of the more recent examples in t/LocationFactory.t come from there.

Brian O.


On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
> Not sure about UniProt/SwissProt.


From cjfields at uiuc.edu  Wed Oct 18 11:00:30 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 10:00:30 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <C15BB5BA.ADAA%bosborne11@verizon.net>
Message-ID: <002501c6f2c6$27625540$15327e82@pyrimidine>

Do they still use the X.Y notations?  Those are the most troublesome.  I
guess we still don't support the ones containing '?'.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: Brian Osborne [mailto:bosborne11 at verizon.net]
> Sent: Wednesday, October 18, 2006 9:43 AM
> To: Chris Fields; bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in
> GenBank/EMBL/DDBJ
> 
> Chris,
> 
> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
> of the more recent examples in t/LocationFactory.t come from there.
> 
> Brian O.
> 
> 
> On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:
> 
> > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
> > Not sure about UniProt/SwissProt.


From Kevin.M.Brown at asu.edu  Wed Oct 18 11:16:50 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 18 Oct 2006 08:16:50 -0700
Subject: [Bioperl-l] Blast information
Message-ID: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>

I just recently upgraded to 1.5.1 on WinXP to bring this version closer
to live to parse some locally created blast files.  I'm trying to find
the method that returns the values that are underneath the Identities
and Positives information as I'm trying to replicate the output of an
old blast parser we have here written in RealBasic which is showing its
age.  Once I have it replicating the old output I then intend to add
more features in terms of filtering returned hits (like not returning
self->self hits or a->b so don't show b->a).

Example:
I'm looking for the methods that will return 117 from identities and 117
from positives.  I can't just use num_identical/percent_identity as that
isn't 100% accurate.

>BurkM_2016
          Length = 241

 Score = 43.2 bits (88), Expect = 7e-005
 Identities = 26/117 (22%), Positives = 51/117 (43%)

Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
357
           Q   F  F  + A+    ++ +         + + L +R   GL   + P   E + A+L
Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
170

Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
              A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227

Thanks,
Kevin


From cjfields at uiuc.edu  Wed Oct 18 11:25:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 10:25:59 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
Message-ID: <002601c6f2c9$b6d04a90$15327e82@pyrimidine>

> I've just added test results for 1.5.2 RC2 to the wiki.
> 
> There are lots of fails for packages other than bioperl-live. I'm not
> sure excatly how the test fails/skipps are/should be handled since my
> setups are as follows.
> 
> Clean WinXP Pro:
> This is a clean install of WinXP Pro SP2 with no major software
> installed, other than ActivePerl 5.8.8.819 and a few tools for archive
> extracting, anti virus etc. Therefore, I'm unsure how tests in
> bioperl-network and bioperl-db should return. For example, I have made
> no effort to setup biosql-schema but I thought that maybe there would be
> a test that would detect this, and fail, then skip over other tests
> gracefully - like the bioperl-run tests when a piece of software is not
> installed???
> 
> Debian Linux:
> This is a Bio-Linux machine with quite a lot of bioinformatics software
> installed in the Path. So most of the tests in bioperl-run should
> probably have passed. The same goes for bioperl-network and bioperl-db
> as with my Windows setup.
> 
> If my thoughts are totally wrong - let me know!
> Nath

The bioperl-db tests rely on a local BioSQL database and on having a
properly set up configuration file (these are detailed in the bioperl-db
INSTALL doc).  Furthermore, there are serious problems with bioperl-db and
WinXP (see Bug 1938 in bugzilla).  There is a workaround, but it isn't
perfect by any means.  

http://bugzilla.open-bio.org/show_bug.cgi?id=1938

Many of the bioperl-run tests rely on env. variables being set properly, so
maybe that's why they failed.  These should all be detailed in the INSTALL
file (but maybe they aren't?).

I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac OS
X yet but intended on doing this within the week.  The INSTALL file details
the requirements for the packages (Graph 0.80 is the only one for
bioperl-network, for instance, and there isn't a PPM for that version
available yet).  

It would be nice to skip the tests based on absence of the particular
modules or installed programs, and I think the final goal is to possibly
attempt to do this.  However, all of the bioperl-related distributions have
their own documentation which outline their installation, requirements, and
use.  At least we can point to that, which works for now.  We could always
start up a wiki page for the various bioperl distributions to monitor
problems or issues with each based on OS, proposed enhancements/ideas, etc.


Also, most (if not all, including core) have been primarily tested on some
*nix-related system, which means that they may not work on Win32 systems.
Though the Windows support is light-years ahead of what it used to be circa
rel 0.7, I don't think it is full-proof yet, as witnessed by the bioperl-db
bug.  Frankly, we need more WinXP users for those packages willing to test
them out and offer suggestions.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign l


From bosborne11 at verizon.net  Wed Oct 18 11:13:51 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 11:13:51 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
 GenBank/EMBL/DDBJ
In-Reply-To: <002501c6f2c6$27625540$15327e82@pyrimidine>
Message-ID: <C15BBCEF.ADB8%bosborne11@verizon.net>

Chris,

No, I don't think they use the form X.Y. See below, from
t/LocationFactory.t, we do support most of the forms using ?. Supposedly
these tests accommodate all of the possible fuzzy locations encountered in
Swissprot, I wrote these a year or so ago.

Brian O.


        # UNCERTAIN locations and positions (Swissprot)
   "?2465..2774" => [$fuzzy_impl,
       2465, 2465, "UNCERTAIN", 2774, 2774, "EXACT", "EXACT", 1, 1],
   "22..?64" => [$fuzzy_impl,
       22, 22, "EXACT", 64, 64, "UNCERTAIN", "EXACT", 1, 1],
   "?22..?64" => [$fuzzy_impl,
       22, 22, "UNCERTAIN", 64, 64, "UNCERTAIN", "EXACT", 1, 1],
   "?..>393" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", 393, undef, "AFTER", "UNCERTAIN", 1, 1],
   "<1..?" => [$fuzzy_impl,
       undef, 1, "BEFORE", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1],
   "?..536" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", 536, 536, "EXACT", "UNCERTAIN", 1, 1],
   "1..?" => [$fuzzy_impl,
       1, 1, "EXACT", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1],
   "?..?" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", undef, undef, "UNCERTAIN", "UNCERTAIN", 1,
1],
   # Not working yet:
   #"12..?1" => [$fuzzy_impl,
   #    1, 1, "UNCERTAIN", 12, 12, "EXACT", "EXACT", 1, 1]


On 10/18/06 11:00 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> Do they still use the X.Y notations?  Those are the most troublesome.  I
> guess we still don't support the ones containing '?'.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
>> -----Original Message-----
>> From: Brian Osborne [mailto:bosborne11 at verizon.net]
>> Sent: Wednesday, October 18, 2006 9:43 AM
>> To: Chris Fields; bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in
>> GenBank/EMBL/DDBJ
>> 
>> Chris,
>> 
>> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
>> of the more recent examples in t/LocationFactory.t come from there.
>> 
>> Brian O.
>> 
>> 
>> On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>> 
>>> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
>>> Not sure about UniProt/SwissProt.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Wed Oct 18 12:56:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 11:56:07 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <002601c6f2c9$b6d04a90$15327e82@pyrimidine>
Message-ID: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>

...
> I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac
> OS
All,

> X yet but intended on doing this within the week.  The INSTALL file
> details
> the requirements for the packages (Graph 0.80 is the only one for
> bioperl-network, for instance, and there isn't a PPM for that version
> available yet).
...

As a followup in this, I tried bioperl-network and had similar failed tests
with Graph 0.79 (the only PPM available from ActiveState).  However, the
INSTALL docs state that Graph 0.80 is needed, and the test run gave several
warnings about not having Graph 0.80 installed. 

I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and
everything passed.  Maybe we need to have a Graph PPM available for those
who want bioperl-network?

As for bioperl-run, all tests passed from a new CVS checkout even though I
have none of the programs installed, so they seem to skip properly.  The
test run also printed warnings when a program wasn't available or installed.


Chris


From bosborne11 at verizon.net  Wed Oct 18 13:10:34 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 13:10:34 -0400
Subject: [Bioperl-l] Blast information
In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
Message-ID: <C15BD84A.ADCC%bosborne11@verizon.net>

Kevin,

Are you looking for hsp_length()? See the SearchIO HOWTO for a list of
methods:

http://www.bioperl.org/wiki/HOWTO:SearchIO


Brian O.


On 10/18/06 11:16 AM, "Kevin Brown" <Kevin.M.Brown at asu.edu> wrote:

> I just recently upgraded to 1.5.1 on WinXP to bring this version closer
> to live to parse some locally created blast files.  I'm trying to find
> the method that returns the values that are underneath the Identities
> and Positives information as I'm trying to replicate the output of an
> old blast parser we have here written in RealBasic which is showing its
> age.  Once I have it replicating the old output I then intend to add
> more features in terms of filtering returned hits (like not returning
> self->self hits or a->b so don't show b->a).
> 
> Example:
> I'm looking for the methods that will return 117 from identities and 117
> from positives.  I can't just use num_identical/percent_identity as that
> isn't 100% accurate.
> 
>> BurkM_2016
>           Length = 241
> 
>  Score = 43.2 bits (88), Expect = 7e-005
>  Identities = 26/117 (22%), Positives = 51/117 (43%)
> 
> Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> 357
>            Q   F  F  + A+    ++ +         + + L +R   GL   + P   E + A+L
> Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> 170
> 
> Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
>               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
> 
> Thanks,
> Kevin
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Kevin.M.Brown at asu.edu  Wed Oct 18 17:25:48 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 18 Oct 2006 14:25:48 -0700
Subject: [Bioperl-l] Blast information
Message-ID: <1A4207F8295607498283FE9E93B775B4022A71C3@EX02.asurite.ad.asu.edu>

Yes, that does indeed look like what I was after. 

> -----Original Message-----
> From: Brian Osborne [mailto:bosborne11 at verizon.net] 
> Sent: Wednesday, October 18, 2006 10:11 AM
> To: Kevin Brown; bioperl-l
> Subject: Re: [Bioperl-l] Blast information
> 
> Kevin,
> 
> Are you looking for hsp_length()? See the SearchIO HOWTO for a list of
> methods:
> 
> http://www.bioperl.org/wiki/HOWTO:SearchIO
> 
> 
> Brian O.
> 
> 
> On 10/18/06 11:16 AM, "Kevin Brown" <Kevin.M.Brown at asu.edu> wrote:
> 
> > I just recently upgraded to 1.5.1 on WinXP to bring this 
> version closer
> > to live to parse some locally created blast files.  I'm 
> trying to find
> > the method that returns the values that are underneath the 
> Identities
> > and Positives information as I'm trying to replicate the 
> output of an
> > old blast parser we have here written in RealBasic which is 
> showing its
> > age.  Once I have it replicating the old output I then intend to add
> > more features in terms of filtering returned hits (like not 
> returning
> > self->self hits or a->b so don't show b->a).
> > 
> > Example:
> > I'm looking for the methods that will return 117 from 
> identities and 117
> > from positives.  I can't just use 
> num_identical/percent_identity as that
> > isn't 100% accurate.
> > 
> >> BurkM_2016
> >           Length = 241
> > 
> >  Score = 43.2 bits (88), Expect = 7e-005
> >  Identities = 26/117 (22%), Positives = 51/117 (43%)
> > 
> > Query: 298 
> QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> > 357
> >            Q   F  F  + A+    ++ +         + + L +R   GL   + 
> P   E + A+L
> > Sbjct: 111 
> QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> > 170
> > 
> > Query: 358 
> MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
> >               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> > Sbjct: 171 
> KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
> > 
> > Thanks,
> > Kevin
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> 


From n.appleby at uq.edu.au  Wed Oct 18 17:58:06 2006
From: n.appleby at uq.edu.au (Nikki Appleby)
Date: Thu, 19 Oct 2006 07:58:06 +1000
Subject: [Bioperl-l] CONTIG dealing
Message-ID: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>


I have just entered the wonderful new world of BioPerl, so the answer to my
question may be obvious to any of the gurus reading this.

I need to collect sequence features and ontology annotations. Here goes.

I am retrieving sequences from SwissProt via Bio::DB::SwissProt and
get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into an RDBMS
format that I am happy with I can get at the xref ids. In this case, they
are 

AP003451; BAB86144.1; -; Genomic_DNA. 
AP008207; BAF07116.1; -; Genomic_DNA. 
AB103395; BAC81207.1; -; mRNA. 

I can happily go off and fetch those from Bio::DB::GenBank (first column),
and Bio::DB::GenPept (second). All good, except...

AP008207 is a contig. I don't want to get all of the features for the entire
thing, just the single contig that actually matches the original sequence.
It takes a couple of hours to get at it and then it gives me way too much.

I will come across this problem with other sequences. How do I (a) find out
if it is a contig without downloading it in it's entirety and (b) extract
the list of sequences that are about to be contigged together.

I have searched the web for answers, including this list, but see nothing.
Help!
 
Nikki Appleby.


From bosborne11 at verizon.net  Wed Oct 18 20:54:04 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 20:54:04 -0400
Subject: [Bioperl-l] LocatableSeq object vs Sequence Object
In-Reply-To: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com>
Message-ID: <C15C44EC.ADF8%bosborne11@verizon.net>

Peter,

I'm not understanding your question, partly because your letter and your
code are saying different things. You say you want to call
location_from_column() but your code shows you calling species(). What
happens when you call location_from_column? Do you see errors?

Brian O.


On 10/17/06 12:26 PM, "Peter H. Baenziger" <plu5even at gmail.com> wrote:

> I was thinking I could use:
> foreach $seq ($alignment->each_seq())
> to loop through the sequences and call:
> $seq->location_from_column($pos)
> on each of the sequences.  


From cjfields at uiuc.edu  Wed Oct 18 22:46:14 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 21:46:14 -0500
Subject: [Bioperl-l] CONTIG dealing
In-Reply-To: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>
References: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>
Message-ID: <FAEAE9E1-EF95-4B79-AD75-B54D3E24E827@uiuc.edu>

On Oct 18, 2006, at 4:58 PM, Nikki Appleby wrote:

>
> I have just entered the wonderful new world of BioPerl, so the  
> answer to my
> question may be obvious to any of the gurus reading this.
>
> I need to collect sequence features and ontology annotations. Here  
> goes.
>
> I am retrieving sequences from SwissProt via Bio::DB::SwissProt and
> get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into  
> an RDBMS
> format that I am happy with I can get at the xref ids. In this  
> case, they
> are
>
> AP003451; BAB86144.1; -; Genomic_DNA.
> AP008207; BAF07116.1; -; Genomic_DNA.
> AB103395; BAC81207.1; -; mRNA.
>
> I can happily go off and fetch those from Bio::DB::GenBank (first  
> column),
> and Bio::DB::GenPept (second). All good, except...
>
> AP008207 is a contig. I don't want to get all of the features for  
> the entire
> thing, just the single contig that actually matches the original  
> sequence.
> It takes a couple of hours to get at it and then it gives me way  
> too much.
>
> I will come across this problem with other sequences. How do I (a)  
> find out
> if it is a contig without downloading it in it's entirety and (b)  
> extract
> the list of sequences that are about to be contigged together.
>
> I have searched the web for answers, including this list, but see  
> nothing.
> Help!
>
> Nikki Appleby.

The default setting for the retrieval format for GenBank is  
'gbwithparts' (which gets the full sequence at all times).  You can  
set this to 'gb' using request_format() to retrieve the sequence file  
with the contig information instead of the sequence, if it contains  
such (otherwise it just retrieves the sequence anyway).

However, I have noticed this particular file does not represent a  
true contig record but is the entire chromosome sequence.  The contig  
information is in the comments section, probably b/c the record is  
converted over.  You could just download the sequence record and run  
regexp to grab the comments section, then parse out the contigs (a  
pain) if you really want that.  Or you could try to find the  
equivalent GenBank record, such as the ones derived from the WGS  
records.

I did notice the list of dbxrefs in your swissprot record indicate  
three EMBL sequences.  If the order is consistent for the SwissProt  
entries you want, they probably represent:

The contig (what you want): AP003451; BAB86144.1; -; Genomic_DNA.
The supercontig (chromosome) : AP008207; BAF07116.1; -; Genomic_DNA.
The cDNA : AB103395; BAC81207.1; -; mRNA.

I checked the first one (AP003451), which seems to confirm this.

Since the chromosome supercontig is built from the smaller sequence  
contigs you could just grab the first EMBL dbxref instead of all of  
them.  It parses much faster than the chromosome file.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Wed Oct 18 11:47:14 2006
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Oct 2006 08:47:14 -0700
Subject: [Bioperl-l] Blast information
In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
References: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
Message-ID: <6B7D24F3-69F1-498D-AB53-B4CEB14E4F3D@bioperl.org>

I think this will work for you.

The seq_inds method parses the middle homology sequence and  
classifies each alignment column and returns a list of the columns  
meeting the criteria.  You can interrogate query or hit in this case  
since you are requiring it to be identical

my $identicalbases = scalar $hsp->seq_inds('query', 'identical');
my $conservedbases =  scalar $hsp->seq_inds('query','conserved');

Conserved returns those identical or conserved, if you want just  
those with conservative replacements use 'conserved-not-identical'

See http://bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods for more  
info.

-jason
On Oct 18, 2006, at 8:16 AM, Kevin Brown wrote:

> I just recently upgraded to 1.5.1 on WinXP to bring this version  
> closer
> to live to parse some locally created blast files.  I'm trying to find
> the method that returns the values that are underneath the Identities
> and Positives information as I'm trying to replicate the output of an
> old blast parser we have here written in RealBasic which is showing  
> its
> age.  Once I have it replicating the old output I then intend to add
> more features in terms of filtering returned hits (like not returning
> self->self hits or a->b so don't show b->a).
>
> Example:
> I'm looking for the methods that will return 117 from identities  
> and 117
> from positives.  I can't just use num_identical/percent_identity as  
> that
> isn't 100% accurate.
>
>> BurkM_2016
>           Length = 241
>
>  Score = 43.2 bits (88), Expect = 7e-005
>  Identities = 26/117 (22%), Positives = 51/117 (43%)
>
> Query: 298  
> QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> 357
>            Q   F  F  + A+    ++ +         + + L +R   GL   + P   E +  
> A+L
> Sbjct: 111  
> QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> 170
>
> Query: 358  
> MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
>               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> Sbjct: 171  
> KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
>
> Thanks,
> Kevin
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Thu Oct 19 01:00:28 2006
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Oct 2006 22:00:28 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
Message-ID: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>

So I'm unsure what we should do here.

We can certainly fix the problem which you report which is relying on  
the "" method -- if you were to do instead:
print $_->database, ":", $_->primary_id, "\n";

you'll get the right answer.  We at a minimum just fix the auto- 
string converting method to do The Right Thing.

But I am not sure if we should keep the version out of the primary_id  
field.  This will require some rejiggering in several modules when it  
comes to printing DBlinks and I don't want to do this before the  
release. I also am not sure if there was an explicit reason why  
someone did put the version information in the primary_id. (I hope it  
wasn't me because I don't think I'm going to remember why).

Does anyone else have a strong feeling?

-jason
On Oct 17, 2006, at 12:01 PM, Erikjan wrote:

> Hello,
>
> I noticed a little problem with the Annotation "DBLink" from  
> GenBank entries
>
> When I run:
>
> perl -MBio::DB::GenBank -e 'my $gi =
> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations 
> ("dblink");
> for(@annotations) { print $_, "\n";} print $INC{
> "Bio/Annotation/DBLink.pm" }, "\n"; '
>
> This yields:
>
>    GenBank:AL591065.17.17
>
> and the place where the used Bio/Annotation/DBLink.pm resides.
>
> Can others repeat this?
>
> I have dug into the source a little and Bio::Annotation::DBLink  
> seems to
> be the place where this happens: it has a concatenation which leads to
> that repeated version number.
>
> It this something that I should fix "client-side", so to speak, or  
> is it
> worthwhile to add some logic to that concatenation to prevent this?
>
>
> Thanks,
>
> Eric
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From n.haigh at sheffield.ac.uk  Thu Oct 19 02:41:02 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 07:41:02 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>
References: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>
Message-ID: <45371DFE.6050306@sheffield.ac.uk>


> As a followup in this, I tried bioperl-network and had similar failed tests
> with Graph 0.79 (the only PPM available from ActiveState).  However, the
> INSTALL docs state that Graph 0.80 is needed, and the test run gave several
> warnings about not having Graph 0.80 installed. 
>
> I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and
> everything passed.  Maybe we need to have a Graph PPM available for those
> who want bioperl-network?
>
> As for bioperl-run, all tests passed from a new CVS checkout even though I
> have none of the programs installed, so they seem to skip properly.  The
> test run also printed warnings when a program wasn't available or installed.
>
>
> Chris
>
>   
If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make 
modifications to integrate them into the package.xml file for PPM4 clients.

Nath


From n.haigh at sheffield.ac.uk  Thu Oct 19 06:40:21 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 11:40:21 +0100
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
Message-ID: <45375615.1020603@sheffield.ac.uk>

Should line 25 read:
require Bio::Factory::EMBOSS

instead of:
require Bio::EMBOSS::Factory;

Nath


From hlapp at gmx.net  Thu Oct 19 09:56:05 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 19 Oct 2006 09:56:05 -0400
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
Message-ID: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>

Here is the overload code:

use overload '""' => sub {
	(($_[0]->database ? $_[0]->database . ':' : '' )
	. ($_[0]->primary_id ? $_[0]->primary_id : '')
	. ($_[0]->version ? '.' . $_[0]->version : ''))
	|| '' };

Except that the last '||' is redundant and unnecessary (it either  
does nothing or replaces an empty string with an empty string), I  
don't see the potential for duplicating the version number here -  
unless primary_id() did that, which I don't see it doing.

So, to me this seems to come from a parsing error in the beginning,  
rather than an erroneous mangling of version into primary_id later.

Is someone in the position to confirm this?

	-hilmar

On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:

> So I'm unsure what we should do here.
>
> We can certainly fix the problem which you report which is relying on
> the "" method -- if you were to do instead:
> print $_->database, ":", $_->primary_id, "\n";
>
> you'll get the right answer.  We at a minimum just fix the auto-
> string converting method to do The Right Thing.
>
> But I am not sure if we should keep the version out of the primary_id
> field.  This will require some rejiggering in several modules when it
> comes to printing DBlinks and I don't want to do this before the
> release. I also am not sure if there was an explicit reason why
> someone did put the version information in the primary_id. (I hope it
> wasn't me because I don't think I'm going to remember why).
>
> Does anyone else have a strong feeling?
>
> -jason
> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>
>> Hello,
>>
>> I noticed a little problem with the Annotation "DBLink" from
>> GenBank entries
>>
>> When I run:
>>
>> perl -MBio::DB::GenBank -e 'my $gi =
>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>> ("dblink");
>> for(@annotations) { print $_, "\n";} print $INC{
>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>
>> This yields:
>>
>>    GenBank:AL591065.17.17
>>
>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>
>> Can others repeat this?
>>
>> I have dug into the source a little and Bio::Annotation::DBLink
>> seems to
>> be the place where this happens: it has a concatenation which  
>> leads to
>> that repeated version number.
>>
>> It this something that I should fix "client-side", so to speak, or
>> is it
>> worthwhile to add some logic to that concatenation to prevent this?
>>
>>
>> Thanks,
>>
>> Eric
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From dmessina at wustl.edu  Thu Oct 19 09:55:31 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 19 Oct 2006 08:55:31 -0500
Subject: [Bioperl-l] missing documentation (request for help)
Message-ID: <69453D5F-7794-4DC7-BAE1-A8B2191752E6@wustl.edu>

Hi all,

There are a few modules missing a one-line description, and by one- 
line description, I'm referring to the part that comes after the  
module name in the POD.

e.g. in

=head1 NAME

Bio::SearchIO - Driver for parsing Sequence Database Searches
(BLAST, FASTA, ...)

=head1 SYNOPSIS

[etc...]

"Driver for parsing Sequence Database Searches (BLAST, FASTA, ...)"  
is the one-line description (even though it falls onto two lines) :).

I fixed the modules that I knew something about, but there are some I  
haven't used. Perhaps the author, or someone else familiar with these  
modules, could fill in an appropriate short description?

Here is the list of affected modules:
Bio::DB::Expression
Bio::Expression::Contact
Bio::Expression::DataSet
Bio::Expression::Platform
Bio::Expression::Sample
Bio::Search::Processor
Bio::DB::EUtilities::ElinkData
Bio::DB::GFF::Adaptor::memory::feature_serializer
Bio::DB::SeqFeature::Store::DBI::Iterator
Bio::Expression::FeatureGroup::FeatureGroupMas50
Bio::Expression::FeatureSet::FeatureSetMas50
Bio::Matrix::PSM::PsmHeaderI
Bio::OntologyIO::Handlers::BaseSAXHandler

Some of these are missing other POD parts as well -- please add those  
too if you can.


Thanks,
Dave


From mckays at cshl.edu  Thu Oct 19 09:51:18 2006
From: mckays at cshl.edu (Sheldon McKay)
Date: Thu, 19 Oct 2006 09:51:18 -0400
Subject: [Bioperl-l] chromosome ideograms
Message-ID: <6b0de00426b3c04b0d0d7641bc8e14e3@cshl.edu>

Hi,

Sorry for the late reply.  I have been working on a karyotype drawing 
tool as part of the Generic Genome Browser that may be useful.  In 
addition to drawing features next to chromosome ideograms, it also 
supports making chromosome 'bands' from any kind of scored features to 
create a sort of heat map on the chromosome itself.

I have a demo running at

http://mckay.cshl.edu/cgi-bin/gbrowse_karyotype

and the source is available from the GMOD CVS HEAD 
http://www.gmod.org/cvs

Sheldon

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Sheldon McKay, PhD
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724


From n.haigh at sheffield.ac.uk  Thu Oct 19 11:37:31 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 15:37:31 +0000
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
In-Reply-To: <45375615.1020603@sheffield.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>
Message-ID: <45379BBB.1040400@sheffield.ac.uk>

Thanks for committing that change Brian. Now the tests proceed from this
point, I get the following error:

------------- EXCEPTION: Bio::Root::NotImplemented -------------
MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not
implemented by package Bio::Tools::Run::EMBOSSApplication.
This is not your fault - author of Bio::Tools::Run::EMBOSSApplication
should be blamed!

STACK: Error::throw
STACK: Bio::Root::Root::throw
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350
STACK: Bio::Root::RootI::throw_not_implemented
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522
STACK: Bio::Tools::Run::WrapperBase::program_dir
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346
STACK: Bio::Tools::Run::WrapperBase::program_path
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327
STACK: Bio::Tools::Run::WrapperBase::executable
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297
STACK: t/EMBOSS.t:58
----------------------------------------------------------------


From N.Haigh at sheffield.ac.uk  Thu Oct 19 11:03:00 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 16:03:00 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45379BBB.1040400@sheffield.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>
	<45379BBB.1040400@sheffield.ac.uk>
Message-ID: <1161270180.453793a432e4f@webmail.shef.ac.uk>

I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be
consistent with other tests.

Failing that - Is there a good test writing style I should follow in one of the other test files?

Thanks
Nathan


From bosborne11 at verizon.net  Thu Oct 19 11:06:08 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Thu, 19 Oct 2006 11:06:08 -0400
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
In-Reply-To: <45379BBB.1040400@sheffield.ac.uk>
Message-ID: <C15D0CA0.AE2C%bosborne11@verizon.net>

Nathan,

Yes, I see. Those EMBOSS programs work a bit differently from the typical
app run by bioperl-run, there's no need for WrapperBase methods like
program_dir(), executable(), it seems. Well, I can try and take a look at
this tonight but there's probably someone better suited to this than me,
I've spent very little time with bioperl-run. Volunteer?

Brian O.


On 10/19/06 11:37 AM, "Nathan S. Haigh" <n.haigh at sheffield.ac.uk> wrote:

> Thanks for committing that change Brian. Now the tests proceed from this
> point, I get the following error:
> 
> ------------- EXCEPTION: Bio::Root::NotImplemented -------------
> MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not
> implemented by package Bio::Tools::Run::EMBOSSApplication.
> This is not your fault - author of Bio::Tools::Run::EMBOSSApplication
> should be blamed!
> 
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350
> STACK: Bio::Root::RootI::throw_not_implemented
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522
> STACK: Bio::Tools::Run::WrapperBase::program_dir
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346
> STACK: Bio::Tools::Run::WrapperBase::program_path
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327
> STACK: Bio::Tools::Run::WrapperBase::executable
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297
> STACK: t/EMBOSS.t:58
> ----------------------------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From niels at genomics.dk  Thu Oct 19 11:16:37 2006
From: niels at genomics.dk (Niels Larsen)
Date: Thu, 19 Oct 2006 17:16:37 +0200
Subject: [Bioperl-l] From EBI support re WU-Blast SOAP service
In-Reply-To: <4535EBF9.1090706@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk>
Message-ID: <453796D5.2070808@genomics.dk>

Sendu Bala wrote:
>> I invoked the EBI script
>>
>> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip
>>
>> like this
>>
>> WSWUBlastClient.pl -p blastn -D embl test.fasta
>>
>> where the content of test.fasta is below, and got
>>
>> Can't find method element in the message at 
>> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.
> 
> As you admit, this is not a Bioperl issue. I would suggest you contact 
> EBI support.
> 

To use EBI's WU-blast SOAP interface from perl, EBI support
says it one must use SOAP::Lite v 0.60 (no later version)
and include '--email you.example.com' on the command line.
This is neither evident from their web pages or the script
usage statement, but they promised to fix.

------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------


From cjfields at uiuc.edu  Thu Oct 19 11:31:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 10:31:45 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45371DFE.6050306@sheffield.ac.uk>
Message-ID: <001501c6f393$b66bd4a0$15327e82@pyrimidine>

> > As a followup in this, I tried bioperl-network and had similar failed
> tests
> > with Graph 0.79 (the only PPM available from ActiveState).  However, the
> > INSTALL docs state that Graph 0.80 is needed, and the test run gave
> several
> > warnings about not having Graph 0.80 installed.
> >
> > I made a PPM of Graph 0.80, installed, retried bioperl-network tests,
> and
> > everything passed.  Maybe we need to have a Graph PPM available for
> those
> > who want bioperl-network?
> >
> > As for bioperl-run, all tests passed from a new CVS checkout even though
> I
> > have none of the programs installed, so they seem to skip properly.  The
> > test run also printed warnings when a program wasn't available or
> installed.
> >
> >
> > Chris
> >
> >
> If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make
> modifications to integrate them into the package.xml file for PPM4
> clients.
> 
> Nath

Will do.  Should these be forwarded to Mauricio?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From N.Haigh at sheffield.ac.uk  Thu Oct 19 11:38:05 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 16:38:05 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001501c6f393$b66bd4a0$15327e82@pyrimidine>
References: <001501c6f393$b66bd4a0$15327e82@pyrimidine>
Message-ID: <1161272285.45379bdd1aea4@webmail.shef.ac.uk>


> > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make
> > modifications to integrate them into the package.xml file for PPM4
> > clients.
> > 
> > Nath
> 
> Will do.  Should these be forwarded to Mauricio?
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 


If you don't have access to the web, you can send them to me - I now have an account on that server.

Cheers
Nath


From cjfields at uiuc.edu  Thu Oct 19 11:45:00 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 10:45:00 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk>
Message-ID: <001601c6f395$8a752ed0$15327e82@pyrimidine>

> I thought I'd have my first proper try at writing some tests. I was
> wondering if there is a template test file that I should use/study in
> order to be
> consistent with other tests.
> 
> Failing that - Is there a good test writing style I should follow in one
> of the other test files?
> 
> Thanks
> Nathan

I would start with the Test::Simple and Test::More perldoc; they're pretty
self-explanatory.  You can look at the various test suites using Test::More
as well for pointers.  By far, most tests will use is().  You can use SKIP
blocks to skip tests that have a requirement, or skip all tests if they all
require something.  Pretty flexible.

We should probably get a wiki page for the developers underway, maybe a
HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
DB tests, etc.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Thu Oct 19 12:23:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 11:23:40 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
Message-ID: <001b01c6f39a$f0288ba0$15327e82@pyrimidine>

> Here is the overload code:
> 
> use overload '""' => sub {
> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> 	|| '' };
> 
> Except that the last '||' is redundant and unnecessary (it either
> does nothing or replaces an empty string with an empty string), I
> don't see the potential for duplicating the version number here -
> unless primary_id() did that, which I don't see it doing.
> 
> So, to me this seems to come from a parsing error in the beginning,
> rather than an erroneous mangling of version into primary_id later.
> 
> Is someone in the position to confirm this?
> 
> 	-hilmar

I have attached a script to the bug report on bugzilla, as well as the test
output sequence and the actual GenBank record.  There are a number of
problems:

1)  primary_id() is assigned both the id and version.
2)  version() is still assigned the version.

The above explain when printing the object directly using the overload (it
concatenates them).  

However, there are a few more issues.  The ID is printed normally
(accession.version), but the source DB is not present when SeqIO handles the
sequence.  I have attached the output and the original GenBank record to the
bug report.  

I can look into it but it won't be today; got my hands full with enzyme
assays. 

Chris


From N.Haigh at sheffield.ac.uk  Thu Oct 19 12:50:57 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 17:50:57 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine>
References: <001601c6f395$8a752ed0$15327e82@pyrimidine>
Message-ID: <1161276657.4537acf1edc80@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> > I thought I'd have my first proper try at writing some tests. I was
> > wondering if there is a template test file that I should use/study in
> > order to be
> > consistent with other tests.
> > 
> > Failing that - Is there a good test writing style I should follow in one
> > of the other test files?
> > 
> > Thanks
> > Nathan
> 
> I would start with the Test::Simple and Test::More perldoc; they're pretty
> self-explanatory.  You can look at the various test suites using Test::More
> as well for pointers.  By far, most tests will use is().  You can use SKIP
> blocks to skip tests that have a requirement, or skip all tests if they all
> require something.  Pretty flexible.
> 
> We should probably get a wiki page for the developers underway, maybe a
> HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
> DB tests, etc.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 
> 
> 


Just working through some test things now, I thought I'd start on the bioperl-run stuff as I thought it might be a bit more straight forward, i'm
familiar with some of them and they seem to get neglected.

I'm heavily commenting my tests with the thought of starting a wiki guide to testing Bioperl modules. See how far I get!

Nath


From hlapp at gmx.net  Thu Oct 19 13:11:27 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 19 Oct 2006 13:11:27 -0400
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
	<8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
Message-ID: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>

Actually you did that Jason: http://tinyurl.com/ye2edk

Apparently the motivation was to "parse swissprot fields in genpept  
file (dbsource)"?

It clearly looks wrong to add the version. You've probably had a  
reason why you did this at the time but if we (you :) can't recover  
that I guess it's best to just fix it to do the right thing (in both  
places obviously).

	-hilmar

On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:

> Well there is explicit addition of the version to the primary id so  
> it isn't so much a parsing error as a deliberate decision to append  
> it.
> see Bio::SeqIO::genbank
>
> to make the dblink
>                                               $annotation- 
> >add_Annotation
>                                                     ('dblink',
>                                                       
> Bio::Annotation::DBLink->new
>                                                      (-primary_id  
> => $id . "." . $version,
>                                                       -version =>  
> $version,
>                                                       -database =>  
> $db,
>                                                       -tagname =>  
> 'dblink'));
>
> and the code to print the dblink back out in the writer already  
> assumes the version number is appended...
>
>         foreach my $ref ( $seq->annotation->get_Annotations 
> ('dblink') ) {
>             # if ($ref->comment eq 'DBSOURCE') {
>             $self->_print('DBSOURCE    accession ',
>                           $ref->primary_id, "\n");
>             # }
>         }
>
> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
>
>> Here is the overload code:
>>
>> use overload '""' => sub {
>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
>> 	|| '' };
>>
>> Except that the last '||' is redundant and unnecessary (it either  
>> does nothing or replaces an empty string with an empty string), I  
>> don't see the potential for duplicating the version number here -  
>> unless primary_id() did that, which I don't see it doing.
>>
>> So, to me this seems to come from a parsing error in the  
>> beginning, rather than an erroneous mangling of version into  
>> primary_id later.
>>
>> Is someone in the position to confirm this?
>>
>> 	-hilmar
>>
>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>>
>>> So I'm unsure what we should do here.
>>>
>>> We can certainly fix the problem which you report which is  
>>> relying on
>>> the "" method -- if you were to do instead:
>>> print $_->database, ":", $_->primary_id, "\n";
>>>
>>> you'll get the right answer.  We at a minimum just fix the auto-
>>> string converting method to do The Right Thing.
>>>
>>> But I am not sure if we should keep the version out of the  
>>> primary_id
>>> field.  This will require some rejiggering in several modules  
>>> when it
>>> comes to printing DBlinks and I don't want to do this before the
>>> release. I also am not sure if there was an explicit reason why
>>> someone did put the version information in the primary_id. (I  
>>> hope it
>>> wasn't me because I don't think I'm going to remember why).
>>>
>>> Does anyone else have a strong feeling?
>>>
>>> -jason
>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>>
>>>> Hello,
>>>>
>>>> I noticed a little problem with the Annotation "DBLink" from
>>>> GenBank entries
>>>>
>>>> When I run:
>>>>
>>>> perl -MBio::DB::GenBank -e 'my $gi =
>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>>> $seqio =
>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>>> ("dblink");
>>>> for(@annotations) { print $_, "\n";} print $INC{
>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>>
>>>> This yields:
>>>>
>>>>    GenBank:AL591065.17.17
>>>>
>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>>
>>>> Can others repeat this?
>>>>
>>>> I have dug into the source a little and Bio::Annotation::DBLink
>>>> seems to
>>>> be the place where this happens: it has a concatenation which  
>>>> leads to
>>>> that repeated version number.
>>>>
>>>> It this something that I should fix "client-side", so to speak, or
>>>> is it
>>>> worthwhile to add some logic to that concatenation to prevent this?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Eric
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> --
>>> Jason Stajich, PhD
>>> Miller Research Fellow
>>> University of California
>>> Dept of Plant and Microbial Biology
>>> 321 Koshland Hall #3102
>>> Berkeley, CA 94720-3102
>>> lab: 510.642.8441
>>> http://pmb.berkeley.edu/~taylor/people/js.html
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From N.Haigh at sheffield.ac.uk  Thu Oct 19 13:17:33 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 18:17:33 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine>
References: <001601c6f395$8a752ed0$15327e82@pyrimidine>
Message-ID: <1161278253.4537b32dd3d15@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> > I thought I'd have my first proper try at writing some tests. I was
> > wondering if there is a template test file that I should use/study in
> > order to be
> > consistent with other tests.
> > 
> > Failing that - Is there a good test writing style I should follow in one
> > of the other test files?
> > 
> > Thanks
> > Nathan
> 
> I would start with the Test::Simple and Test::More perldoc; they're pretty
> self-explanatory.  You can look at the various test suites using Test::More
> as well for pointers.  By far, most tests will use is().  You can use SKIP
> blocks to skip tests that have a requirement, or skip all tests if they all
> require something.  Pretty flexible.
> 
> We should probably get a wiki page for the developers underway, maybe a
> HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
> DB tests, etc.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 

Just wrote a partial and small test script for t/Amap.t in bioperl-run. When I run "perl -I. t/Amap.t" I get the following output:
1..10
ok 1 - use Bio::Tools::Run::Alignment::Amap;
ok 2 - use Bio::AlignIO;
ok 3 - use Bio::SeqIO;
ok 4 - use Bio::Root::IO;
ok 5 - All the required modules are present
ok 6 - new() returned something
ok 7 -   and its the right class
not ok 8 - executable() got the correct filename
#   Failed test 'executable() got the correct filename'
#   in t/Amap.t at line 90.
#          got: undef
#     expected: 'filename'
ok 9 # skip Got incorrect filename for executable
ok 10 # skip Got incorrect filename for executable
# Looks like you failed 1 test of 10.


So far this looks good (well, that it's failing passing expected tests). However, when i run "make test" the output is unexpected and I don't know
why. It seems to die and produce the results of the testing before the rest of the test suit is run:
t/Amap....................NOK 8
#   Failed test 'executable() got the correct filename'
#   in t/Amap.t at line 90.
#          got: undef
#     expected: 'filename'
# Looks like you failed 1 test of 10.
t/Amap....................dubious
        Test returned status 1 (wstat 256, 0x100)
DIED. FAILED test 8
        Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, 70.00%)
t/Analysis_soap...........ok 7/17make: *** wait: No child processes.  Stop.


Is there something I'm missing?? If it's something less obvious, let me know and i'll post whole test file.
Nath


From cjfields at uiuc.edu  Thu Oct 19 13:26:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 12:26:45 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161278253.4537b32dd3d15@webmail.shef.ac.uk>
Message-ID: <002001c6f3a3$c00b9080$15327e82@pyrimidine>

...
> Just wrote a partial and small test script for t/Amap.t in bioperl-run.
> When I run "perl -I. t/Amap.t" I get the following output:
> 1..10
> ok 1 - use Bio::Tools::Run::Alignment::Amap;
> ok 2 - use Bio::AlignIO;
> ok 3 - use Bio::SeqIO;
> ok 4 - use Bio::Root::IO;
> ok 5 - All the required modules are present
> ok 6 - new() returned something
> ok 7 -   and its the right class
> not ok 8 - executable() got the correct filename
> #   Failed test 'executable() got the correct filename'
> #   in t/Amap.t at line 90.
> #          got: undef
> #     expected: 'filename'
> ok 9 # skip Got incorrect filename for executable
> ok 10 # skip Got incorrect filename for executable
> # Looks like you failed 1 test of 10.
> 
> 
> So far this looks good (well, that it's failing passing expected tests).
> However, when i run "make test" the output is unexpected and I don't know
> why. It seems to die and produce the results of the testing before the
> rest of the test suit is run:
> t/Amap....................NOK 8
> #   Failed test 'executable() got the correct filename'
> #   in t/Amap.t at line 90.
> #          got: undef
> #     expected: 'filename'
> # Looks like you failed 1 test of 10.
> t/Amap....................dubious
>         Test returned status 1 (wstat 256, 0x100)
> DIED. FAILED test 8
>         Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay,
> 70.00%)
> t/Analysis_soap...........ok 7/17make: *** wait: No child processes.
> Stop.
> 
> 
> 
> Is there something I'm missing?? If it's something less obvious, let me
> know and i'll post whole test file.
> Nath

Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
the problem.  The only issue I can think of is that Test::More TODO blocks
require a newer version of Test::Harness (which most users have anyway).
Are you using a TODO block?

You can send me Amap.t and I'll give it a try, but I can't promise I'll get
to it immediately (busy day).

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From N.Haigh at sheffield.ac.uk  Thu Oct 19 13:38:25 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 18:38:25 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161279505.4537b811e143f@webmail.shef.ac.uk>


> Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
> the problem.  The only issue I can think of is that Test::More TODO blocks
> require a newer version of Test::Harness (which most users have anyway).
> Are you using a TODO block?
> 
> You can send me Amap.t and I'll give it a try, but I can't promise I'll get
> to it immediately (busy day).
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 
> 

No TODO blocks.

I must have done something wrong - it's the first time I've seen this - but then again, I don't look that closely at the output of "make test" unless
something shows as a fail. Anyway, below is the short bit of code.

Thanks
Nath

use strict;
use Bio::Root::IO;  # cant test for this, might be needed to get Test::More

BEGIN {
  # Things to do ASAP once the script is run
  # even before anything else in the file is parsed
  use vars qw($NUMTESTS $DEBUG $error);
  $DEBUG = $ENV{'BIOIPERLDEBUG'} || 0;

  # Use installed Test module, otherwise fall back
  # to copy of Test.pm located in the t dir
  eval { require Test::More; };
  if ( $@ ) {
    use lib Bio::Root::IO->catfile('t','lib');
  }

  # Currently no errors
  $error = 0;

  # Setup the number of tests to be run
  # what about using:
  # use Test::More 'no_plan';
  use Test::More;
  $NUMTESTS = 10;
  plan tests => $NUMTESTS;

  # Use modules that are needed in this test that are from
  # any of the Bioperl packages: Bioperl-core, Bioperl-run ... etc
  # use_ok('<module::to::use>');
  use_ok('Bio::Tools::Run::Alignment::Amap');
  use_ok('Bio::AlignIO');
  use_ok('Bio::SeqIO');
  use_ok('Bio::Root::IO');
}

# Multiple END blocks are run in reverse order of their definition
# Last In, First Out (LIFO)
END {
  # Things to do right at the very end, just
  # when the  interpreter finishes/exits
  # E.g. deleting intermediate files produced during the test

  foreach my $file ( qw(cysprot.dnd cysprot1a.dnd) ) {
    unlink $file;
    # check it was deleted

  }
  #unlink qw(cysprot.dnd cysprot1a.dnd)
}

END {
  # Not sure what this is doing?
  #for ( $Test::ntest..$NUMTESTS ) {
  #  skip("Amap program not found. Skipping.\n",1);
  #}
}

# if we got to here, thats OK!
# is this really needed?
ok( 1, 'All the required modules are present');

# setup input files etc
my $inputfilename = Bio::Root::IO->catfile("t","data","cysprot.fa");

# setup output files etc
# none in this test

# setup global objects that are to be used in more than one test
# Also test they were initialised correctly
my @params = ();
my $aln;
my $factory = Bio::Tools::Run::Alignment::Amap->new(@params);
ok( defined $factory,                                  'new() returned something' );
ok( $factory->isa('Bio::Tools::Run::Alignment::Amap'), '  and its the right class' );

# Now onto the nitty gritty tests of the modules methods
my $executable_file = $factory->executable();
#is( $factory->executable(), 'filename',                'executable() got the correct filename' );

# block of tests to skip if you know the tests will fail
# under some condition. E.g.:
#   Need network access,
#   Wont work on particular OS,
#   Cant find the exectuable
# Do not just skip tests that seem to fail for an unknown reason
SKIP: {
  # condition used to skip this block of tests
  #skip($why, $how_many_in_block);
  skip("Got incorrect filename for executable", 2)
    unless is($factory->executable(), 'filename',       'executable() got the correct filename');

  ok( -e $executable_file,                              'Found executable' );
  ok( $factory->version >= 2.0,                         'Code tested on Amap versions >= 2.0' );

}


From jason at bioperl.org  Thu Oct 19 13:44:51 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 10:44:51 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
	<8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
	<7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
Message-ID: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>

Yikes - I was worried that it might have been me.....

Okay I'll look into fixing it -- ChrisF - check in with me before  
diving in, in case I've gotten it done and I expect your enzyme  
assays might take up the time.

-jason
On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:

> Actually you did that Jason: http://tinyurl.com/ye2edk
>
> Apparently the motivation was to "parse swissprot fields in genpept  
> file (dbsource)"?
>
> It clearly looks wrong to add the version. You've probably had a  
> reason why you did this at the time but if we (you :) can't recover  
> that I guess it's best to just fix it to do the right thing (in  
> both places obviously).
>
> 	-hilmar
>
> On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
>
>> Well there is explicit addition of the version to the primary id  
>> so it isn't so much a parsing error as a deliberate decision to  
>> append it.
>> see Bio::SeqIO::genbank
>>
>> to make the dblink
>>                                               $annotation- 
>> >add_Annotation
>>                                                     ('dblink',
>>                                                       
>> Bio::Annotation::DBLink->new
>>                                                      (-primary_id  
>> => $id . "." . $version,
>>                                                       -version =>  
>> $version,
>>                                                       -database =>  
>> $db,
>>                                                       -tagname =>  
>> 'dblink'));
>>
>> and the code to print the dblink back out in the writer already  
>> assumes the version number is appended...
>>
>>         foreach my $ref ( $seq->annotation->get_Annotations 
>> ('dblink') ) {
>>             # if ($ref->comment eq 'DBSOURCE') {
>>             $self->_print('DBSOURCE    accession ',
>>                           $ref->primary_id, "\n");
>>             # }
>>         }
>>
>> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
>>
>>> Here is the overload code:
>>>
>>> use overload '""' => sub {
>>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
>>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
>>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
>>> 	|| '' };
>>>
>>> Except that the last '||' is redundant and unnecessary (it either  
>>> does nothing or replaces an empty string with an empty string), I  
>>> don't see the potential for duplicating the version number here -  
>>> unless primary_id() did that, which I don't see it doing.
>>>
>>> So, to me this seems to come from a parsing error in the  
>>> beginning, rather than an erroneous mangling of version into  
>>> primary_id later.
>>>
>>> Is someone in the position to confirm this?
>>>
>>> 	-hilmar
>>>
>>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>>>
>>>> So I'm unsure what we should do here.
>>>>
>>>> We can certainly fix the problem which you report which is  
>>>> relying on
>>>> the "" method -- if you were to do instead:
>>>> print $_->database, ":", $_->primary_id, "\n";
>>>>
>>>> you'll get the right answer.  We at a minimum just fix the auto-
>>>> string converting method to do The Right Thing.
>>>>
>>>> But I am not sure if we should keep the version out of the  
>>>> primary_id
>>>> field.  This will require some rejiggering in several modules  
>>>> when it
>>>> comes to printing DBlinks and I don't want to do this before the
>>>> release. I also am not sure if there was an explicit reason why
>>>> someone did put the version information in the primary_id. (I  
>>>> hope it
>>>> wasn't me because I don't think I'm going to remember why).
>>>>
>>>> Does anyone else have a strong feeling?
>>>>
>>>> -jason
>>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I noticed a little problem with the Annotation "DBLink" from
>>>>> GenBank entries
>>>>>
>>>>> When I run:
>>>>>
>>>>> perl -MBio::DB::GenBank -e 'my $gi =
>>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>>>> $seqio =
>>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>>>> ("dblink");
>>>>> for(@annotations) { print $_, "\n";} print $INC{
>>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>>>
>>>>> This yields:
>>>>>
>>>>>    GenBank:AL591065.17.17
>>>>>
>>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>>>
>>>>> Can others repeat this?
>>>>>
>>>>> I have dug into the source a little and Bio::Annotation::DBLink
>>>>> seems to
>>>>> be the place where this happens: it has a concatenation which  
>>>>> leads to
>>>>> that repeated version number.
>>>>>
>>>>> It this something that I should fix "client-side", so to speak, or
>>>>> is it
>>>>> worthwhile to add some logic to that concatenation to prevent  
>>>>> this?
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Eric
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> --
>>>> Jason Stajich, PhD
>>>> Miller Research Fellow
>>>> University of California
>>>> Dept of Plant and Microbial Biology
>>>> 321 Koshland Hall #3102
>>>> Berkeley, CA 94720-3102
>>>> lab: 510.642.8441
>>>> http://pmb.berkeley.edu/~taylor/people/js.html
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Jason Stajich, PhD
>> Miller Research Fellow
>> University of California
>> Dept of Plant and Microbial Biology
>> 321 Koshland Hall #3102
>> Berkeley, CA 94720-3102
>> lab: 510.642.8441
>> http://pmb.berkeley.edu/~taylor/people/js.html
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From cjfields at uiuc.edu  Thu Oct 19 14:03:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:03:52 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
Message-ID: <000001c6f3a8$f0a46a00$15327e82@pyrimidine>

Also seems that the DBSOURCE line isn't caught correctly and stuffs it by
default into a GenBank dblink (the dbsource ihn the test case is EMBL, not
GenBank).  

http://bugzilla.open-bio.org/show_bug.cgi?id=2124

It looks like NCBI may be now using:

DBSOURCE    embl accession Z49548.1

instead of the old version:

DBSOURCE    embl locus SCYJR048W, accession Z49548.1

I don't recall NCBI mentioning changes regarding DBSOURCE in any of the
recent release notes.

Chris

> Actually you did that Jason: http://tinyurl.com/ye2edk
> 
> Apparently the motivation was to "parse swissprot fields in genpept
> file (dbsource)"?
> 
> It clearly looks wrong to add the version. You've probably had a
> reason why you did this at the time but if we (you :) can't recover
> that I guess it's best to just fix it to do the right thing (in both
> places obviously).
> 
> 	-hilmar
> 
> On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> 
> > Well there is explicit addition of the version to the primary id so
> > it isn't so much a parsing error as a deliberate decision to append
> > it.
> > see Bio::SeqIO::genbank
> >
> > to make the dblink
> >                                               $annotation-
> > >add_Annotation
> >                                                     ('dblink',
> >
> > Bio::Annotation::DBLink->new
> >                                                      (-primary_id
> > => $id . "." . $version,
> >                                                       -version =>
> > $version,
> >                                                       -database =>
> > $db,
> >                                                       -tagname =>
> > 'dblink'));
> >
> > and the code to print the dblink back out in the writer already
> > assumes the version number is appended...
> >
> >         foreach my $ref ( $seq->annotation->get_Annotations
> > ('dblink') ) {
> >             # if ($ref->comment eq 'DBSOURCE') {
> >             $self->_print('DBSOURCE    accession ',
> >                           $ref->primary_id, "\n");
> >             # }
> >         }
> >
> > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >
> >> Here is the overload code:
> >>
> >> use overload '""' => sub {
> >> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >> 	|| '' };
> >>
> >> Except that the last '||' is redundant and unnecessary (it either
> >> does nothing or replaces an empty string with an empty string), I
> >> don't see the potential for duplicating the version number here -
> >> unless primary_id() did that, which I don't see it doing.
> >>
> >> So, to me this seems to come from a parsing error in the
> >> beginning, rather than an erroneous mangling of version into
> >> primary_id later.
> >>
> >> Is someone in the position to confirm this?
> >>
> >> 	-hilmar
> >>
> >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>
> >>> So I'm unsure what we should do here.
> >>>
> >>> We can certainly fix the problem which you report which is
> >>> relying on
> >>> the "" method -- if you were to do instead:
> >>> print $_->database, ":", $_->primary_id, "\n";
> >>>
> >>> you'll get the right answer.  We at a minimum just fix the auto-
> >>> string converting method to do The Right Thing.
> >>>
> >>> But I am not sure if we should keep the version out of the
> >>> primary_id
> >>> field.  This will require some rejiggering in several modules
> >>> when it
> >>> comes to printing DBlinks and I don't want to do this before the
> >>> release. I also am not sure if there was an explicit reason why
> >>> someone did put the version information in the primary_id. (I
> >>> hope it
> >>> wasn't me because I don't think I'm going to remember why).
> >>>
> >>> Does anyone else have a strong feeling?
> >>>
> >>> -jason
> >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> I noticed a little problem with the Annotation "DBLink" from
> >>>> GenBank entries
> >>>>
> >>>> When I run:
> >>>>
> >>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>> $seqio =
> >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>> ("dblink");
> >>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>
> >>>> This yields:
> >>>>
> >>>>    GenBank:AL591065.17.17
> >>>>
> >>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>
> >>>> Can others repeat this?
> >>>>
> >>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>> seems to
> >>>> be the place where this happens: it has a concatenation which
> >>>> leads to
> >>>> that repeated version number.
> >>>>
> >>>> It this something that I should fix "client-side", so to speak, or
> >>>> is it
> >>>> worthwhile to add some logic to that concatenation to prevent this?
> >>>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>> --
> >>> Jason Stajich, PhD
> >>> Miller Research Fellow
> >>> University of California
> >>> Dept of Plant and Microbial Biology
> >>> 321 Koshland Hall #3102
> >>> Berkeley, CA 94720-3102
> >>> lab: 510.642.8441
> >>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >> --
> >> ===========================================================
> >> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >> ===========================================================
> >>
> >>
> >>
> >>
> >>
> >
> > --
> > Jason Stajich, PhD
> > Miller Research Fellow
> > University of California
> > Dept of Plant and Microbial Biology
> > 321 Koshland Hall #3102
> > Berkeley, CA 94720-3102
> > lab: 510.642.8441
> > http://pmb.berkeley.edu/~taylor/people/js.html
> >
> >
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From N.Haigh at sheffield.ac.uk  Thu Oct 19 14:06:11 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:06:11 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161281171.4537be93b63c9@webmail.shef.ac.uk>


> 
> Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
> the problem.  The only issue I can think of is that Test::More TODO blocks
> require a newer version of Test::Harness (which most users have anyway).
> Are you using a TODO block?
> 
> You can send me Amap.t and I'll give it a try, but I can't promise I'll get
> to it immediately (busy day).
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 

Nevermind about this - It's working as expected!

I got confused as a previous run threw errors but wasn't included in the final table of failed tests - working now.

Nath 


From N.Haigh at sheffield.ac.uk  Thu Oct 19 14:14:54 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:14:54 +0100
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>

I have a few questions about How bioperl-run modules.

1) How do modules define what the name of the executable is that it uses?
2) Is there a way to test what this is?
3) Does $factory->executable return this or does it only return the name if it successfully found it?

Thanks
Nath


From cjfields at uiuc.edu  Thu Oct 19 14:15:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:15:08 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>
Message-ID: <000001c6f3aa$82845ba0$15327e82@pyrimidine>

Go for it.  I haven't got the time to spare at the moment, sucky protein
assays....

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Jason Stajich
> Sent: Thursday, October 19, 2006 12:45 PM
> To: Hilmar Lapp
> Cc: bioperl-l at lists.open-bio.org; Erikjan
> Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating
> 
> Yikes - I was worried that it might have been me.....
> 
> Okay I'll look into fixing it -- ChrisF - check in with me before
> diving in, in case I've gotten it done and I expect your enzyme
> assays might take up the time.
> 
> -jason
> On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:
> 
> > Actually you did that Jason: http://tinyurl.com/ye2edk
> >
> > Apparently the motivation was to "parse swissprot fields in genpept
> > file (dbsource)"?
> >
> > It clearly looks wrong to add the version. You've probably had a
> > reason why you did this at the time but if we (you :) can't recover
> > that I guess it's best to just fix it to do the right thing (in
> > both places obviously).
> >
> > 	-hilmar
> >
> > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> >
> >> Well there is explicit addition of the version to the primary id
> >> so it isn't so much a parsing error as a deliberate decision to
> >> append it.
> >> see Bio::SeqIO::genbank
> >>
> >> to make the dblink
> >>                                               $annotation-
> >> >add_Annotation
> >>                                                     ('dblink',
> >>
> >> Bio::Annotation::DBLink->new
> >>                                                      (-primary_id
> >> => $id . "." . $version,
> >>                                                       -version =>
> >> $version,
> >>                                                       -database =>
> >> $db,
> >>                                                       -tagname =>
> >> 'dblink'));
> >>
> >> and the code to print the dblink back out in the writer already
> >> assumes the version number is appended...
> >>
> >>         foreach my $ref ( $seq->annotation->get_Annotations
> >> ('dblink') ) {
> >>             # if ($ref->comment eq 'DBSOURCE') {
> >>             $self->_print('DBSOURCE    accession ',
> >>                           $ref->primary_id, "\n");
> >>             # }
> >>         }
> >>
> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >>
> >>> Here is the overload code:
> >>>
> >>> use overload '""' => sub {
> >>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >>> 	|| '' };
> >>>
> >>> Except that the last '||' is redundant and unnecessary (it either
> >>> does nothing or replaces an empty string with an empty string), I
> >>> don't see the potential for duplicating the version number here -
> >>> unless primary_id() did that, which I don't see it doing.
> >>>
> >>> So, to me this seems to come from a parsing error in the
> >>> beginning, rather than an erroneous mangling of version into
> >>> primary_id later.
> >>>
> >>> Is someone in the position to confirm this?
> >>>
> >>> 	-hilmar
> >>>
> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>>
> >>>> So I'm unsure what we should do here.
> >>>>
> >>>> We can certainly fix the problem which you report which is
> >>>> relying on
> >>>> the "" method -- if you were to do instead:
> >>>> print $_->database, ":", $_->primary_id, "\n";
> >>>>
> >>>> you'll get the right answer.  We at a minimum just fix the auto-
> >>>> string converting method to do The Right Thing.
> >>>>
> >>>> But I am not sure if we should keep the version out of the
> >>>> primary_id
> >>>> field.  This will require some rejiggering in several modules
> >>>> when it
> >>>> comes to printing DBlinks and I don't want to do this before the
> >>>> release. I also am not sure if there was an explicit reason why
> >>>> someone did put the version information in the primary_id. (I
> >>>> hope it
> >>>> wasn't me because I don't think I'm going to remember why).
> >>>>
> >>>> Does anyone else have a strong feeling?
> >>>>
> >>>> -jason
> >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> I noticed a little problem with the Annotation "DBLink" from
> >>>>> GenBank entries
> >>>>>
> >>>>> When I run:
> >>>>>
> >>>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>>> $seqio =
> >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>>> ("dblink");
> >>>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>>
> >>>>> This yields:
> >>>>>
> >>>>>    GenBank:AL591065.17.17
> >>>>>
> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>>
> >>>>> Can others repeat this?
> >>>>>
> >>>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>>> seems to
> >>>>> be the place where this happens: it has a concatenation which
> >>>>> leads to
> >>>>> that repeated version number.
> >>>>>
> >>>>> It this something that I should fix "client-side", so to speak, or
> >>>>> is it
> >>>>> worthwhile to add some logic to that concatenation to prevent
> >>>>> this?
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l at lists.open-bio.org
> >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>> --
> >>>> Jason Stajich, PhD
> >>>> Miller Research Fellow
> >>>> University of California
> >>>> Dept of Plant and Microbial Biology
> >>>> 321 Koshland Hall #3102
> >>>> Berkeley, CA 94720-3102
> >>>> lab: 510.642.8441
> >>>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>
> >>> --
> >>> ===========================================================
> >>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >>> ===========================================================
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Jason Stajich, PhD
> >> Miller Research Fellow
> >> University of California
> >> Dept of Plant and Microbial Biology
> >> 321 Koshland Hall #3102
> >> Berkeley, CA 94720-3102
> >> lab: 510.642.8441
> >> http://pmb.berkeley.edu/~taylor/people/js.html
> >>
> >>
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Thu Oct 19 14:35:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:35:08 -0500
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
Message-ID: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>

I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase
but I'm not sure.  I haven't used them very much myself but plan on making
wrappers at some point soon for some programs I use.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Nathan Haigh [mailto:N.Haigh at sheffield.ac.uk]
> Sent: Thursday, October 19, 2006 1:15 PM
> To: Chris Fields
> Cc: 'bioperl-l'
> Subject: bioperl-run executable
> 
> I have a few questions about How bioperl-run modules.
> 
> 1) How do modules define what the name of the executable is that it uses?
> 2) Is there a way to test what this is?
> 3) Does $factory->executable return this or does it only return the name
> if it successfully found it?
> 
> Thanks
> Nath


From N.Haigh at sheffield.ac.uk  Thu Oct 19 14:47:01 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:47:01 +0100
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
Message-ID: <1161283620.4537c82501c43@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase
> but I'm not sure.  I haven't used them very much myself but plan on making
> wrappers at some point soon for some programs I use.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 

On closer inspection of a couple of other modules (Clustalw.pm and TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME and have a sub
(program_name) that simply returns this value. I'd like to see the program_name become a getter/setter so users can change the default and have the
string stored in the factory object.

Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core not bioperl-run? I suppose not since bioperl-core is a prerep for bioperl-run but
wouldn't it make sence to go in bioperl-run?

Nath


From cjfields at uiuc.edu  Thu Oct 19 15:07:05 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 14:07:05 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>
Message-ID: <000701c6f3b1$c5914230$15327e82@pyrimidine>

Jason, Hilmar, 

How about changing the default parsed dblink in SeqIO::genbank (line 520) to

		if( $dbsource =~ /^(\S*?)\s*accession\s+(\S+)\.(\d+)/ ) {
		    my ($db,$id,$version) = ($1,$2,$3);
		    $annotation->add_Annotation
			('dblink',
			 Bio::Annotation::DBLink->new
			 (-primary_id => $id,
			  -version => $version,
			  -database => $db || 'GenBank',
			  -tagname => 'dblink'));
		} 

It passes tests and catches the optional database ('embl' for the bugzilla
report).  The output sequence still doesn't print the DB if it isn't GenBank
via write_seq(), but that should be too hard to fix (famous last words).

Okay, okay, back to the assays...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Jason Stajich
> Sent: Thursday, October 19, 2006 12:45 PM
> To: Hilmar Lapp
> Cc: bioperl-l at lists.open-bio.org; Erikjan
> Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating
> 
> Yikes - I was worried that it might have been me.....
> 
> Okay I'll look into fixing it -- ChrisF - check in with me before
> diving in, in case I've gotten it done and I expect your enzyme
> assays might take up the time.
> 
> -jason
> On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:
> 
> > Actually you did that Jason: http://tinyurl.com/ye2edk
> >
> > Apparently the motivation was to "parse swissprot fields in genpept
> > file (dbsource)"?
> >
> > It clearly looks wrong to add the version. You've probably had a
> > reason why you did this at the time but if we (you :) can't recover
> > that I guess it's best to just fix it to do the right thing (in
> > both places obviously).
> >
> > 	-hilmar
> >
> > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> >
> >> Well there is explicit addition of the version to the primary id
> >> so it isn't so much a parsing error as a deliberate decision to
> >> append it.
> >> see Bio::SeqIO::genbank
> >>
> >> to make the dblink
> >>                                               $annotation-
> >> >add_Annotation
> >>                                                     ('dblink',
> >>
> >> Bio::Annotation::DBLink->new
> >>                                                      (-primary_id
> >> => $id . "." . $version,
> >>                                                       -version =>
> >> $version,
> >>                                                       -database =>
> >> $db,
> >>                                                       -tagname =>
> >> 'dblink'));
> >>
> >> and the code to print the dblink back out in the writer already
> >> assumes the version number is appended...
> >>
> >>         foreach my $ref ( $seq->annotation->get_Annotations
> >> ('dblink') ) {
> >>             # if ($ref->comment eq 'DBSOURCE') {
> >>             $self->_print('DBSOURCE    accession ',
> >>                           $ref->primary_id, "\n");
> >>             # }
> >>         }
> >>
> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >>
> >>> Here is the overload code:
> >>>
> >>> use overload '""' => sub {
> >>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >>> 	|| '' };
> >>>
> >>> Except that the last '||' is redundant and unnecessary (it either
> >>> does nothing or replaces an empty string with an empty string), I
> >>> don't see the potential for duplicating the version number here -
> >>> unless primary_id() did that, which I don't see it doing.
> >>>
> >>> So, to me this seems to come from a parsing error in the
> >>> beginning, rather than an erroneous mangling of version into
> >>> primary_id later.
> >>>
> >>> Is someone in the position to confirm this?
> >>>
> >>> 	-hilmar
> >>>
> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>>
> >>>> So I'm unsure what we should do here.
> >>>>
> >>>> We can certainly fix the problem which you report which is
> >>>> relying on
> >>>> the "" method -- if you were to do instead:
> >>>> print $_->database, ":", $_->primary_id, "\n";
> >>>>
> >>>> you'll get the right answer.  We at a minimum just fix the auto-
> >>>> string converting method to do The Right Thing.
> >>>>
> >>>> But I am not sure if we should keep the version out of the
> >>>> primary_id
> >>>> field.  This will require some rejiggering in several modules
> >>>> when it
> >>>> comes to printing DBlinks and I don't want to do this before the
> >>>> release. I also am not sure if there was an explicit reason why
> >>>> someone did put the version information in the primary_id. (I
> >>>> hope it
> >>>> wasn't me because I don't think I'm going to remember why).
> >>>>
> >>>> Does anyone else have a strong feeling?
> >>>>
> >>>> -jason
> >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> I noticed a little problem with the Annotation "DBLink" from
> >>>>> GenBank entries
> >>>>>
> >>>>> When I run:
> >>>>>
> >>>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>>> $seqio =
> >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>>> ("dblink");
> >>>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>>
> >>>>> This yields:
> >>>>>
> >>>>>    GenBank:AL591065.17.17
> >>>>>
> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>>
> >>>>> Can others repeat this?
> >>>>>
> >>>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>>> seems to
> >>>>> be the place where this happens: it has a concatenation which
> >>>>> leads to
> >>>>> that repeated version number.
> >>>>>
> >>>>> It this something that I should fix "client-side", so to speak, or
> >>>>> is it
> >>>>> worthwhile to add some logic to that concatenation to prevent
> >>>>> this?
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l at lists.open-bio.org
> >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>> --
> >>>> Jason Stajich, PhD
> >>>> Miller Research Fellow
> >>>> University of California
> >>>> Dept of Plant and Microbial Biology
> >>>> 321 Koshland Hall #3102
> >>>> Berkeley, CA 94720-3102
> >>>> lab: 510.642.8441
> >>>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>
> >>> --
> >>> ===========================================================
> >>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >>> ===========================================================
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Jason Stajich, PhD
> >> Miller Research Fellow
> >> University of California
> >> Dept of Plant and Microbial Biology
> >> 321 Koshland Hall #3102
> >> Berkeley, CA 94720-3102
> >> lab: 510.642.8441
> >> http://pmb.berkeley.edu/~taylor/people/js.html
> >>
> >>
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Thu Oct 19 14:48:28 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 11:48:28 -0700
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
	<1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
Message-ID: <67650240-D61B-4842-AE7C-75F15F608F6F@bioperl.org>

program_name()
  Should return the name of the program

executable()
  Is a function that you don't have to mess with that tries to find  
the executable named  program_name() based on your PATH.


-jason
On Oct 19, 2006, at 11:14 AM, Nathan Haigh wrote:

> I have a few questions about How bioperl-run modules.
>
> 1) How do modules define what the name of the executable is that it  
> uses?
> 2) Is there a way to test what this is?
> 3) Does $factory->executable return this or does it only return the  
> name if it successfully found it?
>
> Thanks
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Thu Oct 19 17:06:43 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 14:06:43 -0700
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161283620.4537c82501c43@webmail.shef.ac.uk>
References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
	<1161283620.4537c82501c43@webmail.shef.ac.uk>
Message-ID: <AA1A41EC-C0E1-49C3-818E-64210971E331@bioperl.org>

It can be reset now but of course this not a very nice way of doing it:

$Bio::Tools::Run::Alignment::Clustalw::PROGRAM_NAME = 'clustalw_smp';

I am not sure if there are pros and cons to making it a getter- 
setter, but if you want to run with it, please do.

The whole run system has been hard to keep people adhering to a  
standard (and the standard has changed a bit) so some auditing is  
warranted.

-jason

On Oct 19, 2006, at 11:47 AM, Nathan Haigh wrote:

> Quoting Chris Fields <cjfields at uiuc.edu>:
>
>> I think a lot of the bioperl-run modules use  
>> Bio::Tools::Run::WrapperBase
>> but I'm not sure.  I haven't used them very much myself but plan  
>> on making
>> wrappers at some point soon for some programs I use.
>>
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>
> On closer inspection of a couple of other modules (Clustalw.pm and  
> TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME  
> and have a sub
> (program_name) that simply returns this value. I'd like to see the  
> program_name become a getter/setter so users can change the default  
> and have the
> string stored in the factory object.
>
> Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core  
> not bioperl-run? I suppose not since bioperl-core is a prerep for  
> bioperl-run but
> wouldn't it make sence to go in bioperl-run?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From torsten.seemann at infotech.monash.edu.au  Thu Oct 19 19:24:03 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 20 Oct 2006 09:24:03 +1000
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161279505.4537b811e143f@webmail.shef.ac.uk>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
	<1161279505.4537b811e143f@webmail.shef.ac.uk>
Message-ID: <45380913.3070506@infotech.monash.edu.au>

Nathan,

> use strict;
> use Bio::Root::IO;  # cant test for this, might be needed to get Test::More

use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
and File::Spec is "guaranteed" to be installed with Perl 5.6+.

>     use lib Bio::Root::IO->catfile('t','lib');

Simpler as:
	use lib 't/lib';
I understand the 'lib.pm' accepts Unix style directories REGARDLESS of native 
platform.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia


From prabubio at gmail.com  Thu Oct 19 20:11:36 2006
From: prabubio at gmail.com (Prabu Raja)
Date: 20 Oct 2006 00:11:36 -0000
Subject: [Bioperl-l] Prabu Raja sent you this link
Message-ID: <20061020001136.86586.qmail@x05.namesdatabase.com>

Remember your link from Prabu Raja:

http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2


1 -> Use Prabu Raja's link by clicking above.

2 -> Enter your info for a membership connected to Prabu.

3 -> Share links with other friends, family and co-workers.

4 -> Use the members-only people search tools.

Prabu selected you for this on 09-02-2004 22:52 ET.


prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-bio.org
at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99.
If you do not know a Prabu Raja, use http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more reminders about this.
For reference, the address of The Names Database is 1253 N. Research Way, Suite Q-2500, Orem, UT 84097.


From cjfields at uiuc.edu  Thu Oct 19 20:29:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 19:29:11 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <45380913.3070506@infotech.monash.edu.au>
Message-ID: <000f01c6f3de$c3d91170$15327e82@pyrimidine>

> Nathan,
> 
> > use strict;
> > use Bio::Root::IO;  # cant test for this, might be needed to get
> Test::More
> 
> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
> and File::Spec is "guaranteed" to be installed with Perl 5.6+.
> 
> >     use lib Bio::Root::IO->catfile('t','lib');
> 
> Simpler as:
> 	use lib 't/lib';
> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of
> native
> platform.
> 
> --
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia

That is true, at least for WinXP (not sure about older Windows versions out
there).  I was using 'Root::IO->catfile' but found 'use lib 't/lib' works.
I may have a few of the 'catfile' versions floating around out there, which
may be where that originated.

Note that if you plan on using Test::More with the bioperl-run test suite,
you should add it to the bioperl-run CVS distribution directory in 't/lib'.
Most people will have it installed, but you never know.

Chris


From cjfields at uiuc.edu  Thu Oct 19 20:33:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 19:33:22 -0500
Subject: [Bioperl-l] Prabu Raja sent you this link
In-Reply-To: <20061020001136.86586.qmail@x05.namesdatabase.com>
Message-ID: <001001c6f3df$598a24c0$15327e82@pyrimidine>

That Prabu Raja sure gets around...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Prabu Raja
> Sent: Thursday, October 19, 2006 7:12 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Prabu Raja sent you this link
> 
> Remember your link from Prabu Raja:
> 
> http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2
> 
> 
> 1 -> Use Prabu Raja's link by clicking above.
> 
> 2 -> Enter your info for a membership connected to Prabu.
> 
> 3 -> Share links with other friends, family and co-workers.
> 
> 4 -> Use the members-only people search tools.
> 
> Prabu selected you for this on 09-02-2004 22:52 ET.
> 
> 
> prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-
> bio.org
> at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99.
> If you do not know a Prabu Raja, use
> http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more
> reminders about this.
> For reference, the address of The Names Database is 1253 N. Research Way,
> Suite Q-2500, Orem, UT 84097.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From keithplayer at hotmail.com  Thu Oct 19 22:13:52 2006
From: keithplayer at hotmail.com (Keith Player)
Date: Fri, 20 Oct 2006 02:13:52 +0000 (UTC)
Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning
Message-ID: <loom.20061020T041338-193@post.gmane.org>

I know that there may be some changes resulting from new GFF3 implementations, 
but thought I would see if the following is useful anyway.

I implemented the R-tree binning schema as used by Bio::DB::GFF::Util::Binning 
and as mention in this article:

I tested the following query on a normal table (no binning), but it assumes 
that you know the longest range in the table.  So for example with a table of 
human genes, where the longest gene we know of is around 2.4Mb.

 SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) AND 
g.start < [end] AND g.end > [start] AND g.chromosome = '1'

so for 100Mb:101Mb

SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < 
101000000 AND g.end > 100000000 AND g.chromosome = '1'


where [start] and [end] define the region of interest.  This query outperforms 
the R-Tree implementation on all tests that I have performed (for lengths of 
200bp to 10Mb across a whole chromsome).  Could this be of some practical use?


From jason at bioperl.org  Thu Oct 19 11:50:49 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 08:50:49 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
Message-ID: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>

Well there is explicit addition of the version to the primary id so  
it isn't so much a parsing error as a deliberate decision to append it.
see Bio::SeqIO::genbank

to make the dblink
                                               $annotation- 
 >add_Annotation
                                                     ('dblink',
                                                       
Bio::Annotation::DBLink->new
                                                      (-primary_id =>  
$id . "." . $version,
                                                       -version =>  
$version,
                                                       -database => $db,
                                                       -tagname =>  
'dblink'));

and the code to print the dblink back out in the writer already  
assumes the version number is appended...

         foreach my $ref ( $seq->annotation->get_Annotations 
('dblink') ) {
             # if ($ref->comment eq 'DBSOURCE') {
             $self->_print('DBSOURCE    accession ',
                           $ref->primary_id, "\n");
             # }
         }

On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:

> Here is the overload code:
>
> use overload '""' => sub {
> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> 	|| '' };
>
> Except that the last '||' is redundant and unnecessary (it either  
> does nothing or replaces an empty string with an empty string), I  
> don't see the potential for duplicating the version number here -  
> unless primary_id() did that, which I don't see it doing.
>
> So, to me this seems to come from a parsing error in the beginning,  
> rather than an erroneous mangling of version into primary_id later.
>
> Is someone in the position to confirm this?
>
> 	-hilmar
>
> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>
>> So I'm unsure what we should do here.
>>
>> We can certainly fix the problem which you report which is relying on
>> the "" method -- if you were to do instead:
>> print $_->database, ":", $_->primary_id, "\n";
>>
>> you'll get the right answer.  We at a minimum just fix the auto-
>> string converting method to do The Right Thing.
>>
>> But I am not sure if we should keep the version out of the primary_id
>> field.  This will require some rejiggering in several modules when it
>> comes to printing DBlinks and I don't want to do this before the
>> release. I also am not sure if there was an explicit reason why
>> someone did put the version information in the primary_id. (I hope it
>> wasn't me because I don't think I'm going to remember why).
>>
>> Does anyone else have a strong feeling?
>>
>> -jason
>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>
>>> Hello,
>>>
>>> I noticed a little problem with the Annotation "DBLink" from
>>> GenBank entries
>>>
>>> When I run:
>>>
>>> perl -MBio::DB::GenBank -e 'my $gi =
>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>> $seqio =
>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>> ("dblink");
>>> for(@annotations) { print $_, "\n";} print $INC{
>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>
>>> This yields:
>>>
>>>    GenBank:AL591065.17.17
>>>
>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>
>>> Can others repeat this?
>>>
>>> I have dug into the source a little and Bio::Annotation::DBLink
>>> seems to
>>> be the place where this happens: it has a concatenation which  
>>> leads to
>>> that repeated version number.
>>>
>>> It this something that I should fix "client-side", so to speak, or
>>> is it
>>> worthwhile to add some logic to that concatenation to prevent this?
>>>
>>>
>>> Thanks,
>>>
>>> Eric
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich, PhD
>> Miller Research Fellow
>> University of California
>> Dept of Plant and Microbial Biology
>> 321 Koshland Hall #3102
>> Berkeley, CA 94720-3102
>> lab: 510.642.8441
>> http://pmb.berkeley.edu/~taylor/people/js.html
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From n.haigh at sheffield.ac.uk  Fri Oct 20 04:35:03 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 20 Oct 2006 08:35:03 +0000
Subject: [Bioperl-l] test::more template
In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
Message-ID: <45388A37.7040505@sheffield.ac.uk>

Chris Fields wrote:
>> Nathan,
>>
>>     
>>> use strict;
>>> use Bio::Root::IO;  # cant test for this, might be needed to get
>>>       
>> Test::More
>>
>> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
>> and File::Spec is "guaranteed" to be installed with Perl 5.6+.
>>
>>     
>>>     use lib Bio::Root::IO->catfile('t','lib');
>>>       
>> Simpler as:
>> 	use lib 't/lib';
>> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of
>> native
>> platform.
>>
>> --
>> Torsten Seemann
>> Victorian Bioinformatics Consortium, Monash University, Australia
>>     
>
> That is true, at least for WinXP (not sure about older Windows versions out
> there).  I was using 'Root::IO->catfile' but found 'use lib 't/lib' works.
> I may have a few of the 'catfile' versions floating around out there, which
> may be where that originated.
>
> Note that if you plan on using Test::More with the bioperl-run test suite,
> you should add it to the bioperl-run CVS distribution directory in 't/lib'.
> Most people will have it installed, but you never know.
>
> Chris
>
>
>   
What is the reason for including Test::More in 't/lib' rather than
having it as a prereq?

-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>


From n.haigh at sheffield.ac.uk  Fri Oct 20 05:27:19 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 20 Oct 2006 10:27:19 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
Message-ID: <45389677.1000709@sheffield.ac.uk>

Is it really necessary to specify the number of tests that are to be
conducted in advance? It seems a bit annoying to have to count the
number of tests in the script or to run the test just to see how many
tests were done, we could just use:
use Test::More 'no_plan';

And then it's up to Test::More to keep a track of how many tests it's
run. The only thing then to worry about is how many tests are in a SKIP
block if the skip criteria are met. This is unless there is a good
reason to use it that I am unaware of.

Thanks
Nath


From bix at sendu.me.uk  Fri Oct 20 06:01:09 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 11:01:09 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45389677.1000709@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk>
Message-ID: <45389E65.6080908@sendu.me.uk>

Nathan Haigh wrote:
> Is it really necessary to specify the number of tests that are to be
> conducted in advance? It seems a bit annoying to have to count the
> number of tests in the script or to run the test just to see how many
> tests were done, we could just use:
> use Test::More 'no_plan';

It's very important to have a plan. That way you know all the tests 
actually ran and weren't skipped (either due to an actual SKIP block or 
an if block that returned false due to a bug, or a for/foreach/while 
that didn't loop enough times due to a bug, or any number of other reasons).


From bix at sendu.me.uk  Fri Oct 20 06:04:48 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 11:04:48 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45388A37.7040505@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45388A37.7040505@sheffield.ac.uk>
Message-ID: <45389F40.5060601@sendu.me.uk>

Nathan S. Haigh wrote:
> Chris Fields wrote:
>
>> Note that if you plan on using Test::More with the bioperl-run test suite,
>> you should add it to the bioperl-run CVS distribution directory in 't/lib'.
>> Most people will have it installed, but you never know.
>
> What is the reason for including Test::More in 't/lib' rather than
> having it as a prereq?

Because we want to ensure that the test suite runs and tells you real 
problems (if any) about the code (Bioperl) that it is testing, not 
problems about actually running the tests (which are NOT required for 
using Bioperl, so cannot be considered 'pre-requisites').


From n.haigh at sheffield.ac.uk  Fri Oct 20 06:54:30 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 20 Oct 2006 11:54:30 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45389E65.6080908@sendu.me.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk>
Message-ID: <4538AAE6.5070600@sheffield.ac.uk>

If there are known bugs in a particular version of software, what is the
best approach for dealing with tests that would fail due to this bug?
Simply skip those tests that would be affected by the bug, or to fail if
the affected version is detected and report the reason so the user is
informed? Or simply bump the minimum version to one above the affected
versions?

For example, t/Clustalw has a test for at least version 1.8. It then has
some profile alignment tests that are only run if version > 1.82 is
installed. It states that versions 1.81 and 1.82 are affected by a
profile alignment bug - which i assume would make the tests fail.

Cheers
Nath


From bix at sendu.me.uk  Fri Oct 20 07:06:07 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 12:06:07 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <4538AAE6.5070600@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk>
	<4538AAE6.5070600@sheffield.ac.uk>
Message-ID: <4538AD9F.8040003@sendu.me.uk>

Nathan Haigh wrote:
> If there are known bugs in a particular version of software, what is the
> best approach for dealing with tests that would fail due to this bug?
> Simply skip those tests that would be affected by the bug, or to fail if
> the affected version is detected and report the reason so the user is
> informed? Or simply bump the minimum version to one above the affected
> versions?
> 
> For example, t/Clustalw has a test for at least version 1.8. It then has
> some profile alignment tests that are only run if version > 1.82 is
> installed. It states that versions 1.81 and 1.82 are affected by a
> profile alignment bug - which i assume would make the tests fail.

Specific cases like this, I'd discuss on the list/ with the author of
the module in question. Maybe there is some great need to allow usage
with <1.81?

My view, based purely on what you've said above, bump the pre-requisite
to a version that works.


From cjfields at uiuc.edu  Fri Oct 20 08:36:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 07:36:37 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <45388A37.7040505@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45388A37.7040505@sheffield.ac.uk>
Message-ID: <80A2D210-B0DB-4CD2-9B56-A38097F4F63F@uiuc.edu>


>> ,,,
>>
> What is the reason for including Test::More in 't/lib' rather than
> having it as a prereq?

We could do that.  Many CPAN modules include it in 't/lib' b/c it is  
only needed for testing purposes.

Chris

>
> -- 
>> A: Yes.
>>> Q: Are you sure?
>>>
>>>> A: Because it reverses the logical flow of conversation.
>>>>
>>>>> Q: Why is top posting frowned upon?
>>>>>
> Get Thunderbird <http://www.mozilla.org/products/thunderbird/>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Fri Oct 20 10:44:29 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 15:44:29 +0100
Subject: [Bioperl-l] Updated Makefile.PL
Message-ID: <4538E0CD.1030908@sendu.me.uk>

Hi,
I've just committed an updated Makefile.PL to HEAD for bioperl-live. 
Could some people test it on multiple platforms and confirm it is ok 
(try out the different possible options as well)?

(NB. in the below, 'pre-reqs' are things the makefile considers optional 
dependencies)

Note that some pre-reqs have been removed:
# DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end 
up requiring it but only after the user makes an explicit choice by 
typing 'DBD::mysql' in their own code to supply as an option to Bioperl 
code)
# File::Temp (standard in 5.6.1)


This pre-req was wrong:
# Data::Stag::Writer
and has been replaced with:
Data::Stag::XMLWriter


Also, I note that very many Bioperl modules need IO::String, including 
Bio::SeqIO, so I'm not sure to what extent we can pretend it is an 
optional module. I didn't make any change though.


I don't know if these changes affect the Windows ppm Nathan, or anything 
else (Bundle?)?

The INSTALL docs need updating with these new and improved pre-reqs 
(note that some pre-reqs had wrong/not enough Bioperl modules listed as 
needing them); does someone want to correct the wiki (based on the new 
Makefile.PL) and then Chris can re-create the text version?


From hlapp at gmx.net  Fri Oct 20 11:03:34 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 20 Oct 2006 11:03:34 -0400
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
References: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>


On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote:

> Also, I note that very many Bioperl modules need IO::String, including
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
> optional module. I didn't make any change though.

I agree. There's really not that many terribly useful things you can  
do with Bioperl w/o having IO::String installed, which is in stark  
contrast to many other dependencies.

I don't have a problem with making it (and a few others used all over  
the place) required, to better contrast them with the dependencies  
that are really optional (and not needed for 90% of users).

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct 20 11:18:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 10:18:32 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <001501c6f45b$019103c0$15327e82@pyrimidine>

> Hi,
> I've just committed an updated Makefile.PL to HEAD for bioperl-live.
> Could some people test it on multiple platforms and confirm it is ok
> (try out the different possible options as well)?
> 
> (NB. in the below, 'pre-reqs' are things the makefile considers optional
> dependencies)
> 
> Note that some pre-reqs have been removed:
> # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end
> up requiring it but only after the user makes an explicit choice by
> typing 'DBD::mysql' in their own code to supply as an option to Bioperl
> code)
> # File::Temp (standard in 5.6.1)

I'll try it out on WinXP and Mac OS X.  BTW, do any of Lincoln's Bio::DB*
use DBD::mySQL?  Bio::DB::GFF comes to mind.  I don't think it should be an
absolute requirement, though.

If we plan on removing those, then we should also remove them from
Bundle::Bioperl (if they are present).

> This pre-req was wrong:
> # Data::Stag::Writer
> and has been replaced with:
> Data::Stag::XMLWriter
> 
> 
> Also, I note that very many Bioperl modules need IO::String, including
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
> optional module. I didn't make any change though.

Do they all require IO::String or is it an option?  There are a few
instances (WebDBSeqI-implementing, for instance) where this is presented as
an option for most OS's (along with the default, pipeline, and tempfile).
However, it is currently used by default with Windows due to lack of
pipe/fork support at the time.

BTW, the latter may now work with WinXP ActivePerl.  ActiveState has been
working on WinXP fork() emulation for a while, but I think it is still
somewhat experimental.  

> I don't know if these changes affect the Windows ppm Nathan, or anything
> else (Bundle?)?
> 
> The INSTALL docs need updating with these new and improved pre-reqs
> (note that some pre-reqs had wrong/not enough Bioperl modules listed as
> needing them); does someone want to correct the wiki (based on the new
> Makefile.PL) and then Chris can re-create the text version?

Easier to just modify the text version based on what is changed in the wiki,
at least for the time being.  The text dumping from elinks/lynx isn't
full-proof re: tables and such, which is one reason I think we should move
the prereqs to a separate file as it's easier to maintain long-term (this
seems to be where most changes occur anyway).  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Fri Oct 20 11:23:38 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 16:23:38 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>	<45379BBB.1040400@sheffield.ac.uk>
	<1161270180.453793a432e4f@webmail.shef.ac.uk>
Message-ID: <4538E9FA.60701@sendu.me.uk>

Nathan Haigh wrote:
> I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be
> consistent with other tests.
> 
> Failing that - Is there a good test writing style I should follow in one of the other test files?

I originally based mine on one of Chris's EUtilities tests, but now 
refer to t/ESEfinder.t since it is small and demonstrates all the major 
tricky things you might have to do - skip remote tests if no 
BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests 
under some condition, fall-back to t/lib for Test::More if necessary.

(Though I just spotted an oops in the latter...)


From cjfields at uiuc.edu  Fri Oct 20 11:38:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 10:38:02 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <4538E9FA.60701@sendu.me.uk>
Message-ID: <001601c6f45d$bb824350$15327e82@pyrimidine>

> Nathan Haigh wrote:
> > I thought I'd have my first proper try at writing some tests. I was
> wondering if there is a template test file that I should use/study in
> order to be
> > consistent with other tests.
> >
> > Failing that - Is there a good test writing style I should follow in one
> of the other test files?
> 
> I originally based mine on one of Chris's EUtilities tests, but now
> refer to t/ESEfinder.t since it is small and demonstrates all the major
> tricky things you might have to do - skip remote tests if no
> BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests
> under some condition, fall-back to t/lib for Test::More if necessary.
> 
> (Though I just spotted an oops in the latter...)

I agree.  The EUtilities tests are quite long.  I plan on eventually cutting
out some of them  Making them somewhat less prone to changes in returned XML
data has also been a pain, as demonstrated by some of the tests from MAIN
now failing... d'oh!

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Fri Oct 20 11:39:32 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 16:39:32 +0100
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <001501c6f45b$019103c0$15327e82@pyrimidine>
References: <001501c6f45b$019103c0$15327e82@pyrimidine>
Message-ID: <4538EDB4.3030500@sendu.me.uk>

Chris Fields wrote:
> BTW, do any of Lincoln's Bio::DB*
> use DBD::mySQL?  Bio::DB::GFF comes to mind.

No, just a require on a user-passed variable as I described.


>> Also, I note that very many Bioperl modules need IO::String, including
>> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
>> optional module. I didn't make any change though.
> 
> Do they all require IO::String or is it an option?

Oops, I take that back. Bio::SeqIO doesn't use IO::String. That's what 
you get for relying on grep output...
It's still many modules that use it, but I suppose you could do useful 
things without. So actually, let's keep it optional.


From cjfields at uiuc.edu  Fri Oct 20 16:32:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 15:32:32 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
Message-ID: <000001c6f486$df508930$15327e82@pyrimidine>


Seth, 

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto:bioperl-l-
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------
> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

-- 
Best Regards,

Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From olenka.m at gmail.com  Fri Oct 20 17:47:15 2006
From: olenka.m at gmail.com (Olena Morozova)
Date: Fri, 20 Oct 2006 14:47:15 -0700
Subject: [Bioperl-l] GO annotations
Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>

Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena


From olenka.m at gmail.com  Fri Oct 20 17:47:15 2006
From: olenka.m at gmail.com (Olena Morozova)
Date: Fri, 20 Oct 2006 14:47:15 -0700
Subject: [Bioperl-l] GO annotations
Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>

Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena


From sdavis2 at mail.nih.gov  Sat Oct 21 11:05:26 2006
From: sdavis2 at mail.nih.gov (Davis, Sean (NIH/NCI) [E])
Date: Sat, 21 Oct 2006 11:05:26 -0400
Subject: [Bioperl-l] GO annotations
References: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>
Message-ID: <014DBF86B19310419F0DF8910FC56457240CE3@nihcesmlbx10.nih.gov>

You can use the ensembl perl API, or (more simply) use the Ensembl MART interface:

http://www.ensembl.org/Multi/martview

Sean


-----Original Message-----
From: Olena Morozova [mailto:olenka.m at gmail.com]
Sent: Fri 10/20/2006 5:47 PM
To: bioperl-l
Subject: [Bioperl-l] GO annotations
 
Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Sun Oct 22 06:34:51 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 22 Oct 2006 10:34:51 +0000
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
Message-ID: <453B494B.7040702@sheffield.ac.uk>

Hilmar Lapp wrote:
> On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote:
>
>   
>> Also, I note that very many Bioperl modules need IO::String, including
>> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
>> optional module. I didn't make any change though.
>>     
>
> I agree. There's really not that many terribly useful things you can  
> do with Bioperl w/o having IO::String installed, which is in stark  
> contrast to many other dependencies.
>
> I don't have a problem with making it (and a few others used all over  
> the place) required, to better contrast them with the dependencies  
> that are really optional (and not needed for 90% of users).
>
> 	-hilmar
>
>   

Is it possible to  make a distinction in Makefile.PL between those
modules that are an absolute must for Bioperl-core and those which are
optional and should go into Bundle::BioPerl?

Once I'm sure what should be "option" I'll do the Bundle::BioPerl
package and PPD's.

Cheers
Nath


From vitacolonna at appliedgenomics.org  Sun Oct 22 09:04:48 2006
From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna)
Date: Sun, 22 Oct 2006 15:04:48 +0200
Subject: [Bioperl-l] Submission proposal: ABIF module
Message-ID: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>

Hi everybody,
I would like to submit to CPAN a module for reading and parsing the  
ABIF files (with .ab1 suffix) produced by Applied Biosequence  
sequencers. The need for such a module arose in our lab because the  
existing ABI module we found on CPAN had too limited functionality.  
As an example, our module allows us to easily produce analysis  
reports similar to the ones generated by the Sequencing Analysis  
software.

May I call the module Bio::ABIF? Or should I follow other conventions?

Nicola


From cjfields at uiuc.edu  Sun Oct 22 09:54:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 08:54:51 -0500
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
Message-ID: <F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>


On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:

> Hi everybody,
> I would like to submit to CPAN a module for reading and parsing the
> ABIF files (with .ab1 suffix) produced by Applied Biosequence
> sequencers. The need for such a module arose in our lab because the
> existing ABI module we found on CPAN had too limited functionality.
> As an example, our module allows us to easily produce analysis
> reports similar to the ones generated by the Sequencing Analysis
> software.
>
> May I call the module Bio::ABIF? Or should I follow other conventions?
>
> Nicola

It depends.  Does it interact with bioperl in any way?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct 22 09:57:18 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 08:57:18 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <453B494B.7040702@sheffield.ac.uk>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
	<453B494B.7040702@sheffield.ac.uk>
Message-ID: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>


On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote:

> Is it possible to  make a distinction in Makefile.PL between those
> modules that are an absolute must for Bioperl-core and those which are
> optional and should go into Bundle::BioPerl?
>
> Once I'm sure what should be "option" I'll do the Bundle::BioPerl
> package and PPD's.
>
> Cheers
> Nath

We probably should steer this way eventually.  Do you aim on placing  
prereqs required for bioperl core in the bioperl PPD and the  
'optional' ones with the bundle?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From vitacolonna at appliedgenomics.org  Sun Oct 22 10:16:26 2006
From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna)
Date: Sun, 22 Oct 2006 16:16:26 +0200
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
	<F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
Message-ID: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>


On 22/ott/06, at 15:54, Chris Fields wrote:

>
> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:
>
>> Hi everybody,
>> I would like to submit to CPAN a module for reading and parsing the
>> ABIF files (with .ab1 suffix) [...]
>> May I call the module Bio::ABIF? Or should I follow other  
>> conventions?
>
> It depends.  Does it interact with bioperl in any way?

No. Can you suggest a suitable pattern for the name?

Nicola


From cjfields at uiuc.edu  Sun Oct 22 10:55:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 09:55:46 -0500
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
	<F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
	<8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>
Message-ID: <B4155C40-8E3D-4AA0-88F5-7A1FFBD3A134@uiuc.edu>

On Oct 22, 2006, at 9:16 AM, Nicola Vitacolonna wrote:

> On 22/ott/06, at 15:54, Chris Fields wrote:
>
>>
>> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:
>>
>>> Hi everybody,
>>> I would like to submit to CPAN a module for reading and parsing the
>>> ABIF files (with .ab1 suffix) [...]
>>> May I call the module Bio::ABIF? Or should I follow other
>>> conventions?
>>
>> It depends.  Does it interact with bioperl in any way?
>
> No. Can you suggest a suitable pattern for the name?
>
> Nicola

I don't think it will be a problem to name it Bio::ABIF; there is  
already a Bio::ASN1::EntrezGene, and Rutger Vos's Bio::Phylo modules  
(the latter doesn't require BioPerl either).

Saying that, if you plan on contributing more CPAN modules with  
similar functionality (such as parsing other trace files), you might  
want to consider using a namespace that isn't limiting but doesn't  
conflict with Bioperl core (like Bio::Trace or similar, then name  
your module Bio::Trace::ABIF).  You can use search.cpan.org to check  
namespaces for conflicts.

Just as an note: we have bioperl-ext, which also parses ABI and other  
trace file formats.  It's a bit old now and needs updating, but is  
supposed to be quite fast (it uses the Staden io_lib C library via  
PerlXS).

-c

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Sun Oct 22 13:26:37 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Sun, 22 Oct 2006 12:26:37 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
References: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <453BA9CD.4060107@campus.iztacala.unam.mx>

Works fine on FreeBSD.

Mauricio.

Sendu Bala wrote:
> Hi,
> I've just committed an updated Makefile.PL to HEAD for bioperl-live. 
> Could some people test it on multiple platforms and confirm it is ok 
> (try out the different possible options as well)?
> 
> (NB. in the below, 'pre-reqs' are things the makefile considers optional 
> dependencies)
> 
> Note that some pre-reqs have been removed:
> # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end 
> up requiring it but only after the user makes an explicit choice by 
> typing 'DBD::mysql' in their own code to supply as an option to Bioperl 
> code)
> # File::Temp (standard in 5.6.1)
> 
> 
> This pre-req was wrong:
> # Data::Stag::Writer
> and has been replaced with:
> Data::Stag::XMLWriter
> 
> 
> Also, I note that very many Bioperl modules need IO::String, including 
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an 
> optional module. I didn't make any change though.
> 
> 
> I don't know if these changes affect the Windows ppm Nathan, or anything 
> else (Bundle?)?
> 
> The INSTALL docs need updating with these new and improved pre-reqs 
> (note that some pre-reqs had wrong/not enough Bioperl modules listed as 
> needing them); does someone want to correct the wiki (based on the new 
> Makefile.PL) and then Chris can re-create the text version?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From n.haigh at sheffield.ac.uk  Sun Oct 22 15:37:07 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 22 Oct 2006 20:37:07 +0100
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
	<453B494B.7040702@sheffield.ac.uk>
	<7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>
Message-ID: <453BC863.4090803@sheffield.ac.uk>

Chris Fields wrote:
>
> On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote:
>
>> Is it possible to  make a distinction in Makefile.PL between those
>> modules that are an absolute must for Bioperl-core and those which are
>> optional and should go into Bundle::BioPerl?
>>
>> Once I'm sure what should be "option" I'll do the Bundle::BioPerl
>> package and PPD's.
>>
>> Cheers
>> Nath
>
> We probably should steer this way eventually.  Do you aim on placing 
> prereqs required for bioperl core in the bioperl PPD and the 
> 'optional' ones with the bundle?
>
That's correct. However, PPM will always try to update packages to the 
latest available. Therefore, if at some point in the future, a 
dependency is removed, and thus removed from Bundle::BioPerl, a 
situation may arise where an older version of BioPerl is running with 
the a recent version of Bundle::BioPerl and could have missing 
dependencies - not ideal but it is how things currently stand. The 
process of making the Bundle::BioPerl PPD would be simplified if these 
"optional" dependencies are separated from the "core" dependencies. If 
one of the following solutions is possible (i'm not sure if they are), 
it would be very useful:

1) Maintain 2 hashes in Makefile.PL that contain the "core" and 
"optional" dependencies. In unsure of the way dependencies are ordered 
during a "make ppd", but it may be possible to pass hash references of 
both to PREREQS_PM in MakeMakefile and have the "optional" depenencies 
grouped separately from "core" depenedcies in the ppd file - thus making 
it easy to stip them out into a Bundle::BioPerl ppd.

2) Again, maintain 2 hashes in Makefile.PL that contain the "core" and 
"optional" dependencies. Have some Makefile setup that allows the 
generation of a Bundle::BioPerl ppd separately from the main Bioperl ppd.

Like I said, these are just some thoughts and I'm not sure if they are 
even viable options.

Nath


From chhalling at alumni.ls.berkeley.edu  Sun Oct 22 19:45:33 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Sun, 22 Oct 2006 19:45:33 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
Message-ID: <453C029D.1070708@alumni.ls.berkeley.edu>

I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 
that prevent these modules from being installed:

Data::Stag::Writer (listed as Data::Stag::writer)
HTTP::Request::Common (listed as HTTP::Request::Common-)
Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From cjfields at uiuc.edu  Sun Oct 22 22:24:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 21:24:07 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
Message-ID: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>

Thanks for letting us know!  Did PPM4 throw errors or just silently  
pass them over?

Chris

On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:

> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17- 
> Oct-2006
> that prevent these modules from being installed:
>
> Data::Stag::Writer (listed as Data::Stag::writer)
> HTTP::Request::Common (listed as HTTP::Request::Common-)
> Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)
>
> -- 
> Conrad Halling
> chhalling at alumni.ls.berkeley.edu
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Mon Oct 23 02:45:29 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 06:45:29 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
Message-ID: <453C6509.90005@sheffield.ac.uk>

Chris Fields wrote:
> Thanks for letting us know!  Did PPM4 throw errors or just silently  
> pass them over?
>
> Chris
>
> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:
>
>   
I believe he is talking about the bundle on cpan and not the ppd. I will
get this updated as soon as possible.

Sendu/Chris - can you confirm to me which Bioperl modules are essential
to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
reason for not putting *all* dependencies into the bundle?

Nath


From bix at sendu.me.uk  Mon Oct 23 02:43:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 07:43:36 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
Message-ID: <453C6498.5@sendu.me.uk>

Conrad Halling wrote:
> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 
> that prevent these modules from being installed:
> 
> Data::Stag::Writer (listed as Data::Stag::writer)

This should be Data::Stag::XMLWriter

> HTTP::Request::Common (listed as HTTP::Request::Common-)
> Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)


From bix at sendu.me.uk  Mon Oct 23 02:52:47 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 07:52:47 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C6509.90005@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
Message-ID: <453C66BF.1060008@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu/Chris - can you confirm to me which Bioperl modules are essential
> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
> reason for not putting *all* dependencies into the bundle?

AFAIK, there are no essential external dependencies. Everything in 
%packages in Makefile.PL, for example, is optional.

We had the discussion about making all the easy-to-install ones a forced 
requirement anyway (so that most things work out of the box), but 
perhaps we'll hold off on making such a change until after 1.5.2.


From jyotikshah at gmail.com  Mon Oct 23 03:10:43 2006
From: jyotikshah at gmail.com (Jyoti Shah)
Date: Mon, 23 Oct 2006 00:10:43 -0700
Subject: [Bioperl-l] short motif searches
Message-ID: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>

Hi,

I am interested in searching motifs as small as 6 or 7 nucleotides in
genomic databases. I need exact matches. Is there any bioperl module
available which can help me do this? I tried WU BLAST with word size one,
but I am getting warning messages such as "WARNING: the maximum achievable
score of 7 in context 0 (frame +1) is less than the ungapped cutoff score S2
(=13). Exit code 0...". Any suggestions?

Thanks in advance,
Jyoti


From bix at sendu.me.uk  Mon Oct 23 03:55:40 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 08:55:40 +0100
Subject: [Bioperl-l] short motif searches
In-Reply-To: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>
References: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>
Message-ID: <453C757C.1010408@sendu.me.uk>

Jyoti Shah wrote:
> Hi,
> 
> I am interested in searching motifs as small as 6 or 7 nucleotides in
> genomic databases. I need exact matches. Is there any bioperl module
> available which can help me do this?

At 6 or 7bp long doing a simple exact match I should point out you're 
going to get very many hits; are you sure this is an appropriate thing 
to do for your purposes?

Assuming yes, you can use Bio::SeqIO, Bio::Index or Bio::DB::<something> 
to get your genomic sequences of interest, then simply use a normal perl 
regexp on the resulting $seq->seq strings.

If your motifs are anything like transcription factor binding sites, and 
you have more information than just a single sequence string for the 
motif, investigate Bio::Matrix::PSM.


From bix at sendu.me.uk  Mon Oct 23 04:29:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 09:29:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C7648.8030004@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk>
	<453C7648.8030004@sheffield.ac.uk>
Message-ID: <453C7D80.80207@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> Sendu/Chris - can you confirm to me which Bioperl modules are essential
>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
>>> reason for not putting *all* dependencies into the bundle?
>> AFAIK, there are no essential external dependencies. Everything in
>> %packages in Makefile.PL, for example, is optional.
>>
>> We had the discussion about making all the easy-to-install ones a
>> forced requirement anyway (so that most things work out of the box),
>> but perhaps we'll hold off on making such a change until after 1.5.2.
 >
> How are they forced?

They're not. Right now they're optional. I'm suggesting we might change 
that in the future.

If you're asking how we /would/ force them, probably by adding 
PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs 
successfully (or should!) without its optional dependencies given in 
PREREQ_PM because make test succeeds (because tests skip ok when the 
optional dependency isn't there).

I don't really know how CPAN discovers dependencies and auto-installs 
them before a dependent module though. Anyone care to explain?


From n.haigh at sheffield.ac.uk  Mon Oct 23 06:09:12 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 10:09:12 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C7D80.80207@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk>
	<453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk>
Message-ID: <453C94C8.5040900@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Nathan S. Haigh wrote:
>>>> Sendu/Chris - can you confirm to me which Bioperl modules are
>>>> essential
>>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
>>>> reason for not putting *all* dependencies into the bundle?
>>> AFAIK, there are no essential external dependencies. Everything in
>>> %packages in Makefile.PL, for example, is optional.
>>>
>>> We had the discussion about making all the easy-to-install ones a
>>> forced requirement anyway (so that most things work out of the box),
>>> but perhaps we'll hold off on making such a change until after 1.5.2.
> >
>> How are they forced?
>
> They're not. Right now they're optional. I'm suggesting we might
> change that in the future.
> If you're asking how we /would/ force them, probably by adding
> PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs
> successfully (or should!) without its optional dependencies given in
> PREREQ_PM because make test succeeds (because tests skip ok when the
> optional dependency isn't there).
>
> I don't really know how CPAN discovers dependencies and auto-installs
> them before a dependent module though. Anyone care to explain?

I thought so! I misunderstood something earlier which confused me. Just
to clarify for my own sanities sake:

1) Currently all dependencies are optional.
2) All dependencies are in %packages
3) all these are passed to PREREQ_PM

As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's:
--snip--

    I installed a Bundle and had a couple of fails. When I retried,
    everything resolved nicely. Can this be fixed to work on first try?

    The reason for this is that CPAN does not know the dependencies of
    all modules when it starts out. To decide about the additional items
    to install, it just uses data found in the META.yml file or the
    generated Makefile. An undetected missing piece breaks the process.
    But it may well be that your Bundle installs some prerequisite later
    than some depending item and thus your second try is able to resolve
    everything. Please note, CPAN.pm does not know the dependency tree
    in advance and cannot sort the queue of things to install in a
    topologically correct order. It resolves perfectly well IF all
    modules declare the prerequisites correctly with the PREREQ_PM
    attribute to MakeMaker or the |requires| stanza of Module::Build.
    For bundles which fail and you need to install often, it is
    recommended to sort the Bundle definition file manually.

--snip--

Therefore, recent modifications to Makefile.PL should result in a fully
operational Bioperl installation, if installed via CPAN. Although only
Bioperl 1.4 is available via CPAN currently. It is possible to upload a
developer release to CPAN which can only be ownloaded via CPAN if
specifically asked for - would be good for 1.5.x.:
--snip--

    How do I install a "DEVELOPER RELEASE" of a module?

    By default, CPAN will install the latest non-developer release of a
    module. If you want to install a dev release, you have to specify
    the partial path starting with the author id to the tarball you wish
    to install, like so:

        cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz

    Note that you can use the |ls| command to get this path listed.

--snip--

HTH
Nath


From bix at sendu.me.uk  Mon Oct 23 05:41:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 10:41:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C94C8.5040900@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
Message-ID: <453C8E60.7000105@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>
>> I don't really know how CPAN discovers dependencies and auto-installs
>> them before a dependent module though. Anyone care to explain?
> 
> I thought so! I misunderstood something earlier which confused me. Just
> to clarify for my own sanities sake:
> 
> 1) Currently all dependencies are optional.
> 2) All dependencies are in %packages
> 3) all these are passed to PREREQ_PM

All correct.


> As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's:
> --snip--
> 
>     I installed a Bundle and had a couple of fails. When I retried,
>     everything resolved nicely. Can this be fixed to work on first try?
> 
>     The reason for this is that CPAN does not know the dependencies of
>     all modules when it starts out. To decide about the additional items
>     to install, it just uses data found in the META.yml file or the
>     generated Makefile. An undetected missing piece breaks the process.
>     But it may well be that your Bundle installs some prerequisite later
>     than some depending item and thus your second try is able to resolve
>     everything. Please note, CPAN.pm does not know the dependency tree
>     in advance and cannot sort the queue of things to install in a
>     topologically correct order. It resolves perfectly well IF all
>     modules declare the prerequisites correctly with the PREREQ_PM
>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>     For bundles which fail and you need to install often, it is
>     recommended to sort the Bundle definition file manually.
> 
> --snip--
>
> Therefore, recent modifications to Makefile.PL should result in a fully
> operational Bioperl installation, if installed via CPAN.

Right, thanks for that.


> Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a
> developer release to CPAN which can only be ownloaded via CPAN if
> specifically asked for - would be good for 1.5.x.:
> --snip--
> 
>     How do I install a "DEVELOPER RELEASE" of a module?
> 
>     By default, CPAN will install the latest non-developer release of a
>     module. If you want to install a dev release, you have to specify
>     the partial path starting with the author id to the tarball you wish
>     to install, like so:
> 
>         cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz
> 
>     Note that you can use the |ls| command to get this path listed.
> 
> --snip--

That's the user point of view - how does the developer actually tell 
CPAN that something is a developer release so that normal users don't 
automatically install it?


From bix at sendu.me.uk  Mon Oct 23 05:59:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 10:59:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C8E60.7000105@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk>
Message-ID: <453C9298.9000900@sendu.me.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> As far as CPAN discovering dependencies, here is a snip from the CPAN 
>> FAQ's:
>> --snip--
>>
>>     I installed a Bundle and had a couple of fails. When I retried,
>>     everything resolved nicely. Can this be fixed to work on first try?
>>
>>     The reason for this is that CPAN does not know the dependencies of
>>     all modules when it starts out. To decide about the additional items
>>     to install, it just uses data found in the META.yml file or the
>>     generated Makefile. An undetected missing piece breaks the process.
>>     But it may well be that your Bundle installs some prerequisite later
>>     than some depending item and thus your second try is able to resolve
>>     everything. Please note, CPAN.pm does not know the dependency tree
>>     in advance and cannot sort the queue of things to install in a
>>     topologically correct order. It resolves perfectly well IF all
>>     modules declare the prerequisites correctly with the PREREQ_PM
>>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>>     For bundles which fail and you need to install often, it is
>>     recommended to sort the Bundle definition file manually.
>>
>> --snip--
>>
>> Therefore, recent modifications to Makefile.PL should result in a fully
>> operational Bioperl installation, if installed via CPAN.
> 
> Right, thanks for that.

Oh, so this effectively means that our 'optional' dependencies are 
installed for CPAN users, which matches up to my 'force the optional 
ones anyway' desire, leaving Bundle::BioPerl without any use.

Makefile.PL could be altered again to remove from PREREQ_PM those 
modules the user didn't already have installed, thus CPAN would only 
install Bioperl itself and nothing optional. The user could then install 
Bundle::BioPerl if they wanted a quick way of getting all the optional 
stuff to work.

I'm happy either way; what do other people think?


From n.haigh at sheffield.ac.uk  Mon Oct 23 07:22:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 11:22:17 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C9298.9000900@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk>
Message-ID: <453CA5E9.1060406@sheffield.ac.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> As far as CPAN discovering dependencies, here is a snip from the
>>> CPAN FAQ's:
>>> --snip--
>>>
>>>     I installed a Bundle and had a couple of fails. When I retried,
>>>     everything resolved nicely. Can this be fixed to work on first try?
>>>
>>>     The reason for this is that CPAN does not know the dependencies of
>>>     all modules when it starts out. To decide about the additional
>>> items
>>>     to install, it just uses data found in the META.yml file or the
>>>     generated Makefile. An undetected missing piece breaks the process.
>>>     But it may well be that your Bundle installs some prerequisite
>>> later
>>>     than some depending item and thus your second try is able to
>>> resolve
>>>     everything. Please note, CPAN.pm does not know the dependency tree
>>>     in advance and cannot sort the queue of things to install in a
>>>     topologically correct order. It resolves perfectly well IF all
>>>     modules declare the prerequisites correctly with the PREREQ_PM
>>>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>>>     For bundles which fail and you need to install often, it is
>>>     recommended to sort the Bundle definition file manually.
>>>
>>> --snip--
>>>
>>> Therefore, recent modifications to Makefile.PL should result in a fully
>>> operational Bioperl installation, if installed via CPAN.
>>
>> Right, thanks for that.
>
> Oh, so this effectively means that our 'optional' dependencies are
> installed for CPAN users, which matches up to my 'force the optional
> ones anyway' desire, leaving Bundle::BioPerl without any use.
>
> Makefile.PL could be altered again to remove from PREREQ_PM those
> modules the user didn't already have installed, thus CPAN would only
> install Bioperl itself and nothing optional. The user could then
> install Bundle::BioPerl if they wanted a quick way of getting all the
> optional stuff to work.
>
> I'm happy either way; what do other people think?
>From my point of view, removing them from PREREQ_PM means building the
Bundle::BioPerl a bit of a pain :o(

I prefer the way it is currently set up - most people have fast internet
connections and GB of harddrive space. Other than the reason "why
install something I won't ever need" I don't see much point maintaining
Bundle::BioPerl and having "optional" dependencies. I think if there are
any modules which are not going to be used by the majority of users,
then this could be used as the rationale for removing them from
bioperl-core into another package?

Nath


From n.haigh at sheffield.ac.uk  Mon Oct 23 07:38:05 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 11:38:05 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C8E60.7000105@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk>
Message-ID: <453CA99D.9060009@sheffield.ac.uk>


>> Although only Bioperl 1.4 is available via CPAN currently. It is
>> possible to upload a
>> developer release to CPAN which can only be ownloaded via CPAN if
>> specifically asked for - would be good for 1.5.x.:
>> --snip--
>>
>>     How do I install a "DEVELOPER RELEASE" of a module?
>>
>>     By default, CPAN will install the latest non-developer release of a
>>     module. If you want to install a dev release, you have to specify
>>     the partial path starting with the author id to the tarball you wish
>>     to install, like so:
>>
>>         cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz
>>
>>     Note that you can use the |ls| command to get this path listed.
>>
>> --snip--
>
> That's the user point of view - how does the developer actually tell
> CPAN that something is a developer release so that normal users don't
> automatically install it?

I found this:
http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt

Is says that $VERSION should simply be changed from a naked number into
a single quoted number and this should be recognized by the CPAN indexer.

Nath


From bix at sendu.me.uk  Mon Oct 23 06:47:38 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 11:47:38 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
	<E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
Message-ID: <453C9DCA.4020802@sendu.me.uk>

Hilmar Lapp wrote:
> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:
> 
>> For example, I have made no effort to setup biosql-schema but I
>> thought that maybe there would be a test that would detect this
> 
> I'm afraid there isn't. Bioperl-db is meaningless without
> biosql-schema.

Can you suggest a way we might detect if biosql-schema has been 
installed prior to running the test suite, so we can give some 
meaningful error message?


From bix at sendu.me.uk  Mon Oct 23 08:43:30 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 13:43:30 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
Message-ID: <453CB8F2.7070703@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>
>> Makefile.PL could be altered again to remove from PREREQ_PM those
>> modules the user didn't already have installed, thus CPAN would only
>> install Bioperl itself and nothing optional. The user could then
>> install Bundle::BioPerl if they wanted a quick way of getting all the
>> optional stuff to work.
>>
>> I'm happy either way; what do other people think?
 >
> From my point of view, removing them from PREREQ_PM means building the
> Bundle::BioPerl a bit of a pain :o(

Can I ask how you're generating Bundle::BioPerl? That is, how did the 
typos get in there? Is there a way to certainly avoid typos in the future?


From n.haigh at sheffield.ac.uk  Mon Oct 23 09:46:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 13:46:17 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CB8F2.7070703@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk>
Message-ID: <453CC7A9.6090609@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>
>>> Makefile.PL could be altered again to remove from PREREQ_PM those
>>> modules the user didn't already have installed, thus CPAN would only
>>> install Bioperl itself and nothing optional. The user could then
>>> install Bundle::BioPerl if they wanted a quick way of getting all the
>>> optional stuff to work.
>>>
>>> I'm happy either way; what do other people think?
> >
>> From my point of view, removing them from PREREQ_PM means building the
>> Bundle::BioPerl a bit of a pain :o(
>
> Can I ask how you're generating Bundle::BioPerl? That is, how did the
> typos get in there? Is there a way to certainly avoid typos in the
> future?

I just modified the list by hand a while back :o( - I'm sure there must
be a better way.


From bix at sendu.me.uk  Mon Oct 23 08:58:13 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 13:58:13 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CC7A9.6090609@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk>
Message-ID: <453CBC65.2020202@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> Sendu Bala wrote:
>>>
>>>> Makefile.PL could be altered again to remove from PREREQ_PM those
>>>> modules the user didn't already have installed, thus CPAN would only
>>>> install Bioperl itself and nothing optional. The user could then
>>>> install Bundle::BioPerl if they wanted a quick way of getting all the
>>>> optional stuff to work.
>>>>
>>>> I'm happy either way; what do other people think?
 >>>
>>> From my point of view, removing them from PREREQ_PM means building the
>>> Bundle::BioPerl a bit of a pain :o(
 >>
>> Can I ask how you're generating Bundle::BioPerl? That is, how did the
>> typos get in there? Is there a way to certainly avoid typos in the
>> future?
> 
> I just modified the list by hand a while back :o( - I'm sure there must
> be a better way.

I'm not sure I understand why removing things from PREREQ_PM would be a 
problem for you then; the %packages hash would remain unchanged (ie. 
have everything) so you have something to refer to when manually editing 
the Bundle.

http://www.cpan.org/misc/cpan-faq.html#How_make_bundle
might be helpful? I didn't really pay too much attention to the advice - 
does it offer a typo-avoiding solution?


From n.haigh at sheffield.ac.uk  Mon Oct 23 10:04:12 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 14:04:12 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CBC65.2020202@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk>
	<453CBC65.2020202@sendu.me.uk>
Message-ID: <453CCBDC.6030904@sheffield.ac.uk>


> I'm not sure I understand why removing things from PREREQ_PM would be
> a problem for you then; the %packages hash would remain unchanged (ie.
> have everything) so you have something to refer to when manually
> editing the Bundle.
>
> http://www.cpan.org/misc/cpan-faq.html#How_make_bundle
> might be helpful? I didn't really pay too much attention to the advice
> - does it offer a typo-avoiding solution?

It's helpful in producing the Bundle PPD as all the XML tags are present
in the Bioperl PPD and they simply need to be copied over to a
Bundle-BioPerl PPD file.

Looks like manual editing of the relevant file is required for making a
CPAN bundle. Unfortunately - no typo-avoiding solution. :o(


From dhoworth at mrc-lmb.cam.ac.uk  Mon Oct 23 08:46:29 2006
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Mon, 23 Oct 2006 13:46:29 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA99D.9060009@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453CA99D.9060009@sheffield.ac.uk>
Message-ID: <453CB9A5.2020409@mrc-lmb.cam.ac.uk>

>> That's the user point of view - how does the developer actually tell
>> CPAN that something is a developer release so that normal users don't
>> automatically install it?
> 
> I found this:
> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
> 
> Is says that $VERSION should simply be changed from a naked number into
> a single quoted number and this should be recognized by the CPAN indexer.

<http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>

Cheers, Dave


From hlapp at gmx.net  Mon Oct 23 09:40:29 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 23 Oct 2006 09:40:29 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <453C9DCA.4020802@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
	<E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
	<453C9DCA.4020802@sendu.me.uk>
Message-ID: <5C22B9C8-CEF0-457B-8565-793D56389A86@gmx.net>

You would need a lot of information to make that determination (host,  
port, db driver, db name, user, password; i.e., the entire connection  
information, and there is no 'standard').

You might just ask a simple question in Makefile.PL as to whether  
biosql is installed or not, similar to the DB::GFF tests.

	-hilmar

On Oct 23, 2006, at 6:47 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:
>>
>>> For example, I have made no effort to setup biosql-schema but I
>>> thought that maybe there would be a test that would detect this
>>
>> I'm afraid there isn't. Bioperl-db is meaningless without
>> biosql-schema.
>
> Can you suggest a way we might detect if biosql-schema has been
> installed prior to running the test suite, so we can give some
> meaningful error message?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Mon Oct 23 09:59:23 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 14:59:23 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CB9A5.2020409@mrc-lmb.cam.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>
	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
Message-ID: <453CCABB.2060308@sendu.me.uk>

Dave Howorth wrote:
>>> That's the user point of view - how does the developer actually tell
>>> CPAN that something is a developer release so that normal users don't
>>> automatically install it?
>> I found this:
>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>
>> Is says that $VERSION should simply be changed from a naked number into
>> a single quoted number and this should be recognized by the CPAN indexer.
> 
> <http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>

Thanks for that.

I guess from that the 1.5.2 version number should be:

$VERSION = 1.05_02

And 1.6 would be

$VERSION = 1.06

But will this cause a problem wrt 1.4? 1.4 has:

$VERSION = 1.4;

Is 1.4 lower than 1.06? Should we keep to a single digit version, so 
1.5_02 and 1.6? Does this really not work with CPAN? Should we call them 
version fifty and version sixty? 1.50_02, 1.60?


From cjfields at uiuc.edu  Mon Oct 23 10:12:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:12:16 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C9298.9000900@sendu.me.uk>
Message-ID: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>

...
> > Right, thanks for that.
> 
> Oh, so this effectively means that our 'optional' dependencies are
> installed for CPAN users, which matches up to my 'force the optional
> ones anyway' desire, leaving Bundle::BioPerl without any use.
> 
> Makefile.PL could be altered again to remove from PREREQ_PM those
> modules the user didn't already have installed, thus CPAN would only
> install Bioperl itself and nothing optional. The user could then install
> Bundle::BioPerl if they wanted a quick way of getting all the optional
> stuff to work.
> 
> I'm happy either way; what do other people think?

I think that we should have it so Bioperl installs as-is (no additional
reqs) and have Bundle::BioPerl used as a convenient way to install all
optional modules for full functionality.  The catch is to make sure that any
optional installations do not crash tests during a CPAN bioperl
installation, otherwise they aren't considered optional by CPAN, and the
install won't work without forcing it.

Frankly, most users will find themselves wanting to install the Bundle
anyway to get full functionality, so we could always 'strongly recommend'
preceding the bioperl installation with a Bundle::Bioperl CPAN installation
to avoid problems, at least for this release. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 23 10:23:04 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:23:04 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk>
Message-ID: <002101c6f6ae$c14d7860$15327e82@pyrimidine>

...
> >> Right, thanks for that.
> >
> > Oh, so this effectively means that our 'optional' dependencies are
> > installed for CPAN users, which matches up to my 'force the optional
> > ones anyway' desire, leaving Bundle::BioPerl without any use.
> >
> > Makefile.PL could be altered again to remove from PREREQ_PM those
> > modules the user didn't already have installed, thus CPAN would only
> > install Bioperl itself and nothing optional. The user could then
> > install Bundle::BioPerl if they wanted a quick way of getting all the
> > optional stuff to work.
> >
> > I'm happy either way; what do other people think?
> >From my point of view, removing them from PREREQ_PM means building the
> Bundle::BioPerl a bit of a pain :o(
> 
> I prefer the way it is currently set up - most people have fast internet
> connections and GB of harddrive space. Other than the reason "why
> install something I won't ever need" I don't see much point maintaining
> Bundle::BioPerl and having "optional" dependencies. I think if there are
> any modules which are not going to be used by the majority of users,
> then this could be used as the rationale for removing them from
> bioperl-core into another package?
> 
> Nath

I think you'll likely find it much easier to maintain a Bundle package
long-term and indicate that it should be installed along with bioperl, than
to have users complain about a particular Bioperl module failing b/c a
particular dependency wasn't installed.  

If we have the Bundle around in CPAN and in PPM for Win32 users, and
indicate in the INSTALL docs and the wiki our preference that it be
installed prior to or along with a Bioperl installation for beginners, we
can mitigate most of those problems.  Nip it in the bud, to quote a Mr.
Barney Fife.

My 2c

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 23 10:29:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:29:33 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CCABB.2060308@sendu.me.uk>
Message-ID: <002201c6f6af$a91e4200$15327e82@pyrimidine>

> Dave Howorth wrote:
> >>> That's the user point of view - how does the developer actually tell
> >>> CPAN that something is a developer release so that normal users don't
> >>> automatically install it?
> >> I found this:
> >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
> >>
> >> Is says that $VERSION should simply be changed from a naked number into
> >> a single quoted number and this should be recognized by the CPAN
> indexer.
> >
> > <http://search.cpan.org/~nwclark/perl-
> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
> 
> Thanks for that.
> 
> I guess from that the 1.5.2 version number should be:
> 
> $VERSION = 1.05_02
> 
> And 1.6 would be
> 
> $VERSION = 1.06
> 
> But will this cause a problem wrt 1.4? 1.4 has:
> 
> $VERSION = 1.4;
> 
> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them
> version fifty and version sixty? 1.50_02, 1.60?

Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
much simpler to use that. 

Simon Cozens wrote about this a while back:

http://www.perl.com/pub/a/2000/04/whatsnew.html

...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Mon Oct 23 10:41:24 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 15:41:24 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <002201c6f6af$a91e4200$15327e82@pyrimidine>
References: <002201c6f6af$a91e4200$15327e82@pyrimidine>
Message-ID: <453CD494.8070905@sendu.me.uk>

Chris Fields wrote:
>> Dave Howorth wrote:
>>>>> That's the user point of view - how does the developer actually tell
>>>>> CPAN that something is a developer release so that normal users don't
>>>>> automatically install it?
>>>> I found this:
>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>>
>>>> Is says that $VERSION should simply be changed from a naked number into
>>>> a single quoted number and this should be recognized by the CPAN
>> indexer.
>>> <http://search.cpan.org/~nwclark/perl-
>> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>
>> Thanks for that.
>>
>> I guess from that the 1.5.2 version number should be:
>>
>> $VERSION = 1.05_02
>>
>> And 1.6 would be
>>
>> $VERSION = 1.06
>>
>> But will this cause a problem wrt 1.4? 1.4 has:
>>
>> $VERSION = 1.4;
>>
>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
>> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them
>> version fifty and version sixty? 1.50_02, 1.60?
> 
> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
> much simpler to use that. 

That does not present us with a way to have 1.5.2 marked as a developer 
release in CPAN.

Also, see the discussion here: 
http://perldoc.perl.org/functions/require.html

Since we require 5.6.1 the backwards-compatible issues maybe don't apply 
to us, but do these ideas work with modules, or just Perl itself? Is 
CPAN et al. happy with this form of versioning?

/Something/ needs to be done about Bioperl versioning, because the 
current 1.4 or 1.5 is completely inadequate.


From bix at sendu.me.uk  Mon Oct 23 10:51:25 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 15:51:25 +0100
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
Message-ID: <453CD6ED.5050507@sendu.me.uk>

Chris Fields wrote:

[option 1]
>> Oh, so this effectively means that our 'optional' dependencies are 
>> installed for CPAN users, which matches up to my 'force the
>> optional ones anyway' desire, leaving Bundle::BioPerl without any
>> use.

[option 2]
>> Makefile.PL could be altered again to remove from PREREQ_PM those 
>> modules the user didn't already have installed, thus CPAN would
>> only install Bioperl itself and nothing optional. The user could
>> then install Bundle::BioPerl if they wanted a quick way of getting
>> all the optional stuff to work.
>> 
>> I'm happy either way; what do other people think?
> 
> I think that we should have it so Bioperl installs as-is (no
> additional reqs) and have Bundle::BioPerl used as a convenient way to
> install all optional modules for full functionality.

Note we're specifically considering a CPAN install here. If you download
the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
still needed as a convenience if you want to install the optional
external dependencies.


> The catch is to make sure that any optional installations do not
> crash tests during a CPAN bioperl installation, otherwise they aren't
> considered optional by CPAN, and the install won't work without
> forcing it.

I'm pretty sure this isn't a problem, though it would be nice if someone 
could test it on a clean system: does 'make test' pass all ok with none 
of the optional modules installed?


Anyway, to reiterate the question: Do we care if CPAN users get all the 
optional external dependencies installed for them automatically, or do 
we want to force them to install Bundle?

The current situation is: CPAN users will get all optional external 
dependencies without using Bundle::BioPerl. Manual installers of bioperl 
(from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to 
get full functionality.


From n.haigh at sheffield.ac.uk  Mon Oct 23 12:30:34 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 16:30:34 +0000
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CCABB.2060308@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
	<453CCABB.2060308@sendu.me.uk>
Message-ID: <453CEE2A.8000002@sheffield.ac.uk>

Sendu Bala wrote:
> Dave Howorth wrote:
>   
>>>> That's the user point of view - how does the developer actually tell
>>>> CPAN that something is a developer release so that normal users don't
>>>> automatically install it?
>>>>         
>>> I found this:
>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>
>>> Is says that $VERSION should simply be changed from a naked number into
>>> a single quoted number and this should be recognized by the CPAN indexer.
>>>       
>> <http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>     
>
> Thanks for that.
>
> I guess from that the 1.5.2 version number should be:
>
> $VERSION = 1.05_02
>
> And 1.6 would be
>
> $VERSION = 1.06
>
> But will this cause a problem wrt 1.4? 1.4 has:
>
> $VERSION = 1.4;
>
> Is 1.4 lower than 1.06? Should we keep to a single digit version, so 
> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them 
> version fifty and version sixty? 1.50_02, 1.60?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
I believe the link to the documentation above describes a common CPAN
versioning scheme as follows:

1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32

Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
be better as 1.52. Then to indicate that the 1.5 series is a developer
release, you append the underscore and at least 2 digits. Thus resulting
in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
1.52_01. The only thing i'm unsure about would be when does the _01 get
incremented? I suspect we would probably not increment this number since
each release would be an increment of the minor release number e.g.
1.52_01, 1.53_01, 1.54_01 etc.

Although I'm still not sure how this versioning would affect bioperl 1.4
since 1.4 uses a non-standard versioning scheme :o(

As I understand it, the versioning of the Perl releases uses the x.y.z
scheme. But apparently CPAN modules should use the above versioning scheme.

Nath


From cjfields at uiuc.edu  Mon Oct 23 11:36:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 10:36:37 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CD6ED.5050507@sendu.me.uk>
Message-ID: <000c01c6f6b9$0781af40$15327e82@pyrimidine>

...
> 
> Note we're specifically considering a CPAN install here. If you download
> the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
> still needed as a convenience if you want to install the optional
> external dependencies.
> 

Agreed.  I don't think the Bundle is dispensable.  For instance, it's very
easy for us to just state to beginners to install Bundle::Bioperl before
installing bioperl itself,  as opposed to having them inundate the mail list
with requests on why x.pl script didn't work, which could be simply from
lack of the required module. 

> I'm pretty sure this isn't a problem, though it would be nice if someone
> could test it on a clean system: does 'make test' pass all ok with none
> of the optional modules installed?

So far on WinXP everything passes; I ran a clean perl installation a while
ago using nmake and tests passed.

> Anyway, to reiterate the question: Do we care if CPAN users get all the
> optional external dependencies installed for them automatically, or do
> we want to force them to install Bundle?
> 
> The current situation is: CPAN users will get all optional external
> dependencies without using Bundle::BioPerl. Manual installers of bioperl
> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
> get full functionality.

I don't think forcing is necessary, so a CPAN installation shouldn't force
someone to install optional modules.  Graph.pm, for instance has a few
optional modules, and the tests which use those get skipped and pass so the
installation proceeds w/o problems.  We could do the same (any tests using
those optional modules display the reason why they are skipped).  

I would strongly state in the INSTALL and INSTALL.WIN docs that (new) users
should install Bundle::Bioperl before installing Bioperl core for full
functionality.  If you are an advanced user and know your way around
CPAN/Perl, then you can install the various independent requirements
depending on your particular requirements. 

Chris


From n.haigh at sheffield.ac.uk  Mon Oct 23 12:38:00 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 16:38:00 +0000
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CD6ED.5050507@sendu.me.uk>
References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
	<453CD6ED.5050507@sendu.me.uk>
Message-ID: <453CEFE8.4000704@sheffield.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>
> [option 1]
>   
>>> Oh, so this effectively means that our 'optional' dependencies are 
>>> installed for CPAN users, which matches up to my 'force the
>>> optional ones anyway' desire, leaving Bundle::BioPerl without any
>>> use.
>>>       
>
> [option 2]
>   
>>> Makefile.PL could be altered again to remove from PREREQ_PM those 
>>> modules the user didn't already have installed, thus CPAN would
>>> only install Bioperl itself and nothing optional. The user could
>>> then install Bundle::BioPerl if they wanted a quick way of getting
>>> all the optional stuff to work.
>>>
>>> I'm happy either way; what do other people think?
>>>       
>> I think that we should have it so Bioperl installs as-is (no
>> additional reqs) and have Bundle::BioPerl used as a convenient way to
>> install all optional modules for full functionality.
>>     
>
> Note we're specifically considering a CPAN install here. If you download
> the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
> still needed as a convenience if you want to install the optional
> external dependencies.
>
>
>   
>> The catch is to make sure that any optional installations do not
>> crash tests during a CPAN bioperl installation, otherwise they aren't
>> considered optional by CPAN, and the install won't work without
>> forcing it.
>>     
>
> I'm pretty sure this isn't a problem, though it would be nice if someone 
> could test it on a clean system: does 'make test' pass all ok with none 
> of the optional modules installed?
>
>   

I could definitely do this on WinXP and *possibly* on a Linux system.

> Anyway, to reiterate the question: Do we care if CPAN users get all the 
> optional external dependencies installed for them automatically, or do 
> we want to force them to install Bundle?
>
>   

I'd prefer any dependencies, whether the are seen as vital to the main
functionality of Bioperl or not actually specified in PREREQ_PM (as they
currently are). A dependency is a dependency - is it not? If a
distinction is to be made based on whether the requiring module is
simply adding additional functionality to Bioperl-core, then shouldn't
it be moved out of core and into another package as with the run modules
if we are to have "optional" dependencies?

my 2p
Nath

> The current situation is: CPAN users will get all optional external 
> dependencies without using Bundle::BioPerl. Manual installers of bioperl 
> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to 
> get full functionality.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   


From cjfields at uiuc.edu  Mon Oct 23 11:39:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 10:39:09 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CD494.8070905@sendu.me.uk>
Message-ID: <000d01c6f6b9$62033d80$15327e82@pyrimidine>

...
> That does not present us with a way to have 1.5.2 marked as a developer
> release in CPAN.
> 
> Also, see the discussion here:
> http://perldoc.perl.org/functions/require.html
> 
> Since we require 5.6.1 the backwards-compatible issues maybe don't apply
> to us, but do these ideas work with modules, or just Perl itself? Is
> CPAN et al. happy with this form of versioning?
> 
> /Something/ needs to be done about Bioperl versioning, because the
> current 1.4 or 1.5 is completely inadequate.

I think using 'require Foo x.y.z' is applicable to modules as well.  There
is something in Programming Perl about this, just don't have it on hand...

Not sure about CPAN, so we need to look into it.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Oct 23 11:42:15 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 16:42:15 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CEE2A.8000002@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
	<453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk>
Message-ID: <453CE2D7.5080608@sendu.me.uk>

Nathan S. Haigh wrote:
> I believe the link to the documentation above describes a common CPAN
> versioning scheme as follows:
> 
> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32
> 
> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
> be better as 1.52. Then to indicate that the 1.5 series is a developer
> release, you append the underscore and at least 2 digits. Thus resulting
> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
> 1.52_01. The only thing i'm unsure about would be when does the _01 get
> incremented? I suspect we would probably not increment this number since
> each release would be an increment of the minor release number e.g.
> 1.52_01, 1.53_01, 1.54_01 etc.
> 
> Although I'm still not sure how this versioning would affect bioperl 1.4
> since 1.4 uses a non-standard versioning scheme :o(

Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
treated higher than 1.4? Anyway, we can cross that bridge when we get 
there, but this seems appropriate now.


Cheers,
Sendu.


From bix at sendu.me.uk  Mon Oct 23 11:59:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 16:59:01 +0100
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
References: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
Message-ID: <453CE6C5.6000108@sendu.me.uk>

Chris Fields wrote:
> ...
>> The current situation is: CPAN users will get all optional external
>> dependencies without using Bundle::BioPerl. Manual installers of bioperl
>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
>> get full functionality.
> 
> I don't think forcing is necessary, so a CPAN installation shouldn't force
> someone to install optional modules.  Graph.pm, for instance has a few
> optional modules, and the tests which use those get skipped and pass so the
> installation proceeds w/o problems.  We could do the same (any tests using
> those optional modules display the reason why they are skipped).  

I should clarify and say that that's what happens in Bioperl as well. 
The 'forcing' that I talk about is simply what I assume will happen if 
the user has CPAN set to automatically install dependencies. The user 
could say 'no' to every question regarding the installation of 
dependencies that CPAN discovers and Bioperl would still install fine.

So really the difference between the current situation and, say, the 
situation when 1.5.1 was released, is that the CPAN user doesn't have to 
use Bundle::BioPerl for full functionality anymore, but can still chose 
not to install all the optional external modules.

The difference is the possible default behaviour. Those users that 
auto-install dependencies get all the optional ones, whereas in the past 
they would not have. I have to point out the benefit of this behaviour: 
those people that don't care and just want it to work are more likely to 
get an installation that does just work. People who know what they're 
doing can still do what they want.


Before we decide what to do I guess we need hard confirmation of how 
CPAN will actually behave with the current Makefile.PL. Any ideas how we 
can find out?

It would also be good to have more options to break the current tie 
(Nathan is for keeping PREREQ_PM populated, Chris is for having it 
empty, I can go either way)...


From dhoworth at mrc-lmb.cam.ac.uk  Mon Oct 23 11:55:42 2006
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Mon, 23 Oct 2006 16:55:42 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CD494.8070905@sendu.me.uk>
References: <002201c6f6af$a91e4200$15327e82@pyrimidine>
	<453CD494.8070905@sendu.me.uk>
Message-ID: <453CE5FE.9070001@mrc-lmb.cam.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>>> Dave Howorth wrote:
>>>>>> That's the user point of view - how does the developer actually tell
>>>>>> CPAN that something is a developer release so that normal users don't
>>>>>> automatically install it?
>>>>> I found this:
>>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>>>
>>>>> Is says that $VERSION should simply be changed from a naked number into
>>>>> a single quoted number and this should be recognized by the CPAN
>>> indexer.
>>>> <http://search.cpan.org/~nwclark/perl-
>>> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>>
>>> Thanks for that.
>>>
>>> I guess from that the 1.5.2 version number should be:
>>>
>>> $VERSION = 1.05_02

I believe so - the underscore is key. Look at your favourite CPAN
modules and see what they do.

>>> And 1.6 would be
>>>
>>> $VERSION = 1.06
>>>
>>> But will this cause a problem wrt 1.4? 1.4 has:

I think it will cause a problem, yes. 1.4 > 1.06 As a workaround, you
could remove 1.4 from CPAN and require everybody who installs from CPAN
to uninstall it before installing 1.06.

>>> $VERSION = 1.4;
>>>
>>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
>>> 1.5_02 and 1.6? Does this really not work with CPAN?

I think that would work but see at the end.

>> Should we call them
>>> version fifty and version sixty? 1.50_02, 1.60?

Then you can count 1.50_02, 1.50_03, 1.52, 1.53_01 ... if you wish.

>> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
>> much simpler to use that. 
> 
> That does not present us with a way to have 1.5.2 marked as a developer 
> release in CPAN.
> 
> Also, see the discussion here: 
> http://perldoc.perl.org/functions/require.html
> 
> Since we require 5.6.1 the backwards-compatible issues maybe don't apply 
> to us, but do these ideas work with modules, or just Perl itself? Is 
> CPAN et al. happy with this form of versioning?

I'm not an expert :( It's my understanding that there is an awful lot of
flexibility in Perl module version numbering (as you might expect :)
However, I believe there are some gotchas. So I would recommend (a)
finding an expert and (b) trying an experiment!

> /Something/ needs to be done about Bioperl versioning, because the 
> current 1.4 or 1.5 is completely inadequate.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 


From n.haigh at sheffield.ac.uk  Mon Oct 23 13:37:13 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 17:37:13 +0000
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CE6C5.6000108@sendu.me.uk>
References: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
	<453CE6C5.6000108@sendu.me.uk>
Message-ID: <453CFDC9.8030107@sheffield.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>   
>> ...
>>     
>>> The current situation is: CPAN users will get all optional external
>>> dependencies without using Bundle::BioPerl. Manual installers of bioperl
>>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
>>> get full functionality.
>>>       
>> I don't think forcing is necessary, so a CPAN installation shouldn't force
>> someone to install optional modules.  Graph.pm, for instance has a few
>> optional modules, and the tests which use those get skipped and pass so the
>> installation proceeds w/o problems.  We could do the same (any tests using
>> those optional modules display the reason why they are skipped).  
>>     
>
> I should clarify and say that that's what happens in Bioperl as well. 
> The 'forcing' that I talk about is simply what I assume will happen if 
> the user has CPAN set to automatically install dependencies. The user 
> could say 'no' to every question regarding the installation of 
> dependencies that CPAN discovers and Bioperl would still install fine.
>
> So really the difference between the current situation and, say, the 
> situation when 1.5.1 was released, is that the CPAN user doesn't have to 
> use Bundle::BioPerl for full functionality anymore, but can still chose 
> not to install all the optional external modules.
>
>   
--snip--

Obviously, we could maintain a Bundle::BioPerl which includes all
dependencies required for a fully functional Bioperl. I think the whole
idea for a Bundle is to provide a common environment for a particular
package. If for example, someone chooses not to install the dependencies
through CPAN (in the current setup), that can easily go back and install
Bundle::BioPerl and it would retrieve any missing dependencies for a
fully functional Bioperl-core.

Nath


From n.haigh at sheffield.ac.uk  Mon Oct 23 14:06:16 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 18:06:16 +0000
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CE2D7.5080608@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
Message-ID: <453D0498.8050206@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>   
>> I believe the link to the documentation above describes a common CPAN
>> versioning scheme as follows:
>>
>> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32
>>
>> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
>> be better as 1.52. Then to indicate that the 1.5 series is a developer
>> release, you append the underscore and at least 2 digits. Thus resulting
>> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
>> 1.52_01. The only thing i'm unsure about would be when does the _01 get
>> incremented? I suspect we would probably not increment this number since
>> each release would be an increment of the minor release number e.g.
>> 1.52_01, 1.53_01, 1.54_01 etc.
>>
>> Although I'm still not sure how this versioning would affect bioperl 1.4
>> since 1.4 uses a non-standard versioning scheme :o(
>>     
>
> Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
> treated higher than 1.4? Anyway, we can cross that bridge when we get 
> there, but this seems appropriate now.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just tried the suggested:
perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)'
bioperl-1-5-2/Bio/Root/Version.pm

To see how it parses the various different version schemes - here are
the results:
1.5       -> 1.5
1.4       -> 1.4
1.60      -> 1.60
1.05_01   -> 1.0501
1.5_01    -> 1.501
1.50_01   -> 1.5001

Nath


From cjfields at uiuc.edu  Mon Oct 23 13:15:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:15:44 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CE6C5.6000108@sendu.me.uk>
Message-ID: <002701c6f6c6$e2622c40$15327e82@pyrimidine>

...
> I should clarify and say that that's what happens in Bioperl as well.
> The 'forcing' that I talk about is simply what I assume will happen if
> the user has CPAN set to automatically install dependencies. The user
> could say 'no' to every question regarding the installation of
> dependencies that CPAN discovers and Bioperl would still install fine.
> 
> So really the difference between the current situation and, say, the
> situation when 1.5.1 was released, is that the CPAN user doesn't have to
> use Bundle::BioPerl for full functionality anymore, but can still chose
> not to install all the optional external modules.
> 
> The difference is the possible default behaviour. Those users that
> auto-install dependencies get all the optional ones, whereas in the past
> they would not have. I have to point out the benefit of this behaviour:
> those people that don't care and just want it to work are more likely to
> get an installation that does just work. People who know what they're
> doing can still do what they want.

OK with me.  Any way we go about it, we have to assume that anyone who set
CPAN to automatically install dependencies would want this behavior.

> Before we decide what to do I guess we need hard confirmation of how
> CPAN will actually behave with the current Makefile.PL. Any ideas how we
> can find out?
> 
> It would also be good to have more options to break the current tie
> (Nathan is for keeping PREREQ_PM populated, Chris is for having it
> empty, I can go either way)...

Frankly I'm for whatever is easiest for the end-user.  I think we should
continue maintaining Bundle::Bioperl b/c of its convenience (easier for us
to say 'install Bundle::Bioperl' as opposed to 'install modules a b d d e f
g...'  ).  I should note that Chris D. maintains Bundle::Bioperl via CPAN
and can easily add/remove modules as needed, so all that would be necessary
prior to a release is to make sure the various modules present in the Bundle
are up-to-date.

The only difficulty would updating the bundle PPM version for Win32; I agree
with Nathan that it would be nice if it were easier to maintain.  The PPD
file generated using 'nmake ppd' needs modifications, likely b/c these are
probably still generated as PPM3-compatible vs PPM4-compatible.

I also think the idea of having the developer releases available via CPAN is
a good one, as long as they are marked as such (which you are taking care of
with versioning changes).  It makes them a little more official, even if
they are interim developer releases.

Chris


From cjfields at uiuc.edu  Mon Oct 23 13:19:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:19:08 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CFDC9.8030107@sheffield.ac.uk>
Message-ID: <002801c6f6c7$5a58ed60$15327e82@pyrimidine>

...
> > So really the difference between the current situation and, say, the
> > situation when 1.5.1 was released, is that the CPAN user doesn't have to
> > use Bundle::BioPerl for full functionality anymore, but can still chose
> > not to install all the optional external modules.
> >
> >
> --snip--
> 
> Obviously, we could maintain a Bundle::BioPerl which includes all
> dependencies required for a fully functional Bioperl. I think the whole
> idea for a Bundle is to provide a common environment for a particular
> package. If for example, someone chooses not to install the dependencies
> through CPAN (in the current setup), that can easily go back and install
> Bundle::BioPerl and it would retrieve any missing dependencies for a
> fully functional Bioperl-core.
> 
> Nath

Succinctly put; I would've spent five paragraphs describing that!  Too much
coffee (from lab meetings...)

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 23 13:26:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:26:57 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>
Message-ID: <002c01c6f6c8$7163dd20$15327e82@pyrimidine>

Seth, 

Did you try this with a clean, taxonomy-installed database?  There may be
some junk left over tfrom the previous test runs.

I'm looking into it this week; it may not make the developer release but
we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
with a call to gzip.  I'll look into a workaround for that.  

Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
introduces others.  One alternative which I found works is cygwin, but
there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
another...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 11:37 AM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
--- 
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): 
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85) 
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' 
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88) 
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis, J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2) 
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2) 
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3) 
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' 
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4) 
#    Expected: '99199225'
==============================


On 10/20/06, Chris Fields < <mailto:cjfields at uiuc.edu>  cjfields at uiuc.edu>
wrote:


Seth,

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with 
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
<mailto:bioperl-l-> 
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM 
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------

> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42 
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list 
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
<http://lists.open-bio.org/mailman/listinfo/bioperl-l> 
>
>


From johnson.biotech at gmail.com  Mon Oct 23 12:36:36 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 23 Oct 2006 12:36:36 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <000001c6f486$df508930$15327e82@pyrimidine>
References: <000001c6f486$df508930$15327e82@pyrimidine>
Message-ID: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>

Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31):
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85)
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators'
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88)
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2)
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2)
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3)
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein'
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4)
#    Expected: '99199225'
==============================


On 10/20/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
>
> Seth,
>
> Did you work out the problem here?  There was a recent CVS update to OBDA
> tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
> apparently left data from tests in the database, which caused problems
> with
> repeated test runs.
>
> Chris
>
> > > -----Original Message-----
> > > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
> > > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > > Sent: Saturday, September 30, 2006 6:35 PM
> > > To: Hilmar Lapp
> > > Cc: Chris Fields; Bioperl List
> > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> > >
> > > Here're complete test details:
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > ...
> >
> > > FAILED tests 10-12
> > >     Failed 3/12 tests, 75.00% okay
> > > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> > >
> >
> --------------------------------------------------------------------------
> > > -----
> > > t\02species.t                 65    2   3.08%  63 65
> > > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > > t\16obda.t                    12    3  25.00%  10-12
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l <at> lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>
>


From n.haigh at sheffield.ac.uk  Mon Oct 23 16:08:00 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 20:08:00 +0000
Subject: [Bioperl-l] CPAN testing Service
Message-ID: <453D2120.9010301@sheffield.ac.uk>

We should also check the CPAN testing service (CPANTS) to see how "good"
our package is for CPAN and try to increase the Kwalitee score. There
only appears to be details for bioperl-1.2.3 for some reason:
http://cpants.perl.org/dist/bioperl

Nath


From pabloivan at gmail.com  Sun Oct 22 15:54:35 2006
From: pabloivan at gmail.com (Pablo Ivan)
Date: Sun, 22 Oct 2006 16:54:35 -0300
Subject: [Bioperl-l] Bioperl installation under Windows
Message-ID: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>

Hello,

I have been trying to install Bioperl 1.4 on a Windows XP system, but I
didn't get too far; my perl installation was made using ActiveState
5.8.8build 816. I then tried the ppm method of searching for bioperl
in the
repositories and installing the core package 1.4. It says that the
installation was made successfully, but the /Bio folder doesn't show up in
/lib, and it's like nothing new was installed at all. I was wondering if
using that version of ActiveState could be causing it, but the uninstall
option for it isn't showing in Add/Remove, and I'm afraid just deleting the
folders and installing version 5.6 of AS could somehow damage and make
things worse. Or should I just forget about it and try using Cygwin?

Thank you,

Pablo.


From cjfields at uiuc.edu  Mon Oct 23 17:34:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 16:34:47 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880610231422o24029a0cu229fccc2b5809b85@mail.gmail.com>
Message-ID: <000401c6f6eb$111df040$15327e82@pyrimidine>

Don't know what that particular error is, but it looks ActivePerl-related
(PPM generates HTML from the blib directory).  You may need to run 'nmake
clean' in between test cycles get rid of old blib and other files.

 
The carryover issue from old test runs was a definite problem.  Brian fixed
that in the bioperl-db CVS recently.  Also,  I tried Sendu's fixes from CVS
head to Bio::Root::Root and they seem to fix the problems with
Bio::Root::Root.  The issue came down to a use of indirect syntax (a bad
perl practice).  There are other errors popping up related to Bio::Species,
but these seem fixable at least.

 
I committed a few changes to bioperl-db CVS to fix 03simpleseq.t test
failures due to a lack of gzip on WinXP (I didn't see them b/c I had a copy
on GNU gzip in my path).  These should pass w/o problems now on WinXP.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 4:22 PM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

I have not cleaned my test database yet.  I'll purge it and redo the tests. 

This error keeps popping up in unexpected places while running nmake during
installation: 
 "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. 
NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code
'0xff'"

Is there a way around it??

Seth

On 10/23/06, Chris Fields <cjfields at uiuc.edu> wrote:

Seth, 

Did you try this with a clean, taxonomy-installed database?  There may be
some junk left over tfrom the previous test runs.

I'm looking into it this week; it may not make the developer release but
we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
with a call to gzip.  I'll look into a workaround for that.  

Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
introduces others.  One alternative which I found works is cygwin, but
there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
another...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 11:37 AM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
--- 
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): 
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85) 
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' 
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88) 
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis, J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2) 
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2) 
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3) 
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' 
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4) 
#    Expected: '99199225'
==============================

On 10/20/06, Chris Fields < cjfields at uiuc.edu <mailto:cjfields at uiuc.edu> >
wrote:


Seth,

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with 
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
<mailto:bioperl-l-> 
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM 
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------

> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42 
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list 
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
<http://lists.open-bio.org/mailman/listinfo/bioperl-l> 
>
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358 


From cjfields at uiuc.edu  Mon Oct 23 17:53:27 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 16:53:27 -0500
Subject: [Bioperl-l] Bioperl installation under Windows
In-Reply-To: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>
References: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>
Message-ID: <9994CFF6-FCA1-4C7F-9A33-31765C6AE255@uiuc.edu>

It won't install in Perl\lib, but in Perl\site\lib.  Check there.

We are working intently on the next developer release for BioPerl and  
plan on having several PPMs available, but we only are supporting  
ActivePerl 5.8.8.819.  I would suggest that you upgrade your  
ActivePerl installation to that if possible since PPM has undergone  
major changes (they use PPM4 now, which has a GUI by default).  Most  
repositories are now moving over to using PPM4 so you'll likely be  
seeing less PPM3-compatible packages being made.

Chris

On Oct 22, 2006, at 2:54 PM, Pablo Ivan wrote:

> Hello,
>
> I have been trying to install Bioperl 1.4 on a Windows XP system,  
> but I
> didn't get too far; my perl installation was made using ActiveState
> 5.8.8build 816. I then tried the ppm method of searching for bioperl
> in the
> repositories and installing the core package 1.4. It says that the
> installation was made successfully, but the /Bio folder doesn't  
> show up in
> /lib, and it's like nothing new was installed at all. I was  
> wondering if
> using that version of ActiveState could be causing it, but the  
> uninstall
> option for it isn't showing in Add/Remove, and I'm afraid just  
> deleting the
> folders and installing version 5.6 of AS could somehow damage and make
> things worse. Or should I just forget about it and try using Cygwin?
>
> Thank you,
>
> Pablo.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnson.biotech at gmail.com  Mon Oct 23 17:22:13 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 23 Oct 2006 17:22:13 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <002c01c6f6c8$7163dd20$15327e82@pyrimidine>
References: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>
	<002c01c6f6c8$7163dd20$15327e82@pyrimidine>
Message-ID: <b99962880610231422o24029a0cu229fccc2b5809b85@mail.gmail.com>

Chris,

I have not cleaned my test database yet.  I'll purge it and redo the tests.

This error keeps popping up in unexpected places while running nmake during
installation:
 "Undefined subroutine &main::UpdateHTML_blib called at -e line 1.
NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code
'0xff'"

Is there a way around it??

Seth

On 10/23/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>  Seth,
>
> Did you try this with a clean, taxonomy-installed database?  There may be
> some junk left over tfrom the previous test runs.
>
> I'm looking into it this week; it may not make the developer release but
> we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
> with a call to gzip.  I'll look into a workaround for that.
>
> Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
> introduces others.  One alternative which I found works is cygwin, but
> there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
> another...
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>   ------------------------------
>
> *From:* Seth Johnson [mailto:johnson.biotech at gmail.com]
> *Sent:* Monday, October 23, 2006 11:37 AM
> *To:* Chris Fields
> *Cc:* bioperl-l
> *Subject:* Re: Error retrieving sequence from BioSQL
>
>
>
> Chris,
>
> There's definite improvement:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> -------------------------------------------------------------------------------
>
> t/02species.t                 65    2   3.08%  63 65
> t/03simpleseq.t    1   256    59  106 179.66%  7-59
> t/04swiss.t                   52   14  26.92%  25 27-34 38-42
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> There's some weirdness going on during the 'swiss.t' test.  It almost
> seems to me that expectations of some tests are swapped (27 & 39, 28 & 40,
> 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31):
> ================================
> not ok 25
> # Test 25 got: '10097078' (t/04swiss.t at line 79)
> #    Expected: '91309150'
> ok 26
> not ok 27
> # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
> at line 85)
> #    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
> not ok 28
> # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
> mitochondrial matrix protein' (t/04swiss.t at line 86)
> #    Expected: 'Functional expression of cloned human splicing factor SF2:
> homology to RNA-binding proteins, U1 70K, and Drosophila splicing
> regulators'
> not ok 29
> # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
> (t/04swiss.t at line 87)
> #    Expected: 'Cell 66 (2), 383-394 (1991)'
> not ok 30
> # Test 30 got: <UNDEF> (t/04swiss.t at line 88)
> #    Expected: '91309150'
> not ok 31
> # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
> (t/04swiss.t at line 85 fail #2)
> #    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
> Celis, J.E. and Leffers,H.'
> not ok 32
> # Test 32 got: 'Functional expression of cloned human splicing factor SF2:
> homology to RNA-binding proteins, U1 70K, and Drosophila splicing
> regulators' (t/04swiss.t at line 86 fail #2)
> #    Expected: 'Cloning and expression of a cDNA covering the complete
> coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
> not ok 33
> # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
> #2)
> #    Expected: 'Gene 134 (2), 283-287 (1993)'
> not ok 34
> # Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
> #    Expected: '94085792'
> ok 35
> ok 36
> ok 37
> not ok 38
> # Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3)
> #    Expected: '94253723'
> not ok 39
> # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
> Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
> #    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
> not ok 40
> # Test 40 got: 'Cloning and expression of a cDNA covering the complete
> coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
> (t/04swiss.t at line 86 fail #4)
> #    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
> mitochondrial matrix protein'
> not ok 41
> # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
> #4)
> #    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
> not ok 42
> # Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4)
> #    Expected: '99199225'
> ==============================
>
>  On 10/20/06, *Chris Fields* < cjfields at uiuc.edu> wrote:
>
>
>
> Seth,
>
> Did you work out the problem here?  There was a recent CVS update to OBDA
> tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
> apparently left data from tests in the database, which caused problems
> with
> repeated test runs.
>
> Chris
>
> > > -----Original Message-----
> > > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
> > > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > > Sent: Saturday, September 30, 2006 6:35 PM
> > > To: Hilmar Lapp
> > > Cc: Chris Fields; Bioperl List
> > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> > >
> > > Here're complete test details:
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > ...
> >
> > > FAILED tests 10-12
> > >     Failed 3/12 tests, 75.00% okay
> > > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> > >
> >
> --------------------------------------------------------------------------
> > > -----
> > > t\02species.t                 65    2   3.08%  63 65
> > > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > > t\16obda.t                    12    3  25.00%  10-12
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l <at> lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358


From chhalling at alumni.ls.berkeley.edu  Mon Oct 23 21:02:24 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Mon, 23 Oct 2006 21:02:24 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C6509.90005@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
Message-ID: <453D6620.5020401@alumni.ls.berkeley.edu>

Sorry, I should know better about giving all the details.

This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a 
fresh compile) with Mac OS X 10.4.8.

-- Conrad

Nathan S. Haigh wrote:
> Chris Fields wrote:
>   
>> Thanks for letting us know!  Did PPM4 throw errors or just silently  
>> pass them over?
>>
>> Chris
>>
>> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:
>>
>>   
>>     
> I believe he is talking about the bundle on cpan and not the ppd. I will
> get this updated as soon as possible.
>
> Sendu/Chris - can you confirm to me which Bioperl modules are essential
> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
> reason for not putting *all* dependencies into the bundle?
>
> Nath
>
>
>
>
>
>   


-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From n.haigh at sheffield.ac.uk  Tue Oct 24 03:05:53 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 24 Oct 2006 08:05:53 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453D6620.5020401@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453D6620.5020401@alumni.ls.berkeley.edu>
Message-ID: <453DBB51.6010505@sheffield.ac.uk>

Conrad Halling wrote:
> Sorry, I should know better about giving all the details.
>
> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a 
> fresh compile) with Mac OS X 10.4.8.
>
> -- Conrad
>
>   
My apologies Conrad, this was my bad! Are you in need of the corrections 
being made swiftly or can you wait until the Bioperl 1.5.2 release when 
I'll ensure the Bundle is updated correctly for that release?

Cheers
Nath


From n.haigh at sheffield.ac.uk  Tue Oct 24 05:57:25 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 10:57:25 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CE2D7.5080608@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
Message-ID: <453DE385.8010700@sheffield.ac.uk>

--snip--
> Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
> treated higher than 1.4? Anyway, we can cross that bridge when we get 
> there, but this seems appropriate now.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just been having a think about this versioning. Does this work well and
is it intuitive with versioning the official 1.5.2 developer release and
also the 1.6 stable release? I'd like to put forward the following
versioning scheme for consideration (most is the same as what it is now,
but with some clarification - hopefully):
major-version . minor-version sub-version _ developer-release-version
RC-version

The sub-version represents bug-fixes and possibly some minor feature
enhancements with no API changes.
The minor-version represents some significant feature enhancements/API
changes/bug fixes.
The major-version represents significant rewrites of Bioperl.

For an RC of a developer release the version would have _0x (where x=the
RC number)
For a non RC of a developer release the version would have _10
For an RC of a stable release the version would have _0x (where x=RC number)
Fo a non RC of a stable release the version would not have the
underscore suffix

Therefore I would see the following $VERSION being applied:
1.5.2 RC1            = 1.52_01
1.5.2 RC2            = 1.52_02
1.5.2 RC3            = 1.52_03
1.5.2                = 1.52_10
1.6 RC1              = 1.60_01
1.6 RC2              = 1.60_02
1.6                  = 1.60
1.6.1 RC1            = 1.61_01
1.6.1                = 1.61

This should satisfy the requirement of CPAN for having underscores in
versions to indicate a developer release, which here is a Bioperl
release with an odd minor version number or any RC whether it be of a
developer release or a stable release. This should mean that we could
have the RC's on CPAN, but by default, CPAN would only install the
latest "non developer release" (i.e. the last package without an
underscore in the version).

If we are going ahead with the new $VERSION scheme (as it currently is
in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
1.52 instead of Bioperl 1.5.2 and make an effort to sync the
documentation with regards to this.

Nath


From bix at sendu.me.uk  Tue Oct 24 06:19:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 11:19:05 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DE385.8010700@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
	<453DE385.8010700@sheffield.ac.uk>
Message-ID: <453DE899.4030603@sendu.me.uk>

Nathan Haigh wrote:
>
> Therefore I would see the following $VERSION being applied:
> 1.5.2 RC1            = 1.52_01
> 1.5.2 RC2            = 1.52_02
> 1.5.2 RC3            = 1.52_03
> 1.5.2                = 1.52_10
> 1.6 RC1              = 1.60_01
> 1.6 RC2              = 1.60_02
> 1.6                  = 1.60
> 1.6.1 RC1            = 1.61_01
> 1.6.1                = 1.61
> 
> This should satisfy the requirement of CPAN for having underscores in
> versions to indicate a developer release, which here is a Bioperl
> release with an odd minor version number or any RC whether it be of a
> developer release or a stable release. This should mean that we could
> have the RC's on CPAN, but by default, CPAN would only install the
> latest "non developer release" (i.e. the last package without an
> underscore in the version).

That all sounds good to me, except I worry about potential confusion if 
people look manually at the things available in CPAN, see 1.60_02 and 
think it is more recent than 1.60 and try to install it manually.

Since
$VERSION = 1.52_10;
is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
final release version should be
$VERSION = 1.6010.


> If we are going ahead with the new $VERSION scheme (as it currently is
> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
> documentation with regards to this.

I might disagree with this though. I think perl people, and perhaps unix 
people in general, should be used to version numbers like '1.5.2', but 
then getting '1.52' from the code since such a number allows simple 
numerical comparisons while the former does not. The former is easier to 
read and understand. This is just how Perl itself behaves.

Most users who wouldn't expect such a behaviour aren't going to be 
checking the version number programatically anyway.


BTW. do we have someone with a CPAN account, or should I get one?


From n.haigh at sheffield.ac.uk  Tue Oct 24 07:37:12 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 12:37:12 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DE899.4030603@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk>
Message-ID: <453DFAE8.5050602@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>   
>> Therefore I would see the following $VERSION being applied:
>> 1.5.2 RC1            = 1.52_01
>> 1.5.2 RC2            = 1.52_02
>> 1.5.2 RC3            = 1.52_03
>> 1.5.2                = 1.52_10
>> 1.6 RC1              = 1.60_01
>> 1.6 RC2              = 1.60_02
>> 1.6                  = 1.60
>> 1.6.1 RC1            = 1.61_01
>> 1.6.1                = 1.61
>>
>> This should satisfy the requirement of CPAN for having underscores in
>> versions to indicate a developer release, which here is a Bioperl
>> release with an odd minor version number or any RC whether it be of a
>> developer release or a stable release. This should mean that we could
>> have the RC's on CPAN, but by default, CPAN would only install the
>> latest "non developer release" (i.e. the last package without an
>> underscore in the version).
>>     
>
> That all sounds good to me, except I worry about potential confusion if 
> people look manually at the things available in CPAN, see 1.60_02 and 
> think it is more recent than 1.60 and try to install it manually.
>
>   

I not sure if this would be a problem. As far as I understand, CPAN
treats these packages with underscores in $VERSION as something
distinctly different to the others releases (i.e. developer releases).
If you look at such a page, it is clearly evident that it is a
developers release. For example, if you search on CPAN for the latest
version of the CPAN module is shows 1.8802. if you go to that page:
http://search.cpan.org/~andk/CPAN-1.8802/
There is also a link for the latest developer release, released 1 day
after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).
This too appears to be later that 1.8802, but since it is dealt with as
a developer release it doesn't seem to matter - CPAN will only deal with
the stable (non-developer) releases, while the developer releases can be
used as a convenient way to access developer releases. Although I'm
thinking CPAN uses some hocus pocus with release dates too.

> Since
> $VERSION = 1.52_10;
> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
> final release version should be
> $VERSION = 1.6010.
>
>
>   

Because they are dealt with separately, I don't think this is an issue
(see above).

>> If we are going ahead with the new $VERSION scheme (as it currently is
>> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
>> documentation with regards to this.
>>     
>
> I might disagree with this though. I think perl people, and perhaps unix 
> people in general, should be used to version numbers like '1.5.2', but 
> then getting '1.52' from the code since such a number allows simple 
> numerical comparisons while the former does not. The former is easier to 
> read and understand. This is just how Perl itself behaves.
>
> Most users who wouldn't expect such a behaviour aren't going to be 
> checking the version number programatically anyway.
>
>
> BTW. do we have someone with a CPAN account, or should I get one?
>   

It says Ewan Birney is the author of Bioperl - I assume it must be
possible to have multiple people have the permissions to update a single
package.

Nath


From chhalling at alumni.ls.berkeley.edu  Tue Oct 24 07:15:12 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Tue, 24 Oct 2006 07:15:12 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453DBB51.6010505@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
	<453D6620.5020401@alumni.ls.berkeley.edu>
	<453DBB51.6010505@sheffield.ac.uk>
Message-ID: <453DF5C0.3040104@alumni.ls.berkeley.edu>

Nathan S. Haigh wrote:
> Conrad Halling wrote:
>> Sorry, I should know better about giving all the details.
>>
>> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 
>> (a fresh compile) with Mac OS X 10.4.8.
>>
>> -- Conrad  
> My apologies Conrad, this was my bad! Are you in need of the 
> corrections being made swiftly or can you wait until the Bioperl 1.5.2 
> release when I'll ensure the Bundle is updated correctly for that 
> release?
>
> Cheers
> Nath

No, I'm fine. I used the cpan utility to load the three modules manually.

-- Conrad

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From bix at sendu.me.uk  Tue Oct 24 08:16:54 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 13:16:54 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DFAE8.5050602@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
Message-ID: <453E0436.3050903@sendu.me.uk>

Nathan Haigh wrote:
> Sendu Bala wrote:
>
>> That all sounds good to me, except I worry about potential confusion if 
>> people look manually at the things available in CPAN, see 1.60_02 and 
>> think it is more recent than 1.60 and try to install it manually.
> 
> I not sure if this would be a problem. As far as I understand, CPAN
> treats these packages with underscores in $VERSION as something
> distinctly different to the others releases (i.e. developer releases).
> If you look at such a page, it is clearly evident that it is a
> developers release. For example, if you search on CPAN for the latest
> version of the CPAN module is shows 1.8802. if you go to that page:
> http://search.cpan.org/~andk/CPAN-1.8802/
> There is also a link for the latest developer release, released 1 day
> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).

[snip]

>> Since
>> $VERSION = 1.52_10;
>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
>> final release version should be
>> $VERSION = 1.6010.
>
> Because they are dealt with separately, I don't think this is an issue
> (see above).

If you don't notice the dates, or are doing numerical version number 
comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may 
not be automatic, but you can still chose to download the developer 
releases. Which means if we say to someone 'use Bioperl 1.6 or better' 
they may choose to get the latest version and think it is 1.6002 when 
infact 1.60 was the more recent version. 1.6010 solves the problem, is 
consistent with your 1.50_10 suggestion, and doesn't cause any problems 
as far as I can see.


>>> If we are going ahead with the new $VERSION scheme (as it currently is
>>> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
>>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
>>> documentation with regards to this.
>>>     
>> I might disagree with this though. I think perl people, and perhaps unix 
>> people in general, should be used to version numbers like '1.5.2', but 
>> then getting '1.52' from the code since such a number allows simple 
>> numerical comparisons while the former does not. The former is easier to 
>> read and understand. This is just how Perl itself behaves.
>>
>> Most users who wouldn't expect such a behaviour aren't going to be 
>> checking the version number programatically anyway.
>>
>>
>> BTW. do we have someone with a CPAN account, or should I get one?
>>   
> 
> It says Ewan Birney is the author of Bioperl - I assume it must be
> possible to have multiple people have the permissions to update a single
> package.

How did you get Bundle::BioPerl updated? Did you just ask Chris 
Dagdigian to do it for you? Or do you have access to his account? I'll 
ask Ewan about it.


From n.haigh at sheffield.ac.uk  Tue Oct 24 08:21:56 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 13:21:56 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0436.3050903@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk>
Message-ID: <453E0564.9030302@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> Sendu Bala wrote:
>>
>>> That all sounds good to me, except I worry about potential confusion
>>> if people look manually at the things available in CPAN, see 1.60_02
>>> and think it is more recent than 1.60 and try to install it manually.
>>
>> I not sure if this would be a problem. As far as I understand, CPAN
>> treats these packages with underscores in $VERSION as something
>> distinctly different to the others releases (i.e. developer releases).
>> If you look at such a page, it is clearly evident that it is a
>> developers release. For example, if you search on CPAN for the latest
>> version of the CPAN module is shows 1.8802. if you go to that page:
>> http://search.cpan.org/~andk/CPAN-1.8802/
>> There is also a link for the latest developer release, released 1 day
>> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).
>
> [snip]
>
>>> Since
>>> $VERSION = 1.52_10;
>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before
>>> release, final release version should be
>>> $VERSION = 1.6010.
>>
>> Because they are dealt with separately, I don't think this is an issue
>> (see above).
>
> If you don't notice the dates, or are doing numerical version number
> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
> not be automatic, but you can still chose to download the developer
> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
> they may choose to get the latest version and think it is 1.6002 when
> infact 1.60 was the more recent version. 1.6010 solves the problem, is
> consistent with your 1.50_10 suggestion, and doesn't cause any
> problems as far as I can see.
>
>

I see - you mean for a non-RC release append 10 to the version number
and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
the version.

--snip--
>
> How did you get Bundle::BioPerl updated? Did you just ask Chris
> Dagdigian to do it for you? Or do you have access to his account? I'll
> ask Ewan about it.
I just asked Chris D. to do it for me :o)

Nath


From bix at sendu.me.uk  Tue Oct 24 09:01:22 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 14:01:22 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0564.9030302@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk>
Message-ID: <453E0EA2.6050306@sendu.me.uk>

Nathan Haigh wrote:
> I see - you mean for a non-RC release append 10 to the version number
> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
> the version.

Precisely.

1.5.2 RC3 will have in Bio::Root::Version :

$VERSION = 1.52_03;
$VERSION = eval $VERSION; # $VERSION is 1.5203

1.5.2 final release would have:

$VERSION = 1.52_10;
$VERSION = eval $VERSION; # $VERSION is 1.5210

1.6.0 RC1 would have:

$VERSION = 1.60_01;
$VERSION = eval $VERSION; # $VERSION is 1.6001

1.6.0 final release would have:

$VERSION = 1.6010;


Nice thing about putting RCs up on CPAN is that I suppose we'd see the 
test results from cpantesters. The more test results the better :)


From n.haigh at sheffield.ac.uk  Tue Oct 24 09:05:54 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 14:05:54 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0EA2.6050306@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk>
	<453E0EA2.6050306@sendu.me.uk>
Message-ID: <453E0FB2.4080002@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> I see - you mean for a non-RC release append 10 to the version number
>> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
>> the version.
>
> Precisely.
>
> 1.5.2 RC3 will have in Bio::Root::Version :
>
> $VERSION = 1.52_03;
> $VERSION = eval $VERSION; # $VERSION is 1.5203
>
> 1.5.2 final release would have:
>
> $VERSION = 1.52_10;
> $VERSION = eval $VERSION; # $VERSION is 1.5210
>
> 1.6.0 RC1 would have:
>
> $VERSION = 1.60_01;
> $VERSION = eval $VERSION; # $VERSION is 1.6001
>
> 1.6.0 final release would have:
>
> $VERSION = 1.6010;
>
>
> Nice thing about putting RCs up on CPAN is that I suppose we'd see the
> test results from cpantesters. The more test results the better :)
Did you see the cpants site I sent earlier:
http://cpants.perl.org/dist/bioperl

But I'm not sure why 1.4 didn't make it in there instead of 1.2.3


From bix at sendu.me.uk  Tue Oct 24 09:14:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 14:14:08 +0100
Subject: [Bioperl-l] CPAN testing Service
In-Reply-To: <453D2120.9010301@sheffield.ac.uk>
References: <453D2120.9010301@sheffield.ac.uk>
Message-ID: <453E11A0.20304@sendu.me.uk>

Nathan S. Haigh wrote:
> We should also check the CPAN testing service (CPANTS) to see how "good"
> our package is for CPAN and try to increase the Kwalitee score. There
> only appears to be details for bioperl-1.2.3 for some reason:
> http://cpants.perl.org/dist/bioperl

Yes, but I think it will be pretty similar score this time round. We'll 
resolve the remaining issues for 1.6.


From cjfields at uiuc.edu  Tue Oct 24 10:24:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 09:24:44 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0436.3050903@sendu.me.uk>
Message-ID: <000501c6f778$279cee10$15327e82@pyrimidine>

...
> >> Since
> >> $VERSION = 1.52_10;
> >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release,
> >> final release version should be
> >> $VERSION = 1.6010.
> >
> > Because they are dealt with separately, I don't think this is an issue
> > (see above).
> 
> If you don't notice the dates, or are doing numerical version number
> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
> not be automatic, but you can still chose to download the developer
> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
> they may choose to get the latest version and think it is 1.6002 when
> infact 1.60 was the more recent version. 1.6010 solves the problem, is
> consistent with your 1.50_10 suggestion, and doesn't cause any problems
> as far as I can see.

CPAN looks like it can handle 'x.y.z', at least for Pugs:

http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/

>From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':

our $VERSION = 6.002013;

That's also a very perlish-way to do it.  And there are no developer
versions of Pugs, since it is always under active development.  We could try
something like:

our $VERSION = 1.005002_01;

just to tag it as a developer release or release candidate, if that's what
you want; I'm neutral to that point.  I don't think it's necessary to post
every RC to CPAN, though, unless you feel very strongly about it.  It just
seems like more hassle than it's worth, esp. since you've been releasing
about one per week leading up to a final 1.5.2 (due soon).  

> >> I might disagree with this though. I think perl people, and perhaps
> unix
> >> people in general, should be used to version numbers like '1.5.2', but
> >> then getting '1.52' from the code since such a number allows simple
> >> numerical comparisons while the former does not. The former is easier
> to
> >> read and understand. This is just how Perl itself behaves.
> >>
> >> Most users who wouldn't expect such a behaviour aren't going to be
> >> checking the version number programatically anyway.
> >>
> >>
> >> BTW. do we have someone with a CPAN account, or should I get one?
> >>
> >
> > It says Ewan Birney is the author of Bioperl - I assume it must be
> > possible to have multiple people have the permissions to update a single
> > package.

As a quick response to the above, I would read 'rel. 1.5.2' as the second
patched release of the second revision (here in a developer cycle) of the
first major release.  I would read 'rel 1.52' as the 52nd release of the
major release (just can't quite make it to version 2, I guess).  I don't
think we can use the latter as it is just too confusing, especially since
we've adopted the 'major.minor.patch' versioning quite early on.  

As for CPAN, I believe there is usually a person or group responsible for
maintaining each distribution.  As Ewan seems to be the point man, you'll
have to ask him.  I suppose it is possible to add more if needed

> How did you get Bundle::BioPerl updated? Did you just ask Chris
> Dagdigian to do it for you? Or do you have access to his account? I'll
> ask Ewan about it.

When I inquired about XML::Simple, I emailed Chris D. via his contact
information from CPAN.  He let me know that adding it would be pretty easy,
so all you need to do is let him know about any errors/additions/deletions.
I think his wiki page also has some contact info.  

Which reminds me, if anyone contacts him, could you make sure that
XML::Simple is added?  I can't remember if it has been.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 24 10:29:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 09:29:11 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0FB2.4080002@sheffield.ac.uk>
Message-ID: <000601c6f778$c639f0e0$15327e82@pyrimidine>

> Sendu Bala wrote:
> > Nathan Haigh wrote:
> >> I see - you mean for a non-RC release append 10 to the version number
> >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
> >> the version.
> >
> > Precisely.
> >
> > 1.5.2 RC3 will have in Bio::Root::Version :
> >
> > $VERSION = 1.52_03;
> > $VERSION = eval $VERSION; # $VERSION is 1.5203
> >
> > 1.5.2 final release would have:
> >
> > $VERSION = 1.52_10;
> > $VERSION = eval $VERSION; # $VERSION is 1.5210
> >
> > 1.6.0 RC1 would have:
> >
> > $VERSION = 1.60_01;
> > $VERSION = eval $VERSION; # $VERSION is 1.6001
> >
> > 1.6.0 final release would have:
> >
> > $VERSION = 1.6010;
> >
> >
> > Nice thing about putting RCs up on CPAN is that I suppose we'd see the
> > test results from cpantesters. The more test results the better :)
> Did you see the cpants site I sent earlier:
> http://cpants.perl.org/dist/bioperl
> 
> But I'm not sure why 1.4 didn't make it in there instead of 1.2.3

Yes, odd.  Another thing to note is that CPAN also list two bugs related to
bioperl 1.4.  We may need to have some way of either redirecting users from
there to bugzilla, or routinely checking the CPAN site.  Otherwise we'll
miss those. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From JK at novozymes.com  Tue Oct 24 10:45:26 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 16:45:26 +0200
Subject: [Bioperl-l] Keeping references around in the objects?
Message-ID: <934F95E71B6C9347A873C42AE3C196191299E011@NZT0004E.dknz.nzcorp.net>

Hi All. 

When getting a Bio::Seq object back from a feature it would be really 
nice to have access to the old objects through the new object as:

$featseq->feature()->parent_seq();

Would it be possible to keep the references around for (as an example) 
to be able to access the global information through the particular
feature. 

Most of the annotation in the general header of a EMBL/Genbank-record
also
applies to the specific features. 

Jesper


From JK at novozymes.com  Tue Oct 24 10:28:22 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 16:28:22 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
Message-ID: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>

Hi. 

We're trying to "extend" bioperl in our own setup. We have some funtions

that we'd like to "allways" have available on a Bio::Seq-object. As an
example, 
I'd like to have the sequence-digest available on ->digest that just
returns
A hex-encoded message-digest of the sequence in the object. This is
really comfortable
when trying to figure out wether we've got some computations stored in
the cache
for this particular sequence. 

Another example is that we have some fields we want to be mandatory in
the objects,
thus adding additional checks in the constructor is nessesary. 

Our approach has been to "subclass" Bio::Seq in a new object: (Nz::Seq)
and add 
the functionality there. This generally works fine (->translate() calls
->can_call_new()
and instantiates the correct subclassed object. 

But the logic fails when the ->seq of a feature just instantiates a
Bio::PrimarySeq 
without trying to get the subclassed object. 

So the question basically is: 
What is the preferred way of extending/subclassing Bio-perl -objects
with 
our own methods? 

Jesper


From bix at sendu.me.uk  Tue Oct 24 11:26:19 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 16:26:19 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <000501c6f778$279cee10$15327e82@pyrimidine>
References: <000501c6f778$279cee10$15327e82@pyrimidine>
Message-ID: <453E309B.9090007@sendu.me.uk>

Chris Fields wrote:
> ...
>>>> Since
>>>> $VERSION = 1.52_10;
>>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release,
>>>> final release version should be
>>>> $VERSION = 1.6010.
>>> Because they are dealt with separately, I don't think this is an issue
>>> (see above).
>> If you don't notice the dates, or are doing numerical version number
>> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
>> not be automatic, but you can still chose to download the developer
>> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
>> they may choose to get the latest version and think it is 1.6002 when
>> infact 1.60 was the more recent version. 1.6010 solves the problem, is
>> consistent with your 1.50_10 suggestion, and doesn't cause any problems
>> as far as I can see.
> 
> CPAN looks like it can handle 'x.y.z', at least for Pugs:
> 
> http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/

'handle'? I think it shows up as '6.2.13' simply because it was uploaded 
with the filename Perl6-Pugs-6.2.13.tar.gz


As you point out, the code has the kind of $VERSION number we've been 
suggesting in this thread:

> From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> 
> our $VERSION = 6.002013;
> 
> That's also a very perlish-way to do it.  And there are no developer
> versions of Pugs, since it is always under active development.  We could try
> something like:
> 
> our $VERSION = 1.005002_01;

Yes, this was already like one of my suggestions (1.0502_01), but I 
brought up the concern that 1.05 might be < 1.4.

So then we have a question: do we try and fumble a 1.4 compatible number 
by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if 
it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no 
room for RC numbering, or 1.006000010 (1.6.0.10) - the first final 
release following some 1.006000_001 (1.6.0.01 == rc1) RCs?


> just to tag it as a developer release or release candidate, if that's what
> you want; I'm neutral to that point.  I don't think it's necessary to post
> every RC to CPAN, though, unless you feel very strongly about it.  It just
> seems like more hassle than it's worth, esp. since you've been releasing
> about one per week leading up to a final 1.5.2 (due soon).  

I don't think it would be a hassle; on the contrary it would be very 
useful to know the CPAN distribution actually works. I'm very happy with 
the idea that a release candidate gets fully tested...


From bix at sendu.me.uk  Tue Oct 24 11:39:16 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 16:39:16 +0100
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
Message-ID: <453E33A4.5060004@sendu.me.uk>

JK (Jesper Agerbo Krogh) wrote:
> Hi. 
> 
> We're trying to "extend" bioperl in our own setup. We have some funtions
> that we'd like to "allways" have available on a Bio::Seq-object.
[snip]
> So the question basically is: 
> What is the preferred way of extending/subclassing Bio-perl -objects
> with our own methods? 

http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit


From hlapp at gmx.net  Tue Oct 24 12:24:09 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 24 Oct 2006 12:24:09 -0400
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
Message-ID: <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>

I think you've generally taken the right path, but see below.

First off, object factories are used extensively already but not yet  
in each and every place where Bioperl creates an object internally.  
Achieving your goal may entail fixes to Bioperl to use a factory  
instead of a hard-coded module name. Also be on the lookout for  
factory() or seq_factory() methods for classes whose work entails  
creating sequence objects and that already give you control over the  
type to be created.

The problem that hits you here though isn't one of determining the  
type of the object to be created, because the respective method  
doesn't create a sequence object. It only returns the sequence object  
that the feature has a reference to.

The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your  
extension of the latter is that the Perl garbage collector can't deal  
with circular references. The way we've circumvented the problem with  
sequence (who hold references to their feature objects) and feature  
objects (who need to hold a reference to their sequence object) is to  
make Bio::Seq a wrapper around Bio::PrimarySeq (i.e., Bio::Seq  
implements Bio::PrimarySeqI by delegating all the Bio::PrimarySeqI  
methods to an instance of Bio::PrimarySeq, and then adds  
implementations of the Bio::SeqI methods), and then make feature  
objects only hold a reference to the 'base' Bio::PrimarySeq instance.  
This works because Bio::PrimarySeq doesn't hold features, only  
Bio::SeqI objects do.

Having said all that, note that if all what you want to do is  
defining computations on Bio::Seq objects, as opposed to storing  
values for additional attributes, the best design approach is not to  
extend the class but to create a class with those computations as  
static methods (which would accept the seq object on which to compute  
as an argument; e.g., print $seqComputations->message_digest($seq)).

	-hlmar


On Oct 24, 2006, at 10:28 AM, JK ((Jesper Agerbo Krogh)) wrote:

> Hi.
>
> We're trying to "extend" bioperl in our own setup. We have some  
> funtions
>
> that we'd like to "allways" have available on a Bio::Seq-object. As an
> example,
> I'd like to have the sequence-digest available on ->digest that just
> returns
> A hex-encoded message-digest of the sequence in the object. This is
> really comfortable
> when trying to figure out wether we've got some computations stored in
> the cache
> for this particular sequence.
>
> Another example is that we have some fields we want to be mandatory in
> the objects,
> thus adding additional checks in the constructor is nessesary.
>
> Our approach has been to "subclass" Bio::Seq in a new object:  
> (Nz::Seq)
> and add
> the functionality there. This generally works fine (->translate()  
> calls
> ->can_call_new()
> and instantiates the correct subclassed object.
>
> But the logic fails when the ->seq of a feature just instantiates a
> Bio::PrimarySeq
> without trying to get the subclassed object.
>
> So the question basically is:
> What is the preferred way of extending/subclassing Bio-perl -objects
> with
> our own methods?
>
> Jesper
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 24 12:45:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 11:45:25 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E309B.9090007@sendu.me.uk>
Message-ID: <000001c6f78b$d1c65a30$15327e82@pyrimidine>

...
> 
> 'handle'? I think it shows up as '6.2.13' simply because it was uploaded
> with the filename Perl6-Pugs-6.2.13.tar.gz

Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is
'6.002013'.  So maybe we should follow a similar convention.  Seems easier
and less confusing to me, at least.
 
> As you point out, the code has the kind of $VERSION number we've been
> suggesting in this thread:
> 
> > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> >
> > our $VERSION = 6.002013;
> >
> > That's also a very perlish-way to do it.  And there are no developer
> > versions of Pugs, since it is always under active development.  We could
> try
> > something like:
> >
> > our $VERSION = 1.005002_01;
> 
> Yes, this was already like one of my suggestions (1.0502_01), but I
> brought up the concern that 1.05 might be < 1.4.
> 
> So then we have a question: do we try and fumble a 1.4 compatible number
> by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if
> it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no
> room for RC numbering, or 1.006000010 (1.6.0.10) - the first final
> release following some 1.006000_001 (1.6.0.01 == rc1) RCs?

I would go for the clean break if it follows perl/CPAN convention.
'1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing.

If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6
RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. 

BTW, the reason I looked at Pugs was to see what some of the Perl6
developers were using.  Who knows; they'll probably change it!

...

> I don't think it would be a hassle; on the contrary it would be very
> useful to know the CPAN distribution actually works. I'm very happy with
> the idea that a release candidate gets fully tested...

So you obviously feel strongly about it!  ;> 

I don't have a problem as long as we stick with doing this from now on (i.e.
have a consistent versioning scheme, release policy, CPAN release policy,
etc).  Would be nice for Jason/Brian/Hilmar to chime in as to the reasoning
behind the older versioning scheme.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From JK at novozymes.com  Tue Oct 24 13:59:10 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 19:59:10 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>

>  
> I think you've generally taken the right path, but see below.
> 
> First off, object factories are used extensively already but not yet  
> in each and every place where Bioperl creates an object internally.  
> Achieving your goal may entail fixes to Bioperl to use a factory  
> instead of a hard-coded module name. Also be on the lookout for  
> factory() or seq_factory() methods for classes whose work entails  
> creating sequence objects and that already give you control over the  
> type to be created.

Can you elaborate/describe this a bit more? 

> The problem that hits you here though isn't one of determining the  
> type of the object to be created, because the respective method  
> doesn't create a sequence object. It only returns the sequence object  
> that the feature has a reference to.

This was what Data::Dumper told me, but stuff I'd likewise would like to 
change was to get a RichSeq object returned every-time from Bio::Seq, adding
in the stuff that allways seems appropriate. 

> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your  
> extension of the latter is that the Perl garbage collector can't deal  
> with circular references. 

Doesn't Scalar::Util::weaken solve that? 

> Having said all that, note that if all what you want to do is  
> defining computations on Bio::Seq objects, as opposed to storing  
> values for additional attributes, the best design approach is not to  
> extend the class but to create a class with those computations as  
> static methods (which would accept the seq object on which to compute  
> as an argument; e.g., print $seqComputations->message_digest($seq)).

I could but there are some functionality that I'd by design would like to 
have available on every sequence in the system. This way I would end up 
coding the functionality for getting the message_digest every place that
I needed to get the value (which would be quite often in this application), 
whereas it by design belongs into the Bio::Seq-stuff. 

Jesper


From JK at novozymes.com  Tue Oct 24 13:59:19 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 19:59:19 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <453E33A4.5060004@sendu.me.uk>
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FD@NZT0004E.dknz.nzcorp.net>


> JK (Jesper Agerbo Krogh) wrote:
> > Hi. 
> > 
> > We're trying to "extend" bioperl in our own setup. We have some funtions
> > that we'd like to "allways" have available on a Bio::Seq-object.
> [snip]
> > So the question basically is: 
> > What is the preferred way of extending/subclassing Bio-perl -objects
> > with our own methods? 
> 
> http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit

That is definately a way of extending Bio-perl, thanks. 

Jesper


From hlapp at gmx.net  Tue Oct 24 14:57:02 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 24 Oct 2006 14:57:02 -0400
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>
	<934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>
Message-ID: <C8DB5DCD-E5BB-4AA0-9CDA-3C2EC7B88621@gmx.net>


On Oct 24, 2006, at 1:59 PM, JK ((Jesper Agerbo Krogh)) wrote:

>>
>> I think you've generally taken the right path, but see below.
>>
>> First off, object factories are used extensively already but not yet
>> in each and every place where Bioperl creates an object internally.
>> Achieving your goal may entail fixes to Bioperl to use a factory
>> instead of a hard-coded module name. Also be on the lookout for
>> factory() or seq_factory() methods for classes whose work entails
>> creating sequence objects and that already give you control over the
>> type to be created.
>
> Can you elaborate/describe this a bit more?

See for example the POD of Bio::SeqIO (sorry, the method is called  
sequence_factory()).

>
>> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your
>> extension of the latter is that the Perl garbage collector can't deal
>> with circular references.
>
> Doesn't Scalar::Util::weaken solve that?

You're welcome to test and try. It should be a simple change in  
Bio::Seq::add_SeqFeature(). You will see that it is this method and  
not the feature object that makes sure the wrapped primarySeq gets  
passed as sequence reference. Just change that to creating a new  
reference to the sequence object and make it a weak reference before  
passing it to the feature object.

(The feature object has no requirement (or knowledge) that the  
referenced sequence object is a PrimarySeq.)

>
>> Having said all that, note that if all what you want to do is
>> defining computations on Bio::Seq objects, as opposed to storing
>> values for additional attributes, the best design approach is not to
>> extend the class but to create a class with those computations as
>> static methods (which would accept the seq object on which to compute
>> as an argument; e.g., print $seqComputations->message_digest($seq)).
>
> I could but there are some functionality that I'd by design would  
> like to
> have available on every sequence in the system. This way I would  
> end up
> coding the functionality for getting the message_digest every place  
> that
> I needed to get the value (which would be quite often in this  
> application),
> whereas it by design belongs into the Bio::Seq-stuff.

I'm not following you why this would make any difference (it would be  
$seq->message_digest() compared to $seqCompute->message_digest 
($seq)), unless what you are saying is that you would like to cache  
the result of the computation.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Wed Oct 25 06:36:27 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 25 Oct 2006 11:36:27 +0100
Subject: [Bioperl-l] Lagan environment variable
Message-ID: <453F3E2B.2040309@sendu.me.uk>

Notification to say I'm changing the environmental variable that 
Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
default variable that the lagan installation and scripts themselves look 
for.

I hope this isn't too much of a burden, but it seems like the sensible 
approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.


Thank you,
Sendu.


From n.haigh at sheffield.ac.uk  Wed Oct 25 09:07:47 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 25 Oct 2006 13:07:47 +0000
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F3E2B.2040309@sendu.me.uk>
References: <453F3E2B.2040309@sendu.me.uk>
Message-ID: <453F61A3.4090904@sheffield.ac.uk>

Sendu Bala wrote:
> Notification to say I'm changing the environmental variable that 
> Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
> default variable that the lagan installation and scripts themselves look 
> for.
>
> I hope this isn't too much of a burden, but it seems like the sensible 
> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.
>
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Woudn't it make more sense to change the test? That is what I've just
done for t/Genscan.t

It seemed to fit in with the ENV variable syntax that other modules in
Bioperl-run used.

Nath

-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>


From bix at sendu.me.uk  Wed Oct 25 08:12:00 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 25 Oct 2006 13:12:00 +0100
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F61A3.4090904@sheffield.ac.uk>
References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk>
Message-ID: <453F5490.7060808@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Notification to say I'm changing the environmental variable that 
>> Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
>> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
>> default variable that the lagan installation and scripts themselves look 
>> for.
>>
>> I hope this isn't too much of a burden, but it seems like the sensible 
>> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.
>
> Woudn't it make more sense to change the test? That is what I've just
> done for t/Genscan.t

For Genscan.t, the test script looked at the wrong environment variable.

Here I'm talking about lagan itself (the thing you get from 
http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with 
Bioperl) needing the environment variable LAGAN_DIR to be set in order 
to work.

Since you need to set LAGAN_DIR to make lagan work, it makes sense that 
the Bioperl front-end to lagan also use the same variable.


From n.haigh at sheffield.ac.uk  Wed Oct 25 09:16:16 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 25 Oct 2006 13:16:16 +0000
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F5490.7060808@sendu.me.uk>
References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk>
	<453F5490.7060808@sendu.me.uk>
Message-ID: <453F63A0.7040609@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Notification to say I'm changing the environmental variable that
>>> Bio::Tools::Run::Alignment::Lagan expects to define the location of
>>> the lagan executables from LAGANDIR to LAGAN_DIR, since the latter
>>> is the default variable that the lagan installation and scripts
>>> themselves look for.
>>>
>>> I hope this isn't too much of a burden, but it seems like the
>>> sensible approach to getting Bio::Tools::Run::Alignment::Lagan to
>>> actually work.
>>
>> Woudn't it make more sense to change the test? That is what I've just
>> done for t/Genscan.t
>
> For Genscan.t, the test script looked at the wrong environment variable.
>
> Here I'm talking about lagan itself (the thing you get from
> http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with
> Bioperl) needing the environment variable LAGAN_DIR to be set in order
> to work.
>
> Since you need to set LAGAN_DIR to make lagan work, it makes sense
> that the Bioperl front-end to lagan also use the same variable.
>
Ah, OK! :-[  teach me for speak up about something I know nothing about!
:-)

FYI, I've been busy this morning installing as much Bioperl-run external
software as I could (those that have tests). Will be posting results shorty.

Nath


From massimo.ubaldi at gmail.com  Wed Oct 25 10:28:52 2006
From: massimo.ubaldi at gmail.com (Massimo Ubaldi)
Date: Wed, 25 Oct 2006 16:28:52 +0200
Subject: [Bioperl-l] blastxml format
Message-ID: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>

Hi
I'm using the script below to parse a blastn output to multiple sequences
I got the output from the blast web interface asking for xml formatted
output.
Everything work fine except that I cannot print the name of each input
sequence (see below).
That is, using the line (see below) $result->query_description I got just
the name of the first sequence. Infact this is defined by the
<BlastOutput_query-def> tag.
What I really want is to extract the name that is defined by the
<Iteration_query-def> tag.
Now I digged out the bioperl mailing list and other sources but I did not
find anything to solve this.
Can somebody help me?
Thanks alot
Massimo


 This is an example of ouput I got

MRDNA_probe
46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form B
(LOC562171), mRNA    68354945    XM_685568
81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
68420187    XM_684078

This what I'd like to get
MRDNA_probe
46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form B
(LOC562171), mRNA    68354945    XM_685568
VDRacterm_probe
81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
ARalpcterm_probe
PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
68420187    XM_684078

This is the script
#!/usr/bin/perl
use strict;
use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'blast',
                            -file   => 'Blastn_danio.bls');
open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
stopped";
my $result = $in->next_result;
print OUTFILE $result->algorithm, "\n";
print OUTFILE $result->database_name, "\n";

print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
"\t", "GenBank Accession", "\n";

while($result = $in->next_result ) {
    print OUTFILE $result->query_description, "\n";
      while( my $hit = $result->next_hit ) {
           while( my $hsp = $hit->next_hsp ) {

                my $acc=$hit->name;
                my $description= $hit->description;

                $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;

                print OUTFILE

                  $hit->raw_score, "\t", # Score
                  $hit->description, "\t", # Description

                $1, "\t", $2, "\n";
         }
      }
}


From cjfields at uiuc.edu  Wed Oct 25 11:04:14 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Oct 2006 10:04:14 -0500
Subject: [Bioperl-l] blastxml format
In-Reply-To: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>
Message-ID: <000301c6f846$d6227760$15327e82@pyrimidine>

Iterations (which are related to PSIBLAST) aren't currently handled in
blastxml, which is why the tag isn't being parsed.  I'll give it a look but
I don't think it will be properly fixed anytime soon, since we're gearing up
for a developer release and are sorting out various bugs in relation to
that.

In the meantime, you could always try changing the relevant tag in the
%MAPPING hash in your local copy of Bio::SearchIO::blastxml from
'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick for
you.  I'm a bit reluctant to change this in CVS as it would be better to add
this in when iterations are handled properly by blastxml, and I'm not sure
all BLAST XML varieties have the <Iteration_query-def> tag.

If you want you can add this to the bioperl bugzilla as an enhancement
request to remind us:

http://bugzilla.open-bio.org/

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi
> Sent: Wednesday, October 25, 2006 9:29 AM
> To: bioperl-l List
> Subject: [Bioperl-l] blastxml format
> 
> Hi
> I'm using the script below to parse a blastn output to multiple sequences
> I got the output from the blast web interface asking for xml formatted
> output.
> Everything work fine except that I cannot print the name of each input
> sequence (see below).
> That is, using the line (see below) $result->query_description I got just
> the name of the first sequence. Infact this is defined by the
> <BlastOutput_query-def> tag.
> What I really want is to extract the name that is defined by the
> <Iteration_query-def> tag.
> Now I digged out the bioperl mailing list and other sources but I did not
> find anything to solve this.
> Can somebody help me?
> Thanks alot
> Massimo
> 
> 
>  This is an example of ouput I got
> 
> MRDNA_probe
> 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form
> B
> (LOC562171), mRNA    68354945    XM_685568
> 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> 68420187    XM_684078
> 
> This what I'd like to get
> MRDNA_probe
> 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form
> B
> (LOC562171), mRNA    68354945    XM_685568
> VDRacterm_probe
> 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> ARalpcterm_probe
> PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> 68420187    XM_684078
> 
> This is the script
> #!/usr/bin/perl
> use strict;
> use Bio::SearchIO;
> my $in = new Bio::SearchIO(-format => 'blast',
>                             -file   => 'Blastn_danio.bls');
> open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
> stopped";
> my $result = $in->next_result;
> print OUTFILE $result->algorithm, "\n";
> print OUTFILE $result->database_name, "\n";
> 
> print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
> "\t", "GenBank Accession", "\n";
> 
> while($result = $in->next_result ) {
>     print OUTFILE $result->query_description, "\n";
>       while( my $hit = $result->next_hit ) {
>            while( my $hsp = $hit->next_hsp ) {
> 
>                 my $acc=$hit->name;
>                 my $description= $hit->description;
> 
>                 $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;
> 
>                 print OUTFILE
> 
>                   $hit->raw_score, "\t", # Score
>                   $hit->description, "\t", # Description
> 
>                 $1, "\t", $2, "\n";
>          }
>       }
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From massimo.ubaldi at gmail.com  Wed Oct 25 11:20:49 2006
From: massimo.ubaldi at gmail.com (Massimo Ubaldi)
Date: Wed, 25 Oct 2006 17:20:49 +0200
Subject: [Bioperl-l] blastxml format
In-Reply-To: <000301c6f846$d6227760$15327e82@pyrimidine>
References: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>
	<000301c6f846$d6227760$15327e82@pyrimidine>
Message-ID: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com>

Thanks for the reply. I've already tried this but I got exactly the same
results as before.
What other can I try?
Massimo

On 10/25/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Iterations (which are related to PSIBLAST) aren't currently handled in
> blastxml, which is why the tag isn't being parsed.  I'll give it a look
> but
> I don't think it will be properly fixed anytime soon, since we're gearing
> up
> for a developer release and are sorting out various bugs in relation to
> that.
>
> In the meantime, you could always try changing the relevant tag in the
> %MAPPING hash in your local copy of Bio::SearchIO::blastxml from
> 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick
> for
> you.  I'm a bit reluctant to change this in CVS as it would be better to
> add
> this in when iterations are handled properly by blastxml, and I'm not sure
> all BLAST XML varieties have the <Iteration_query-def> tag.
>
> If you want you can add this to the bioperl bugzilla as an enhancement
> request to remind us:
>
> http://bugzilla.open-bio.org/
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi
> > Sent: Wednesday, October 25, 2006 9:29 AM
> > To: bioperl-l List
> > Subject: [Bioperl-l] blastxml format
> >
> > Hi
> > I'm using the script below to parse a blastn output to multiple
> sequences
> > I got the output from the blast web interface asking for xml formatted
> > output.
> > Everything work fine except that I cannot print the name of each input
> > sequence (see below).
> > That is, using the line (see below) $result->query_description I got
> just
> > the name of the first sequence. Infact this is defined by the
> > <BlastOutput_query-def> tag.
> > What I really want is to extract the name that is defined by the
> > <Iteration_query-def> tag.
> > Now I digged out the bioperl mailing list and other sources but I did
> not
> > find anything to solve this.
> > Can somebody help me?
> > Thanks alot
> > Massimo
> >
> >
> >  This is an example of ouput I got
> >
> > MRDNA_probe
> > 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor
> form
> > B
> > (LOC562171), mRNA    68354945    XM_685568
> > 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> > 68420187    XM_684078
> >
> > This what I'd like to get
> > MRDNA_probe
> > 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor
> form
> > B
> > (LOC562171), mRNA    68354945    XM_685568
> > VDRacterm_probe
> > 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> > ARalpcterm_probe
> > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> > 68420187    XM_684078
> >
> > This is the script
> > #!/usr/bin/perl
> > use strict;
> > use Bio::SearchIO;
> > my $in = new Bio::SearchIO(-format => 'blast',
> >                             -file   => 'Blastn_danio.bls');
> > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
> > stopped";
> > my $result = $in->next_result;
> > print OUTFILE $result->algorithm, "\n";
> > print OUTFILE $result->database_name, "\n";
> >
> > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
> > "\t", "GenBank Accession", "\n";
> >
> > while($result = $in->next_result ) {
> >     print OUTFILE $result->query_description, "\n";
> >       while( my $hit = $result->next_hit ) {
> >            while( my $hsp = $hit->next_hsp ) {
> >
> >                 my $acc=$hit->name;
> >                 my $description= $hit->description;
> >
> >                 $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;
> >
> >                 print OUTFILE
> >
> >                   $hit->raw_score, "\t", # Score
> >                   $hit->description, "\t", # Description
> >
> >                 $1, "\t", $2, "\n";
> >          }
> >       }
> > }
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at uiuc.edu  Wed Oct 25 12:56:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Oct 2006 11:56:46 -0500
Subject: [Bioperl-l] blastxml format
In-Reply-To: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com>
Message-ID: <000001c6f856$8ee44bc0$15327e82@pyrimidine>


> Thanks for the reply. I've already tried this but I got exactly the same >
> results as before.
> What other can I try? 
> Massimo

If you don't mind me asking, what version of perl and Bioperl are you using,
and what version of BLAST is used?  

I want to point out there are a number of problems with your script, now I
have had a chance to look at it.  

1) You have the SearchIO format set to 'blast'.  It should be 'blastxml' if
you are parsing XML format.  

2) Every time you call next_result() you iterate through each BLAST report.
In effect, you're doing something like this:

  my $result = $in->next_result();
   ....# do something here (in first BLAST report)
 
  while ($result = $in->next_result()) { # change to second BLAST report
      # more stuff here (in second BLAST report, if there is one)
  }

I don't know if it's intentional though, but it's something to point out.

3) You also use raw_score(), which doesn't return a value for me (this may
be related to the bioperl version, which is why I asked above).  If you use
$hit->bits() or $hit->significance() you can get the bits or hit evalue,
respectively.

4) Also, I didn't see a difference with the two XML tags
<BlastOutput_query-def> and <Iteration_query-def> using BLAST 2.2.15 output
(WebBLAST at NCBI), which makes sense since they should originate from the
same query sequence anyway.  This could be related to the BLAST version.

Here's my version of your script, using WinXP and bioperl-live (CVS):

use Bio::SearchIO;
my $file = shift @ARGV;

my $in = new Bio::SearchIO(-format => 'blastxml',
                            -file   => $file);

open OUTFILE, ">parsed_blastn_danio.txt" || 
die "Could not open file, stopped";

while(my $result = $in->next_result ) {
    print OUTFILE $result->algorithm, "\n";
    print OUTFILE $result->database_name, "\n";
    print OUTFILE "Score", "\t",
                  "Description", "\t",
                  "NCBI gi identifiers", "\t",
                  "GenBank Accession", "\n";
    print OUTFILE $result->query_description, "\n";
    while( my $hit = $result->next_hit ) {
        while( my $hsp = $hit->next_hsp ) {
            my $acc=$hit->name;
            my $description= $hit->description;
            if ($acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/) {
                print OUTFILE $hit->bits, "\t", # Score
                  $hit->description, "\t", # Description
                  $1, "\t", $2, "\n";
            }
        }
    }
}

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign

...


From n.haigh at sheffield.ac.uk  Thu Oct 26 04:47:27 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 09:47:27 +0100
Subject: [Bioperl-l] More extensive Bioperl-run 1.5.2RC2 tests
Message-ID: <4540761F.6010904@sheffield.ac.uk>

Oops, I posted this to the Biojava list the other day by mistake!

I have recently installed some more software for which there are
bioperl-run tests and run the test suite with several versions of the
software I could find. I've added info to
http://www.bioperl.org/wiki/Release_1.5.2#bioperl-run. If there were any
fails in any of the versions I tested I've noted them together with
versions that were ok (if any).

There maybe another 6 or so programs I'm trying to get hold of to run
further tests - I'll update when I get them.
Nath


From n.haigh at sheffield.ac.uk  Thu Oct 26 05:14:07 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 10:14:07 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
Message-ID: <45407C5F.40104@sheffield.ac.uk>

I'm thinking that it's not wise to test for things like
overall_percentage_identity etc in alignments that are generated by
external software like T-Coffee, Clustalw etc. Changes to software
algorithms/efficiency, bug fixes etc may well alter the quality of the
alignment produced in different versions and thus affect the value
returned by such methods. Therefore, I think these methods should only
be tested from alignments loaded directly from t/data.

Nath


From bix at sendu.me.uk  Thu Oct 26 05:48:37 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 26 Oct 2006 10:48:37 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45407C5F.40104@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk>
Message-ID: <45408475.30903@sendu.me.uk>

Nathan Haigh wrote:
> I'm thinking that it's not wise to test for things like
> overall_percentage_identity etc in alignments that are generated by
> external software like T-Coffee, Clustalw etc. Changes to software
> algorithms/efficiency, bug fixes etc may well alter the quality of the
> alignment produced in different versions and thus affect the value
> returned by such methods. Therefore, I think these methods should only
> be tested from alignments loaded directly from t/data.

Did you discover some specific problem cases?


From n.haigh at sheffield.ac.uk  Thu Oct 26 06:04:54 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 11:04:54 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408475.30903@sendu.me.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
Message-ID: <45408846.1050001@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> I'm thinking that it's not wise to test for things like
>> overall_percentage_identity etc in alignments that are generated by
>> external software like T-Coffee, Clustalw etc. Changes to software
>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>> alignment produced in different versions and thus affect the value
>> returned by such methods. Therefore, I think these methods should only
>> be tested from alignments loaded directly from t/data.
>
> Did you discover some specific problem cases?
My messages seem to be taking a while to come through, but, yes. It may
be due to the software changing default parameters, but it makes testing
the output for specific details pretty difficult and inconsistent. For
example, running T-Coffee, the following command from t/TCoffee.t
results in slightly different alignment:
$aln = $factory->run('-type' => 'profile',
                     '-profile' => $aln1,
                     '-seq'  =>
Bio::Root::IO->catfile("t","data","cysprot1b.fa"));

Of particular note, is the gaps on the last line of the sequences. In
4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
<v4.45 this is ('gkn----mcg').

T-Coffee v4.45 returns the following alignment:

>CATH_RAT/1-333
------mwtalpllcagawllsagat----------aeltvnaiek------------fh
ftswmkqhqktyss-reyshrlqvfannwrkiqahn----qrnhtfkmglnqfsdmsfae
ikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqgacgscwtfs
ttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqafeyilynk
gimgedsypyigkngqckfnpekavafvknvv-nitlndeaamveavalynpvsfafevt
-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivknswgsnwgnn
gyfliergk-nm---cglaacasypipqv
>CATL_HUMAN/1-333
--------------------------------mnptlilaafclgiasatltfdhsleaq
wtkwkamhnrlygmnee-gwrravweknmkmielhnqeyregkhsftmamnafgdmtsee
frqvmngfqnrkpr----kgkvfqeplfyeaprsvdwrekg-yvtpvknqgqcgscwafs
atgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdyafqyvqdng
gldseesypyeateesckynpkysvandtgfv-dip-kqekalmkavatvgpisvaidag
hesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvknswgeewgmg
gyvkmakdrrnh---cgiasaasyptv--
>CATL_RAT/1-334
--------------------------------mtpllllavlclgtalatpkfdqtfnaq
whqwksthrrlygtnee-ewrravweknmrmiqlhngeysngkhgftmemnafgdmtnee
frqivngyrhqkhk----kgrlfqeplmlqipktvdwrekg-cvtpvknqgqcgscwafs
asgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfafqyikeng
gldseesypyeakdgsckyraeyavandtgfv-dip-qqekalmkavatvgpisvamdas
hpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvknswgkewgmd
gyikiakdrnnh---cglataasypivn-
>PAPA_CARPA/1-345
mamipsiskllfvaiclfvymglsfg-------------dfsivgysqndltsterliql
feswmlkhnkiyknidekiyrfeifkdnlkyidetn----kknnsywlglnvfadmsnde
fkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgscgscwafs
avvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsalqlvaqy-
gihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysian-qpvsvvleaa
gkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yiliknswgtgwgen
gyirikrgtgnsygvcglytssfypvkn-
>ALEU_HORVU/1-362
maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtrhalr
farfavrygksyesaaevrrrfrifsesleevrstn----rkglpyrlginrfsdmswee
fqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqahcgscwtfs
ttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqafeyikyng
gidteesypykgvngvchykaenaavqvldsv-nitlnaedelknavglvrpvsvafqvi
-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywliknswgadwgdn
gyfkmemgk-nm---caiatcasypvvaa
>CATH_HUMAN/1-335
------mwatlpllcagawllg--------vpvcgaaelsvnslek------------fh
fkswmskhrktys-teeyhhrlqtfasnwrkinahn----ngnhtfkmalnqfsdmsfae
ikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqgacgscwtfs
ttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqafeyilynk
gimgedtypyqgkdgyckfqpgkaigfvkdva-nitiydeeamveavalynpvsfafevt
-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivknswgpqwgmn
gyfliergk-nm---cglaacasypiplv
>CYS1_DICDI/1-343
-----mkvillfvlavftvfvs---------------srgippeeq------------sq
flefqdkfnkkys-heeylerfeifksnlgkieelnliainhkadtkfgvnkfadlssde
fknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgqcgscwsfs
ttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpnaynyiikng
giqtessypytaetgtqcnfnsanigakisnf-tmipknetvmagyivstgplaiaadav
-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivknswgadwgeq
gyiylrrgk-nt---cgvsnfvstsii--

While T-Coffee <4.45 returned:
>CATH_RAT/1-333
----------mwtalpllcagawllsagat----------aeltvnaiek----------
--fhftswmkqhqktyss-reyshrlqvfannwrkiqahn----q----rnhtfkmglnq
fsdmsfaeikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqga
cgscwtfsttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqa
feyilynkgimgedsypyigkngqckfnpekavafvknvvn-itlndeaamveavalynp
vsfafevt-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivkns
wgsnwgnngyfliergkn----mcglaacasypipqv
>PAPA_CARPA/1-345
mamipsiskllfvaiclfvymglsfgdfsivgysqndltsterliqlfeswml-------
-------------khnkiyknidekiyrf-----eifkdnlkyidetnkknnsywlglnv
fadmsndefkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgs
cgscwafsavvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsa
lq-lvaqygihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysia-nqp
vsvvleaagkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yilikns
wgtgwgengyirikrgtgnsygvcglytssfypvkn-
>CATL_HUMAN/1-333
-----------------------------------------mnptlilaafclgiasatl
tfdhsleaqwtkwkamhnrlygmneegwrravweknmkmielhnqeyregkhsftmamna
fgdmtseefrqvmngfqnrkprkgkvfqeplf----yeaprsvdwrekg-yvtpvknqgq
cgscwafsatgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdya
fqyvqdnggldseesypyeateesckynpkysvandtgfvd--ipkqekalmkavatvgp
isvaidaghesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvkns
wgeewgmggyvkmakdrrnh---cgiasaasyptv--
>CATL_RAT/1-334
-----------------------------------------mtpllllavlclgtalatp
kfdqtfnaqwhqwksthrrlygtneeewrravweknmrmiqlhngeysngkhgftmemna
fgdmtneefrqivngyrhqkhkkgrlfqeplm----lqipktvdwrekg-cvtpvknqgq
cgscwafsasgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfa
fqyikenggldseesypyeakdgsckyraeyavandtgfvd--ipqqekalmkavatvgp
isvamdashpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvkns
wgkewgmdgyikiakdrnnh---cglataasypivn-
>ALEU_HORVU/1-362
----maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtr
halrfarfavrygksyesaaevrrrfrifsesleevrstn----r----kglpyrlginr
fsdmsweefqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqah
cgscwtfsttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqa
feyikynggidteesypykgvngvchykaenaavqvldsvn-itlnaedelknavglvrp
vsvafqvi-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywlikns
wgadwgdngyfkmemgkn----mcaiatcasypvvaa
>CATH_HUMAN/1-335
----------mwatlpllcagawllg--------vpvcgaaelsvnslek----------
--fhfkswmskhrktys-teeyhhrlqtfasnwrkinahn----n----gnhtfkmalnq
fsdmsfaeikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqga
cgscwtfsttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqa
feyilynkgimgedtypyqgkdgyckfqpgkaigfvkdvan-itiydeeamveavalynp
vsfafevt-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivkns
wgpqwgmngyfliergkn----mcglaacasypiplv
>CYS1_DICDI/1-343
---------mkvillfvlavftvfvs---------------srgippeeq----------
--sqflefqdkfnkkys-heeylerfeifksnlgkieelnliain----hkadtkfgvnk
fadlssdefknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgq
cgscwsfsttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpna
ynyiiknggiqtessypytaetgtqcnfnsanigakisnft-mipknetvmagyivstgp
laiaadav-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivkns
wgadwgeqgyiylrrgkn----tcgvsnfvstsii--


From sanges at biogem.it  Thu Oct 26 06:26:36 2006
From: sanges at biogem.it (Remo Sanges)
Date: Thu, 26 Oct 2006 11:26:36 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408846.1050001@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk>
Message-ID: <45408D5C.1000305@biogem.it>

Nathan Haigh wrote:
> Sendu Bala wrote:
>   
>> Nathan Haigh wrote:
>>     
>>> I'm thinking that it's not wise to test for things like
>>> overall_percentage_identity etc in alignments that are generated by
>>> external software like T-Coffee, Clustalw etc. Changes to software
>>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>>> alignment produced in different versions and thus affect the value
>>> returned by such methods. Therefore, I think these methods should only
>>> be tested from alignments loaded directly from t/data.
>>>       
>> Did you discover some specific problem cases?
>>     
> My messages seem to be taking a while to come through, but, yes. It may
> be due to the software changing default parameters, but it makes testing
> the output for specific details pretty difficult and inconsistent. For
> example, running T-Coffee, the following command from t/TCoffee.t
> results in slightly different alignment:
> $aln = $factory->run('-type' => 'profile',
>                      '-profile' => $aln1,
>                      '-seq'  =>
> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>
> Of particular note, is the gaps on the last line of the sequences. In
> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
> <v4.45 this is ('gkn----mcg').
>   
I'm not a T-coffee user but usually you can come across
these problems when you use different scoring parameters
when align sequences.

Could it be possible that they have simply changed the
default parameters for gap penalties and that kind of
stuff? It is possible to set them?

If so you can just run the test by defining
the scores in the param hash without using the default.

HTH

Remo


From n.haigh at sheffield.ac.uk  Thu Oct 26 06:33:55 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 11:33:55 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408D5C.1000305@biogem.it>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it>
Message-ID: <45408F13.9020209@sheffield.ac.uk>

Remo Sanges wrote:
> Nathan Haigh wrote:
>> Sendu Bala wrote:
>>  
>>> Nathan Haigh wrote:
>>>    
>>>> I'm thinking that it's not wise to test for things like
>>>> overall_percentage_identity etc in alignments that are generated by
>>>> external software like T-Coffee, Clustalw etc. Changes to software
>>>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>>>> alignment produced in different versions and thus affect the value
>>>> returned by such methods. Therefore, I think these methods should only
>>>> be tested from alignments loaded directly from t/data.
>>>>       
>>> Did you discover some specific problem cases?
>>>     
>> My messages seem to be taking a while to come through, but, yes. It may
>> be due to the software changing default parameters, but it makes testing
>> the output for specific details pretty difficult and inconsistent. For
>> example, running T-Coffee, the following command from t/TCoffee.t
>> results in slightly different alignment:
>> $aln = $factory->run('-type' => 'profile',
>>                      '-profile' => $aln1,
>>                      '-seq'  =>
>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>>
>> Of particular note, is the gaps on the last line of the sequences. In
>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
>> <v4.45 this is ('gkn----mcg').
>>   
> I'm not a T-coffee user but usually you can come across
> these problems when you use different scoring parameters
> when align sequences.
>
> Could it be possible that they have simply changed the
> default parameters for gap penalties and that kind of
> stuff? It is possible to set them?
>
> If so you can just run the test by defining
> the scores in the param hash without using the default.
>
> HTH
>
> Remo
That is true, but it depends on the whether the wrapper is complete
enough to be able to set all the parameters provided by the software.

Nath


From n.haigh at sheffield.ac.uk  Thu Oct 26 12:13:03 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 17:13:03 +0100
Subject: [Bioperl-l] Bio::Restriction::Enzyme
Message-ID: <4540DE8F.7070501@sheffield.ac.uk>

I'm in the middle of writing some code that uses
Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
Bioperl from HEAD.

I seem to find that $enzyme->is_palindromic always seems to return true.
Can anyone verify this? If needs be, I can send some code.

Thanks
Nathan


From info at nanotechcongresssmailer.net  Tue Oct 24 10:45:10 2006
From: info at nanotechcongresssmailer.net (International Association of Nanotechnology)
Date: Tue, 24 Oct 2006 09:45:10 -0500
Subject: [Bioperl-l] ICNT2006-presents Nanotechnology Workforce Development
Message-ID: <200610241445.k9OEjBBA024478@portal.open-bio.org>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061024/d185772e/attachment-0002.html>

From bosborne11 at verizon.net  Thu Oct 26 12:37:06 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Thu, 26 Oct 2006 12:37:06 -0400
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk>
Message-ID: <C1665C72.B068%bosborne11@verizon.net>

Nathan,

Perhaps because most restriction sites are palindromes. Anyway, I added
tests for palindromic() and is_palindromic() where the site is not a
palindrome, these tests pass (t/RestrictionAnalyis.t).

Brian O.


On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:

> I'm in the middle of writing some code that uses
> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
> Bioperl from HEAD.
> 
> I seem to find that $enzyme->is_palindromic always seems to return true.
> Can anyone verify this? If needs be, I can send some code.
> 
> Thanks
> Nathan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Thu Oct 26 12:49:48 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 17:49:48 +0100
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1665C72.B068%bosborne11@verizon.net>
References: <C1665C72.B068%bosborne11@verizon.net>
Message-ID: <4540E72C.5020800@sheffield.ac.uk>

Brian Osborne wrote:
> Nathan,
>
> Perhaps because most restriction sites are palindromes. Anyway, I added
> tests for palindromic() and is_palindromic() where the site is not a
> palindrome, these tests pass (t/RestrictionAnalyis.t).
>
> Brian O.
>
>
> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>
>   
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>
>   
Ok, thanks - nice to know :-)


From cjfields at uiuc.edu  Thu Oct 26 12:58:34 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 26 Oct 2006 11:58:34 -0500
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk>
Message-ID: <001301c6f91f$f9611770$15327e82@pyrimidine>

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh
> Sent: Thursday, October 26, 2006 11:13 AM
> To: Bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bio::Restriction::Enzyme
> 
> I'm in the middle of writing some code that uses
> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
> Bioperl from HEAD.
> 
> I seem to find that $enzyme->is_palindromic always seems to return true.
> Can anyone verify this? If needs be, I can send some code.
> 
> Thanks
> Nathan

You should file a bug report if you have found a test case where this method
isn't working as it should, especially if Brian's tests pass and you're
still getting the wrong results.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From jason at bioperl.org  Thu Oct 26 12:57:32 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 26 Oct 2006 09:57:32 -0700
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408F13.9020209@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it>
	<45408F13.9020209@sheffield.ac.uk>
Message-ID: <C2AC4DE8-7E99-4744-9FA9-B11C51788BDE@bioperl.org>

Nathan -

I agree - the values tend to change with different versions of the  
applications unfortunately.  It would make sense to just test that  
you get out sequences that are in valid alignment format and perhaps  
have as many ending sequences as you started with.   The more  
restrictive tests probably aren't reliable with mixing and matching  
versions.

One thing we do for PAML is condition tests on the version used - but  
of course when a new version comes out we have to add more stuff to  
the tests (or just have some code that skips those tests).

-jason
On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:

> Remo Sanges wrote:
>> Nathan Haigh wrote:
>>> Sendu Bala wrote:
>>>
>>>> Nathan Haigh wrote:
>>>>
>>>>> I'm thinking that it's not wise to test for things like
>>>>> overall_percentage_identity etc in alignments that are  
>>>>> generated by
>>>>> external software like T-Coffee, Clustalw etc. Changes to software
>>>>> algorithms/efficiency, bug fixes etc may well alter the quality  
>>>>> of the
>>>>> alignment produced in different versions and thus affect the value
>>>>> returned by such methods. Therefore, I think these methods  
>>>>> should only
>>>>> be tested from alignments loaded directly from t/data.
>>>>>
>>>> Did you discover some specific problem cases?
>>>>
>>> My messages seem to be taking a while to come through, but, yes.  
>>> It may
>>> be due to the software changing default parameters, but it makes  
>>> testing
>>> the output for specific details pretty difficult and  
>>> inconsistent. For
>>> example, running T-Coffee, the following command from t/TCoffee.t
>>> results in slightly different alignment:
>>> $aln = $factory->run('-type' => 'profile',
>>>                      '-profile' => $aln1,
>>>                      '-seq'  =>
>>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>>>
>>> Of particular note, is the gaps on the last line of the  
>>> sequences. In
>>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
>>> <v4.45 this is ('gkn----mcg').
>>>
>> I'm not a T-coffee user but usually you can come across
>> these problems when you use different scoring parameters
>> when align sequences.
>>
>> Could it be possible that they have simply changed the
>> default parameters for gap penalties and that kind of
>> stuff? It is possible to set them?
>>
>> If so you can just run the test by defining
>> the scores in the param hash without using the default.
>>
>> HTH
>>
>> Remo
> That is true, but it depends on the whether the wrapper is complete
> enough to be able to set all the parameters provided by the software.
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From cjfields at uiuc.edu  Thu Oct 26 18:01:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 26 Oct 2006 17:01:08 -0500
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <C2AC4DE8-7E99-4744-9FA9-B11C51788BDE@bioperl.org>
Message-ID: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>

I have been running into similar issues with EUtilities tests.  Since the
data on the server is constantly updated I have to try an future-proof the
tests so they don't constantly fail.  

I have been using Test::More and like/unlike or cmp_ok to get around some of
those 'fuzzy data' issues.  If some methods consistently return a particular
type of value, such as an integer, you could use:

like($foo->get_value, qr{^\d+$}, 'value test'); #integer

or similar.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> Nathan -
> 
> I agree - the values tend to change with different versions of the
> applications unfortunately.  It would make sense to just test that
> you get out sequences that are in valid alignment format and perhaps
> have as many ending sequences as you started with.   The more
> restrictive tests probably aren't reliable with mixing and matching
> versions.
> 
> One thing we do for PAML is condition tests on the version used - but
> of course when a new version comes out we have to add more stuff to
> the tests (or just have some code that skips those tests).
> 
> -jason
> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:
> 
> > Remo Sanges wrote:
> >> Nathan Haigh wrote:
> >>> Sendu Bala wrote:
> >>>
> >>>> Nathan Haigh wrote:
> >>>>
> >>>>> I'm thinking that it's not wise to test for things like
> >>>>> overall_percentage_identity etc in alignments that are
> >>>>> generated by
> >>>>> external software like T-Coffee, Clustalw etc. Changes to software
> >>>>> algorithms/efficiency, bug fixes etc may well alter the quality
> >>>>> of the
> >>>>> alignment produced in different versions and thus affect the value
> >>>>> returned by such methods. Therefore, I think these methods
> >>>>> should only
> >>>>> be tested from alignments loaded directly from t/data.
> >>>>>
> >>>> Did you discover some specific problem cases?
> >>>>
> >>> My messages seem to be taking a while to come through, but, yes.
> >>> It may
> >>> be due to the software changing default parameters, but it makes
> >>> testing
> >>> the output for specific details pretty difficult and
> >>> inconsistent. For
> >>> example, running T-Coffee, the following command from t/TCoffee.t
> >>> results in slightly different alignment:
> >>> $aln = $factory->run('-type' => 'profile',
> >>>                      '-profile' => $aln1,
> >>>                      '-seq'  =>
> >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
> >>>
> >>> Of particular note, is the gaps on the last line of the
> >>> sequences. In
> >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
> >>> <v4.45 this is ('gkn----mcg').
> >>>
> >> I'm not a T-coffee user but usually you can come across
> >> these problems when you use different scoring parameters
> >> when align sequences.
> >>
> >> Could it be possible that they have simply changed the
> >> default parameters for gap penalties and that kind of
> >> stuff? It is possible to set them?
> >>
> >> If so you can just run the test by defining
> >> the scores in the param hash without using the default.
> >>
> >> HTH
> >>
> >> Remo
> > That is true, but it depends on the whether the wrapper is complete
> > enough to be able to set all the parameters provided by the software.
> >
> > Nath
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From gbazykin at Princeton.EDU  Thu Oct 26 18:49:56 2006
From: gbazykin at Princeton.EDU (Georgii A Bazykin)
Date: Thu, 26 Oct 2006 18:49:56 -0400
Subject: [Bioperl-l] about PAML running within bioperl
In-Reply-To: <001901c6dbcf$9af4de50$0915020a@zchou>
References: <001901c6dbcf$9af4de50$0915020a@zchou>
Message-ID: <185431468.20061026184956@princeton.edu>

I just had the exact same problem, which was also (as in Caleb Davis's
case) was solved by switching to PAML 3.14 from 3.15.


------------------------------
Tuesday, September 19, 2006, 5:40:07 AM, you wrote:

> Hello, every one,

> I use code in the PAML HOWTO (running PAML fom within Bioperl) on
> my Linux OS. And I set ENV as described by instructions. At the
> beginning, it seems that ClustalW run smoothly. However, when the
> programme run to call method "get_MLmatrix", somethign happened. The
> following information was listed as follows: (What reason or How to solve these problems?)
> ........
> Sequences (2:3) Aligned. Score:  87
> Sequences (2:4) Aligned. Score:  88
> Sequences (2:5) Aligned. Score:  87
> Sequences (2:6) Aligned. Score:  87
> Sequences (2:7) Aligned. Score:  87
> Sequences (2:8) Aligned. Score:  87
> Sequences (3:4) Aligned. Score:  93
> Sequences (3:5) Aligned. Score:  93
> Sequences (3:6) Aligned. Score:  93
> Sequences (3:7) Aligned. Score:  92
> Sequences (3:8) Aligned. Score:  92
> Sequences (4:5) Aligned. Score:  99
> Sequences (4:6) Aligned. Score:  99
> Sequences (4:7) Aligned. Score:  98
> Sequences (4:8) Aligned. Score:  98
> Sequences (5:6) Aligned. Score:  100
> Sequences (5:7) Aligned. Score:  99
> Sequences (5:8) Aligned. Score:  99
> Sequences (6:7) Aligned. Score:  99
> Sequences (6:8) Aligned. Score:  99
> Sequences (7:8) Aligned. Score:  100
> Guide tree        file created:  
> [/home/zchou/TMPDIR/8QEqLivAKY/JU833u8OTP.dnd]
> Start of Multiple Alignment
> There are 7 groups
> Aligning...
> Group 1: Sequences:   2      Score:5875
> Group 2: Sequences:   2      Score:5877
> Group 3: Sequences:   4      Score:5864
> Group 4: Sequences:   5      Score:5537
> Group 5: Sequences:   6      Score:5727
> Group 6: Sequences:   7      Score:5608
> Group 7: Sequences:   8      Score:5607
> Alignment Score 43650
> GCG-Alignment file created     
> [/home/zchou/TMPDIR/8QEqLivAKY/CussPD56rZ]
> aligned aa sequences were: Bio::SimpleAlign=HASH(0x87b93f4)
> Can't call method "get_MLmatrix" on an undefined value at
> originalpaml.pl line 57, <GEN2> line 332.


> Zhuocheng Hou
> Department of Animal Genetics and Breeding
> China Agricultural University


From himanshu.ardawatia at bccs.uib.no  Thu Oct 26 21:54:36 2006
From: himanshu.ardawatia at bccs.uib.no (Himanshu Ardawatia)
Date: Fri, 27 Oct 2006 03:54:36 +0200
Subject: [Bioperl-l] Query on tree bootstrap values
Message-ID: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>

Hi,

2 questions :

1. I have a phylogenetic tree and I wish to set (or modify or query)
bootstrap values for all internal nodes. How do I do that using BioPerl ?

2. I tried the example script attached below for general purpose for the
example newick tree with bootstrap values (also attached below) and It gives
strange results even for branch length. It shows Parent ID as 0.71 which
actually is the bootstrap value for the last ancestral node for human and
chimp and It shows the Child node ID as 'Human' ! Am I missing something in
the tree formatting ? Results also attached below. Also how to extract /
modify/ add bootstrap values in this tree ?

Thanks
Himanshu

EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
#################################
(
  ('Chimp'  : 0.052,
   'Human'  : 0.042) 0.71 : 0.007,
  'Gorilla'  : 0.060,
  ('Gibbon'  : 0.124,
   'Orangutan'  : 0.0971) 1 : 0.038
);
#################################

EXAMPLE SCRIPT:

#################################
#!/usr/bin/perl -w

use Bio::Seq;
# use Bio::TreeIO;
use Bio::Tree::TreeI;

# get a Tree::NodeI somehow
    # like from a TreeIO
    use Bio::TreeIO;
    # read in a clustalw NJ in phylip/newick format
    my $treeio = new Bio::TreeIO(-format => 'newick', -file =>
'example_newick_tree.newick');

    my $tree = $treeio->next_tree; # we'll assume it worked for demo
purposes
                                   # you might want to test that it was
defined

    my $rootnode = $tree->get_root_node;

    # process just the next generation
    foreach my $node ( $rootnode->each_Descendent() ) {
        print "branch len is ", $node->branch_length, "\n";
    }

    # process all the children
    my $example_leaf_node;
    foreach my $node ( $rootnode->get_Descendents() ) {
        if( $node->is_Leaf ) {
            print "node is a leaf ... ";
            # for example use below
            $example_leaf_node = $node unless defined $example_leaf_node;
        }
        print "branch len is ", $node->branch_length, "\n";
    }

    # The ancestor() method points to the parent of a node
    # A node can only have one parent

    my $parent = $example_leaf_node->ancestor;

    # parent won't likely have an description because it is an internal node
    # but child will because it is a leaf

    print "Parent id: ", $parent->id," child id: ",
          $example_leaf_node->id, "\n";

##########################################

RESULTS:
branch len is  0.007
branch len is  0.060
branch len is  0.038
node is a leaf ... branch len is  0.042
node is a leaf ... branch len is  0.052
branch len is  0.007
node is a leaf ... branch len is  0.060
node is a leaf ... branch len is  0.0971
node is a leaf ... branch len is  0.124
branch len is  0.038
Parent id: _0.71_ child id: ___'Human'__


From n.haigh at sheffield.ac.uk  Fri Oct 27 04:42:23 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 08:42:23 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1665C72.B068%bosborne11@verizon.net>
References: <C1665C72.B068%bosborne11@verizon.net>
Message-ID: <4541C66F.1020404@sheffield.ac.uk>

Hi Brian,

I wonder if i'm using is_prototype() correctly as I don't seem to get
any returning true:

my $enz_coll = Bio::Restriction::EnzymeCollection->new();
my $prototype = 0;
foreach my $enz ($enz_coll->each_enzyme) {
    $prototype++ if $enz->is_prototype;
}
print "$prototype have unique recognition sites\n";

prints:
0 have unique recognition sites

Thanks
Nath

Brian Osborne wrote:
> Nathan,
>
> Perhaps because most restriction sites are palindromes. Anyway, I added
> tests for palindromic() and is_palindromic() where the site is not a
> palindrome, these tests pass (t/RestrictionAnalyis.t).
>
> Brian O.
>
>
> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>
>   
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>
>   


-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>


From n.haigh at sheffield.ac.uk  Fri Oct 27 04:47:21 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 08:47:21 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <001301c6f91f$f9611770$15327e82@pyrimidine>
References: <001301c6f91f$f9611770$15327e82@pyrimidine>
Message-ID: <4541C799.4090507@sheffield.ac.uk>

Chris Fields wrote:
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh
>> Sent: Thursday, October 26, 2006 11:13 AM
>> To: Bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] Bio::Restriction::Enzyme
>>
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>>     
>
> You should file a bug report if you have found a test case where this method
> isn't working as it should, especially if Brian's tests pass and you're
> still getting the wrong results.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   

I was doing some filtering of the default set of enzymes and happened to
removed the 2 that are not palindromic before I used is_palindromic().
Thus, I didn't see any that were not palindromic - if that makes sense!
Since I know very little about restriction enzymes, I'll trust that
these are correct :-)  and I'm getting the correct results.

Thanks
Nath
<http://www.mozilla.org/products/thunderbird/>


From n.haigh at sheffield.ac.uk  Fri Oct 27 05:04:40 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 09:04:40 +0000
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>
References: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>
Message-ID: <4541CBA8.10006@sheffield.ac.uk>

Chris Fields wrote:
> I have been running into similar issues with EUtilities tests.  Since the
> data on the server is constantly updated I have to try an future-proof the
> tests so they don't constantly fail.  
>
> I have been using Test::More and like/unlike or cmp_ok to get around some of
> those 'fuzzy data' issues.  If some methods consistently return a particular
> type of value, such as an integer, you could use:
>
> like($foo->get_value, qr{^\d+$}, 'value test'); #integer
>
> or similar.
>
> Chris
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>   
>> Nathan -
>>
>> I agree - the values tend to change with different versions of the
>> applications unfortunately.  It would make sense to just test that
>> you get out sequences that are in valid alignment format and perhaps
>> have as many ending sequences as you started with.   The more
>> restrictive tests probably aren't reliable with mixing and matching
>> versions.
>>
>> One thing we do for PAML is condition tests on the version used - but
>> of course when a new version comes out we have to add more stuff to
>> the tests (or just have some code that skips those tests).
>>
>> -jason
>> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:
>>
>>     
I think it makes sense to test that data of the expected type was
returned by the xternal resource but not to test the specifics of what
was retured. If specifics are tested we are then in the realm of testing
whether we believe the data returned by the external resource or not. We
should assume that the domain experts for these resources know what they
are doing - in some cases this might not be true :-)  but I think we
should stick to testing that the objects created hold the expected type
of data.

I like what Chris had to say (above) but wonder whether tests
would/should be tested for in the module itself - i.e. testing that a
stored value is an integer and warn/throw if not?

Nath


From bix at sendu.me.uk  Fri Oct 27 05:08:18 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 10:08:18 +0100
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
Message-ID: <4541CC82.2040705@sendu.me.uk>

Himanshu Ardawatia wrote:
> Hi,
> 
> 2 questions :
> 
> 1. I have a phylogenetic tree and I wish to set (or modify or query)
> bootstrap values for all internal nodes. How do I do that using BioPerl ?

Does bootstrap() not do what you need?


> 2. I tried the example script attached below for general purpose for the
> example newick tree with bootstrap values (also attached below) and It gives
> strange results even for branch length. It shows Parent ID as 0.71 which
> actually is the bootstrap value for the last ancestral node for human and
> chimp and It shows the Child node ID as 'Human' ! Am I missing something in
> the tree formatting ? Results also attached below. Also how to extract /
> modify/ add bootstrap values in this tree ?
[snip]
> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
> #################################
> (
>   ('Chimp'  : 0.052,
>    'Human'  : 0.042) 0.71 : 0.007,
>   'Gorilla'  : 0.060,
>   ('Gibbon'  : 0.124,
>    'Orangutan'  : 0.0971) 1 : 0.038
> );
> #################################

Are you sure this is in the correct format?

For example, with the tree:
( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 
'Gorilla':0.060, 
('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038);

and your script (with a print "--\n" between the two printing loops for 
clarity) I get...

> ##########################################
> 
> RESULTS:
> branch len is  0.007
> branch len is  0.060
> branch len is  0.038
> node is a leaf ... branch len is  0.042
> node is a leaf ... branch len is  0.052
> branch len is  0.007
> node is a leaf ... branch len is  0.060
> node is a leaf ... branch len is  0.0971
> node is a leaf ... branch len is  0.124
> branch len is  0.038
> Parent id: _0.71_ child id: ___'Human'__

...

branch len is 0.007
branch len is 0.060
branch len is 0.038
--
branch len is 0.007
node is a leaf ... branch len is 0.052
node is a leaf ... branch len is 0.042
node is a leaf ... branch len is 0.060
branch len is 0.038
node is a leaf ... branch len is 0.124
node is a leaf ... branch len is 0.0971
Parent id: 'Human_Chimp_Ancestor' child id: 'Chimp'

This seems reasonable to me. What were you expecting?


From n.haigh at sheffield.ac.uk  Fri Oct 27 07:36:10 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 11:36:10 +0000
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541CC82.2040705@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
	<4541CC82.2040705@sendu.me.uk>
Message-ID: <4541EF2A.4050600@sheffield.ac.uk>

Sendu Bala wrote:
> Himanshu Ardawatia wrote:
>   
>> Hi,
>>
>> 2 questions :
>>
>> 1. I have a phylogenetic tree and I wish to set (or modify or query)
>> bootstrap values for all internal nodes. How do I do that using BioPerl ?
>>     
>
> Does bootstrap() not do what you need?
>
>
>   
>> 2. I tried the example script attached below for general purpose for the
>> example newick tree with bootstrap values (also attached below) and It gives
>> strange results even for branch length. It shows Parent ID as 0.71 which
>> actually is the bootstrap value for the last ancestral node for human and
>> chimp and It shows the Child node ID as 'Human' ! Am I missing something in
>> the tree formatting ? Results also attached below. Also how to extract /
>> modify/ add bootstrap values in this tree ?
>>     
> [snip]
>   
>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>> #################################
>> (
>>   ('Chimp'  : 0.052,
>>    'Human'  : 0.042) 0.71 : 0.007,
>>   'Gorilla'  : 0.060,
>>   ('Gibbon'  : 0.124,
>>    'Orangutan'  : 0.0971) 1 : 0.038
>> );
>> #################################
>>     
>
> Are you sure this is in the correct format?
>   

He/she may have a tree that already contains bootstrap values output
from another program. If this is so, which program did you use? Without
reminding myself of the formats, you should lookup newick format and
whther it is possible to store bootstraps in it. In addition you should
also look up the nhx format.

> For example, with the tree:
> ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 
> 'Gorilla':0.060, 
> ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038);
>
>   

This tree does not contain any bootstrap values - only branch lengths.

Sorry I can't be much more help at the moment - if i get a spare 10 mins
i'll have a closer look.
Nath


From bix at sendu.me.uk  Fri Oct 27 07:16:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 12:16:08 +0100
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EF2A.4050600@sheffield.ac.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk>
Message-ID: <4541EA78.3050404@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Himanshu Ardawatia wrote:
>>>
>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>>> #################################
>>> (
>>>   ('Chimp'  : 0.052,
>>>    'Human'  : 0.042) 0.71 : 0.007,
>>>   'Gorilla'  : 0.060,
>>>   ('Gibbon'  : 0.124,
>>>    'Orangutan'  : 0.0971) 1 : 0.038
>>> );
>>> #################################
>>>     
>> Are you sure this is in the correct format?
>>   
> 
> He/she may have a tree that already contains bootstrap values output
> from another program. If this is so, which program did you use? Without
> reminding myself of the formats, you should lookup newick format and
> whther it is possible to store bootstraps in it. In addition you should
> also look up the nhx format.

Ah, well from a brief google it seemed like some software do store 
boostrap values for internal nodes as the node ids when outputting in 
Newick format. I don't think Bioperl should be able to tell the 
difference between a normal id and a bootstrap value, so you'll have to 
detect that yourself and manually use bootstrap() when you get an id 
that looks like a number.

Or should Bioperl be making this assumption for you? Is that a safe 
thing to do? Maybe as an option only?


From n.haigh at sheffield.ac.uk  Fri Oct 27 08:24:49 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 12:24:49 +0000
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EA78.3050404@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk>
Message-ID: <4541FA91.3040505@sheffield.ac.uk>

--snip--
>
> Ah, well from a brief google it seemed like some software do store
> boostrap values for internal nodes as the node ids when outputting in
> Newick format. I don't think Bioperl should be able to tell the
> difference between a normal id and a bootstrap value, so you'll have
> to detect that yourself and manually use bootstrap() when you get an
> id that looks like a number.

If I remember rightly, in programs like Clustal you can specify where
bootstrap values are stored - node or branch. I can't remember which is
the default way, but TreeView can only see bootstraps in they are stored
using the "non-default" setting. This "could" be the same issue here.

>
> Or should Bioperl be making this assumption for you? Is that a safe
> thing to do? Maybe as an option only?
I don't know without a closer look - i'd also need to look at the newick
format definition as to whether this is an "extension" to the format or
if something is just flouting the newick rules.

Nath


From n.haigh at sheffield.ac.uk  Fri Oct 27 08:59:51 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 12:59:51 +0000
Subject: [Bioperl-l] Caching sequences
Message-ID: <454202C7.1040701@sheffield.ac.uk>

I have a script that is capable of downloading sequences from GenBank
based on GI numbers. I retrieve them if fasta format in order to save
bandwidth, but I'd like to take this one step further and cache the
sequences in case the user want to rerun the script using some of the
GI's they used previously.

Does anyone have any guidance on how best to do this?

Cheers
Nath


From bix at sendu.me.uk  Fri Oct 27 08:35:13 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 13:35:13 +0100
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <454202C7.1040701@sheffield.ac.uk>
References: <454202C7.1040701@sheffield.ac.uk>
Message-ID: <4541FD01.6090803@sendu.me.uk>

Nathan S. Haigh wrote:
> I have a script that is capable of downloading sequences from GenBank
> based on GI numbers. I retrieve them if fasta format in order to save
> bandwidth, but I'd like to take this one step further and cache the
> sequences in case the user want to rerun the script using some of the
> GI's they used previously.
> 
> Does anyone have any guidance on how best to do this?

You'd probably write the sequences out in some suitable format and 
access them via Bio::Index

Or, I'm sure bioperl-db excels at this kind of thing, but is a little 
more involved if this is only a simple situation.


From bosborne11 at verizon.net  Fri Oct 27 09:09:30 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 27 Oct 2006 09:09:30 -0400
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4541C66F.1020404@sheffield.ac.uk>
Message-ID: <C1677D4A.B0AF%bosborne11@verizon.net>

Nathan,

I don't know how this is supposed to work, there would be different ways to
make is_prototype true. One way would be to make the enzyme with the first
occurrence of a given restriction site the prototype (and the next enzymes
with the same site are isoschizomers). Or, one could wait until one site had
appeared twice, with 2 different enzymes, then make the first the prototype,
etc. I would have done it the first way myself but I took a quick look at
IO/withrefm.pm and it looks like it's doing it the second way. That means
one can read an enzyme file and end up with no duplicated restriction sites,
or prototypes and isoschizomers.

Brian O.


On 10/27/06 4:42 AM, "Nathan S. Haigh" <n.haigh at sheffield.ac.uk> wrote:

> Hi Brian,
> 
> I wonder if i'm using is_prototype() correctly as I don't seem to get
> any returning true:
> 
> my $enz_coll = Bio::Restriction::EnzymeCollection->new();
> my $prototype = 0;
> foreach my $enz ($enz_coll->each_enzyme) {
>     $prototype++ if $enz->is_prototype;
> }
> print "$prototype have unique recognition sites\n";
> 
> prints:
> 0 have unique recognition sites
> 
> Thanks
> Nath
> 
> Brian Osborne wrote:
>> Nathan,
>> 
>> Perhaps because most restriction sites are palindromes. Anyway, I added
>> tests for palindromic() and is_palindromic() where the site is not a
>> palindrome, these tests pass (t/RestrictionAnalyis.t).
>> 
>> Brian O.
>> 
>> 
>> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>> 
>>   
>>> I'm in the middle of writing some code that uses
>>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>>> Bioperl from HEAD.
>>> 
>>> I seem to find that $enzyme->is_palindromic always seems to return true.
>>> Can anyone verify this? If needs be, I can send some code.
>>> 
>>> Thanks
>>> Nathan
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>     
>> 
>> 
>>   
> 


From n.haigh at sheffield.ac.uk  Fri Oct 27 10:19:02 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 14:19:02 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1677D4A.B0AF%bosborne11@verizon.net>
References: <C1677D4A.B0AF%bosborne11@verizon.net>
Message-ID: <45421556.9060300@sheffield.ac.uk>

Brian Osborne wrote:
> Nathan,
>
> I don't know how this is supposed to work, there would be different ways to
> make is_prototype true. One way would be to make the enzyme with the first
> occurrence of a given restriction site the prototype (and the next enzymes
> with the same site are isoschizomers). Or, one could wait until one site had
> appeared twice, with 2 different enzymes, then make the first the prototype,
> etc. I would have done it the first way myself but I took a quick look at
> IO/withrefm.pm and it looks like it's doing it the second way. That means
> one can read an enzyme file and end up with no duplicated restriction sites,
> or prototypes and isoschizomers.
>
> Brian O.
>
>   
Hmm, I'd have done it the first way also. Doing it the second way would
mean you only ended up with something as a prototype if there were
multiple enzymes with the same restriction site - is that correct
biologically?

Nath


From n.haigh at sheffield.ac.uk  Fri Oct 27 10:23:20 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 14:23:20 +0000
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
Message-ID: <45421658.5000103@sheffield.ac.uk>

As you may be aware by now, i'm working with Bio::Restriction::Analysis
and friends.

I'm doing restriction analysis on large sequences - chromosomes. I need
to identify an appropriate enzyme based on the total length of fragments
that are of a certain size (e.g. 100 - 500 bp). However, the amount of
memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
have the following code (bottom) which downloads 2 thaliana chromosomes
(mito and chloro - so pretty small) and runs an analysis and then loops
through the fragments for all enzymes in the default collection.

My memory usage just keep on climbing and none seems to get freed up
even when a $ra goes out of scope (start dealing with the next
sequence). Is this a memory leak of some sort, is there a way to free up
memory as I go? I'd appreciate any help/advice on how to reduce the
amount of memory being consumed as I'd like to use all the thaliana
chromosomes (not just mito and chloro), which at the moment probably
won't work.

Cheers
Nath

use strict;
use Bio::DB::GenBank;
use Bio::Restriction::Analysis;
use Bio::Restriction::EnzymeCollection;

my @seq_objs;
my @gis = ( 7525012,  26556996 );

my $db = Bio::DB::GenBank->new(-format => "fasta");
foreach my $gi (@gis) {
  print "Getting GI: $gi\n";
  push @seq_objs, $db->get_Seq_by_id($gi)
}

my $min_fragment_size = 100;
my $max_fragment_size = 500;
my $enz_Coll = Bio::Restriction::EnzymeCollection->new();

foreach my $seq (@seq_objs) {
  my $tot_size = 0;
  print "Processing ", $seq->primary_id,"\n";
  my $ra = Bio::Restriction::Analysis->new(
                                         -seq=>$seq,
                                         -enzymes=>$enz_Coll,
  );
 
  my @all_enzymes = $ra->cutters->each_enzyme;
  print "  Calc total length of fragments in range: $min_fragment_size -
$max_fragment_size\n";
  foreach my $enzyme ( @all_enzymes ) {
    # fragments() is a real memory hog
    foreach my $frag ($ra->fragments($enzyme)) {
      next if $min_fragment_size && (length $frag < $min_fragment_size);
      next if $max_fragment_size && (length $frag > $max_fragment_size);
      $tot_size += length $frag;
    }
    # do something based on value of $tot_size
    #print "    ", $enzyme->name, " total = $tot_size\n";
  }
  print "DONE\n";
}


From avilella at gmail.com  Fri Oct 27 09:39:41 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 27 Oct 2006 14:39:41 +0100
Subject: [Bioperl-l] scale branch lengths of a tree to sum 1
In-Reply-To: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>
References: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>
Message-ID: <358f4d650610270639q14870a6erae2e3c4e9063105d@mail.gmail.com>

I respond to myself: I think I found the way:

my $tree = $treeio->next_tree;
my $total_branch_length = 0;
foreach my $node ($tree->get_nodes) {
    $total_branch_length += $node->branch_length;
}
foreach my $node ($tree->get_nodes) {
    my $branch_length = $node->branch_length;
    next unless (defined($branch_length));
    $node->branch_length($branch_length/$total_branch_length);
    1;
}

my $new_branch_length;
foreach my $node ($tree->get_nodes) {
    $new_branch_length += $node->branch_length;
}
1;

On 10/27/06, Albert Vilella <avilella at gmail.com> wrote:
> Hi all,
>
> I am in need of a method that would scale the different branch lengths
> of a tree so that after the scaling they all sum up to exactly 1.
>
> Any pointers? Has anyone done that before?
>
> Thanks in advance,
>
>     Albert.
>


From cjfields at uiuc.edu  Fri Oct 27 10:35:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 09:35:35 -0500
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <4541CBA8.10006@sheffield.ac.uk>
Message-ID: <001501c6f9d5$2e33e120$15327e82@pyrimidine>

...
> I think it makes sense to test that data of the expected type was
> returned by the xternal resource but not to test the specifics of what
> was retured. If specifics are tested we are then in the realm of testing
> whether we believe the data returned by the external resource or not. We
> should assume that the domain experts for these resources know what they
> are doing - in some cases this might not be true :-)  but I think we
> should stick to testing that the objects created hold the expected type
> of data.
> 
> I like what Chris had to say (above) but wonder whether tests
> would/should be tested for in the module itself - i.e. testing that a
> stored value is an integer and warn/throw if not?
> 
> Nath

Yeah, sorry about the top post (stupid Outlook always sticks the sig at the
top of the page!).  

Testing in the module would be best but can be tricky for the very same
reasons that writing tests entail, even more so.  For instance, for NCBI
esummary data, I parse the data in a very generic way in order to have
access to as much data as possible.  

For tests, I have to assume that NCBI will always return a particular type
of value (string, integer, date).  I can test for each of those with a regex
in the module fairly simply and throw/wanr, as you indicate.  However, if
they decide to add new data with a data tag other that the ones I test for
in the module (i.e. String, Integer, Date), I suddenly have warns/throws
showing up and cluttering/clobbering the code for perfectly valid data.  

However, if these are caught in tests and the tests fail, no big loss.  The
actual module still works, even if the tests are failing based on an new
unknown value being returned.  

For me, failed tests are sort of a warning light to let me know that
something has changed, but it doesn't necessarily mean a module doesn't
work.  I generally use throw/warn for something truly catastrophic, like no
response from the server or an error in the XML, which affects downstream
methods.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Oct 27 11:09:36 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 10:09:36 -0500
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <454202C7.1040701@sheffield.ac.uk>
Message-ID: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>

> I have a script that is capable of downloading sequences from GenBank
> based on GI numbers. I retrieve them if fasta format in order to save
> bandwidth, but I'd like to take this one step further and cache the
> sequences in case the user want to rerun the script using some of the
> GI's they used previously.
> 
> Does anyone have any guidance on how best to do this?
> 
> Cheers
> Nath

There is Bio::DB::InMemoryCache, which is really an interface but appears to
have several methods defined; you could look for modules which implement it.
Sendu's suggestion of the Bio::Index modules and bioperl-db are also good
starting points.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Fri Oct 27 11:21:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 10:21:49 -0500
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <45421556.9060300@sheffield.ac.uk>
Message-ID: <001701c6f9db$9f90d160$15327e82@pyrimidine>

> Brian Osborne wrote:
> > Nathan,
> >
> > I don't know how this is supposed to work, there would be different ways
> to
> > make is_prototype true. One way would be to make the enzyme with the
> first
> > occurrence of a given restriction site the prototype (and the next
> enzymes
> > with the same site are isoschizomers). Or, one could wait until one site
> had
> > appeared twice, with 2 different enzymes, then make the first the
> prototype,
> > etc. I would have done it the first way myself but I took a quick look
> at
> > IO/withrefm.pm and it looks like it's doing it the second way. That
> means
> > one can read an enzyme file and end up with no duplicated restriction
> sites,
> > or prototypes and isoschizomers.
> >
> > Brian O.
> >
> >
> Hmm, I'd have done it the first way also. Doing it the second way would
> mean you only ended up with something as a prototype if there were
> multiple enzymes with the same restriction site - is that correct
> biologically?
> 
> Nath

I had a look at all the Restriction::IO modules a while back; most need
serious updating!  It just hasn't been a top priority unfortunately.

I think the prototype issue may depend on the IO format and whether or not
one is defined explicitly in the file being parsed or is just chosen based
on what Brian said (order in the file, similar cutting site).

By the strictest definition (and cheating by looking at the Fermentas web
site), the prototype is supposed to be the first enzyme discovered which
cleaves a unique sequence, so it may not be the first enzyme found in the
file.  Isoschizomers are those discovered to cleave the same sequence
subsequent to the prototype.  Neoschizomers cleave the same sequence as a
prototype but at a different site.

So this calls into question whether the prototype should be defined at all
unless it is specifically indicated in the file.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Fri Oct 27 12:47:53 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 16:47:53 +0000
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>
References: <454202C7.1040701@sheffield.ac.uk>	
	<001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
	<8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>
Message-ID: <45423839.9040503@sheffield.ac.uk>

Jason Stajich wrote:
> Bio::DB::FileCache does one better and lets you cache the data in a
> persistent file.  Not sure this index is shareable among users though
> - bioperl-db is a better soln when that is desired.
Thanks I'll have a look into it. No need for being sharable among users
- not unless the script becomes heavily used.

Thanks
Nath


From cjfields at uiuc.edu  Fri Oct 27 12:15:00 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 11:15:00 -0500
Subject: [Bioperl-l] StandAloneFasta.t bioperl-run tests
Message-ID: <000101c6f9e3$0e5e95d0$15327e82@pyrimidine>

Nathan,

The test fails you posted on the wiki seem to indicate that using the
wrapper works but the order of the returned hits is off.  Does the order of
the returned hits match the actual FASTA report order?  If it does then the
tests need to be fixed in a way to make it more flexible, to account for
some data 'fuzziness' due to variations in output based on different
versions.  

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From jason at bioperl.org  Fri Oct 27 12:50:54 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 27 Oct 2006 09:50:54 -0700
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EA78.3050404@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk>
Message-ID: <1230E110-01AB-4D4E-842F-20B939555299@bioperl.org>

I've answered to this effect this multiple times in the past on the  
mailing list.  newick format does not distinguish between internal  
ids and bootstrap values (or whatever else you want to attach  
there).  Different programs have different conventions.  when both  
values are present and encoded so that we can parse out the  
bootstrap  like this: [BOOTSTRAP] the parser grabs it out.   If you  
know all the internal ids are boostraps you can just copy the values  
over manually very simply

for my $node ( grep { ! $_->is_Leaf } $tree->get_nodes ) { # get all  
the internal nodes
  $node->bootstrap($node->id) if defined $node->id && length($node- 
 >id); # copy id to boostrap
  $node->id(''); # set internal id to empty
}

If someone can make this clearer on a wiki page that would be great.

On Oct 27, 2006, at 4:16 AM, Sendu Bala wrote:

> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Himanshu Ardawatia wrote:
>>>>
>>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>>>> #################################
>>>> (
>>>>   ('Chimp'  : 0.052,
>>>>    'Human'  : 0.042) 0.71 : 0.007,
>>>>   'Gorilla'  : 0.060,
>>>>   ('Gibbon'  : 0.124,
>>>>    'Orangutan'  : 0.0971) 1 : 0.038
>>>> );
>>>> #################################
>>>>
>>> Are you sure this is in the correct format?
>>>
>>
>> He/she may have a tree that already contains bootstrap values output
>> from another program. If this is so, which program did you use?  
>> Without
>> reminding myself of the formats, you should lookup newick format and
>> whther it is possible to store bootstraps in it. In addition you  
>> should
>> also look up the nhx format.
>
> Ah, well from a brief google it seemed like some software do store
> boostrap values for internal nodes as the node ids when outputting in
> Newick format. I don't think Bioperl should be able to tell the
> difference between a normal id and a bootstrap value, so you'll  
> have to
> detect that yourself and manually use bootstrap() when you get an id
> that looks like a number.
>
> Or should Bioperl be making this assumption for you? Is that a safe
> thing to do? Maybe as an option only?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From avilella at gmail.com  Fri Oct 27 09:23:07 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 27 Oct 2006 14:23:07 +0100
Subject: [Bioperl-l] scale branch lengths of a tree to sum 1
Message-ID: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>

Hi all,

I am in need of a method that would scale the different branch lengths
of a tree so that after the scaling they all sum up to exactly 1.

Any pointers? Has anyone done that before?

Thanks in advance,

    Albert.


From cjfields at uiuc.edu  Fri Oct 27 14:34:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 13:34:57 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
Message-ID: <000001c6f9f6$9ab12710$15327e82@pyrimidine>

I am working an refactoring the AlignIO::stockholm parser to get it reading
and writing Pfam/Rfam alignments, and noticed that many alignments have
EMBL-like annotations attached, which pertain to the entire alignment:

# STOCKHOLM 1.0
#=GF ID    ykkC-yxkD
#=GF AC    RF00442
#=GF DE    ykkC-yxkD element
#=GF AU    Moxon SJ
#=GF GA    20.0
#=GF NC    0.1
#=GF TC    59.4
#=GF SE    Barrick JE, Breaker RR
#=GF SS    Predicted; Barrick JE, Breaker RR
#=GF TP    Cis-reg; riboswitch;
#=GF BM    cmbuild CM SEED
#=GF BM    cmsearch -W 175 CM SEQDB
#=GF RN    [1]
#=GF RM    15096624
#=GF RT    New RNA motifs suggest an expanded scope for riboswitches in
#=GF RT    bacterial genetic control.
#=GF RA    Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, Collins J,
Lee
#=GF RA    M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR;
#=GF RL    Proc Natl Acad Sci U S A 2004;101:6421-6426.
#=GF CC    This family represents the bacterial ykkC/yxkD element. The
function of
#=GF CC    this family is unclear although it has been suggested that it may
function
#=GF CC    to switch on efflux pumps and detoxification systems in response
to harmful
#=GF CC    environmental molecules [1]. The Thermoanaerobacter tengcongensis
sequence
#=GF CC    EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that the two
#=GF CC    riboswitches may work in conjunction to regulate the the upstream
gene
#=GF CC    which codes for Swiss:Q8RC62, a member of Pfam:PF00860 (Personal
obs. Moxon
#=GF CC    SJ).
#=GF SQ    16

SimpleAlign, as implemented, seemingly doesn't have a way to store this
information.

I'll work on getting the core alignment IO working, but would there be any
interest in having a way to store annotations in Bio::SimpleAlign?  I'm
guessing the methods would be similar to the various Bio::Seq Annotation
methods.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From hlapp at gmx.net  Fri Oct 27 16:23:46 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 27 Oct 2006 16:23:46 -0400
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
Message-ID: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>

You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose  
this is what you meant by the 'various Bio::Seq Annotation methods'  
too.)

Just to make sure I'm not misunderstanding, I suppose the annotation  
pertains to the entire alignment?

	-hilmar

On Oct 27, 2006, at 2:34 PM, Chris Fields wrote:

> I am working an refactoring the AlignIO::stockholm parser to get it  
> reading
> and writing Pfam/Rfam alignments, and noticed that many alignments  
> have
> EMBL-like annotations attached, which pertain to the entire alignment:
>
> # STOCKHOLM 1.0
> #=GF ID    ykkC-yxkD
> #=GF AC    RF00442
> #=GF DE    ykkC-yxkD element
> #=GF AU    Moxon SJ
> #=GF GA    20.0
> #=GF NC    0.1
> #=GF TC    59.4
> #=GF SE    Barrick JE, Breaker RR
> #=GF SS    Predicted; Barrick JE, Breaker RR
> #=GF TP    Cis-reg; riboswitch;
> #=GF BM    cmbuild CM SEED
> #=GF BM    cmsearch -W 175 CM SEQDB
> #=GF RN    [1]
> #=GF RM    15096624
> #=GF RT    New RNA motifs suggest an expanded scope for  
> riboswitches in
> #=GF RT    bacterial genetic control.
> #=GF RA    Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M,  
> Collins J,
> Lee
> #=GF RA    M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR;
> #=GF RL    Proc Natl Acad Sci U S A 2004;101:6421-6426.
> #=GF CC    This family represents the bacterial ykkC/yxkD element. The
> function of
> #=GF CC    this family is unclear although it has been suggested  
> that it may
> function
> #=GF CC    to switch on efflux pumps and detoxification systems in  
> response
> to harmful
> #=GF CC    environmental molecules [1]. The Thermoanaerobacter  
> tengcongensis
> sequence
> #=GF CC    EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that  
> the two
> #=GF CC    riboswitches may work in conjunction to regulate the the  
> upstream
> gene
> #=GF CC    which codes for Swiss:Q8RC62, a member of Pfam:PF00860  
> (Personal
> obs. Moxon
> #=GF CC    SJ).
> #=GF SQ    16
>
> SimpleAlign, as implemented, seemingly doesn't have a way to store  
> this
> information.
>
> I'll work on getting the core alignment IO working, but would there  
> be any
> interest in having a way to store annotations in Bio::SimpleAlign?   
> I'm
> guessing the methods would be similar to the various Bio::Seq  
> Annotation
> methods.
>
> Chris
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct 27 16:38:17 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 15:38:17 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
Message-ID: <000001c6fa07$d8659990$15327e82@pyrimidine>

Hilmar Lapp wrote:
> You could make SimpleAlign be a Bio::AnnotationHolderI. (I
> suppose this is what you meant by the 'various Bio::Seq Annotation
> methods' too.)
> 
> Just to make sure I'm not misunderstanding, I suppose the
> annotation pertains to the entire alignment?
> 
> 	-hilmar
...

Yes, that's correct.  I would probably use Bio::Seq::Meta for the
sequence-specific markup lines.  I would have to add another new method to
deal with non-sequence-based consensus data (like sec. structure) for now.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Fri Oct 27 11:38:05 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 27 Oct 2006 08:38:05 -0700
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
References: <454202C7.1040701@sheffield.ac.uk>
	<001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
Message-ID: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>

Bio::DB::FileCache does one better and lets you cache the data in a
persistent file.  Not sure this index is shareable among users though -
bioperl-db is a better soln when that is desired.

-jason

On 10/27/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> > I have a script that is capable of downloading sequences from GenBank
> > based on GI numbers. I retrieve them if fasta format in order to save
> > bandwidth, but I'd like to take this one step further and cache the
> > sequences in case the user want to rerun the script using some of the
> > GI's they used previously.
> >
> > Does anyone have any guidance on how best to do this?
> >
> > Cheers
> > Nath
>
> There is Bio::DB::InMemoryCache, which is really an interface but appears
> to
> have several methods defined; you could look for modules which implement
> it.
> Sendu's suggestion of the Bio::Index modules and bioperl-db are also good
> starting points.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Jason Stajich
jason at bioperl.org
http://www.duke.edu/~jes12/


From cjfields at uiuc.edu  Fri Oct 27 21:57:58 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 20:57:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
Message-ID: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>


On Oct 27, 2006, at 3:23 PM, Hilmar Lapp wrote:

> You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose
> this is what you meant by the 'various Bio::Seq Annotation methods'
> too.)
>
> Just to make sure I'm not misunderstanding, I suppose the annotation
> pertains to the entire alignment?
>
> 	-hilmar

BTW, was that supposed to be Bio::AnnotatableI, or  
Bio::AnnotationHolderI?  The latter isn't present in CVS HEAD.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From eric.ross at neuro.utah.edu  Sat Oct 28 17:24:30 2006
From: eric.ross at neuro.utah.edu (Eric Ross)
Date: Sat, 28 Oct 2006 15:24:30 -0600
Subject: [Bioperl-l] PAML
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>

I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object.

I am able to extract other data from the report, but there seems to be a conflict in the documentation.  One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. 


I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far.  Anyone have suggestions?


code:

----begin code-------
#!/usr/bin/perl -w

use strict;


use Bio::Tools::Phylo::PAML;
my $parser = new Bio::Tools::Phylo::PAML
             (-file => "mlc");
my $result = $parser->next_result;
my @posteriors = $result->get_posteriors();

print "@posteriors";

exit(0);

---------end code-------------


---------------
Eric Ross
Computer Analyst II
ejr at neuro.utah.edu
Howard Hughes Medical Institute
University of Utah
S?nchez Lab


From avilella at gmail.com  Sun Oct 29 05:52:04 2006
From: avilella at gmail.com (Albert Vilella)
Date: Sun, 29 Oct 2006 10:52:04 +0000
Subject: [Bioperl-l] PAML
In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>

I don't know if this method is implemented. I can't grep-find it.
Maybe it's simply not there yet, but was planned when the
documentation was written.

On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
> I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>
> I am able to extract other data from the report, but there seems to be a conflict in the documentation.  One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object.
>
>
> I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far.  Anyone have suggestions?
>
>
> code:
>
> ----begin code-------
> #!/usr/bin/perl -w
>
> use strict;
>
>
> use Bio::Tools::Phylo::PAML;
> my $parser = new Bio::Tools::Phylo::PAML
>              (-file => "mlc");
> my $result = $parser->next_result;
> my @posteriors = $result->get_posteriors();
>
> print "@posteriors";
>
> exit(0);
>
> ---------end code-------------
>
>
>
> ---------------
> Eric Ross
> Computer Analyst II
> ejr at neuro.utah.edu
> Howard Hughes Medical Institute
> University of Utah
> S?nchez Lab
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Sun Oct 29 09:23:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 08:23:45 -0600
Subject: [Bioperl-l] PAML
In-Reply-To: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
Message-ID: <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>

Does the data show up in the object using Data::Dumper?

This should be filed as a bug since the docs imply the method  
exists.  This could be written up fairly quickly if one had test data  
and and a script to work with (hint hint...)

Chris

On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote:

> I don't know if this method is implemented. I can't grep-find it.
> Maybe it's simply not there yet, but was planned when the
> documentation was written.
>
> On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
>> I am trying to extract the "Naive Empirical Bayes (NEB)  
>> probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>>
>> I am able to extract other data from the report, but there seems  
>> to be a conflict in the documentation.  One doc implies that there  
>> should be a get_posteriors method. (It's used as an example in the  
>> Bio::Tools::Phylo::PAML doc), but the method does not appear to  
>> exist in the Bio::Tools::Phylo::PAML::Result object.
>>
>>
>> I have been trying various methods, in the event I'm just  
>> "confused", but I've had no luck, thus far.  Anyone have suggestions?
>>
>>
>> code:
>>
>> ----begin code-------
>> #!/usr/bin/perl -w
>>
>> use strict;
>>
>>
>> use Bio::Tools::Phylo::PAML;
>> my $parser = new Bio::Tools::Phylo::PAML
>>              (-file => "mlc");
>> my $result = $parser->next_result;
>> my @posteriors = $result->get_posteriors();
>>
>> print "@posteriors";
>>
>> exit(0);
>>
>> ---------end code-------------
>>
>>
>>
>> ---------------
>> Eric Ross
>> Computer Analyst II
>> ejr at neuro.utah.edu
>> Howard Hughes Medical Institute
>> University of Utah
>> S?nchez Lab
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From eric.ross at neuro.utah.edu  Sun Oct 29 12:06:54 2006
From: eric.ross at neuro.utah.edu (Eric Ross)
Date: Sun, 29 Oct 2006 10:06:54 -0700
Subject: [Bioperl-l] PAML
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
	<9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>
Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>

Thanks for all the help.

I've been looking at the code for the PAML rst parser.  It's a bit tricky. 

We have written a parser specific for our needs, but it looks to be a pretty complicated matter to make it generic.  

The output of PAML can vary a lot depending upon your options and this section can be repeated multiple times.  I'm sure someone with a good grasp of the potential output of PAML could come up with something, but I'll admit to being at a loss. 


---------------
Eric Ross
Computer Analyst II
ejr at neuro.utah.edu
Howard Hughes Medical Institute
University of Utah
S?nchez Lab


-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu]
Sent: Sun 2006-10-29 7:23 AM
To: Albert Vilella
Cc: Eric Ross; Bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] PAML
 
Does the data show up in the object using Data::Dumper?

This should be filed as a bug since the docs imply the method  
exists.  This could be written up fairly quickly if one had test data  
and and a script to work with (hint hint...)

Chris

On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote:

> I don't know if this method is implemented. I can't grep-find it.
> Maybe it's simply not there yet, but was planned when the
> documentation was written.
>
> On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
>> I am trying to extract the "Naive Empirical Bayes (NEB)  
>> probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>>
>> I am able to extract other data from the report, but there seems  
>> to be a conflict in the documentation.  One doc implies that there  
>> should be a get_posteriors method. (It's used as an example in the  
>> Bio::Tools::Phylo::PAML doc), but the method does not appear to  
>> exist in the Bio::Tools::Phylo::PAML::Result object.
>>
>>
>> I have been trying various methods, in the event I'm just  
>> "confused", but I've had no luck, thus far.  Anyone have suggestions?
>>
>>
>> code:
>>
>> ----begin code-------
>> #!/usr/bin/perl -w
>>
>> use strict;
>>
>>
>> use Bio::Tools::Phylo::PAML;
>> my $parser = new Bio::Tools::Phylo::PAML
>>              (-file => "mlc");
>> my $result = $parser->next_result;
>> my @posteriors = $result->get_posteriors();
>>
>> print "@posteriors";
>>
>> exit(0);
>>
>> ---------end code-------------
>>
>>
>>
>> ---------------
>> Eric Ross
>> Computer Analyst II
>> ejr at neuro.utah.edu
>> Howard Hughes Medical Institute
>> University of Utah
>> S?nchez Lab
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sun Oct 29 12:43:20 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 29 Oct 2006 17:43:20 +0000
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <45421658.5000103@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
Message-ID: <4544E838.7090400@sheffield.ac.uk>

Sorry for the repeat post but I haven't had a response. Just wondered if 
anyone had any idea about this?

Thanks
Nath

Nathan S. Haigh wrote:
> As you may be aware by now, i'm working with Bio::Restriction::Analysis
> and friends.
>
> I'm doing restriction analysis on large sequences - chromosomes. I need
> to identify an appropriate enzyme based on the total length of fragments
> that are of a certain size (e.g. 100 - 500 bp). However, the amount of
> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
> have the following code (bottom) which downloads 2 thaliana chromosomes
> (mito and chloro - so pretty small) and runs an analysis and then loops
> through the fragments for all enzymes in the default collection.
>
> My memory usage just keep on climbing and none seems to get freed up
> even when a $ra goes out of scope (start dealing with the next
> sequence). Is this a memory leak of some sort, is there a way to free up
> memory as I go? I'd appreciate any help/advice on how to reduce the
> amount of memory being consumed as I'd like to use all the thaliana
> chromosomes (not just mito and chloro), which at the moment probably
> won't work.
>
> Cheers
> Nath
>
> use strict;
> use Bio::DB::GenBank;
> use Bio::Restriction::Analysis;
> use Bio::Restriction::EnzymeCollection;
>
> my @seq_objs;
> my @gis = ( 7525012,  26556996 );
>
> my $db = Bio::DB::GenBank->new(-format => "fasta");
> foreach my $gi (@gis) {
>   print "Getting GI: $gi\n";
>   push @seq_objs, $db->get_Seq_by_id($gi)
> }
>
> my $min_fragment_size = 100;
> my $max_fragment_size = 500;
> my $enz_Coll = Bio::Restriction::EnzymeCollection->new();
>
> foreach my $seq (@seq_objs) {
>   my $tot_size = 0;
>   print "Processing ", $seq->primary_id,"\n";
>   my $ra = Bio::Restriction::Analysis->new(
>                                          -seq=>$seq,
>                                          -enzymes=>$enz_Coll,
>   );
>  
>   my @all_enzymes = $ra->cutters->each_enzyme;
>   print "  Calc total length of fragments in range: $min_fragment_size -
> $max_fragment_size\n";
>   foreach my $enzyme ( @all_enzymes ) {
>     # fragments() is a real memory hog
>     foreach my $frag ($ra->fragments($enzyme)) {
>       next if $min_fragment_size && (length $frag < $min_fragment_size);
>       next if $max_fragment_size && (length $frag > $max_fragment_size);
>       $tot_size += length $frag;
>     }
>     # do something based on value of $tot_size
>     #print "    ", $enzyme->name, " total = $tot_size\n";
>   }
>   print "DONE\n";
> }
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   


From cjfields at uiuc.edu  Sun Oct 29 13:09:54 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 12:09:54 -0600
Subject: [Bioperl-l] PAML
In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
	<9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <C775A898-5D18-48F6-874F-3B359C1A10C5@uiuc.edu>

On Oct 29, 2006, at 11:06 AM, Eric Ross wrote:

> Thanks for all the help.
>
> I've been looking at the code for the PAML rst parser.  It's a bit  
> tricky.
>
> We have written a parser specific for our needs, but it looks to be  
> a pretty complicated matter to make it generic.
>
> The output of PAML can vary a lot depending upon your options and  
> this section can be repeated multiple times.  I'm sure someone with  
> a good grasp of the potential output of PAML could come up with  
> something, but I'll admit to being at a loss.

Eric,

I planned on looking at ways to integrate the protein-based PAML  
programs but I'm working on a different area at the moment.  I agree  
it may be hard to adequately genericize parsing/methods to accomplish  
this, but if you have any ideas feel free to post them.  Again, I  
would suggest adding any proposed enhancements or bugs to Bugzilla:

http://bugzilla.open-bio.org/

Suggestions or bug reports on the list sometimes get lost in the  
shuffle, esp. since we're planning on a new developer release soon.

Chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct 29 13:16:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 12:16:37 -0600
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <4544E838.7090400@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
	<4544E838.7090400@sheffield.ac.uk>
Message-ID: <6D9EAA04-199C-4BDD-AA60-4833BC1CE250@uiuc.edu>


On Oct 29, 2006, at 11:43 AM, Nathan S. Haigh wrote:

> Sorry for the repeat post but I haven't had a response. Just  
> wondered if
> anyone had any idea about this?
>
> Thanks
> Nath

...

I think Warnock applies here.  Likely no one is really sure, hence  
they aren't answering.  It probably bears investigating by submitting  
and tracking as a bug.  My guess is something isn't garbage-collected  
properly (i.e. there are circular references present), leading to a  
memory leak.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From chhalling at alumni.ls.berkeley.edu  Sun Oct 29 14:16:36 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Sun, 29 Oct 2006 14:16:36 -0500
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <4544E838.7090400@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
	<4544E838.7090400@sheffield.ac.uk>
Message-ID: <4544FE14.7030701@alumni.ls.berkeley.edu>

Nathan S. Haigh wrote:
> Sorry for the repeat post but I haven't had a response. Just wondered if 
> anyone had any idea about this?
>
> Thanks
> Nath
>
> Nathan S. Haigh wrote:
>   
>> As you may be aware by now, i'm working with Bio::Restriction::Analysis
>> and friends.
>>
>> I'm doing restriction analysis on large sequences - chromosomes. I need
>> to identify an appropriate enzyme based on the total length of fragments
>> that are of a certain size (e.g. 100 - 500 bp). However, the amount of
>> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
>> have the following code (bottom) which downloads 2 thaliana chromosomes
>> (mito and chloro - so pretty small) and runs an analysis and then loops
>> through the fragments for all enzymes in the default collection.
>>
>> My memory usage just keep on climbing and none seems to get freed up
>> even when a $ra goes out of scope (start dealing with the next
>> sequence). Is this a memory leak of some sort, is there a way to free up
>> memory as I go? I'd appreciate any help/advice on how to reduce the
>> amount of memory being consumed as I'd like to use all the thaliana
>> chromosomes (not just mito and chloro), which at the moment probably
>> won't work.
>>
>> Cheers
>> Nath
>>
>> use strict;
>> use Bio::DB::GenBank;
>> use Bio::Restriction::Analysis;
>> use Bio::Restriction::EnzymeCollection;
>>
>> my @seq_objs;
>> my @gis = ( 7525012,  26556996 );
>>
>> my $db = Bio::DB::GenBank->new(-format => "fasta");
>> foreach my $gi (@gis) {
>>   print "Getting GI: $gi\n";
>>   push @seq_objs, $db->get_Seq_by_id($gi)
>> }
>>
>> my $min_fragment_size = 100;
>> my $max_fragment_size = 500;
>> my $enz_Coll = Bio::Restriction::EnzymeCollection->new();
>>
>> foreach my $seq (@seq_objs) {
>>   my $tot_size = 0;
>>   print "Processing ", $seq->primary_id,"\n";
>>   my $ra = Bio::Restriction::Analysis->new(
>>                                          -seq=>$seq,
>>                                          -enzymes=>$enz_Coll,
>>   );
>>  
>>   my @all_enzymes = $ra->cutters->each_enzyme;
>>   print "  Calc total length of fragments in range: $min_fragment_size -
>> $max_fragment_size\n";
>>   foreach my $enzyme ( @all_enzymes ) {
>>     # fragments() is a real memory hog
>>     foreach my $frag ($ra->fragments($enzyme)) {
>>       next if $min_fragment_size && (length $frag < $min_fragment_size);
>>       next if $max_fragment_size && (length $frag > $max_fragment_size);
>>       $tot_size += length $frag;
>>     }
>>     # do something based on value of $tot_size
>>     #print "    ", $enzyme->name, " total = $tot_size\n";
>>   }
>>   print "DONE\n";
>> }
>>
>>     
Try this code, which creates a new Bio::Restriction::Analysis object for 
each digest. On my PowerBook, this doesn't use more than 13 Mb of memory.

Reading the code for Bio::Restriction::Analysis reveals that the 
fragments() method calls the cut() method. The documentation for the cut 
method states:

Note: cut doesn't now re-initialize everything before figuring out
cuts. This is so that you can do multiple digests, or add more data or
whatever. You'll have to use new to reset everything.

This means there is no memory leak; it's just that the 
Bio::Restriction::Analysis object is retaining cut information for each 
enzyme, which takes a lot of memory.

use strict;
use warnings;
use Bio::DB::GenBank;
use Bio::Restriction::Analysis;
use Bio::Restriction::EnzymeCollection;

my @seq_objs;
my @gis = ( 7525012,  26556996 );

my $db = Bio::DB::GenBank->new(-format => "fasta");
foreach my $gi (@gis) {
  print "Getting GI: $gi\n";
  push @seq_objs, $db->get_Seq_by_id($gi)
}

my $min_fragment_size = 100;
my $max_fragment_size = 500;
my $enz_Coll = Bio::Restriction::EnzymeCollection->new();

foreach my $seq (@seq_objs) {
  print "Processing ", $seq->primary_id, "\n";
  foreach my $enzyme ( $enz_Coll->each_enzyme() ) {
    my $ra = Bio::Restriction::Analysis->new(
      -seq => $seq,
      -enzymes => $enzyme );
    my $tot_size = 0;
 
    print "  Calc total length of fragments in range: $min_fragment_size 
-" .
      " $max_fragment_size\n";

    foreach my $frag ($ra->fragments($enzyme)) {
      next if $min_fragment_size && (length $frag < $min_fragment_size);
      next if $max_fragment_size && (length $frag > $max_fragment_size);
      $tot_size += length $frag;
    }
    # do something based on value of $tot_size
    print "    ", $enzyme->name, " total = $tot_size\n";
  }
  print "DONE\n";
}

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From n.haigh at sheffield.ac.uk  Mon Oct 30 03:51:49 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 30 Oct 2006 08:51:49 +0000
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
Message-ID: <4545BD25.3030107@sheffield.ac.uk>

In my script I retrieve sequences from GenBank in FASTA format by GI
numbers and optionally store the sequence in a cache using
Bio::DB::Fasta. On subsequent runs of the script, the cache is first
checked for the GI and returns the sequence if it is found or the
sequence is obtained from GenBank as above.

I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
object which is defined within the Bio::DB::Fasta file. This is
annoying, since $seq_obj in my script would be either a Bio::Seq if it
was obtained from GenBank or a Bio::PrimarySeq if obtained from the
cache and calling primary_id() on it doesn't do the expected thing with
Bio::PrimarySeq:
ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)

Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?

Nath


From yuhki at ncifcrf.gov  Mon Oct 30 08:57:35 2006
From: yuhki at ncifcrf.gov (Naoya Yuhki)
Date: Mon, 30 Oct 2006 08:57:35 -0500
Subject: [Bioperl-l] bptutorial.pl 0
Message-ID: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov>

Hello,
I run

perl bptutorial.pl 0

and I got the following error.

-------------------- WARNING ---------------------
MSG: id (ROA1_HUMAN) does not exist
---------------------------------------------------
Can't call method "display_id" on an undefined value at bptutorial.pl  
line 3945.

other tests all worked.

I thank any suggestions from you.

NAOYA YUHKI.


From cjfields at uiuc.edu  Mon Oct 30 12:42:21 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 30 Oct 2006 11:42:21 -0600
Subject: [Bioperl-l] bptutorial.pl 0
In-Reply-To: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov>
Message-ID: <000601c6fc4a$c3e43450$15327e82@pyrimidine>

> Hello,
> I run
> 
> perl bptutorial.pl 0
> 
> and I got the following error.
> 
> -------------------- WARNING ---------------------
> MSG: id (ROA1_HUMAN) does not exist
> ---------------------------------------------------
> Can't call method "display_id" on an undefined value at bptutorial.pl
> line 3945. 
> 
> other tests all worked.
> 
> I thank any suggestions from you.
> 
> NAOYA YUHKI.

What version of Bioperl are you running?  

As a warning, the bptutorial.pl script has been removed from CVS and will
not be included in future versions of Bioperl.  It can be found on the
bioperl wiki instead:

http://www.bioperl.org/wiki/Bptutorial

chris


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Mon Oct 30 13:08:15 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 30 Oct 2006 10:08:15 -0800
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <4545BD25.3030107@sheffield.ac.uk>
References: <4545BD25.3030107@sheffield.ac.uk>
Message-ID: <29F47393-D134-4093-8751-E948BF521843@bioperl.org>

Bio::PrimarySeq makes sense because Fasta databases only provide  
sequences without features.  But you are actually getting a  
Bio::PrimarySeq::Fasta object which is a proxy object since the  
module won't pull a whole sequence into memory unless seq() is  
requested.

The problem is really why you are getting something useless set for  
primary_id.

What do you want it to be - the GI number?  you'll need to explicitly  
set it because DB::Fasta has no concept of GI numbers encoded in the  
header line.
AFAIK you cannot also set the primary_id to a value of your liking  
because this a proxy object.  The best bet is to create a Bio::Seq  
object out of one of these and set the primary_id and display_id to  
values that you can compute from the display_id.

At least that has been my strategy when using this - maybe someone  
wants to code something new into the object itsself.

-jason
On Oct 30, 2006, at 12:51 AM, Nathan S. Haigh wrote:

> In my script I retrieve sequences from GenBank in FASTA format by GI
> numbers and optionally store the sequence in a cache using
> Bio::DB::Fasta. On subsequent runs of the script, the cache is first
> checked for the GI and returns the sequence if it is found or the
> sequence is obtained from GenBank as above.
>
> I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
> returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
> object which is defined within the Bio::DB::Fasta file. This is
> annoying, since $seq_obj in my script would be either a Bio::Seq if it
> was obtained from GenBank or a Bio::PrimarySeq if obtained from the
> cache and calling primary_id() on it doesn't do the expected thing  
> with
> Bio::PrimarySeq:
> ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)
>
> Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From golharam at umdnj.edu  Mon Oct 30 15:11:51 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 15:11:51 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String?
Message-ID: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>

I'm trying to parse some blast output w/o actually creating the output
file.  Instead, I'm capturing the output in a variable and would like to
use IO::String to represent the file:

	$_ = `megablast -d somedatabase -i somesequence -D 2`;
	my $blast_file = new IO::String($_);
	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
$blast_file);
	my $results = $searchio->next_result;
	my $hit = $results->next_hit;
	if (! defined($hit)) {
		warn "No BLAST hit for $accession on chr $chr for
Seq/$orth_id/$organism\n\n";
		return;
	}

Now, when Bio::SearchIO tries to read the output line by line, instead
it reads the entire output as 1 line.

If I provide the output in a file and use:

	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
'/tmp/somefile.blast');

This works...so is it possible to use IO::String to provide
Bio::SearchIO with BLAST output?  

Ryan


From golharam at umdnj.edu  Mon Oct 30 15:54:29 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 15:54:29 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com>
Message-ID: <00e801c6fc65$9849aee0$e6028a0a@GOLHARMOBILE1>

Thanks.  How are you getting the output?  system()?  BTW- I'm using
v1.5.1...


> -----Original Message-----
> From: Bernd Web [mailto:bernd.web at gmail.com] 
> Sent: Monday, October 30, 2006 3:45 PM
> To: golharam at umdnj.edu
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] Is it possible to parse BLAST output 
> using IO:String?
> 
> 
> Hi Ryan,
> 
> I parse blastn output using IO::String w/o problems:
> 
>  my $stringfh = new IO::String($input);
>  my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh);
> 
> however this is input does not come via backticks.
> 
> 
> bernd
> 
> On 10/30/06, Ryan Golhar <golharam at umdnj.edu> wrote:
> > I'm trying to parse some blast output w/o actually creating 
> the output 
> > file.  Instead, I'm capturing the output in a variable and 
> would like 
> > to use IO::String to represent the file:
> >
> >         $_ = `megablast -d somedatabase -i somesequence -D 2`;
> >         my $blast_file = new IO::String($_);
> >         my $searchio = new Bio::SearchIO(-format => 'blast', -fh => 
> > $blast_file);
> >         my $results = $searchio->next_result;
> >         my $hit = $results->next_hit;
> >         if (! defined($hit)) {
> >                 warn "No BLAST hit for $accession on chr $chr for 
> > Seq/$orth_id/$organism\n\n";
> >                 return;
> >         }
> >
> > Now, when Bio::SearchIO tries to read the output line by 
> line, instead 
> > it reads the entire output as 1 line.
> >
> > If I provide the output in a file and use:
> >
> >         my $searchio = new Bio::SearchIO(-format => 
> 'blast', -file => 
> > '/tmp/somefile.blast');
> >
> > This works...so is it possible to use IO::String to provide 
> > Bio::SearchIO with BLAST output?
> >
> > Ryan
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org 
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 


From bix at sendu.me.uk  Mon Oct 30 16:27:58 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 30 Oct 2006 21:27:58 +0000
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
Message-ID: <45466E5E.9000504@sendu.me.uk>

Ryan Golhar wrote:
> I'm trying to parse some blast output w/o actually creating the output
> file.  Instead, I'm capturing the output in a variable and would like to
> use IO::String to represent the file:
> 
> 	$_ = `megablast -d somedatabase -i somesequence -D 2`;
> 	my $blast_file = new IO::String($_);
> 	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
> $blast_file);
> 	my $results = $searchio->next_result;
> 	my $hit = $results->next_hit;
> 	if (! defined($hit)) {
> 		warn "No BLAST hit for $accession on chr $chr for
> Seq/$orth_id/$organism\n\n";
> 		return;
> 	}
> 
> Now, when Bio::SearchIO tries to read the output line by line, instead
> it reads the entire output as 1 line.
> 
> If I provide the output in a file and use:
> 
> 	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
> '/tmp/somefile.blast');
> 
> This works...so is it possible to use IO::String to provide
> Bio::SearchIO with BLAST output?

Why must it be IO::String? Why not just open() your megablast and 
provide $searchio the real filehandle? It would be faster that way as well.

Read the docs for `. Your usage above is inappropriate.


From golharam at umdnj.edu  Mon Oct 30 16:54:45 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 16:54:45 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <C3209DC5-433B-4BAD-A184-AC9D2A2B4A90@bioperl.org>
Message-ID: <00f901c6fc6e$03916460$e6028a0a@GOLHARMOBILE1>

Hmmm.  Yes, I suppose I could.  
 
I did it with the backtick because I based my code off of the "To and
>From a String" from the SeqIO HOWTO...
 

-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason
Stajich
Sent: Monday, October 30, 2006 4:44 PM
To: Sendu Bala
Cc: golharam at umdnj.edu; 'bioperl-l'
Subject: Re: [Bioperl-l] Is it possible to parse BLAST output using
IO:String?


right - can't you just do: 

my $fh;
open($fh, "megablast -d ... | ") || die $!;
my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh);

On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote:


Ryan Golhar wrote:

I'm trying to parse some blast output w/o actually creating the output
file.  Instead, I'm capturing the output in a variable and would like to
use IO::String to represent the file:

$_ = `megablast -d somedatabase -i somesequence -D 2`;
my $blast_file = new IO::String($_);
my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
$blast_file);
my $results = $searchio->next_result;
my $hit = $results->next_hit;
if (! defined($hit)) {
warn "No BLAST hit for $accession on chr $chr for
Seq/$orth_id/$organism\n\n";
return;
}

Now, when Bio::SearchIO tries to read the output line by line, instead
it reads the entire output as 1 line.

If I provide the output in a file and use:

my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
'/tmp/somefile.blast');

This works...so is it possible to use IO::String to provide
Bio::SearchIO with BLAST output?


Why must it be IO::String? Why not just open() your megablast and 
provide $searchio the real filehandle? It would be faster that way as
well.

Read the docs for `. Your usage above is inappropriate.


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


--
Jason Stajich, PhD 
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From bernd.web at gmail.com  Mon Oct 30 15:44:31 2006
From: bernd.web at gmail.com (Bernd Web)
Date: Mon, 30 Oct 2006 21:44:31 +0100
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
Message-ID: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com>

Hi Ryan,

I parse blastn output using IO::String w/o problems:

 my $stringfh = new IO::String($input);
 my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh);

however this is input does not come via backticks.


bernd

On 10/30/06, Ryan Golhar <golharam at umdnj.edu> wrote:
> I'm trying to parse some blast output w/o actually creating the output
> file.  Instead, I'm capturing the output in a variable and would like to
> use IO::String to represent the file:
>
>         $_ = `megablast -d somedatabase -i somesequence -D 2`;
>         my $blast_file = new IO::String($_);
>         my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
> $blast_file);
>         my $results = $searchio->next_result;
>         my $hit = $results->next_hit;
>         if (! defined($hit)) {
>                 warn "No BLAST hit for $accession on chr $chr for
> Seq/$orth_id/$organism\n\n";
>                 return;
>         }
>
> Now, when Bio::SearchIO tries to read the output line by line, instead
> it reads the entire output as 1 line.
>
> If I provide the output in a file and use:
>
>         my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
> '/tmp/somefile.blast');
>
> This works...so is it possible to use IO::String to provide
> Bio::SearchIO with BLAST output?
>
> Ryan
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From jason at bioperl.org  Mon Oct 30 16:44:18 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 30 Oct 2006 13:44:18 -0800
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <45466E5E.9000504@sendu.me.uk>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
	<45466E5E.9000504@sendu.me.uk>
Message-ID: <C3209DC5-433B-4BAD-A184-AC9D2A2B4A90@bioperl.org>

right - can't you just do:

my $fh;
open($fh, "megablast -d ... | ") || die $!;
my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh);

On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote:

> Ryan Golhar wrote:
>> I'm trying to parse some blast output w/o actually creating the  
>> output
>> file.  Instead, I'm capturing the output in a variable and would  
>> like to
>> use IO::String to represent the file:
>>
>> 	$_ = `megablast -d somedatabase -i somesequence -D 2`;
>> 	my $blast_file = new IO::String($_);
>> 	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
>> $blast_file);
>> 	my $results = $searchio->next_result;
>> 	my $hit = $results->next_hit;
>> 	if (! defined($hit)) {
>> 		warn "No BLAST hit for $accession on chr $chr for
>> Seq/$orth_id/$organism\n\n";
>> 		return;
>> 	}
>>
>> Now, when Bio::SearchIO tries to read the output line by line,  
>> instead
>> it reads the entire output as 1 line.
>>
>> If I provide the output in a file and use:
>>
>> 	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
>> '/tmp/somefile.blast');
>>
>> This works...so is it possible to use IO::String to provide
>> Bio::SearchIO with BLAST output?
>
> Why must it be IO::String? Why not just open() your megablast and
> provide $searchio the real filehandle? It would be faster that way  
> as well.
>
> Read the docs for `. Your usage above is inappropriate.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From lstein at cshl.edu  Mon Oct 30 13:59:29 2006
From: lstein at cshl.edu (Lincoln Stein)
Date: Mon, 30 Oct 2006 13:59:29 -0500
Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase
Message-ID: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>

Hi All,

I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not
to validate. I have committed a new version to live and to the release
candidate branch. I hope it isn't too late to get this into the release.

Lincoln

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From huangyi1 at hkusua.hku.hk  Tue Oct 31 00:46:20 2006
From: huangyi1 at hkusua.hku.hk (Huang Yi)
Date: Tue, 31 Oct 2006 13:46:20 +0800
Subject: [Bioperl-l] bioperl1.5 and GD2.35
Message-ID: <200610310546.k9V5kQGT010481@hkusua.hku.hk>

Hi,

 
I just installed bioperl 1.4 from CPAN to my Gentoo linux computer. But the
installation was failed. I had to install by force.

 
However, the GD module couldn't be installed for some unknown reasons.

 
I therefore use "emerge" tool of Gentoo to get bioperl and GD again. They
are fine. The version of bioperl became upgrade to1.5 and GD was 2.35.

 
However, when I tested it by using the program in HOWTO wiki page
(http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me:

 
Can't locate object method "png" via package "GD::Image" at
/usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 799, <> line 9.

 
In my other computer, bioperl1.4 and GD2.34 work fine. I therefore want to
remove the CPAN bioperl from the system and re-install it, but it seems to
be impossible.

 
Would you please give me some advices on how to let my GD and bioperl work. 

 
Thanks!

 
Huang Yi

 
From bix at sendu.me.uk  Tue Oct 31 03:20:21 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 31 Oct 2006 08:20:21 +0000
Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase
In-Reply-To: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>
References: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>
Message-ID: <45470745.1050605@sendu.me.uk>

Lincoln Stein wrote:
> Hi All,
> 
> I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not
> to validate. I have committed a new version to live and to the release
> candidate branch. I hope it isn't too late to get this into the release.

It isn't too late, thank you.


From avilella at gmail.com  Tue Oct 31 08:54:39 2006
From: avilella at gmail.com (Albert Vilella)
Date: Tue, 31 Oct 2006 13:54:39 +0000
Subject: [Bioperl-l] catfile and catdir
Message-ID: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>

Hi,

I was testing the bioperl-run/t/PAML.t and stumbled upon this a
catdir/catfile error:

Can't locate object method "catdir" via package "Bio::Root::IO" at
/home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
113.
BEGIN failed--compilation aborted at
/home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
143.
Compilation failed in require at t/PAML.t line 64.
BEGIN failed--compilation aborted at t/PAML.t line 64.

Should be be using File::Spec for catdir and catfile instead of Root::IO?

Cheers,

    Albert.


From Kevin.M.Brown at asu.edu  Tue Oct 31 10:34:34 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Tue, 31 Oct 2006 08:34:34 -0700
Subject: [Bioperl-l] bioperl1.5 and GD2.35
Message-ID: <1A4207F8295607498283FE9E93B775B4023B5F3C@EX02.asurite.ad.asu.edu>

Not really a Bioperl issue per se, but sounds like when you had Gentoo
emerge GD it didn't include libpng and so didn't build the needed parts
to create PNG type graphics. 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Huang Yi
> Sent: Monday, October 30, 2006 10:46 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] bioperl1.5 and GD2.35
> 
> Hi,
> 
>  
> 
> I just installed bioperl 1.4 from CPAN to my Gentoo linux 
> computer. But the
> installation was failed. I had to install by force.
> 
>  
> 
> However, the GD module couldn't be installed for some unknown reasons.
> 
>  
> 
> I therefore use "emerge" tool of Gentoo to get bioperl and GD 
> again. They
> are fine. The version of bioperl became upgrade to1.5 and GD was 2.35.
> 
>  
> 
> However, when I tested it by using the program in HOWTO wiki page
> (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me:
> 
>  
> 
> Can't locate object method "png" via package "GD::Image" at
> /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 
> 799, <> line 9.
> 
>  
> 
> In my other computer, bioperl1.4 and GD2.34 work fine. I 
> therefore want to
> remove the CPAN bioperl from the system and re-install it, 
> but it seems to
> be impossible.
> 
>  
> 
> Would you please give me some advices on how to let my GD and 
> bioperl work. 
> 
>  
> 
> Thanks!
> 
>  
> 
> Huang Yi
> 
>  
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From hlapp at gmx.net  Tue Oct 31 11:21:40 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 11:21:40 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
	<24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
Message-ID: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>


On Oct 27, 2006, at 9:57 PM, Chris Fields wrote:

> BTW, was that supposed to be Bio::AnnotatableI, or  
> Bio::AnnotationHolderI?

Sorry, the former. I guess I got confused with FeatureHolders. Too  
bad Featureable isn't an English word.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Tue Oct 31 12:01:44 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 12:01:44 -0500
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <4545BD25.3030107@sheffield.ac.uk>
References: <4545BD25.3030107@sheffield.ac.uk>
Message-ID: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net>

The only thing I would add to Jason's reply is that it is easy to do

	if (! $seq->isa("Bio::SeqI")) {
		my $bioseq = Bio::Seq->new();
		$bioseq->primary_seq($seq);
		$seq = $bioseq;
	}

and from that point on all your objects are Bio::SeqI compliant  
regardless of whether they were obtained that way or not.

Aside from that I wonder why there isn't a -primary_seq option in  
Bio::Seq::new - this would shorten the above into a (more perl'ish)  
single line:

	$seq = Bio::Seq->new(-primary_seq=>$seq) unless $seq->isa("Bio::SeqI");

Anyone takers to add that capability?

-hilmar

On Oct 30, 2006, at 3:51 AM, Nathan S. Haigh wrote:

> In my script I retrieve sequences from GenBank in FASTA format by GI
> numbers and optionally store the sequence in a cache using
> Bio::DB::Fasta. On subsequent runs of the script, the cache is first
> checked for the GI and returns the sequence if it is found or the
> sequence is obtained from GenBank as above.
>
> I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
> returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
> object which is defined within the Bio::DB::Fasta file. This is
> annoying, since $seq_obj in my script would be either a Bio::Seq if it
> was obtained from GenBank or a Bio::PrimarySeq if obtained from the
> cache and calling primary_id() on it doesn't do the expected thing  
> with
> Bio::PrimarySeq:
> ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)
>
> Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 31 12:08:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 11:08:56 -0600
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
Message-ID: <001401c6fd0f$4239aa50$15327e82@pyrimidine>

>> BTW, was that supposed to be Bio::AnnotatableI, or
>> Bio::AnnotationHolderI?
> 
> Sorry, the former. I guess I got confused with
> FeatureHolders. Too bad Featureable isn't an English word.
> 
> 	-hilmar

Having SimpleAlign be AnnotatableI shouldn't be too much of a burden, since
the only additional implemented method is annotation().  So, I think all the
various Stockholm tags can be placed somewhere.

A bit OT: were we planning on getting rid of the various *_tag_* methods in
AnnotatableI at some point?  I'm a bit confused as to why they were added.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Tue Oct 31 12:09:26 2006
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 31 Oct 2006 09:09:26 -0800
Subject: [Bioperl-l] catfile and catdir
In-Reply-To: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>
References: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>
Message-ID: <1AD4DB38-E08D-4E47-8A59-6539068474CB@bioperl.org>

Yep.  Unless we want this to also exist in Root::IO and delegate to  
File::Spec.

-jason
On Oct 31, 2006, at 5:54 AM, Albert Vilella wrote:

> Hi,
>
> I was testing the bioperl-run/t/PAML.t and stumbled upon this a
> catdir/catfile error:
>
> Can't locate object method "catdir" via package "Bio::Root::IO" at
> /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
> 113.
> BEGIN failed--compilation aborted at
> /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
> 143.
> Compilation failed in require at t/PAML.t line 64.
> BEGIN failed--compilation aborted at t/PAML.t line 64.
>
> Should be be using File::Spec for catdir and catfile instead of  
> Root::IO?
>
> Cheers,
>
>     Albert.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Tue Oct 31 12:10:51 2006
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 31 Oct 2006 09:10:51 -0800
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
	<24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
	<8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
Message-ID: <65F92B54-33FD-4D8F-90B7-49E2697CDBA2@bioperl.org>

It just needs to have an annotation collection - so it would be  
Bio::AnnotateableI

On Oct 31, 2006, at 8:21 AM, Hilmar Lapp wrote:

>
> On Oct 27, 2006, at 9:57 PM, Chris Fields wrote:
>
>> BTW, was that supposed to be Bio::AnnotatableI, or
>> Bio::AnnotationHolderI?
>
> Sorry, the former. I guess I got confused with FeatureHolders. Too
> bad Featureable isn't an English word.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From hlapp at gmx.net  Tue Oct 31 12:44:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 12:44:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C16CF3EE.B1A9%bosborne11@verizon.net>
References: <C16CF3EE.B1A9%bosborne11@verizon.net>
Message-ID: <ACF19E78-7FC3-42BE-8F41-86C45C710F4B@gmx.net>

Well isn't this a result of conflating some of the SeqFeatureI  
methods into the annotation collection?

If I'm not mistaken on this then those methods were introduced in  
1.5.0 and hence can go away without deprecation.

	-hilmar

On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote:

> Chris,
>
> I don't think the intent was to remove the methods, rather we'd  
> just call
> deprecated(). Example from AnnotatableI:
>
> sub remove_tag {
>   my ($self, at args) = @_;
>
>   #uncomment in 1.6
>   #$self->deprecated('remove_tag() is deprecated, use
> remove_Annotations()');
>
>   return $self->annotation->remove_Annotations(@args);
> }
>
> With regards to "why", I can't reconstruct the entire rationale  
> myself but I
> can say that the newer names make more sense. Take that example  
> above - it's
> function is to remove entire Annotations not just to remove tags, so
> remove_Annotations is a better name.
>
> Brian O.
>
>
> On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>
>> A bit OT: were we planning on getting rid of the various *_tag_*  
>> methods in
>> AnnotatableI at some point?  I'm a bit confused as to why they  
>> were added.
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bosborne11 at verizon.net  Tue Oct 31 11:37:01 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 31 Oct 2006 12:37:01 -0400
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <001401c6fd0f$4239aa50$15327e82@pyrimidine>
Message-ID: <C16CF3EE.B1A9%bosborne11@verizon.net>

Chris,

I don't think the intent was to remove the methods, rather we'd just call
deprecated(). Example from AnnotatableI:

sub remove_tag {
  my ($self, at args) = @_;

  #uncomment in 1.6
  #$self->deprecated('remove_tag() is deprecated, use
remove_Annotations()');

  return $self->annotation->remove_Annotations(@args);
}

With regards to "why", I can't reconstruct the entire rationale myself but I
can say that the newer names make more sense. Take that example above - it's
function is to remove entire Annotations not just to remove tags, so
remove_Annotations is a better name.

Brian O.


On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> A bit OT: were we planning on getting rid of the various *_tag_* methods in
> AnnotatableI at some point?  I'm a bit confused as to why they were added.


From cjfields at uiuc.edu  Tue Oct 31 13:44:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 12:44:02 -0600
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <ACF19E78-7FC3-42BE-8F41-86C45C710F4B@gmx.net>
Message-ID: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>

Hilmar Lapp wrote:
> Well isn't this a result of conflating some of the
> SeqFeatureI methods into the annotation collection?
> 
> If I'm not mistaken on this then those methods were
> introduced in 1.5.0 and hence can go away without deprecation.
> 
> 	-hilmar
> 
> On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote:
> 
>> Chris,
>> 
>> I don't think the intent was to remove the methods, rather we'd just
>> call deprecated(). Example from AnnotatableI:
>> 
>> sub remove_tag {
>>   my ($self, at args) = @_;
>> 
>>   #uncomment in 1.6
>>   #$self->deprecated('remove_tag() is deprecated, use
>> remove_Annotations()'); 
>> 
>>   return $self->annotation->remove_Annotations(@args); }
>> 
>> With regards to "why", I can't reconstruct the entire rationale
>> myself but I can say that the newer names make more sense. Take that
>> example above - it's function is to remove entire Annotations not
>> just to remove tags, so remove_Annotations is a better name.
>> 
>> Brian O.
>> 
>> 
>> On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>> 
>>> A bit OT: were we planning on getting rid of the various *_tag_*
>>> methods in AnnotatableI at some point?  I'm a bit confused as to why
>>> they were added.

Sorry Brian, what I meant was, based on CVS history, the various *tag*
methods in AnnotatableI were added all at once, with deprecations already
present in the commit.  So the methods weren't there to begin with, then
added only to be deprecated later?  Hence the confusion...

I think Hilmar's right; the CVS history indicates these were added just
prior to rel. 1.5 by Allen and seem to be related to SeqFeatureI.  I'm sure
the intent was good, but they contradict methods in the Feature/Annotation
HOWTO on retrieving Annotation objects via the Annotation::Collection
object.  I think that agrees with your point about the various Annotation*
method names being the more appropriate ones.  

Does everybody agree we should just remove them?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 31 13:53:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 12:53:16 -0600
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net>
Message-ID: <000001c6fd1d$d4359c80$15327e82@pyrimidine>


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp
> Sent: Tuesday, October 31, 2006 11:02 AM
> To: n.haigh at sheffield.ac.uk
> Cc: Bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
> 
> The only thing I would add to Jason's reply is that it is easy to do
> 
> 	if (! $seq->isa("Bio::SeqI")) {
> 		my $bioseq = Bio::Seq->new();
> 		$bioseq->primary_seq($seq);
> 		$seq = $bioseq;
> 	}
> 
> and from that point on all your objects are Bio::SeqI 
> compliant regardless of whether they were obtained that way or not.
> 
> Aside from that I wonder why there isn't a -primary_seq 
> option in Bio::Seq::new - this would shorten the above into a 
> (more perl'ish) single line:
> 
> 	$seq = Bio::Seq->new(-primary_seq=>$seq) unless 
> $seq->isa("Bio::SeqI");
> 
> Anyone takers to add that capability?
> 
> -hilmar

Sounds good to me!

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
From nhansen at nhgri.nih.gov  Tue Oct 31 14:51:23 2006
From: nhansen at nhgri.nih.gov (Nancy Hansen)
Date: Tue, 31 Oct 2006 14:51:23 -0500 (EST)
Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling
Message-ID: <Pine.GSO.4.58.0610311438470.17750@stout.nhgri.nih.gov>


Hello,

	As sequencing centers begin to deposit trace data from "Medical
Sequencing" projects into the public archives, there is now the need to
"anonymize" sequence trace files by removing embedded information which
might be used to identify the individual who was the original source of
the DNA being sequenced.

	I was hoping I might be able to use Bio::SeqIO to manipulate the
comments contained in an SCF-formatted trace file, but I'm finding that
Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information.
Since SCF is a widely-accepted standard for trace files, would it be
reasonable to include fields like "scf_comments" and "scf_header" in a
Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them?
Likewise, it would be great if write_seq could pull these values right
from a SequenceTrace object rather than requiring them as arguments.

	I'd be happy to help in this effort if necessary.

	Thanks,
	--Nancy

*************************************
Nancy F. Hansen, PhD	nhansen at nhgri.nih.gov
Bioinformatics Group
NIH Intramural Sequencing Center (NISC)
5625 Fishers Lane
Rockville, MD 20852
Phone: (301) 435-1560	Fax: (301) 435-6170


From lincoln.stein at gmail.com  Tue Oct 31 15:24:17 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 31 Oct 2006 15:24:17 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <000001c6f78b$d1c65a30$15327e82@pyrimidine>
References: <453E309B.9090007@sendu.me.uk>
	<000001c6f78b$d1c65a30$15327e82@pyrimidine>
Message-ID: <6dce9a0b0610311224x79256b29sf102eb5c35865caf@mail.gmail.com>

Are you going to go ahead with 1.52_XX ? If so, I will code GBrowse to look
for 1.52 or higher.

Lincoln

On 10/24/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> ..
> >
> > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded
> > with the filename Perl6-Pugs-6.2.13.tar.gz
>
> Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is
> '6.002013'.  So maybe we should follow a similar convention.  Seems easier
> and less confusing to me, at least.
>
> > As you point out, the code has the kind of $VERSION number we've been
> > suggesting in this thread:
> >
> > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> > >
> > > our $VERSION = 6.002013;
> > >
> > > That's also a very perlish-way to do it.  And there are no developer
> > > versions of Pugs, since it is always under active development.  We
> could
> > try
> > > something like:
> > >
> > > our $VERSION = 1.005002_01;
> >
> > Yes, this was already like one of my suggestions (1.0502_01), but I
> > brought up the concern that 1.05 might be < 1.4.
> >
> > So then we have a question: do we try and fumble a 1.4 compatible number
> > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if
> > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no
> > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final
> > release following some 1.006000_001 (1.6.0.01 == rc1) RCs?
>
> I would go for the clean break if it follows perl/CPAN convention.
> '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing.
>
> If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6
> RC1, 1.6 RC2 etc then that would be consistent and perl-compatible.
>
> BTW, the reason I looked at Pugs was to see what some of the Perl6
> developers were using.  Who knows; they'll probably change it!
>
> ..
>
> > I don't think it would be a hassle; on the contrary it would be very
> > useful to know the CPAN distribution actually works. I'm very happy with
> > the idea that a release candidate gets fully tested...
>
> So you obviously feel strongly about it!  ;>
>
> I don't have a problem as long as we stick with doing this from now on (
> i.e.
> have a consistent versioning scheme, release policy, CPAN release policy,
> etc).  Would be nice for Jason/Brian/Hilmar to chime in as to the
> reasoning
> behind the older versioning scheme.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From hlapp at gmx.net  Tue Oct 31 16:53:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 16:53:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>
References: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>
Message-ID: <F244DEC6-0ADE-437E-9AED-1F864A54F7AD@gmx.net>


On Oct 31, 2006, at 1:44 PM, Chris Fields wrote:

> Does everybody agree we should just remove them?

I wish you could but I'm afraid that would break stuff? Otherwise why  
were they added in the first place? I thought  
Bio::SeqFeature::Annotated needs them maybe?

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 31 17:41:17 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 16:41:17 -0600
Subject: [Bioperl-l] AnnotatableI tag methods,
	was  Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <F244DEC6-0ADE-437E-9AED-1F864A54F7AD@gmx.net>
Message-ID: <000001c6fd3d$ae37c240$15327e82@pyrimidine>


> On Oct 31, 2006, at 1:44 PM, Chris Fields wrote:
> 
> > Does everybody agree we should just remove them?
> 
> I wish you could but I'm afraid that would break stuff? 
> Otherwise why were they added in the first place? I thought 
> Bio::SeqFeature::Annotated needs them maybe?
> 
> 	-hilmar
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Yep, removing them clobbers a ton of tests, including anything that requires
SeqIO::FTHelper.  Looks like SeqFeature::Generic and a few others use them.


I could understand if these were meant to be permanent methods, but why add
these in if they were to be deprecated in 1.6?  Something that was meant to
be a transition but wasn't finished?  That seems to be indicated in the
commented out lines for all the *tag* methods:

  #uncomment in 1.6
  #$self->deprecated('remove_tag() is deprecated, use
remove_Annotations()');

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
From lincoln.stein at gmail.com  Tue Oct 31 18:18:07 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 31 Oct 2006 18:18:07 -0500
Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning
In-Reply-To: <loom.20061020T041338-193@post.gmane.org>
References: <loom.20061020T041338-193@post.gmane.org>
Message-ID: <6dce9a0b0610311518l3bec852q5d04a9b488621377@mail.gmail.com>

Hi Keith,

The current Bio/DB/GFF/Util/Binning.pm file just contains the hierarchical
binning system that I implemented some time ago. Where is the R-tree system
that you describe? How much of an improvement did the R-tree scheme give
over the hierarchical scheme?

FTYI the GFF3 implementation uses a different binning scheme in which there
is a fixed-size bin. Every time a feature overlaps a bin, it creates a new
row in a table. So big features will have multiple rows and little features
that fit inside a bin will have only one row. The query for this is simpler
and seems to give the same relative speedup as the hierarchical binning
system. I'd really like to get these queries to go as fast as possible and
would love to work with you on this if you're interested.

Lincoln

On 10/19/06, Keith Player <keithplayer at hotmail.com> wrote:
>
> I know that there may be some changes resulting from new GFF3
> implementations,
> but thought I would see if the following is useful anyway.
>
> I implemented the R-tree binning schema as used by
> Bio::DB::GFF::Util::Binning
> and as mention in this article:
>
> I tested the following query on a normal table (no binning), but it
> assumes
> that you know the longest range in the table.  So for example with a table
> of
> human genes, where the longest gene we know of is around 2.4Mb.
>
> SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb])
> AND
> g.start < [end] AND g.end > [start] AND g.chromosome = '1'
>
> so for 100Mb:101Mb
>
> SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start <
> 101000000 AND g.end > 100000000 AND g.chromosome = '1'
>
>
> where [start] and [end] define the region of interest.  This query
> outperforms
> the R-Tree implementation on all tests that I have performed (for lengths
> of
> 200bp to 10Mb across a whole chromsome).  Could this be of some practical
> use?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From bosborne11 at verizon.net  Tue Oct 31 21:31:49 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 31 Oct 2006 22:31:49 -0400
Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling
In-Reply-To: <Pine.GSO.4.58.0610311438470.17750@stout.nhgri.nih.gov>
Message-ID: <C16D7F55.B1D9%bosborne11@verizon.net>

Nancy,

It looks like a good place to start would be the get_header() and
_get_header methods in Bio::SeqIO::scf. If you read t/scf.t you can see that
the author, at some point, wanted get_header to return meaningful
information but stepping through the test shows it returning a lot of UNDEF.
Now I don't know if this is due to the method or the source SCF file, but
you might be able to get these methods to work yourself.

But to answer your questions, yes, it certainly sounds reasonable that these
values would be extracted by Bio::SeqIO::scf.

Brian O.


On 10/31/06 3:51 PM, "Nancy Hansen" <nhansen at nhgri.nih.gov> wrote:

> 
> Hello,
> 
> As sequencing centers begin to deposit trace data from "Medical
> Sequencing" projects into the public archives, there is now the need to
> "anonymize" sequence trace files by removing embedded information which
> might be used to identify the individual who was the original source of
> the DNA being sequenced.
> 
> I was hoping I might be able to use Bio::SeqIO to manipulate the
> comments contained in an SCF-formatted trace file, but I'm finding that
> Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information.
> Since SCF is a widely-accepted standard for trace files, would it be
> reasonable to include fields like "scf_comments" and "scf_header" in a
> Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them?
> Likewise, it would be great if write_seq could pull these values right
> from a SequenceTrace object rather than requiring them as arguments.
> 
> I'd be happy to help in this effort if necessary.
> 
> Thanks,
> --Nancy
> 
> *************************************
> Nancy F. Hansen, PhD nhansen at nhgri.nih.gov
> Bioinformatics Group
> NIH Intramural Sequencing Center (NISC)
> 5625 Fishers Lane
> Rockville, MD 20852
> Phone: (301) 435-1560 Fax: (301) 435-6170
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Sun Oct  1 13:05:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 12:05:25 -0500
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>
	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>
	<451E3707.4090400@sendu.me.uk>
	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>
	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
Message-ID: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>


On Sep 30, 2006, at 4:43 PM, Hilmar Lapp wrote:

>
> On Sep 30, 2006, at 10:57 AM, Chris Fields wrote:
>
>> There should be a failed test to let us know of the problem.  As
>> currently set up, the XEMBL server failure doesn't show up in
>> Test::Harness test summaries.  Biblio_biofetch.t had the similar
>> problems before Brian's fixes.
>
> Just keep in mind that you may not want somebody's CPAN installation
> to fail (or require a 'forced' install) just because some server
> happens to be down for maintenance.
>
> 	-hilmar

I don't think this would be a problem unless users specifically set  
BIOPERLDEBUG to 1, which is something most people don't bother with  
before installation (and probably not something we should promote for  
normal installation anyway).  So, for CPAN installation we would  
suggest that BIOPERLDEBUG be 0 or not set at all, and outline the  
reasons why.

The idea is to retain current behavior (remote DB access will not be  
run unless BIOPERLDEBUG is set to 1) and apply it to all tests  
requiring such access.  Otherwise, just those tests are skipped (and  
not the rest of the tests, which occurs currently).  If BIOPERLDEBUG  
is set, the next tests would check the URL, which passes/fails (based  
on the specific value of $@), and runs/skips tests based on the mere  
presence of $@, which indicates some URL issue.  You can do this with  
Test::More, but I'm not sure this can be done with Test.pm or  
Test::Simple.

The current behavior just skips all tests based on a single failed  
URL.  Then, Test::Harness, as currently set, shows skipped tests as  
passed.  The last run I posted previously where XEMBL_DB.t remote DB  
tests failed, I also ran all tests (make test) and get this, which  
doesn't tell us that the remote URL failed:

-----------------------------------------

...
t/WABA.......................ok
t/XEMBL_DB...................ok
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext  
is not installed or is installed incorrectly - skipping ztr.t tests
ok
All tests successful, 5 subtests skipped.

-----------------------------------------


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct  1 13:17:24 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 12:17:24 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880609301610w7b838543t5f8ba7313d285915@mail.gmail.com>
References: <b99962880609271039s75cc4af4nc109cd637b5b267@mail.gmail.com>
	<7A592EAB-A869-4A6C-BFA8-F73F3DFD8F5B@gmx.net>
	<09FB1EB0-2C1C-4FCF-8339-E78556EFEFF2@uiuc.edu>
	<b99962880609280842w47401efnd6d00ff2a6e7fd98@mail.gmail.com>
	<8D75FE6D-C02D-4A86-93FA-B7256050AF11@uiuc.edu>
	<b99962880609280910i68a649fw38a4a77d514eccf@mail.gmail.com>
	<40155903-555A-4662-BCCE-38E5E3784118@uiuc.edu>
	<54E79A5F-5446-4D8E-AD26-B70894048D60@gmx.net>
	<b99962880609301444h3e0a8bd2y5d3ecb2ca9e222e6@mail.gmail.com>
	<1D69005A-DF0E-4F37-93FE-7577A32CC625@gmx.net>
	<b99962880609301610w7b838543t5f8ba7313d285915@mail.gmail.com>
Message-ID: <CAD572AC-B108-4520-8335-6B2F138905C9@uiuc.edu>

The '-w' flag on the shebang line is the source of those errors.  I  
never set it anymore on Windows due to this; I just use the 'use  
warnings' pragma.

If you use 'perl -I. t/test.t' you can normally get around the '-w'  
assumed by using 'make test'.

I will try running tests on bioperl-db and bioperl tomorrow on WinXP  
to confirm these.

Chris

On Sep 30, 2006, at 6:10 PM, Seth Johnson wrote:

> How do I get rid of all of the warnings for "redefined subroutines"  
> during
> the test??  It clutters the output and I can't see the errors.
>
> On 9/30/06, Hilmar Lapp <hlapp at gmx.net> wrote:
>>
>> It doesn't shed more light but it does raise an alert flag. All tests
>> are supposed to pass. The fact that they don't means the problems you
>> are seeing have nothing to do with your specific data or script.
>>
>> First off - can anyone else confirm those errors using the latest
>> Bioperl-db and Bioperl?
>>
>> Second - Seth could you run those tests individually, e.g., using
>>
>>         $ make test test_02species TEST_VERBOSE=1
>>
>> and similarly for the other tests that have failures and post the
>> output. Let's start with 02species and 03simpleseq.
>>
>>         -hilmar
>>
>> On Sep 30, 2006, at 5:44 PM, Seth Johnson wrote:
>>
>>> There are errors during the test. Here's their summary:
>>> ____________________________
>>> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
>>> -------------------------------------------------------------
>>> t\02species.t                 65    2   3.08%  63 65
>>> t\03simpleseq.t    1   256    59  106 179.66%  7-59
>>> t\04swiss.t                   52   14  26.92%  25 27-34 38-42
>>> t\12ontology.t     2   512   738 1471 199.32%  3-738
>>> t\16obda.t                    12    3  25.00%  10-12
>>> ____________________________
>>>
>>> May be that can shed some light on the problem?!?!
>>>
>>> On 9/29/06, Hilmar Lapp < hlapp at gmx.net> wrote:This may in fact be
>>> a knock-on effect of the fixes? <sigh>
>>>
>>> Seth, did you run the test suite that comes with bioperl-db, and did
>>> you get any errors?
>>>
>>>         -hilmar
>>>
>>> On Sep 28, 2006, at 2:26 PM, Chris Fields wrote:
>>>
>>>> Seth,
>>>>
>>>> The organism issue is a bug and has been reported, though I thought
>>>> it was fixed.
>>>>
>>>> The lack of the date and the version is a bit odd, but there have
>>>> been a lot of changes lately to bioperl-live (core bioperl in CVS),
>>>> and a few to bioperl-db.  How old is your bioperl and bioperl-db
>>>> installation.  Hilmar, any additional thoughts?
>>>>
>>>> Chris
>>>>
>>>> On Sep 28, 2006, at 11:10 AM, Seth Johnson wrote:
>>>>
>>>>> Thank you.  That takes care of that, however, I do have another
>>>>> gripe.  When
>>>>> running my script, quoted before, with "my $out =
>>>>> Bio::SeqIO->newFh('-format' => 'genbank');", I have several key
>>>>> pieces of
>>>>> information missing.  The most important one is the version
>>>>> number.  There's
>>>>> also a date missing, and source organism name is corrupted.
>>>>> Here's what I
>>>>> get:
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> LOCUS       NM_014580               2145 bp    dna     linear    
>>>>> UNK
>>>>> DEFINITION  Homo sapiens solute carrier family 2, (facilitated
>>>>> glucose
>>>>>             transporter) member 8 (SLC2A8), mRNA.
>>>>> ACCESSION   NM_014580
>>>>> SOURCE      sapiens.
>>>>>   ORGANISM  sapiens
>>>>>             Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa;
>>>>> Bilateria;
>>>>>             Coelomata; Deuterostomia; Chordata; Craniata;
>>> Vertebrata;
>>>>>             Gnathostomata; Teleostomi; Euteleostomi;  
>>>>> Sarcopterygii;
>>>>> Tetrapoda;
>>>>>             Amniota; Mammalia; Theria; Eutheria; Euarchontoglires;
>>>>> Primates;
>>>>>             Haplorrhini; Simiiformes; Catarrhini; Hominoidea;
>>>>> Hominidae;
>>>>>             Homo/Pan/Gorilla group; Homo.
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>> All of the missing information is stored in BioSQL and
>>>>> theoretically should
>>>>> be in the outpu. Here's how NCBI genbank file looks:
>>>>>
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>> LOCUS       NM_014580               2145 bp    mRNA    linear
>>>>> PRI 17-OCT-2005
>>>>> DEFINITION  Homo sapiens solute carrier family 2, (facilitated
>>>>> glucose
>>>>>             transporter) member 8 (SLC2A8), mRNA.
>>>>> ACCESSION   NM_014580
>>>>> VERSION     NM_014580.3  GI:51870928
>>>>> KEYWORDS    .
>>>>> SOURCE      Homo sapiens (human)
>>>>>   ORGANISM  Homo sapiens
>>>>> <http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606 >
>>>>>             Eukaryota; Metazoa; Chordata; Craniata; Vertebrata;
>>>>> Euteleostomi;
>>>>>             Mammalia; Eutheria; Euarchontoglires; Primates;
>>>>> Haplorrhini;
>>>>>             Catarrhini; Hominidae; Homo.
>>>>>
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>>
>>>>> On 9/28/06, Chris Fields <cjfields at uiuc.edu> wrote:
>>>>>>
>>>>>> Those are from the excessively paranoid '-w' flag on the shebang
>>>>>> line.  If you remove the flag but add the 'use warnings' pragma
>>> the
>>>>>> 'subroutine x redefined' warnings go away.  This, BTW, is one
>>> of the
>>>>>> quirks of the ActivePerl distribution; other OSs don't have the
>>> same
>>>>>> problem.
>>>>>>
>>>>>> The 'solution' described on that page is actually a workaround,
>>>>>> not a
>>>>>> bugfix.  It causes problems with stack traces with error handling
>>>>>> but
>>>>>> seems harmless beyond that.  I haven't been able to find a
>>>>>> satisfactory fix which works on all OS's.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>> On Sep 28, 2006, at 10:42 AM, Seth Johnson wrote:
>>>>>>
>>>>>>> This is under Windows, but using ActiveState Komodo 3.5 and  
>>>>>>> their
>>>>>>> latest Perl for Windows and latest BioPerl & BioPerl-db from  
>>>>>>> CVS.
>>>>>>>
>>>>>>> I actually just stumbled upon a solution.  It's described in the
>>>>>>> "Installing Bioperl on Windows" by adding a comma after
>>> $class: in
>>>>>>> Bio::Root::Root throw() subroutine.  Thanks for hinting me about
>>>>>>> what I run it on.
>>>>>>>
>>>>>>> The code works now, BUT it spews whole bunch of warnings about
>>>>>>> "Subroutine .... redefined":
>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\BioEntry
>>>>>>> .pm line 88.
>>>>>>> Subroutine object_id redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 128.
>>>>>>> Subroutine version redefined at c:/Perl/site/lib/Bio\BioEntry.pm
>>>>>>> line 150.
>>>>>>> Subroutine authority redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 171.
>>>>>>> Subroutine namespace redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 192.
>>>>>>> Subroutine display_name redefined at c:/Perl/site/lib/Bio
>>>>>>> \BioEntry.pm line 217.
>>>>>>> Subroutine description redefined at c:/Perl/site/lib/Bio
>>>>>>> \BioEntry.pm line 241.
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>> line
>>>>>>> 201.
>>>>>>> Subroutine verbose redefined at c:/Perl/site/lib/Bio\Root 
>>>>>>> \Root.pm
>>>>>>> line 234.
>>>>>>> Subroutine _register_for_cleanup redefined at c:/Perl/site/lib/
>>> Bio
>>>>>>> \Root\Root.pm line 246.
>>>>>>> Subroutine _unregister_for_cleanup redefined at c:/Perl/site/ 
>>>>>>> lib/
>>>>>>> Bio
>>>>>>> \Root\Root.pm line 256.
>>>>>>> Subroutine _cleanup_methods redefined at c:/Perl/site/lib/Bio
>>> \Root
>>>>>>> \Root.pm line 263.
>>>>>>> Subroutine throw redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>>>>>> line 316.
>>>>>>> Subroutine debug redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>>>>>> line 379.
>>>>>>> Subroutine _load_module redefined at c:/Perl/site/lib/Bio\Root
>>>>>>> \Root.pm line 398.
>>>>>>> Subroutine DESTROY redefined at c:/Perl/site/lib/Bio\Root 
>>>>>>> \Root.pm
>>>>>>> line 426.
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\RootI.pm
>>> line
>>>>>>> 117.
>>>>>>> Subroutine _initialize redefined at c:/Perl/site/lib/Bio\Root
>>>>>>> \RootI.pm line 128.
>>>>>>> ...
>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>>
>>>>>>>
>>>>>>> On 9/28/06, Chris Fields <cjfields at uiuc.edu> wrote: I had
>>> problems
>>>>>>> with bioperl-db on native WinXP (not cygwin), but I
>>>>>>> did manage to get it running in cygwin with some effort.  The
>>> issue
>>>>>>> on native WinXP was related to Bio::Root::Root::throw(), though.
>>>>>>>
>>>>>>> There is a bug and workaround filed on Bugzilla, but I haven't
>>>>>>> worked
>>>>>>> on it in a while (and the workaround has some problems as
>>> well).  I
>>>>>>> may try running it again to see what happens.
>>>>>>>
>>>>>>> http://bugzilla.open-bio.org/show_bug.cgi?id=1938
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On Sep 28, 2006, at 9:04 AM, Hilmar Lapp wrote:
>>>>>>>
>>>>>>>> Very odd. This is under Windows, presumably using Cygwin?
>>>>>>>>
>>>>>>>> The method Bio::Root::Root::throw() clearly exists, and
>>>>>>>> PersistentObject inherits from it. The exception it was
>>> trying to
>>>>>>>> throw has nothing to do with failure or success to find the
>>>>>>>> database
>>>>>>>> row (actually it did succeed since otherwise it wouldn't
>>> construct
>>>>>>>> the object) but with dynamically loading a class, presumably
>>>>>>>> Bio::DB::Persistent::Seq.
>>>>>>>>
>>>>>>>> Are you using the 1.5.x release of bioperl?
>>>>>>>>
>>>>>>>> Does anyone on the list have any experience with these sorts of
>>>>>>>> things on Windows?
>>>>>>>>
>>>>>>>> (Seth, I've moved this thread to the bioperl list, since  
>>>>>>>> this is
>>>>>>> what
>>>>>>>> the problem is about.)
>>>>>>>>
>>>>>>>>       -hilmar
>>>>>>>>
>>>>>>>> On Sep 27, 2006, at 1:39 PM, Seth Johnson wrote:
>>>>>>>>
>>>>>>>>> Hello guys,
>>>>>>>>>
>>>>>>>>> I successfully populated the biosql database, thanks to you.
>>>>>>>>> Now,
>>>>>>>>> I'm
>>>>>>>>> trying to retrieve a sequence from it following the example
>>> from
>>>>>>>>> BOSC2003
>>>>>>>>> slides and ran into uninformative error (at least to me it
>>>>>>>>> doesn't
>>>>>>>>> mean
>>>>>>>>> anyting).  I suspect that I'm missing something and hope you
>>> can
>>>>>>>>> point me in
>>>>>>>>> the right direction.  Here's my source code:
>>>>>>>>>
>>>>>>>
>>> -------------------------------------------------------------------
>>>>>>> --
>>>>>>>>> -
>>>>>>>>> ---
>>>>>>>>> #!/usr/bin/perl -w
>>>>>>>>> use strict;
>>>>>>>>> use warnings;
>>>>>>>>>
>>>>>>>>> use Bio::Seq;
>>>>>>>>> use Bio::Seq::SeqFactory;
>>>>>>>>> use Bio::DB::SimpleDBContext;
>>>>>>>>> use Bio::DB::BioDB;
>>>>>>>>>
>>>>>>>>> my $dbc = Bio::DB::SimpleDBContext->new(
>>>>>>>>>     -driver => 'mysql',
>>>>>>>>>     -dbname => 'BioSQL_1',
>>>>>>>>>     -host => ' 192.168.1.3',
>>>>>>>>>     -user => 'xxxxx',
>>>>>>>>>     -pass => 'xxxxxx'
>>>>>>>>> );
>>>>>>>>>
>>>>>>>>> my $db = Bio::DB::BioDB->new(-database  => 'biosql',
>>>>>>>>>                             -dbcontext => $dbc);
>>>>>>>>>
>>>>>>>>> my $seq = Bio::Seq->new(-accession_number => 'NM_014580', -
>>>>>>>>> namespace =>
>>>>>>>>> 'refseq_H_sapiens');
>>>>>>>>> my $seqfact = Bio::Seq::SeqFactory->new(-type => 'Bio::Seq');
>>>>>>>>> my $adp = $db->get_object_adaptor($seq);
>>>>>>>>> my $dbseq = $adp->find_by_unique_key($seq, -obj_factory =>
>>>>>>> $seqfact);
>>>>>>>>>
>>>>>>>>> my $out = Bio::SeqIO->newFh('-format' => 'EMBL');
>>>>>>>>> print $out $dbseq;
>>>>>>>>>
>>>>>>>>> exit;
>>>>>>>>>
>>> -----------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> Just when the "find_by_unique_key" function is executed I
>>> get the
>>>>>>>>> following
>>>>>>>>> error:
>>>>>>>>>
>>>>>>>>> ================================
>>>>>>>>> Undefined subroutine &Bio::Root::Root::throw called at
>>>>>>>>> c:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm line
>>> 199.
>>>>>>>>> ================================
>>>>>>>>>
>>>>>>>>> The sequence does exist in the database. I checked that.  Any
>>>>>>>>> ideas???
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best Regards,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Seth Johnson
>>>>>>>>> Senior Bioinformatics Associate
>>>>>>>>> _______________________________________________
>>>>>>>>> BioSQL-l mailing list
>>>>>>>>> BioSQL-l at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ===========================================================
>>>>>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>>>>>> ===========================================================
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>
>>>>>>> Christopher Fields
>>>>>>> Postdoctoral Researcher
>>>>>>> Lab of Dr. Robert Switzer
>>>>>>> Dept of Biochemistry
>>>>>>> University of Illinois Urbana-Champaign
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>>
>>>>>>>
>>>>>>> Seth Johnson
>>>>>>> Senior Bioinformatics Associate
>>>>>>>
>>>>>>> Ph: (202) 470-0900
>>>>>>> Fx: (775) 251-0358
>>>>>>
>>>>>> Christopher Fields
>>>>>> Postdoctoral Researcher
>>>>>> Lab of Dr. Robert Switzer
>>>>>> Dept of Biochemistry
>>>>>> University of Illinois Urbana-Champaign
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>>
>>>>>
>>>>> Seth Johnson
>>>>> Senior Bioinformatics Associate
>>>>>
>>>>> Ph: (202) 470-0900
>>>>> Fx: (775) 251-0358
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher
>>>> Lab of Dr. Robert Switzer
>>>> Dept of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>
>>>
>>> --
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>>
>>>
>>> Seth Johnson
>>> Senior Bioinformatics Associate
>>>
>>> Ph: (202) 470-0900
>>> Fx: (775) 251-0358
>>
>> --
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>>
>
>
> -- 
> Best Regards,
>
>
> Seth Johnson
> Senior Bioinformatics Associate
>
> Ph: (202) 470-0900
> Fx: (775) 251-0358
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From osborne1 at optonline.net  Sun Oct  1 17:49:47 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Sun, 01 Oct 2006 17:49:47 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061001183214.GB12075@iucha.net>
Message-ID: <C145B03B.A8A5%osborne1@optonline.net>

Florin,

This is fixed in CVS now. What had happened is that the DIP file had some
minimal protein (node) entries where the only id available was DIP's
internal identifier. Not ideal to have to use these as accessions but
there's no other choice.

Thank you for the note, and in the future write to bioperl-l since there may
be others who are interested in hearing about what you've encountered.

Brian O.


On 10/1/06 2:32 PM, "Florin Iucha" <florin at iucha.net> wrote:

> Hello,
> 
> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and
> I am using it to read the 20060402 edition release of the DIP [2] dataset.
> 
> Starting with the simple program you show in the man page:
> 
>    my $io = Bio::Network::IO->new(-format => 'psi',
>                                   -file   => $ARGV[0]);
> 
>    my $network = $io->next_network;
> 
> I get 772 instances of:
> 
>    Use of uninitialized value in string eq at
>    /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 326.
> 
> I don't know if it is just an annoyance or something bad, so you might
> want to take a look at it.
> 
> Thank you for your work,
> florin
> 
> [1] http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-network/
> [2] http://dip.doe-mbi.ucla.edu/


From osborne1 at optonline.net  Sun Oct  1 17:56:39 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Sun, 01 Oct 2006 17:56:39 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061001211844.GC12075@iucha.net>
Message-ID: <C145B1D7.A8A8%osborne1@optonline.net>

Florin,

I'm not seeing any segmentation fault using the same file you're using as
input (dip20060402.mif). I'm assuming you don't see this error when you use
smaller files as input, like those in the t/data directory.

When I watch the script in top I see Perl using about 135Mb (RSIZE) right
before the script exits. How much memory do you use?

Thank you for the note, and in the future write to bioperl-l since there may
be others who are interested in hearing about what you've encountered.

Brian O.


On 10/1/06 5:18 PM, "Florin Iucha" <florin at iucha.net> wrote:

> On Sun, Oct 01, 2006 at 01:32:14PM -0500, Florin Iucha wrote:
>> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and
>> I am using it to read the 20060402 edition release of the DIP [2] dataset.
> 
> Using the attached script, I am getting a segmentation fault at the
> end, right after printing "That's all, Folks!"  Maybe some cleanup is
> going off in a wrong direction.
> 
> florin


From florin at iucha.net  Sun Oct  1 20:24:03 2006
From: florin at iucha.net (Florin Iucha)
Date: Sun, 1 Oct 2006 19:24:03 -0500
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <C145B1D7.A8A8%osborne1@optonline.net>
References: <20061001211844.GC12075@iucha.net>
	<C145B1D7.A8A8%osborne1@optonline.net>
Message-ID: <20061002002403.GD12075@iucha.net>

On Sun, Oct 01, 2006 at 05:56:39PM -0400, Brian Osborne wrote:
> I'm not seeing any segmentation fault using the same file you're using as
> input (dip20060402.mif). I'm assuming you don't see this error when you use
> smaller files as input, like those in the t/data directory.

The t/data files are fine.

Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the
MINT [1] database does not produce the crash.  It has a new warning, however:

   Can't call method "text" on an undefined value at
   /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290.

> When I watch the script in top I see Perl using about 135Mb (RSIZE) right
> before the script exits. How much memory do you use?

"ps ux" tells me VSZ = 272788 and RSZ = 254992. This is on x86-64 with
64 bit perl.  The box has 2 GB of physical memory so these numbers
don't seem to be a concern.

> Thank you for the note, and in the future write to bioperl-l since there may
> be others who are interested in hearing about what you've encountered.

Do'h! You have the list address loud and clear in three places, but I got
your contact info from the AUTHORS.  Will use the proper channel from now
on!

Thanks,
florin

[1] ftp://mint.bio.uniroma2.it/pub/release/psi1/

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061001/901e447e/attachment-0003.bin>

From cjfields at uiuc.edu  Mon Oct  2 00:35:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 23:35:22 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880609301635w421fae0er3497ba655679f0bc@mail.gmail.com>
Message-ID: <000001c6e5dc$2eceabe0$15327e82@pyrimidine>

Seth,

What version of MySQL and perl are you using?  I'm using MySQL 5.0.18 (but
am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819.

I ran into a few problems with bioperl-db tests which were unrelated the
ones below, but I'm wondering if it is a difference in MySQL versions.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Seth Johnson
> Sent: Saturday, September 30, 2006 6:35 PM
> To: Hilmar Lapp
> Cc: Chris Fields; Bioperl List
> Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> 
> Here're complete test details:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

...

> FAILED tests 10-12
>     Failed 3/12 tests, 75.00% okay
> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> --------------------------------------------------------------------------
> -----
> t\02species.t                 65    2   3.08%  63 65
> t\03simpleseq.t    1   256    59  106 179.66%  7-59
> t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> t\12ontology.t     2   512   738 1471 199.32%  3-738
> t\16obda.t                    12    3  25.00%  10-12
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From torsten.seemann at infotech.monash.edu.au  Mon Oct  2 02:06:50 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Mon, 02 Oct 2006 16:06:50 +1000
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
References: <451C8ED8.2060003@infotech.monash.edu.au>
	<451CC40D.2030401@sendu.me.uk>
	<2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
Message-ID: <4520AC7A.1050009@infotech.monash.edu.au>


 >>> I have removed all use/@ISA Bio::Root::Object references from
 >>> bioperl-live, except for those in Bio::Root::* itself:

 >> So I'd say they're both relics that can be removed. In fact I was
 >> planning on getting rid off all references to both of these modules
 >> before you did, so thanks! :)

> I think they can go. It's probably a pre-1.0 deprecation that somehow  
> was never followed through on.

Today I did a fresh CVS checkout of bioperl-live, and deleted the 
following modules and tests, and all tests passed with BIOPERLDEBUG=0

     * Bio::Root::Err
     * Bio::Root::Global
     * Bio::Root::IOManager
     * Bio::Root::Object
     * Bio::Root::Storable
     * Bio::Root::Utilities  # may be used by third parties?
     * Bio::Root::Vector
     * Bio::Root::Xref
     * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
     * t/RootStorable.t

Should we schedule for deprecation, or deprecate immediately as Hilmar 
suggested they were meant to be deprecated long ago ?

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From bix at sendu.me.uk  Mon Oct  2 05:40:02 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 10:40:02 +0100
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>	<451E3707.4090400@sendu.me.uk>	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
	<3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
Message-ID: <4520DE72.4000603@sendu.me.uk>

Chris Fields wrote:
>
> The idea is to retain current behavior (remote DB access will not be  
> run unless BIOPERLDEBUG is set to 1) and apply it to all tests  
> requiring such access.  Otherwise, just those tests are skipped (and  
> not the rest of the tests, which occurs currently).  If BIOPERLDEBUG  
> is set, the next tests would check the URL, which passes/fails (based  
> on the specific value of $@), and runs/skips tests based on the mere  
> presence of $@, which indicates some URL issue.  You can do this with  
> Test::More, but I'm not sure this can be done with Test.pm or  
> Test::Simple.

Firstly, BIOPERLDEBUG should not be abused; it should be used only when 
you want to see extra debugging messages. There should be another 
variable that you can set to choose if network-requiring tests are run, 
and it should also be a configurable choice when you run perl Makefile.PL.

(But changing this isn't going to happen for 1.5.2)

When the server problem is ambiguous we should not fail the test. Just 
make the skip message visible and pass all ok...


> The current behavior just skips all tests based on a single failed  
> URL.  Then, Test::Harness, as currently set, shows skipped tests as  
> passed.  The last run I posted previously where XEMBL_DB.t remote DB  
> tests failed, I also ran all tests (make test) and get this, which  
> doesn't tell us that the remote URL failed:
> 
> -----------------------------------------
> 
> ...
> t/WABA.......................ok
> t/XEMBL_DB...................ok
> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext  
> is not installed or is installed incorrectly - skipping ztr.t tests
> ok
> All tests successful, 5 subtests skipped.

All you have to do to make it visible is start the skip message with the 
work 'Skip':

skip('Skip server may be down',1);

...
t/WABA.......................ok 

t/XEMBL_DB...................ok 

         1/9 skipped: server may be down
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is 
not installed or is installed incorrectly - skipping ztr.t tests
t/ztr........................ok


It's nicer when using Test::More.


From bix at sendu.me.uk  Mon Oct  2 05:55:27 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 10:55:27 +0100
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au>
References: <451C8ED8.2060003@infotech.monash.edu.au>	<451CC40D.2030401@sendu.me.uk>	<2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
	<4520AC7A.1050009@infotech.monash.edu.au>
Message-ID: <4520E20F.6040406@sendu.me.uk>

Torsten Seemann wrote:
>  >>> I have removed all use/@ISA Bio::Root::Object references from
>  >>> bioperl-live, except for those in Bio::Root::* itself:
> 
>  >> So I'd say they're both relics that can be removed. In fact I was
>  >> planning on getting rid off all references to both of these modules
>  >> before you did, so thanks! :)
> 
>> I think they can go. It's probably a pre-1.0 deprecation that somehow  
>> was never followed through on.
> 
> Today I did a fresh CVS checkout of bioperl-live, and deleted the 
> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> 
>      * Bio::Root::Err
>      * Bio::Root::Global
>      * Bio::Root::IOManager
>      * Bio::Root::Object
>      * Bio::Root::Storable
>      * Bio::Root::Utilities  # may be used by third parties?
>      * Bio::Root::Vector
>      * Bio::Root::Xref
>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>      * t/RootStorable.t
> 
> Should we schedule for deprecation, or deprecate immediately as Hilmar 
> suggested they were meant to be deprecated long ago ?

I'm happy to get rid of them all straight away. Does anyone object?


From florin at iucha.net  Sun Oct  1 21:40:07 2006
From: florin at iucha.net (Florin Iucha)
Date: Sun, 1 Oct 2006 20:40:07 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on
	AMD64
Message-ID: <20061002014007.GG12075@iucha.net>

Hello,

I am trying to install bioperl-network from CVS.  I found this to
require bioperl from CVS, which requires bioperl-ext from CVS.
I have compiled and installed io_lib 1.10.1.

After running "perl Makefile.PL; make test" in bioperl-ext I see a lot 
sources being compiled, then:

cc -c  -I./libs -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O2   -DVERSION=\"1.5.1\" -DXS_VERSION=\"1.5.1\" -fPIC "-I/usr/lib/perl/5.8/CORE"  -DPOSIX -DNOERROR Align.c
Running Mkbootstrap for Bio::Ext::Align ()
chmod 644 Align.bs
rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so
cc  -shared -L/usr/local/lib Align.o  -o ../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a  \
           -lm          \

/usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC
libs/libsw.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1
make[1]: Leaving directory `/scratch/dmbio/tools/bioperl-ext/Bio/Ext/Align'
make: *** [subdirs] Error 2

This is on a Debian AMD64 box:

florin at zeus $ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu
Thread model: posix
gcc version 4.1.2 20060901 (prerelease) (Debian 4.1.1-13)
florin at zeus $ perl -V
Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
  Platform:
    osname=linux, osvers=2.6.16-1-vserver-amd64-k8, archname=x86_64-linux-gnu-thread-multi
    uname='linux excelsior 2.6.16-1-vserver-amd64-k8 #2 smp tue apr 4 03:40:49 utc 2006 x86_64 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=define use64bitall=define uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion='', gccversion='4.1.2 20060729 (prerelease) (Debian 4.1.1-10)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.6.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8
    gnulibc_version='2.3.6'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl):
  Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT
                        PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ALL
                        USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES
                        USE_PERLIO USE_REENTRANT_API

The compiler command line for aln.o is lacking -fPIC:

cc -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN
-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -DPOSIX -DNOERROR   -c -o aln.o aln.c

Adding -fPIC to the CCFLAGS variable in Bio/Ext/Align/Makefile and
Makefile seems to take build further, but it fails with a similar
error in Bio/SeqIO/staden/_Inline/build/Bio/SeqIO/staden/read. That
Makefile seems to be regenerated every time I run 'make test' in the
top level directory.

The error in ../staden/read is:

rm -f blib/arch/auto/Bio/SeqIO/staden/read/read.so
cc  -shared -L/usr/local/lib read.o  -o blib/arch/auto/Bio/SeqIO/staden/read/read.so    \
           -L/usr/local/lib -lread -lz          \

/usr/bin/ld: /usr/local/lib/libread.a(libread_a-Read.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC
/usr/local/lib/libread.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [blib/arch/auto/Bio/SeqIO/staden/read/read.so] Error 1

So, the questions appears to be:
   - should "-fPIC" be appended to CFLAGS in the generated Makefiles?
   - is there anything wrong with io_lib flags?
   - has anybody built bioperl-ext on AMD64?

I can help with debugging or testing if given a gentle nudge in the right
direction, but I have little experience with the interactions between perl
and static libraries on 64 bit.

Thanks,
florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061001/bc134c7e/attachment-0003.bin>

From bix at sendu.me.uk  Mon Oct  2 06:52:47 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 11:52:47 +0100
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on	AMD64
In-Reply-To: <20061002014007.GG12075@iucha.net>
References: <20061002014007.GG12075@iucha.net>
Message-ID: <4520EF7F.40908@sendu.me.uk>

Florin Iucha wrote:
> Hello,
> 
> I am trying to install bioperl-network from CVS.  I found this to
> require bioperl from CVS, which requires bioperl-ext from CVS.

I can't help with the compile problems you encountered (other than to 
say I also have problems under AMD64), but from where did you get the 
idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
recent changes to Makefile.PL may give that impression...


From cjfields at uiuc.edu  Mon Oct  2 08:26:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 07:26:57 -0500
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <4520DE72.4000603@sendu.me.uk>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>	<451E3707.4090400@sendu.me.uk>	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
	<3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
	<4520DE72.4000603@sendu.me.uk>
Message-ID: <DAAC7FDC-0C03-4345-9E09-DBF04D521628@uiuc.edu>


On Oct 2, 2006, at 4:40 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>> The idea is to retain current behavior (remote DB access will not be
>> run unless BIOPERLDEBUG is set to 1) and apply it to all tests
>> requiring such access.  Otherwise, just those tests are skipped (and
>> not the rest of the tests, which occurs currently).  If BIOPERLDEBUG
>> is set, the next tests would check the URL, which passes/fails (based
>> on the specific value of $@), and runs/skips tests based on the mere
>> presence of $@, which indicates some URL issue.  You can do this with
>> Test::More, but I'm not sure this can be done with Test.pm or
>> Test::Simple.
>
> Firstly, BIOPERLDEBUG should not be abused; it should be used only  
> when
> you want to see extra debugging messages. There should be another
> variable that you can set to choose if network-requiring tests are  
> run,
> and it should also be a configurable choice when you run perl  
> Makefile.PL.
>
> (But changing this isn't going to happen for 1.5.2)
>
> When the server problem is ambiguous we should not fail the test. Just
> make the skip message visible and pass all ok...

I agree, as well as with your assessment of BIOPERLDEBUG (which I  
alluded to in a previous post).  Torsten suggested creating a new  
env. variable for network tests.

It's obvious this won't be done before 1.5.2, but we can make plans  
towards the next release.

>> The current behavior just skips all tests based on a single failed
>> URL.  Then, Test::Harness, as currently set, shows skipped tests as
>> passed.  The last run I posted previously where XEMBL_DB.t remote DB
>> tests failed, I also ran all tests (make test) and get this, which
>> doesn't tell us that the remote URL failed:
>>
>> -----------------------------------------
>>
>> ...
>> t/WABA.......................ok
>> t/XEMBL_DB...................ok
>> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext
>> is not installed or is installed incorrectly - skipping ztr.t tests
>> ok
>> All tests successful, 5 subtests skipped.
>
> All you have to do to make it visible is start the skip message  
> with the
> work 'Skip':
>
> skip('Skip server may be down',1);
>
> ...
> t/WABA.......................ok
>
> t/XEMBL_DB...................ok
>
>          1/9 skipped: server may be down
> t/ztr........................Bio::SeqIO::staden::read of bioperl- 
> ext is
> not installed or is installed incorrectly - skipping ztr.t tests
> t/ztr........................ok
>
>
> It's nicer when using Test::More.

Okay, if Test::Harness picks that up it would be okay.  We could use  
skip blocks to skip subsets of tests that require remote access (like  
SeqFeature.t) as opposed to skipping all tests.

I think we want to avoid promoting running tests with BIOPERLDEBUG  
(or similar) upon installation for everyday installation anyway (such  
as from CPAN, which Hilmar points out).  It's not something everybody  
installing a new BioPerl should be running unless they run into  
problems.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From florin at iucha.net  Mon Oct  2 08:15:06 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 07:15:06 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
	on	AMD64
In-Reply-To: <4520EF7F.40908@sendu.me.uk>
References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk>
Message-ID: <20061002121506.GB14409@iucha.net>

On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
> Florin Iucha wrote:
> > I am trying to install bioperl-network from CVS.  I found this to
> > require bioperl from CVS, which requires bioperl-ext from CVS.
> 
> I can't help with the compile problems you encountered (other than to 
> say I also have problems under AMD64), but from where did you get the 
> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
> recent changes to Makefile.PL may give that impression...

Running the tests for bioperl-live mention in some places that 'this
test has been skipped since $foo is not available' and I found the
'foos' in bioperl-ext.

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/8fc9df03/attachment-0003.bin>

From bix at sendu.me.uk  Mon Oct  2 10:05:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 15:05:11 +0100
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on	AMD64
In-Reply-To: <20061002121506.GB14409@iucha.net>
References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk>
	<20061002121506.GB14409@iucha.net>
Message-ID: <45211C97.2060800@sendu.me.uk>

Florin Iucha wrote:
> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
>> Florin Iucha wrote:
>>> I am trying to install bioperl-network from CVS.  I found this to
>>> require bioperl from CVS, which requires bioperl-ext from CVS.
>> I can't help with the compile problems you encountered (other than to 
>> say I also have problems under AMD64), but from where did you get the 
>> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
>> recent changes to Makefile.PL may give that impression...
> 
> Running the tests for bioperl-live mention in some places that 'this
> test has been skipped since $foo is not available' and I found the
> 'foos' in bioperl-ext.

Right, yes. The idea is, you'd only need to install bioperl-ext if you 
wanted to use the modules that the complaining tests test.
So if none of the things that were skipped matter to you, don't install ext.

I guess this needs to be clarified in documentation somewhere.


From cjfields at uiuc.edu  Mon Oct  2 10:13:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:13:56 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au>
Message-ID: <001801c6e62d$02c883d0$15327e82@pyrimidine>


>  >>> I have removed all use/@ISA Bio::Root::Object references from
>  >>> bioperl-live, except for those in Bio::Root::* itself:
> 
>  >> So I'd say they're both relics that can be removed. In fact I was
>  >> planning on getting rid off all references to both of these modules
>  >> before you did, so thanks! :)
> 
> > I think they can go. It's probably a pre-1.0 deprecation that somehow
> > was never followed through on.
> 
> Today I did a fresh CVS checkout of bioperl-live, and deleted the
> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> 
>      * Bio::Root::Err
>      * Bio::Root::Global
>      * Bio::Root::IOManager
>      * Bio::Root::Object
>      * Bio::Root::Storable
>      * Bio::Root::Utilities  # may be used by third parties?
>      * Bio::Root::Vector
>      * Bio::Root::Xref
>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>      * t/RootStorable.t
> 
> Should we schedule for deprecation, or deprecate immediately as Hilmar
> suggested they were meant to be deprecated long ago ?

I vote for quick deprecation; I had also noticed that these were superfluous
and added them as possible deprecations to the wiki page.  However, we need
to be careful about that 'third-party use' caveat you have for
Bio::Root::Utilities; there's another one with Bio::Root::Storable and
Ensembl:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2924/focus=2924

and it seems to have it's users:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/8242/focus=8242

The others (including Bio::Root::Utilities) haven't had any major threads on
the mail lists in a very long time.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> --
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Mon Oct  2 10:16:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:16:31 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of
	bioperl-exton	AMD64
In-Reply-To: <20061002121506.GB14409@iucha.net>
Message-ID: <001901c6e62d$5c4fac80$15327e82@pyrimidine>

They're not absolutely necessary; the tests are skipped w/o failure because
bioperl-ext is optional.  These are only necessary if you want the ability
to read sequence trace files.  

BTW, you might have a rough time on trying to install bioperl-ext depending
on your platform.  Note the following bug report:

http://bugzilla.open-bio.org/show_bug.cgi?id=2074

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Florin Iucha
> Sent: Monday, October 02, 2006 7:15 AM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-
> exton AMD64
> 
> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
> > Florin Iucha wrote:
> > > I am trying to install bioperl-network from CVS.  I found this to
> > > require bioperl from CVS, which requires bioperl-ext from CVS.
> >
> > I can't help with the compile problems you encountered (other than to
> > say I also have problems under AMD64), but from where did you get the
> > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though
> > recent changes to Makefile.PL may give that impression...
> 
> Running the tests for bioperl-live mention in some places that 'this
> test has been skipped since $foo is not available' and I found the
> 'foos' in bioperl-ext.
> 
> florin
> 
> --
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra


From osborne1 at optonline.net  Mon Oct  2 10:14:13 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 10:14:13 -0400
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520E20F.6040406@sendu.me.uk>
Message-ID: <C14696F5.A903%osborne1@optonline.net>

Sendu,

No objection but someone should check the scripts in examples/root to make
sure that they are not used there.

Brian O.


On 10/2/06 5:55 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> Torsten Seemann wrote:
>>>>> I have removed all use/@ISA Bio::Root::Object references from
>>>>> bioperl-live, except for those in Bio::Root::* itself:
>> 
>>>> So I'd say they're both relics that can be removed. In fact I was
>>>> planning on getting rid off all references to both of these modules
>>>> before you did, so thanks! :)
>> 
>>> I think they can go. It's probably a pre-1.0 deprecation that somehow
>>> was never followed through on.
>> 
>> Today I did a fresh CVS checkout of bioperl-live, and deleted the
>> following modules and tests, and all tests passed with BIOPERLDEBUG=0
>> 
>>      * Bio::Root::Err
>>      * Bio::Root::Global
>>      * Bio::Root::IOManager
>>      * Bio::Root::Object
>>      * Bio::Root::Storable
>>      * Bio::Root::Utilities  # may be used by third parties?
>>      * Bio::Root::Vector
>>      * Bio::Root::Xref
>>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>>      * t/RootStorable.t
>> 
>> Should we schedule for deprecation, or deprecate immediately as Hilmar
>> suggested they were meant to be deprecated long ago ?
> 
> I'm happy to get rid of them all straight away. Does anyone object?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From johnson.biotech at gmail.com  Mon Oct  2 10:21:50 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 2 Oct 2006 10:21:50 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <000001c6e5dc$2eceabe0$15327e82@pyrimidine>
References: <b99962880609301635w421fae0er3497ba655679f0bc@mail.gmail.com>
	<000001c6e5dc$2eceabe0$15327e82@pyrimidine>
Message-ID: <b99962880610020721j776d3801m4f5b49cd1bdf66c6@mail.gmail.com>

I'm using MySQL 5.0.19 and Perl v5.8.7 [MSWin32-x86-multi-thread]

On 10/2/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Seth,
>
> What version of MySQL and perl are you using?  I'm using MySQL 5.0.18 (but
> am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819.
>
> I ran into a few problems with bioperl-db tests which were unrelated the
> ones below, but I'm wondering if it is a difference in MySQL versions.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------
> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358


From osborne1 at optonline.net  Mon Oct  2 10:08:50 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 10:08:50 -0400
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on AMD64
In-Reply-To: <20061002014007.GG12075@iucha.net>
Message-ID: <C14695B2.A900%osborne1@optonline.net>

Florian,

Minor correction here, the Bioperl package does not require bioperl-ext.
However we see there is a problem compiling bioperl-ext...

Brian O.


On 10/1/06 9:40 PM, "Florin Iucha" <florin at iucha.net> wrote:

> I am trying to install bioperl-network from CVS.  I found this to
> require bioperl from CVS, which requires bioperl-ext from CVS.


From JK at novozymes.com  Mon Oct  2 10:05:34 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Mon, 2 Oct 2006 16:05:34 +0200
Subject: [Bioperl-l] Blast parser.
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net>


Hi. 

I've tried to use the blast-parser but I cannot get the original alignment
out of the parser. Is it possible to get that out of the 
Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a
clustalw alignment out when it isn't that type of alignment people are
used to get from blast. 

Thanks 

Jesper


From cjfields at uiuc.edu  Mon Oct  2 10:36:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:36:31 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <C14696F5.A903%osborne1@optonline.net>
Message-ID: <001d01c6e630$27792fb0$15327e82@pyrimidine>

> Sendu,
> 
> No objection but someone should check the scripts in examples/root to make
> sure that they are not used there.
> 
> Brian O.

I suppose it's also possible that the other bioperl distributions (like
bioperl-run) could use them as well.  

If they do we can take care of them as they pop up.  These are really old
and haven't been revised in a long time.  

The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
anyone know where Will Spooner is?  He's the maintainer for
Bio::Root::Storable.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct  2 11:01:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 10:01:44 -0500
Subject: [Bioperl-l] Blast parser.
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net>
Message-ID: <000001c6e633$ad0a6ce0$15327e82@pyrimidine>

The alignment that you get should come from GenericHSP, not BLASTHSP.
Either way, the HSP alignment that is retrieved using $hsp->get_aln() should
be a Bio::SimpleAlign object.  You can then output that to the proper
AlignIO format using an AlignIO stream object or use the Bio::SimpleAlign
methods for further analysis.  

my $aln = $hsp->get_aln();
my $alnout = Bio::AlignIO->new(-format => 'msf',
                               -fh  => \*STDOUT);
$alnout->write_aln($aln);

Quick note: not all AlignIO formats have write_aln() support at this time,
but most do.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of JK (Jesper Agerbo Krogh)
> Sent: Monday, October 02, 2006 9:06 AM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Blast parser.
> 
> 
> Hi.
> 
> I've tried to use the blast-parser but I cannot get the original alignment
> out of the parser. Is it possible to get that out of the
> Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a
> clustalw alignment out when it isn't that type of alignment people are
> used to get from blast.
> 
> Thanks
> 
> Jesper
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From whs at ebi.ac.uk  Mon Oct  2 12:00:19 2006
From: whs at ebi.ac.uk (Will Spooner)
Date: Mon, 2 Oct 2006 17:00:19 +0100 (BST)
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <001d01c6e630$27792fb0$15327e82@pyrimidine>
References: <001d01c6e630$27792fb0$15327e82@pyrimidine>
Message-ID: <Pine.LNX.4.64.0610021651550.1560@parrot.ebi.ac.uk>

On Mon, 2 Oct 2006, Chris Fields wrote:

>> Sendu,
>>
>> No objection but someone should check the scripts in examples/root to make
>> sure that they are not used there.
>>
>> Brian O.
>
> I suppose it's also possible that the other bioperl distributions (like
> bioperl-run) could use them as well.
>
> If they do we can take care of them as they pop up.  These are really old
> and haven't been revised in a long time.
>
> The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
> anyone know where Will Spooner is?  He's the maintainer for
> Bio::Root::Storable.
>

Hi Chris,

I'm still lurking...

If the tests for Bio::Root::Storable still pass (I assume that they do), 
then the module is working as advertised.

The idea behind Storable is very simple; object instances of any 
inhereting class can be serialised/retrieved from disk. BioPerl objects 
will probably not want this functionality by default, but it is trival to 
implement if needed.

Will


From cjfields at uiuc.edu  Mon Oct  2 13:58:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 12:58:15 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <Pine.LNX.4.64.0610021651550.1560@parrot.ebi.ac.uk>
Message-ID: <000601c6e64c$5746f990$15327e82@pyrimidine>

> On Mon, 2 Oct 2006, Chris Fields wrote:
> 
> >> Sendu,
> >>
> >> No objection but someone should check the scripts in examples/root to
> make
> >> sure that they are not used there.
> >>
> >> Brian O.
> >
> > I suppose it's also possible that the other bioperl distributions (like
> > bioperl-run) could use them as well.
> >
> > If they do we can take care of them as they pop up.  These are really
> old
> > and haven't been revised in a long time.
> >
> > The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
> > anyone know where Will Spooner is?  He's the maintainer for
> > Bio::Root::Storable.
> >
> 
> Hi Chris,
> 
> I'm still lurking...
> 
> If the tests for Bio::Root::Storable still pass (I assume that they do),
> then the module is working as advertised.
> 
> The idea behind Storable is very simple; object instances of any
> inhereting class can be serialised/retrieved from disk. BioPerl objects
> will probably not want this functionality by default, but it is trival to
> implement if needed.
> 
> Will

Okay, nice to know you're listening in!  Based on that we should keep it in.
The rest that Torsten mentioned could probably be removed right away.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From osborne1 at optonline.net  Mon Oct  2 13:59:58 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 13:59:58 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061002002403.GD12075@iucha.net>
Message-ID: <C146CBDE.A938%osborne1@optonline.net>

Florin,

OK, this is fixed in CVS now. The problem is that there's some variability
in how the PSI MI "standard" is used. In this case there was a species that
was not given a value for its scientific name ("fullName"), I had to use
common name in its place. Fortunately there's an NCBI taxon id behind all
this.

Thanks again,

Brian O.


On 10/1/06 8:24 PM, "Florin Iucha" <florin at iucha.net> wrote:

> Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the
> MINT [1] database does not produce the crash.  It has a new warning, however:
> 
>    Can't call method "text" on an undefined value at
>    /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290.


From mmacho at gmail.com  Mon Oct  2 13:43:13 2006
From: mmacho at gmail.com (ende)
Date: Mon, 2 Oct 2006 19:43:13 +0200
Subject: [Bioperl-l] Variable scope
Message-ID: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>


	Hi

this may be a typical perl topic and then out of this list center  
topic.  My apologize for any inconvenience.

It is a annoying problem that is making me waste lot of time.

I have a package with its new object, etc... and constants in it like:

#-----
use constant False => 0;
use constant True => 1;

our %CLRFG = (
               PLASMIDO      => RED,
               POLY_A        => GREEN,
               RESTR_SITES   => BLUE,
               CONECTORS     => MAGENTA,
               CONTAMINANTS  => CYAN,
           );

our %CLRBG = (
               PLASMIDO      => "",
               POLY_A        => "",
               RESTR_SITES   => "",
               CONECTORS     => "",
               CONTAMINANTS  => "",
           );
#------

this constants are include with require "h.pl" from the main package  
file.

I use this module from the mail command line driver to test it  
"using" it.  In the command line driver I can use with no gripe the  
constants False and True directly, for example "return True", etc  
without any reference to the origin of that constant.

But, with respect to the variables (I would like they also were  
constants.. but how?), %CLRFG and %CLRBG I can't find the way of  
refering those int the module.  Finally I have desisted and _copy_  
the definitions where  I have needed it (in the sub were I print Ansi  
terminal colouring seqs...).  I don't find how to refer those  
variables out of the module.

I have tried %modulename::CLRFG, for example, but Perl gives me errors.

Any help?


--
     Juan Falgueras
     Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n
     Universidad de M?laga


From cjfields at uiuc.edu  Mon Oct  2 16:52:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 15:52:11 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <C14696F5.A903%osborne1@optonline.net>
Message-ID: <000001c6e664$a25538d0$15327e82@pyrimidine>

I have updated the Deprecation page with the Bio::Root::* modules that we
plan on deprecating (note that I have them being removed for rel. 1.5.2).  I
have left out Bio::Root::Storable for now based on Will's response.  

http://www.bioperl.org/wiki/Deprecated_modules

I'll update the DEPRECATED doc in CVS as well.  There is a tentative
schedule for when warnings are added for modules before they are removed.  

In relation to the recent trend for house-cleaning, I noticed that all of
the Bio::Tools::BP* BLAST-related modules all are still present but haven't
been modified or had deprecation warnings added.  BPLite was marked for
deprecation around rel 1.5 since the functionality is present in
Bio::SearchIO, as well as the others.  Judging by the mail list, no one has
used these in quite a while, and everyone has been redirected to use
Bio::SearchIO instead.  Based on that I have added warnings in CVS for
deprecation to BPlite and the related modules BPpsilite and BPbl2seq.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Brian Osborne
> Sent: Monday, October 02, 2006 9:14 AM
> To: Sendu Bala; bioperl-l
> Subject: Re: [Bioperl-l] Do we need Bio::Root::Object anymore?
> 
> Sendu,
> 
> No objection but someone should check the scripts in examples/root to make
> sure that they are not used there.
> 
> Brian O.
> 
> 
> On 10/2/06 5:55 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:
> 
> > Torsten Seemann wrote:
> >>>>> I have removed all use/@ISA Bio::Root::Object references from
> >>>>> bioperl-live, except for those in Bio::Root::* itself:
> >>
> >>>> So I'd say they're both relics that can be removed. In fact I was
> >>>> planning on getting rid off all references to both of these modules
> >>>> before you did, so thanks! :)
> >>
> >>> I think they can go. It's probably a pre-1.0 deprecation that somehow
> >>> was never followed through on.
> >>
> >> Today I did a fresh CVS checkout of bioperl-live, and deleted the
> >> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> >>
> >>      * Bio::Root::Err
> >>      * Bio::Root::Global
> >>      * Bio::Root::IOManager
> >>      * Bio::Root::Object
> >>      * Bio::Root::Storable
> >>      * Bio::Root::Utilities  # may be used by third parties?
> >>      * Bio::Root::Vector
> >>      * Bio::Root::Xref
> >>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
> >>      * t/RootStorable.t
> >>
> >> Should we schedule for deprecation, or deprecate immediately as Hilmar
> >> suggested they were meant to be deprecated long ago ?
> >
> > I'm happy to get rid of them all straight away. Does anyone object?
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From florin at iucha.net  Mon Oct  2 16:47:01 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 15:47:01 -0500
Subject: [Bioperl-l] Variable scope
In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
Message-ID: <20061002204701.GG14409@iucha.net>

On Mon, Oct 02, 2006 at 07:43:13PM +0200, ende wrote:
> It is a annoying problem that is making me waste lot of time.
> 
> I have a package with its new object, etc... and constants in it like:
> 
> #-----
> use constant False => 0;
> use constant True => 1;
> 
> our %CLRFG = (
>                PLASMIDO      => RED,
>                POLY_A        => GREEN,
>                RESTR_SITES   => BLUE,
>                CONECTORS     => MAGENTA,
>                CONTAMINANTS  => CYAN,
>            );
> 
> our %CLRBG = (
>                PLASMIDO      => "",
>                POLY_A        => "",
>                RESTR_SITES   => "",
>                CONECTORS     => "",
>                CONTAMINANTS  => "",
>            );
> #------
> 
> this constants are include with require "h.pl" from the main package  
> file.
> 
> I use this module from the mail command line driver to test it  
> "using" it.  In the command line driver I can use with no gripe the  
> constants False and True directly, for example "return True", etc  
> without any reference to the origin of that constant.

It is possible you get them from somewhere else.

> But, with respect to the variables (I would like they also were  
> constants.. but how?), %CLRFG and %CLRBG I can't find the way of  
> refering those int the module.  Finally I have desisted and _copy_  
> the definitions where  I have needed it (in the sub were I print Ansi  
> terminal colouring seqs...).  I don't find how to refer those  
> variables out of the module.
> 
> I have tried %modulename::CLRFG, for example, but Perl gives me errors.

Did you actually declare a package name in "h.pl" ?

Is there any reason you don't call the file ".pm" and load it with
"use"?  I have attached a small example of importing that works.

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: one.pm
Type: text/x-perl
Size: 118 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0009.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: two.pl
Type: text/x-perl
Size: 69 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0010.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0011.bin>

From Kevin.M.Brown at asu.edu  Mon Oct  2 19:44:50 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 2 Oct 2006 16:44:50 -0700
Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module
Message-ID: <1A4207F8295607498283FE9E93B775B4021960CD@EX02.asurite.ad.asu.edu>

Well, for anyone that wants to know, I found a way to capture the output
of ClustalW to get at things like the score.

Copy STDOUT to another handle
open(OUTCOPY, ">&STDOUT") or die "Couldn't dup STDOUT: $!";

Change where STDOUT goes
open(STDOUT, ">log.test") or die "Couldn't open log.test: $!";

Run the alignment and its output will be captured by the STDOUT
redirection
$aln, $factory->align(\@seq);

Restore STDOUT to its normal location for the rest of the script
close STDOUT;
open(STDOUT, ">&OUTCOPY");

I guess I can understand why most of this is just dropped by the
ClustalW.pm module since there doesn't seem to be a way to hold it all
in a SimpleAlign object.

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Kevin Brown
> Sent: Thursday, September 28, 2006 2:48 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module
> 
> I've gotten a very simple script to run using bioperl that creates an
> alignment using clustalw of two sequences.  I see that clustal outputs
> to stdout information like the score, but I don't see any way to store
> that or retrieve that from the alignment object that is 
> returned (unless
> I'm just blind).  What follows is my very basic script which used code
> found in the Wiki.
> 
> print $aln->score() spits out an error about using an uninitialized
> value.
> 
> 
> #!/usr/bin/perl -w
> 
> use strict;
> use Bio::SeqIO;
> use Bio::Perl;
> use Bio::AlignIO;
> use Getopt::Long qw(:config no_ignore_case bundling pass_through);
> use POSIX;
> use Bio::Tools::Run::Alignment::Clustalw;
> 
> my $fileName   = "";         # filename(s) to be parsed for 
> information
> my $output_dir = "";
> my $format     = 'fasta';    # default format for SeqIO module
> 
> GetOptions(
>                    'file=s'   => \$fileName,
>                    'output=s' => \$output_dir,
>                   );
> 
> # Parse the input file for the needed information
> # SeqIO supports several normal formats including <tab>, <fasta> and
> <excel>
> 
> my @files = split(/\|/, $fileName);
> my @seq_array;
> 
> my $stream_out =
>   Bio::AlignIO->new(-file => '>test.msf', -format => 'msf', -flush =>
> 0);
> 
> foreach my $fileName (@files)
> {
>         my $file = Bio::SeqIO->new(-format => $format, -file =>
> $fileName);
>         my $seq;
>         while ($seq = $file->next_seq())
>         {
>                 push(@seq_array, $seq);
>         }
> }
> 
> my @params  = ('ktuple' => 2, 'matrix' => 'BLOSUM');
> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
> my $ktuple  = 3;
> $factory->ktuple($ktuple);    # change the parameter before executing
>     # where @seq_array is an array of {{PM|Bio::Seq}} objects
> 
> open my $out, ">seq.txt";
> 
> for (my $i = 1 ; $i <= $#seq_array ; $i++)
> {
>         my @seq = ($seq_array[0], $seq_array[$i]);
>         my $aln = $factory->align(\@seq);
>         $stream_out->write_aln($aln);
>         print $aln->score;
>         for my $seq ($aln->each_seq) {
>                 print $out $seq->display_id() ."\t". $seq->seq()."\n";
>         }
> }
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bix at sendu.me.uk  Mon Oct  2 19:48:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 00:48:34 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
Message-ID: <4521A552.60301@sendu.me.uk>

Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
upload tar.gz files when I have access to the server, then reply here 
with links.

In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
instructions on getting and testing this RC.

Developers:
   Make sure you're in the AUTHORS file in all 4 packages, as
   appropriate.

Users:
   Even though 1.5.2 is a 'developer' release, we consider it the most
   stable and capable version of Bioperl, and recommend that you use
   it in all but the most critical production environments. Please
   try it out and let us know of any problems or difficulties you run
   into.


Thank you,
Sendu.


From lincoln.stein at gmail.com  Mon Oct  2 17:53:38 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 2 Oct 2006 21:53:38 +0000
Subject: [Bioperl-l] Variable scope
In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
Message-ID: <6dce9a0b0610021453va2132c7u73747b9253211a66@mail.gmail.com>

Hi,

Read the documentation in Export. It is much better to formally export
constants, variables and functions and to import them with "use" than to use
"require". Also be sure that you understand how namespaces and modules work.

This is not a BioPerl topic and should have been directed to a general Perl
discussion list, such as Perl Monks.

Lincoln

On 10/2/06, ende <mmacho at gmail.com> wrote:
>
>
>         Hi
>
> this may be a typical perl topic and then out of this list center
> topic.  My apologize for any inconvenience.
>
> It is a annoying problem that is making me waste lot of time.
>
> I have a package with its new object, etc... and constants in it like:
>
> #-----
> use constant False => 0;
> use constant True => 1;
>
> our %CLRFG = (
>                PLASMIDO      => RED,
>                POLY_A        => GREEN,
>                RESTR_SITES   => BLUE,
>                CONECTORS     => MAGENTA,
>                CONTAMINANTS  => CYAN,
>            );
>
> our %CLRBG = (
>                PLASMIDO      => "",
>                POLY_A        => "",
>                RESTR_SITES   => "",
>                CONECTORS     => "",
>                CONTAMINANTS  => "",
>            );
> #------
>
> this constants are include with require "h.pl" from the main package
> file.
>
> I use this module from the mail command line driver to test it
> "using" it.  In the command line driver I can use with no gripe the
> constants False and True directly, for example "return True", etc
> without any reference to the origin of that constant.
>
> But, with respect to the variables (I would like they also were
> constants.. but how?), %CLRFG and %CLRBG I can't find the way of
> refering those int the module.  Finally I have desisted and _copy_
> the definitions where  I have needed it (in the sub were I print Ansi
> terminal colouring seqs...).  I don't find how to refer those
> variables out of the module.
>
> I have tried %modulename::CLRFG, for example, but Perl gives me errors.
>
> Any help?
>
>
>
>
> --
>      Juan Falgueras
>      Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n
>      Universidad de M?laga
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From florin at iucha.net  Mon Oct  2 22:30:31 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 21:30:31 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <20061003023031.GI14409@iucha.net>

On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.
> 
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
> instructions on getting and testing this RC.

[I won't create a wiki account just to report this.]

Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
not set.  Lots of warnings about missing packages and all, but this
looks interesting:

   Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.

Otherwise:

   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay.

The failed test is:

   t/ESEfinder..................dubious
      Test returned status 255 (wstat 65280, 0xff00)
   DIED. FAILED test 15

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra


From cjfields at uiuc.edu  Mon Oct  2 23:50:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 22:50:47 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>

So far all tests pass on Mac OS X.  I'll add this to the release page.

This RC will throw warnings for four tests I didn't remove in time  
(BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which  
correspond to their namesake deprecated Bio::Tools modules.  These  
are no longer in CVS HEAD so should be gone by the next RC, and the  
relevant modules marked for deprecation.

I can verify the Bio::DB::SeqFeature.t warning on Mac OS X that  
Florin reported, but ESEFinder.t works fine:

t/BioDBSeqFeature............Argument "+" isn't numeric in numeric lt  
(<) at Bio/DB/SeqFeature/Segment.pm line 423.
ok
....

I'll report WinXP tests tomorrow on the wiki.

Chris


On Oct 2, 2006, at 6:48 PM, Sendu Bala wrote:

> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll
> upload tar.gz files when I have access to the server, then reply here
> with links.
>
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
>
> Developers:
>    Make sure you're in the AUTHORS file in all 4 packages, as
>    appropriate.
>
> Users:
>    Even though 1.5.2 is a 'developer' release, we consider it the most
>    stable and capable version of Bioperl, and recommend that you use
>    it in all but the most critical production environments. Please
>    try it out and let us know of any problems or difficulties you run
>    into.
>
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct  2 23:54:29 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 22:54:29 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003023031.GI14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
Message-ID: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>

> [I won't create a wiki account just to report this.]
>
> Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> not set.  Lots of warnings about missing packages and all, but this
> looks interesting:
>
>    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ 
> SeqFeature/Segment.pm line 423.

This is verified on Mac OS X.

> Otherwise:
>
>    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,  
> 99.99% okay.
>
> The failed test is:
>
>    t/ESEfinder..................dubious
>       Test returned status 255 (wstat 65280, 0xff00)
>    DIED. FAILED test 15

What do you get when you run that set of tests using 'perl -I. -w t/ 
ESEFinder.t'?  The bad status code is odd and could be a remote  
server issue.

Chris


>
> florin
>
> -- 
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Tue Oct  3 00:30:06 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 03 Oct 2006 14:30:06 +1000
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
Message-ID: <4521E74E.1040404@infotech.monash.edu.au>

My understanding is that all Bioperl-compliant classes should inherit 
from Bio::Root::Root, not Bio::Root::RootI.

Additionally, if functions such as throw() or _rearrange() are to be 
used without a class instance reference, they are to be used as class 
methods via Bio::Root::Root, not Bio::Root::RootI.

Is this correct?

My naive audit of bioperl-live CVS brought up the following statistics:

# Root.pm
/cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
26
/cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
346

# RootI.pm
/cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
9
/cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
79

My guess would be that all RootI should be changed to plain Root ?

Any help appreciated,

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From jason at bioperl.org  Tue Oct  3 02:03:17 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 2 Oct 2006 23:03:17 -0700
Subject: [Bioperl-l] t/ESEFinder.t fixed on branch
Message-ID: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>

Looks like good work everyone.

All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1  
with RC1 except for the t/ESEFinder problem which I've fixed.

It skipped too few tests when BIOPERLDEBUG=0.

Don't forget to merge branch changes back to head for this test when  
it is done.   I don't want to muddy water so I'm holding off  
migrating the changes to main trunk as the files is substantially  
different (I presume pre-Test::More adoption?).

-jason


From bix at sendu.me.uk  Tue Oct  3 03:28:48 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:28:48 +0100
Subject: [Bioperl-l] t/ESEFinder.t fixed on branch
In-Reply-To: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>
References: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>
Message-ID: <45221130.2060405@sendu.me.uk>

Jason Stajich wrote:
> Looks like good work everyone.
> 
> All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1  
> with RC1 except for the t/ESEFinder problem which I've fixed.
> 
> It skipped too few tests when BIOPERLDEBUG=0.
> 
> Don't forget to merge branch changes back to head for this test when  
> it is done.   I don't want to muddy water so I'm holding off  
> migrating the changes to main trunk as the files is substantially  
> different (I presume pre-Test::More adoption?).

Actually, it was the same until Torsten made his own (different) fixes 
to HEAD but not to branch. It was my mistake and I've corrected in yet a 
third way, and now branch and HEAD match.

No harm done :)


From bix at sendu.me.uk  Tue Oct  3 03:31:10 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:31:10 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>
References: <4521A552.60301@sendu.me.uk>
	<7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>
Message-ID: <452211BE.6080107@sendu.me.uk>

Chris Fields wrote:
> So far all tests pass on Mac OS X.  I'll add this to the release page.
> 
> This RC will throw warnings for four tests I didn't remove in time  
> (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which  
> correspond to their namesake deprecated Bio::Tools modules.  These  
> are no longer in CVS HEAD so should be gone by the next RC, and the  
> relevant modules marked for deprecation.

Thanks Chris. Sorry I missed these.


From bix at sendu.me.uk  Tue Oct  3 03:32:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:32:08 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003023031.GI14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
Message-ID: <452211F8.8040104@sendu.me.uk>

Florin Iucha wrote:
> On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote:
>> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
>> upload tar.gz files when I have access to the server, then reply here 
>> with links.
>>
>> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
>> instructions on getting and testing this RC.
> 
> [I won't create a wiki account just to report this.]
> 
> Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> not set.  Lots of warnings about missing packages and all, but this
> looks interesting:
> 
>    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.
> 
> Otherwise:
> 
>    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay.
> 
> The failed test is:
> 
>    t/ESEfinder..................dubious
>       Test returned status 255 (wstat 65280, 0xff00)
>    DIED. FAILED test 15

Thanks for your feedback Florin. The ESEfinder fail will be fixed in the 
next RC.


From bix at sendu.me.uk  Tue Oct  3 04:29:37 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 09:29:37 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <45221F71.40206@sendu.me.uk>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.

Live/core:
http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-1.5.2-RC1.zip

Run:
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.zip

DB:
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.zip

Network:
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.zip

Md5 checksums are in:
http://bioperl.org/DIST/SIGNATURES.md5


From jason at bioperl.org  Tue Oct  3 02:11:30 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 2 Oct 2006 23:11:30 -0700
Subject: [Bioperl-l]  Use of Root.pm versus RootI.pm
Message-ID: <87F9B64E-8BDA-464B-814D-3F117AA646A1@bioperl.org>

I only briefly saw your question - but RootI is for interfaces,  
Root.pm is for instantiated objects.


From florin at iucha.net  Tue Oct  3 07:39:12 2006
From: florin at iucha.net (Florin Iucha)
Date: Tue, 3 Oct 2006 06:39:12 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
Message-ID: <20061003113912.GJ14409@iucha.net>

On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote:
> >Otherwise:
> >
> >   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,  
> >99.99% okay.
> >
> >The failed test is:
> >
> >   t/ESEfinder..................dubious
> >      Test returned status 255 (wstat 65280, 0xff00)
> >   DIED. FAILED test 15

$ perl -I. -w t/ESEfinder.t
1..15
ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
ok 2 - use Data::Dumper;
ok 3 - use Bio::PrimarySeq;
ok 4 - use Bio::Seq;
ok 5
ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
# Looks like you planned 15 tests but only ran 14.
$ grep Id t/ESEfinder.t
# $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra


From hlapp at gmx.net  Tue Oct  3 08:27:46 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 3 Oct 2006 08:27:46 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au>
References: <4521E74E.1040404@infotech.monash.edu.au>
Message-ID: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net>

The interface classes (those ending in 'I') should actually inherit  
from RootI, not Root.

In reality this recommendation is more theoretical than it makes that  
much of a difference I think. The motivation is that interface  
classes should not determine the actual implementation of a class  
(hash ref, array ref, whatever), and since Root.pm contains lots of  
implementation using a hash ref that decision will basically have  
been made.

On the contrary though, RootI contains implementation too, although  
I'm not sure it would prescribe the object implementation as opposed  
to merely implementing static methods (like throw(), warn(), etc).  
That would need to be checked.

	-hilmar

On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:

> My understanding is that all Bioperl-compliant classes should inherit
> from Bio::Root::Root, not Bio::Root::RootI.
>
> Additionally, if functions such as throw() or _rearrange() are to be
> used without a class instance reference, they are to be used as class
> methods via Bio::Root::Root, not Bio::Root::RootI.
>
> Is this correct?
>
> My naive audit of bioperl-live CVS brought up the following  
> statistics:
>
> # Root.pm
> /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> 26
> /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
> 346
>
> # RootI.pm
> /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> 9
> /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
> 79
>
> My guess would be that all RootI should be changed to plain Root ?
>
> Any help appreciated,
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct  3 08:33:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 07:33:37 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003113912.GJ14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
	<20061003113912.GJ14409@iucha.net>
Message-ID: <44724E16-74CD-4778-B04F-529475B47E37@uiuc.edu>

Florin,

Looks like this is fixed and should be working in the next release.

Chris

On Oct 3, 2006, at 6:39 AM, Florin Iucha wrote:

> On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote:
>>> Otherwise:
>>>
>>>   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,
>>> 99.99% okay.
>>>
>>> The failed test is:
>>>
>>>   t/ESEfinder..................dubious
>>>      Test returned status 255 (wstat 65280, 0xff00)
>>>   DIED. FAILED test 15
>
> $ perl -I. -w t/ESEfinder.t
> 1..15
> ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
> ok 2 - use Data::Dumper;
> ok 3 - use Bio::PrimarySeq;
> ok 4 - use Bio::Seq;
> ok 5
> ok 6 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 7 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 8 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 9 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 10 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 11 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 12 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 13 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 14 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> # Looks like you planned 15 tests but only ran 14.
> $ grep Id t/ESEfinder.t
> # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $
>
> florin
>
> -- 
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct  3 10:29:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 09:29:51 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net>
Message-ID: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>

> The interface classes (those ending in 'I') should actually inherit
> from RootI, not Root.
> 
> In reality this recommendation is more theoretical than it makes that
> much of a difference I think. The motivation is that interface
> classes should not determine the actual implementation of a class
> (hash ref, array ref, whatever), and since Root.pm contains lots of
> implementation using a hash ref that decision will basically have
> been made.
> 
> On the contrary though, RootI contains implementation too, although
> I'm not sure it would prescribe the object implementation as opposed
> to merely implementing static methods (like throw(), warn(), etc).
> That would need to be checked.
> 
> 	-hilmar

The constructor in Bio::Root::RootI lets one know that its use is
deprecated, so you shouldn't have any cases of 'our qw(Bio::Root::RootI)';
there should be some way of inheriting Root directly or indirectly.  I would
say that any direct use of RootI is not good practice, though.  For the
current implementation we should only inherit Bio::Root::Root, which
implements RootI.

Is there any reason to shut off the warning with BIOPERLDEBUG?  

>From RootI:

sub new {
  my $class = shift;
  my @args = @_;
  unless ( $ENV{'BIOPERLDEBUG'} ) {
      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
Bio::Root::Root instead");
  }
  eval "require Bio::Root::Root";
  return Bio::Root::Root->new(@args);
}


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> 
> On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> 
> > My understanding is that all Bioperl-compliant classes should inherit
> > from Bio::Root::Root, not Bio::Root::RootI.
> >
> > Additionally, if functions such as throw() or _rearrange() are to be
> > used without a class instance reference, they are to be used as class
> > methods via Bio::Root::Root, not Bio::Root::RootI.
> >
> > Is this correct?
> >
> > My naive audit of bioperl-live CVS brought up the following
> > statistics:
> >
> > # Root.pm
> > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > 26
> > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
> > 346
> >
> > # RootI.pm
> > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > 9
> > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
> > 79
> >
> > My guess would be that all RootI should be changed to plain Root ?
> >
> > Any help appreciated,
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From slenk at emich.edu  Tue Oct  3 13:31:47 2006
From: slenk at emich.edu (Stephen Gordon Lenk)
Date: Tue, 03 Oct 2006 13:31:47 -0400
Subject: [Bioperl-l] Perl 6 has 'roles' - may be cleanly applicable to the
	Root/RootI issue
Message-ID: <5147da5514e402.514e4025147da5@emich.edu>

I looked at the Perl6 site, there is an RFC on interfaces:
http://dev.perl.org/perl6/rfc/265.html

Roles seem to be the Perl 6 answer to the Root/RootI issue in Bioperl. 
Maybe it is too early to suggest this.

http://dev.perl.org/perl6/doc/design/apo/A12.html:
The primary role of a class is to manage instances, that is, objects. 
So a class must worry about object creation and destruction, and 
everything that happens in between. Classes have a secondary role as 
units of software reuse, in that they can be inherited from or 
delegated to. However, because this is a secondary role, and because 
of weaknesses in models of inheritance, composition, and delegation, 
Perl 6 will split out the notion of software reuse into a separate 
class-like entity called a "role". Roles are an abstraction mechanism 
for use by classes that don't care about the secondary aspects of 
software reuse, or that (looking at it the other way) care so much 
about it that they want to encapsulate any decisions about 
implementation, composition, delegation, and maybe even inheritance. 
Sounds fancy, but just think of them as includes of partial classes, 
with some safety checks. Roles don't manage objects. They manage 
interfaces and other abstract behavior (like default implementations), 
and they help classes manage objects. As such, a role may only be 
composed into a class or into another role, never inherited from or 
delegated to. That's what classes are for.


From slenk at emich.edu  Tue Oct  3 12:45:15 2006
From: slenk at emich.edu (Stephen Gordon Lenk)
Date: Tue, 03 Oct 2006 12:45:15 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
Message-ID: <5120d6a511f5a7.511f5a75120d6a@emich.edu>

The separation of interface and implementation is generally
regarded as a good idea. Right now the Bioperl community is
doing this as part of the implementation of Bioperl. I suggest
that this is an example of something which you might want to
have as part of the Perl implementation. If Perl 6 (or even
Perl 5) does not have this as a core part of the language or
as a standard package (reusable by all in a common fashion),
you may want to suggest to the Perl implementers that a way
for interface/implementation distinctions be made part of the
core language. My 2 cents, as you people are the experts on 
your own code.


----- Original Message -----
From: Chris Fields <cjfields at uiuc.edu>
Date: Tuesday, October 3, 2006 10:29 am
Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm

> > The interface classes (those ending in 'I') should actually inherit
> > from RootI, not Root.
> > 
> > In reality this recommendation is more theoretical than it makes 
> that> much of a difference I think. The motivation is that interface
> > classes should not determine the actual implementation of a class
> > (hash ref, array ref, whatever), and since Root.pm contains lots of
> > implementation using a hash ref that decision will basically have
> > been made.
> > 
> > On the contrary though, RootI contains implementation too, although
> > I'm not sure it would prescribe the object implementation as 
opposed
> > to merely implementing static methods (like throw(), warn(), etc).
> > That would need to be checked.
> > 
> > 	-hilmar
> 
> The constructor in Bio::Root::RootI lets one know that its use is
> deprecated, so you shouldn't have any cases of 'our 
> qw(Bio::Root::RootI)';there should be some way of inheriting Root 
> directly or indirectly.  I would
> say that any direct use of RootI is not good practice, though.  
> For the
> current implementation we should only inherit Bio::Root::Root, which
> implements RootI.
> 
> Is there any reason to shut off the warning with BIOPERLDEBUG?  
> 
> >From RootI:
> 
> sub new {
>  my $class = shift;
>  my @args = @_;
>  unless ( $ENV{'BIOPERLDEBUG'} ) {
>      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
> Bio::Root::Root instead");
>  }
>  eval "require Bio::Root::Root";
>  return Bio::Root::Root->new(@args);
> }
> 
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> > 
> > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> > 
> > > My understanding is that all Bioperl-compliant classes should 
> inherit> > from Bio::Root::Root, not Bio::Root::RootI.
> > >
> > > Additionally, if functions such as throw() or _rearrange() are 
> to be
> > > used without a class instance reference, they are to be used 
> as class
> > > methods via Bio::Root::Root, not Bio::Root::RootI.
> > >
> > > Is this correct?
> > >
> > > My naive audit of bioperl-live CVS brought up the following
> > > statistics:
> > >
> > > # Root.pm
> > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > > 26
> > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | 
> wc -l
> > > 346
> > >
> > > # RootI.pm
> > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > > 9
> > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | 
> wc -l
> > > 79
> > >
> > > My guess would be that all RootI should be changed to plain 
> Root ?
> > >
> > > Any help appreciated,
> > >
> > > --
> > > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > > Victorian Bioinformatics Consortium, Monash University, Australia
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > 
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Tue Oct  3 13:49:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 12:49:35 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <5120d6a511f5a7.511f5a75120d6a@emich.edu>
Message-ID: <000001c6e714$4c2cbb80$15327e82@pyrimidine>

Perl6 already has added flexibility for separation of
implementation/interface (I believe they are called roles).  

http://dev.perl.org/perl6/doc/design/syn/S12.html

To tell the truth, I'm not sure about Perl 5, except the way the Bioperl
devs have up the distinction between interface and implementation.  However,
I find the way we use interfaces is very simple (set up interface with
some/all methods as unimplemented, use the module as an abstract base class,
then override the unimplemented methods).  It works for me.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Stephen Gordon Lenk [mailto:slenk at emich.edu]
> Sent: Tuesday, October 03, 2006 11:45 AM
> To: Chris Fields
> Cc: 'Hilmar Lapp'; 'Torsten Seemann'; 'bioperl-l'
> Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm
> 
> The separation of interface and implementation is generally
> regarded as a good idea. Right now the Bioperl community is
> doing this as part of the implementation of Bioperl. I suggest
> that this is an example of something which you might want to
> have as part of the Perl implementation. If Perl 6 (or even
> Perl 5) does not have this as a core part of the language or
> as a standard package (reusable by all in a common fashion),
> you may want to suggest to the Perl implementers that a way
> for interface/implementation distinctions be made part of the
> core language. My 2 cents, as you people are the experts on
> your own code.
> 
> 
> ----- Original Message -----
> From: Chris Fields <cjfields at uiuc.edu>
> Date: Tuesday, October 3, 2006 10:29 am
> Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm
> 
> > > The interface classes (those ending in 'I') should actually inherit
> > > from RootI, not Root.
> > >
> > > In reality this recommendation is more theoretical than it makes
> > that> much of a difference I think. The motivation is that interface
> > > classes should not determine the actual implementation of a class
> > > (hash ref, array ref, whatever), and since Root.pm contains lots of
> > > implementation using a hash ref that decision will basically have
> > > been made.
> > >
> > > On the contrary though, RootI contains implementation too, although
> > > I'm not sure it would prescribe the object implementation as
> opposed
> > > to merely implementing static methods (like throw(), warn(), etc).
> > > That would need to be checked.
> > >
> > > 	-hilmar
> >
> > The constructor in Bio::Root::RootI lets one know that its use is
> > deprecated, so you shouldn't have any cases of 'our
> > qw(Bio::Root::RootI)';there should be some way of inheriting Root
> > directly or indirectly.  I would
> > say that any direct use of RootI is not good practice, though.
> > For the
> > current implementation we should only inherit Bio::Root::Root, which
> > implements RootI.
> >
> > Is there any reason to shut off the warning with BIOPERLDEBUG?
> >
> > >From RootI:
> >
> > sub new {
> >  my $class = shift;
> >  my @args = @_;
> >  unless ( $ENV{'BIOPERLDEBUG'} ) {
> >      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
> > Bio::Root::Root instead");
> >  }
> >  eval "require Bio::Root::Root";
> >  return Bio::Root::Root->new(@args);
> > }
> >
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > >
> > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> > >
> > > > My understanding is that all Bioperl-compliant classes should
> > inherit> > from Bio::Root::Root, not Bio::Root::RootI.
> > > >
> > > > Additionally, if functions such as throw() or _rearrange() are
> > to be
> > > > used without a class instance reference, they are to be used
> > as class
> > > > methods via Bio::Root::Root, not Bio::Root::RootI.
> > > >
> > > > Is this correct?
> > > >
> > > > My naive audit of bioperl-live CVS brought up the following
> > > > statistics:
> > > >
> > > > # Root.pm
> > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > > > 26
> > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio |
> > wc -l
> > > > 346
> > > >
> > > > # RootI.pm
> > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > > > 9
> > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio |
> > wc -l
> > > > 79
> > > >
> > > > My guess would be that all RootI should be changed to plain
> > Root ?
> > > >
> > > > Any help appreciated,
> > > >
> > > > --
> > > > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > > > Victorian Bioinformatics Consortium, Monash University, Australia
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > >
> > > --
> > > ===========================================================
> > > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > > ===========================================================
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >


From cmlapid at up.edu.ph  Tue Oct  3 22:06:06 2006
From: cmlapid at up.edu.ph (Carlo Lapid)
Date: Wed, 4 Oct 2006 10:06:06 +0800
Subject: [Bioperl-l] genbank mirror
Message-ID: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>

Hi,

I'm trying to set up a local mirror of a large part of the Genbank database.
For users to access the local database, I need to create a web-based search
tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
flat files I've downloaded based on a query entered by the user.

I'm trying to use Bioperl to create this from scratch, but I'm having a very
hard time, especially since I want the user to have reasonable flexibility
in customizing his search. The best that I've been able to accomplish is a
search function that retrieves genbank sequence objects based on their
primary IDs or accession numbers; by using the fetch method of the
Bio::Index::GenBank module. But this doesn't help users who don't know the
exact IDs for the sequences they want.

Can anybody suggest a way to use Bioperl to search for an ordinary word or
phrase, like "16S gene", which could be matched against the description
field, or the entire genbank entry? (Alternatively, is there some other
freely available tool or software that can do this?) I've been scouring the
Bioperl documentation, but I couldn't find anything. I just need to be
pointed in the right direction. What I thought was a relatively simple
problem has been driving me crazy for days; if anybody has any suggestions I
would really, really appreciate it.


From torsten.seemann at infotech.monash.edu.au  Tue Oct  3 22:58:03 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 04 Oct 2006 12:58:03 +1000
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
References: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <4523233B.7030505@infotech.monash.edu.au>

> I'm trying to set up a local mirror of a large part of the Genbank database.
> For users to access the local database, I need to create a web-based search
> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
> flat files I've downloaded based on a query entered by the user.

Have you coinsidered bioperl-db / BioSQL ?

http://www.bioperl.org/wiki/BioPerl_db
http://lists.open-bio.org/pipermail/biosql-l/

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From osborne1 at optonline.net  Tue Oct  3 23:16:20 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Tue, 03 Oct 2006 23:16:20 -0400
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <C1489FC4.AA43%osborne1@optonline.net>

Carlo,

You might want to look at the Bio::DB::Query::GenBank module:

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_dat
abase

However this works through NCBI's own eutils API, setting it up to query a
local mirror may be very difficult.


Brian O.


On 10/3/06 10:06 PM, "Carlo Lapid" <cmlapid at up.edu.ph> wrote:

> Hi,
> 
> I'm trying to set up a local mirror of a large part of the Genbank database.
> For users to access the local database, I need to create a web-based search
> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
> flat files I've downloaded based on a query entered by the user.
> 
> I'm trying to use Bioperl to create this from scratch, but I'm having a very
> hard time, especially since I want the user to have reasonable flexibility
> in customizing his search. The best that I've been able to accomplish is a
> search function that retrieves genbank sequence objects based on their
> primary IDs or accession numbers; by using the fetch method of the
> Bio::Index::GenBank module. But this doesn't help users who don't know the
> exact IDs for the sequences they want.
> 
> Can anybody suggest a way to use Bioperl to search for an ordinary word or
> phrase, like "16S gene", which could be matched against the description
> field, or the entire genbank entry? (Alternatively, is there some other
> freely available tool or software that can do this?) I've been scouring the
> Bioperl documentation, but I couldn't find anything. I just need to be
> pointed in the right direction. What I thought was a relatively simple
> problem has been driving me crazy for days; if anybody has any suggestions I
> would really, really appreciate it.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From osborne1 at optonline.net  Tue Oct  3 23:28:06 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Tue, 03 Oct 2006 23:28:06 -0400
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <4523233B.7030505@infotech.monash.edu.au>
Message-ID: <C148A286.AA47%osborne1@optonline.net>

Torsten and Carlo,

Right. For some simple examples of using Bio::DB::Query::BioQuery to query a
BioSQL db take a look at Bio::DB::BioSQL::OBDA.

You may also want to take a look at NCBI's eutils API, it's quite powerful
but not local. Or the ENSEMBL API, people have set up their own local
ENSEMBL dbs. There's an example of this API here:

http://www.bioperl.org/wiki/Getting_Genomic_Sequences


Brian O.


On 10/3/06 10:58 PM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

>> I'm trying to set up a local mirror of a large part of the Genbank database.
>> For users to access the local database, I need to create a web-based search
>> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
>> flat files I've downloaded based on a query entered by the user.
> 
> Have you coinsidered bioperl-db / BioSQL ?
> 
> http://www.bioperl.org/wiki/BioPerl_db
> http://lists.open-bio.org/pipermail/biosql-l/


From torsten.seemann at infotech.monash.edu.au  Wed Oct  4 01:21:24 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 04 Oct 2006 15:21:24 +1000
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
Message-ID: <452344D4.8070908@infotech.monash.edu.au>

Hi all,

Now that we have Perl 5.6.1 as a minimum, the following modules are 
standard: File::Spec, File::Temp, File::Path

Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() 
which currently dispatch to the File:: version, or try to emulate it. We 
don't need to emulate anymore. Jason Stajich suggested in a previous 
post that they should be deprecated, and that users should use directly 
the File:: functions themselves.

I have an uncommitted simplified version of Bio::Root::IO which does 
this, and "all tests pass". The functions currently (silently) dispatch 
directly to their native counterparts.

The only tricky function is tempfile() which is *mostly* like 
File::Temp::tempfile(), but does some voodoo of converting 
(TEMPLATE=>'xxx') to the non-hash first parameter of the File:: version, 
so I'm hesitant to commit. It may do other magic - Hilmar?

Comments?

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From gianluca.debellis at itb.cnr.it  Wed Oct  4 05:25:26 2006
From: gianluca.debellis at itb.cnr.it (Gianluca De Bellis)
Date: Wed, 04 Oct 2006 11:25:26 +0200
Subject: [Bioperl-l] Bioperl under WinXP
Message-ID: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>

I'm trying to use Bioperl under WinXP-SP2 (novice)

Bioperl has been just downloaded  (v 1.2.3)

Even the simplest program with a single command (use Bio::Perl;) ends up in
an error of the Perl interpreter with these details

AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll

ModVer: 0.0.0.0      Offset: 00003294

Coming from the  windos reporting system

Where is the problem?

 
Thanks in advance


From epsteinj at mail.nih.gov  Wed Oct  4 07:25:57 2006
From: epsteinj at mail.nih.gov (Epstein, Jonathan A (NIH/NICHD) [E])
Date: Wed, 4 Oct 2006 07:25:57 -0400
Subject: [Bioperl-l] genbank mirror
References: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <42504F69898FE546B3F0238C9BD03275532603@NIHCESMLBX7.nih.gov>

There's Seqhound:
  http://seqhound.blueprint.org/report.html

We set this up locally, and it's probably the most comprehensive free solution out there, but it's non-trivial to setup. Also, since the Blueprint&BIND have lost most of their funding, I'm not sure how long you can count on SeqHound to remain operational (although for now it is being updated).

Jonathan


-----Original Message-----
From: Carlo Lapid [mailto:cmlapid at up.edu.ph]
Sent: Tue 10/3/2006 10:06 PM
To: bioperl-l at bioperl.org
Subject: [Bioperl-l] genbank mirror
 
Hi,

I'm trying to set up a local mirror of a large part of the Genbank database.
For users to access the local database, I need to create a web-based search
tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
flat files I've downloaded based on a query entered by the user.

I'm trying to use Bioperl to create this from scratch, but I'm having a very
hard time, especially since I want the user to have reasonable flexibility
in customizing his search. The best that I've been able to accomplish is a
search function that retrieves genbank sequence objects based on their
primary IDs or accession numbers; by using the fetch method of the
Bio::Index::GenBank module. But this doesn't help users who don't know the
exact IDs for the sequences they want.

Can anybody suggest a way to use Bioperl to search for an ordinary word or
phrase, like "16S gene", which could be matched against the description
field, or the entire genbank entry? (Alternatively, is there some other
freely available tool or software that can do this?) I've been scouring the
Bioperl documentation, but I couldn't find anything. I just need to be
pointed in the right direction. What I thought was a relatively simple
problem has been driving me crazy for days; if anybody has any suggestions I
would really, really appreciate it.
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Wed Oct  4 09:19:45 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 04 Oct 2006 14:19:45 +0100
Subject: [Bioperl-l] Bioperl under WinXP
In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
References: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
Message-ID: <4523B4F1.3010305@sendu.me.uk>

Gianluca De Bellis wrote:
> I'm trying to use Bioperl under WinXP-SP2 (novice)
> 
> Bioperl has been just downloaded  (v 1.2.3)
> 
> Even the simplest program with a single command (use Bio::Perl;) ends up in
> an error of the Perl interpreter with these details
> 
> AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll
> 
> ModVer: 0.0.0.0      Offset: 00003294
> 
> Coming from the  windos reporting system
> 
> Where is the problem?

Hard to say. Do non-bioperl scripts work?

Make sure to follow the Bioperl installation instructions carefully:
http://bioperl.org/wiki/Installing_Bioperl_on_Windows

And make sure to install at least version 1.4. 1.2.3 is ancient and 
effectively unsupported.


From cjfields at uiuc.edu  Wed Oct  4 10:03:34 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 4 Oct 2006 09:03:34 -0500
Subject: [Bioperl-l] Bioperl under WinXP
In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
Message-ID: <000601c6e7bd$e22ad190$15327e82@pyrimidine>

If you're using PPM, you can install a (much) newer version of BioPerl from
here:

http://www.gmod.org/ggb/ppm/

Add that as one of your repositories in PPM4 (seeing that you are using
ActivePerl 5.8.8.819), then search for bioperl.  The version should be
1.512.

In a few weeks we'll be releasing a new developer release.  A WinXP PPM is
expected, as well as a bundled package to install all prerequisites.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Gianluca De Bellis
> Sent: Wednesday, October 04, 2006 4:25 AM
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] Bioperl under WinXP
> 
> I'm trying to use Bioperl under WinXP-SP2 (novice)
> 
> Bioperl has been just downloaded  (v 1.2.3)
> 
> Even the simplest program with a single command (use Bio::Perl;) ends up
> in
> an error of the Perl interpreter with these details
> 
> AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll
> 
> ModVer: 0.0.0.0      Offset: 00003294
> 
> Coming from the  windos reporting system
> 
> Where is the problem?
> 
> 
> 
> Thanks in advance
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at gmx.net  Wed Oct  4 10:25:23 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 4 Oct 2006 10:25:23 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>
References: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>
Message-ID: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net>


On Oct 3, 2006, at 10:29 AM, Chris Fields wrote:

> The constructor in Bio::Root::RootI lets one know that its use is
> deprecated, so you shouldn't have any cases of 'our qw 
> (Bio::Root::RootI)';

Don't confuse the constructor with the inheritance tree.

Interface classes should never be instantiated, hence the  
constructor, consistent with the documentation, should never get  
executed.

> there should be some way of inheriting Root directly or  
> indirectly.  I would
> say that any direct use of RootI is not good practice, though.

I don't know what you mean by 'directly' or 'indirectly' but  
inheritance from interfaces, and interfaces extending (inheriting  
from) other interfaces, is certainly standard practice. I'm not sure  
at all why it would be a bad one.

> For the current implementation we should only inherit  
> Bio::Root::Root, which
> implements RootI.

For the implementation classes, yes. For the interface classes, no.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Oct  4 10:43:54 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 4 Oct 2006 10:43:54 -0400
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <452344D4.8070908@infotech.monash.edu.au>
References: <452344D4.8070908@infotech.monash.edu.au>
Message-ID: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>


On Oct 4, 2006, at 1:21 AM, Torsten Seemann wrote:

> Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree()
> which currently dispatch to the File:: version, or try to emulate  
> it. We
> don't need to emulate anymore. Jason Stajich suggested in a previous
> post that they should be deprecated, and that users should use  
> directly
> the File:: functions themselves.

I don't think there's a need to deprecate - if the methods just plain  
delegate to whatever File:: module is appropriate their  
implementation (supposedly) will become very simple and hence won't  
pose a maintenance burden anymore.

One can still recommend for all new scripts or modules or code  
written to use the File:: modules directly, just I'm not sure there's  
a need to tell users that they should start changing their existing  
stuff.

>
> I have an uncommitted simplified version of Bio::Root::IO which does
> this, and "all tests pass". The functions currently (silently)  
> dispatch
> directly to their native counterparts.
>
> The only tricky function is tempfile() which is *mostly* like
> File::Temp::tempfile(), but does some voodoo of converting
> (TEMPLATE=>'xxx') to the non-hash first parameter of the File::  
> version,
> so I'm hesitant to commit. It may do other magic - Hilmar?

Not that I would know of. If the tests pass (without having to change  
them!) I'd give it a try.

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Oct  4 11:35:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 4 Oct 2006 10:35:16 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net>
Message-ID: <001901c6e7ca$b12fd5b0$15327e82@pyrimidine>

...
> Don't confuse the constructor with the inheritance tree.
> 
> Interface classes should never be instantiated, hence the
> constructor, consistent with the documentation, should never get
> executed.

I know that interfaces shouldn't be instantiated.  I had noticed there are
cases of 'our qw (Bio::Root::RootI)' where it is completely acceptable to
inherit the interface.  Makes sense to me now.

> > there should be some way of inheriting Root directly or
> > indirectly.  I would
> > say that any direct use of RootI is not good practice, though.
> 
> I don't know what you mean by 'directly' or 'indirectly' but
> inheritance from interfaces, and interfaces extending (inheriting
> from) other interfaces, is certainly standard practice. I'm not sure
> at all why it would be a bad one.

I was talking specifically about inheriting RootI, and not about all Bioperl
interfaces in general.  I completely understand the use of
interface/implementation in Bioperl.  However, I missed one small fact until
yesterday (of course AFTER I posed my reply), which was that interfaces may
inherit RootI directly.  My oops.

I had understood that, in general, any Bioperl implementation should not
inherit the RootI interface directly (they should inherit Root, since that
implements RootI).  The 'constructor' present in RootI is essentially to
make sure that no one inherits from the wrong class.

Probably a bad use of the terms 'direct' and 'indirect', so maybe I didn't
get that across very well.  What I meant was that all classes inherit Root
in some way, either 'directly' (as the direct parent class) or 'indirectly'
(through the inheritance tree). Probably comes from being primarily a
molecular microbiologist and not a computer scientist.

OT, but it would be nice to have an updated class diagram to sort out the
inheritance hierarchy a bit easier.  In the meantime, the Deobfuscator does
help quite a bit.

> > For the current implementation we should only inherit
> > Bio::Root::Root, which
> > implements RootI.
> 
> For the implementation classes, yes. For the interface classes, no.

I agree (see above).  That's the one small bit about interfaces I missed
along the way.  Makes sense; they use throw_not_implemented(), which is a
RootI method.

> 	-hilmar

Chris


From pmiguel at purdue.edu  Wed Oct  4 15:38:51 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Wed, 04 Oct 2006 15:38:51 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <45240DCB.2080204@purdue.edu>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.
>
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
> instructions on getting and testing this RC.
>
> Developers:
>    Make sure you're in the AUTHORS file in all 4 packages, as
>    appropriate.
>
> Users:
>    Even though 1.5.2 is a 'developer' release, we consider it the most
>    stable and capable version of Bioperl, and recommend that you use
>    it in all but the most critical production environments. Please
>    try it out and let us know of any problems or difficulties you run
>    into.
>
>
> Thank you,
> Sendu.
>   
I didn't see any tests done under solaris, so I asked our sys admin to 
do the install on one of our machines.

Just another data point:

He installed this release candidate on a Sun E450 box running solaris. 
uname -a gives:

SunOS descartes 5.10 Generic_118833-18 sun4u sparc SUNW,Ultra-4

perl -v gives:

This is perl, v5.8.8 built for sun4-solaris
(etc.)


$ time make test
PERL_DL_NONLAZY=1 /usr/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/AAChange...................ok
t/AAReverseMutate............ok
t/abi........................Bio::SeqIO::staden::read from bioperl-ext is not installed or is installed incorrectly - skipping abi.t tests
t/abi........................ok
t/ace........................ok
t/AlignIO....................ok
t/AlignStats.................ok
t/AlignUtil..................ok
t/alignUtilities.............ok
t/Allele.....................ok
t/Alphabet...................ok
t/Annotation.................ok
t/AnnotationAdaptor..........ok
t/asciitree..................ok
t/Assembly...................ok
        1/19 skipped:
t/Biblio.....................ok
t/Biblio_biofetch............ok
t/Biblio_eutils..............ok
t/BiblioReferences...........ok
t/BioDBGFF...................ok
t/BioDBSeqFeature............ok 1/46Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.
t/BioDBSeqFeature............ok
t/BioDBSeqFeature_BDB........ok
t/BioDBSeqFeature_mysql......ok 3/46prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT sequence,offset
   FROM sequence as s,locationlist as ll
   WHERE s.id=ll.id
     AND ll.seqname= ?
     AND offset >= ?
     AND offset <= ?
   ORDER BY offset
) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT sequence,offset
   FROM sequence as s,locationlist as ll
   WHERE s.id=ll.id
     AND ll.seqname= ?
     AND offset >= ?
     AND offset <= ?
   ORDER BY offset
) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
t/BioDBSeqFeature_mysql......ok
t/BioFetch_DB................ok
t/BioGraphics................ok
t/BlastIndex.................ok 1/13
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BlastIndex.................ok
t/BPbl2seq...................
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPbl2seq...................ok 1/108
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPbl2seq...................ok
t/BPlite.....................ok 1/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok 52/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok 88/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
STACK Bio::Tools::BPlite::new /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/Tools/BPlite.pm:197
STACK toplevel t/BPlite.t:127

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok
t/BPpsilite..................
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPpsilite..................ok 4/11
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPpsilite..................ok
t/bsml_sax...................ok
t/Chain......................ok
t/chaosxml...................ok
t/cigarstring................ok
t/ClusterIO..................ok
t/Coalescent.................ok
t/CodonTable.................ok
t/Compatible.................ok
t/consed.....................ok
t/CoordinateGraph............ok
t/CoordinateMapper...........ok
t/Correlate..................ok
t/ctf........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ctf.t tests
t/ctf........................ok
t/CytoMap....................ok
t/DB.........................skipped
        all skipped: Skipping all tests since they require network access, set BIOPERLDEBUG=1 to test
t/DBCUTG.....................ok
        11/34 skipped: Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
t/DBFasta....................ok
t/DNAMutation................ok
t/Domcut.....................ok
t/ECnumber...................ok
t/ELM........................ok 1/13
-------------------- WARNING ---------------------
MSG: sleeping for 1 seconds

---------------------------------------------------
t/ELM........................ok
t/embl.......................ok
t/EMBL_DB....................ok
t/EMBOSS_Tools...............ok
t/EncodedSeq.................ok
t/entrezgene.................ok 491/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 695/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 723/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 824/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok
t/ePCR.......................ok
t/ESEfinder..................ok 1/15# Looks like you planned 15 tests but only ran 14.
t/ESEfinder..................dubious
        Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED test 15
        Failed 1/15 tests, 93.33% okay (less 9 skipped tests: 5 okay, 33.33%)
t/est2genome.................ok
t/EUtilities.................skipped
        all skipped: Set BIOPERLDEBUG=1 to run tests
t/Exception..................ok
t/Exonerate..................ok
t/exp........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping exp.t tests
t/exp........................ok
t/fasta......................ok
t/FeatureIO..................ok 7/33
-------------------- WARNING ---------------------
MSG: '##feature-ontology' directive handling not yet implemented
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: '##attribute-ontology' directive handling not yet implemented
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: '##source-ontology' directive handling not yet implemented
---------------------------------------------------
t/FeatureIO..................ok
t/flat.......................ok
t/FootPrinter................ok
t/game.......................ok
t/GbrowseGFF.................ok
t/gcg........................ok
t/GDB........................ok
t/Gel........................ok
t/genbank....................ok
t/GeneCoordinateMapper.......ok
t/Geneid.....................ok
t/Genewise...................ok
        2/51 skipped:
t/Genomewise.................ok
t/Genpred....................ok
t/GFF........................ok
t/GOR4.......................ok
t/GOterm.....................ok
t/GraphAdaptor...............ok
t/GuessSeqFormat.............ok
t/hmmer......................ok
t/hmmer_pull.................ok
t/HNN........................ok
t/HtSNP......................ok
t/Index......................ok
t/InstanceSite...............ok
t/interpro...................ok
t/InterProParser.............ok
t/IUPAC......................ok
t/kegg.......................ok
t/largefasta.................ok
t/LargeLocatableSeq..........ok
t/largepseq..................ok
t/lasergene..................ok
t/LinkageMap.................ok
t/LiveSeq....................ok
t/LocatableSeq...............ok
t/Location...................ok
t/LocationFactory............ok
t/LocusLink..................ok
t/lucy.......................ok
t/Map........................ok
t/MapIO......................ok
t/masta......................ok
t/Matrix.....................ok
t/Measure....................ok
t/MeSH.......................ok
t/metafasta..................ok
t/MetaSeq....................ok
t/MicrosatelliteMarker.......ok
t/MiniMIMentry...............ok
t/MitoProt...................ok
t/Molphy.....................ok
t/MultiFile..................ok
t/multiple_fasta.............ok
t/Mutation...................ok
t/Mutator....................ok
t/NetPhos....................ok
        10/14 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/Node.......................ok
t/obo_parser.................ok
t/OddCodes...................ok
t/OMIMentry..................ok
t/OMIMentryAllelicVariant....ok
t/OMIMparser.................ok
t/Ontology...................ok
t/OntologyEngine.............ok
t/OntologyStore..............ok
t/PAML.......................ok
t/Perl.......................ok
t/phd........................ok
t/Phenotype..................ok
t/PhylipDist.................ok
t/PhysicalMap................ok
t/pICalculator...............ok
t/Pictogram..................ok
t/pir........................ok
t/pln........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping pln.t tests
t/pln........................ok
t/PopGen.....................ok
        2/89 skipped:
t/PopGenSims.................ok
t/primaryqual................ok
t/PrimarySeq.................ok
t/primedseq..................ok
t/Primer.....................ok
t/primer3....................ok
t/Promoterwise...............ok
t/ProtDist...................ok
t/protgraph..................ok
t/ProtMatrix.................ok
t/ProtPsm....................ok
t/Pseudowise.................ok
t/psm........................ok
t/QRNA.......................ok
t/qual.......................ok
t/RandDistFunctions..........ok
t/RandomTreeFactory..........ok
t/Range......................ok
t/RangeI.....................ok
t/raw........................ok
t/RefSeq.....................ok
t/Registry...................ok
t/Relationship...............ok
t/RelationshipType...........ok
t/RemoteBlast................ok
        11/13 skipped: to avoid timeout
t/RepeatMasker...............ok
t/RestrictionAnalysis........ok
t/RestrictionEnzyme..........ok 1/14
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::RestrictionEnzyme is deprecatedUse Bio::Restriction classes instead
---------------------------------------------------
t/RestrictionEnzyme..........ok
t/RestrictionIO..............ok
t/RNAChange..................ok
t/rnamotif...................ok
t/RootI......................ok
t/RootIO.....................ok
        2/27 skipped: various reasons
t/RootStorable...............ok
t/Scansite...................ok
t/scf........................ok
t/SearchDist.................ok
t/SearchIO...................ok
t/Seg........................ok
t/Seq........................ok
t/seq_quality................ok
t/SeqAnalysisParser..........ok
t/SeqBuilder.................ok
t/SeqDiff....................ok
t/SeqFeatCollection..........ok
t/SeqFeature.................ok
t/seqfeaturePrimer...........ok
t/SeqHound_DB................ok 4/14Writing into 'shoundlog' log file.
t/SeqHound_DB................ok
t/SeqIO......................ok
t/SeqPattern.................ok
t/seqread_fail...............ok
t/SeqStats...................ok
t/SequenceFamily.............ok
t/sequencetrace..............ok
t/SeqUtils...................ok
t/SeqVersion.................ok
t/seqwithquality.............ok
t/SeqWords...................ok
t/Sigcleave..................ok
t/Signalp....................ok
t/Sim4.......................ok
t/SimilarityPair.............ok
t/SimpleAlign................ok
t/simpleGOparser.............ok
t/singlet....................ok
t/sirna......................ok
t/SiteMatrix.................ok
t/SNP........................ok
t/Sopma......................ok
t/Species....................ok
        5/20 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/Spidey.....................ok
t/splicedseq.................ok
t/StandAloneBlast............ok
t/StructIO...................ok
t/Structure..................ok
t/swiss......................ok
t/Symbol.....................ok
t/tab........................ok
t/table......................ok
t/TagHaplotype...............ok
t/Taxonomy...................ok
        44/98 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/TaxonTree..................ok
t/Tempfile...................ok
t/Term.......................ok
t/tigrxml....................ok
t/tinyseq....................ok
t/Tmhmm......................ok
t/Tools......................ok
t/Tree.......................ok
t/TreeBuild..................ok
t/TreeIO.....................ok
t/trim.......................ok
t/tRNAscanSE.................ok
t/UCSCParsers................ok
t/Unflattener................ok
t/Unflattener2...............ok
t/UniGene....................ok
t/Variation_IO...............ok
t/WABA.......................ok
t/XEMBL_DB...................ok
        1/9 skipped: server may be down
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests
t/ztr........................ok
Failed Test   Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
t/ESEfinder.t  255 65280    15    2  13.33%  15
2 tests and 98 subtests skipped.
Failed 1/240 test scripts, 99.58% okay. 1/11910 subtests failed, 99.99% okay.
*** Error code 29
make: Fatal error: Command failed for target `test_dynamic'

real    13m10.064s
user    11m14.891s
sys     0m45.417s

$ TEST_VERBOSE=1 perl t/ESEfinder.t
1..15
ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
ok 2 - use Data::Dumper;
ok 3 - use Bio::PrimarySeq;
ok 4 - use Bio::Seq;
ok 5
ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
# Looks like you planned 15 tests but only ran 14.


From bix at sendu.me.uk  Thu Oct  5 03:19:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 08:19:39 +0100
Subject: [Bioperl-l] EUtilities term handling
Message-ID: <4524B20B.5010703@sendu.me.uk>

This is actually a general question and not limited to EUtilities. As I 
see it EUtiltiies lets you do queries in Bioperl that you can do on a 
website. The question is, should a Bioperl module always work with 
queries that the website it is a front-end to works with?

So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is 
essentially a frontend onto:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=

With a web-browser you can complete that url by supplying a term. For 
example, the term 'BRCA2+9606[taxid]' works and returns results:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid]

If you supply the exact same term to EUtilities::esearch like so:

my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => 
"gene", -term "BRCA2+9606[taxid]");

The search fails. From my 'user' perspective this is highly unexpected. 
Chris (the author) and I both understand /why/ it fails, but Chris 
doesn't think it is a bug, or at least something than can/should be 
changed. What do other people think? At the very least, if something 
unexpected happens, I'd suggest making a note of it in the POD 
somewhere. Eg. "Do not use + in term strings, even though they might 
work on the website".

Chris: what is the disadvantage of always submitting '+' as '+' to the 
server?


From bix at sendu.me.uk  Thu Oct  5 03:24:45 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 08:24:45 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <4524B33D.9070607@sendu.me.uk>

Sendu Bala wrote:
>
> With a web-browser you can complete that url by supplying a term. For 
> example, the term 'BRCA2+9606[taxid]' works and returns results:
> 
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] 
> 
> 
> If you supply the exact same term to EUtilities::esearch like so:
> 
> my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => 
> "gene", -term "BRCA2+9606[taxid]");

*cough*

my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db =>
"gene", -term => "BRCA2+9606[taxid]");


> The search fails. 


From m.weimer at dkfz-heidelberg.de  Thu Oct  5 08:15:53 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Thu, 05 Oct 2006 14:15:53 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Error
Message-ID: <1160050554.18691.11.camel@localhost>

When running


--------------------------------------------------------------

  #! /usr/bin/perl -w

  use strict;
  use Bio::DB::SwissProt;

  my $db_obj = new Bio::DB::SwissProt(-verbose=>1);

  my $seq_obj = $db_obj->get_Seq_by_acc('P43780');


-------------------------------------------------------------

using Bioperl 1.4-1 I get the error message

---------------------------------------------------------------------------------

  request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
  Content-Length: 45
  Content-Type: application/x-www-form-urlencoded

  format=swissprot&db=swall&style=raw&id=P43780


  ------------- EXCEPTION: Bio::Root::Exception -------------
  MSG: swissprot stream with no ID. Not swissprot in my book
  STACK: Error::throw
  STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
  STACK
Bio::SeqIO::swiss::next_seq /usr/share/perl5/Bio/SeqIO/swiss.pm:179
  STACK:
Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/share/perl5/Bio/DB/WebDBSeqI.pm:187
  STACK: ./putativeGele.pl:8
  -----------------------------------------------------------

--------------------------------------------------------------------------------

Any suggestions?

Thanks,

Marc


From bix at sendu.me.uk  Thu Oct  5 09:21:23 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 14:21:23 +0100
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <1160050554.18691.11.camel@localhost>
References: <1160050554.18691.11.camel@localhost>
Message-ID: <452506D3.5050501@sendu.me.uk>

Marc Weimer wrote:
[snip]
>   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
> 
>   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
[snip]
> using Bioperl 1.4-1 I get the error message
[snip]
>   ------------- EXCEPTION: Bio::Root::Exception -------------
>   MSG: swissprot stream with no ID. Not swissprot in my book
[snip]
> Any suggestions?

It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most 
recent official release), but 1.5.2 does 
(http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS 
(http://bioperl.org/wiki/Getting_BioPerl#CVS).


From m.weimer at dkfz-heidelberg.de  Thu Oct  5 09:35:06 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Thu, 05 Oct 2006 15:35:06 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <452506D3.5050501@sendu.me.uk>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
Message-ID: <1160055306.18691.14.camel@localhost>

Works fine with 1.5.2

Thanks,

Marc


> Marc Weimer wrote:
> [snip]
> >   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
> > 
> >   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
> [snip]
> > using Bioperl 1.4-1 I get the error message
> [snip]
> >   ------------- EXCEPTION: Bio::Root::Exception -------------
> >   MSG: swissprot stream with no ID. Not swissprot in my book
> [snip]
> > Any suggestions?
> 
> It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most 
> recent official release), but 1.5.2 does 
> (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS 
> (http://bioperl.org/wiki/Getting_BioPerl#CVS).
-- 
########################################

Dr. Marc Weimer
German Cancer Research Center
Central Unit Biostatistics
Im Neuenheimer Feld 280
D-69120 Heidelberg
Phone: +49 (0) 6221/42-2387
Fax: +49 (0) 6221/42-2397

########################################


From hlapp at gmx.net  Thu Oct  5 09:55:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 09:55:58 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>


On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:

> This is actually a general question and not limited to EUtilities.  
> As I
> see it EUtiltiies lets you do queries in Bioperl that you can do on a
> website. The question is, should a Bioperl module always work with
> queries that the website it is a front-end to works with?

I think yes, but stick to this definition.

Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez  
website it will actually not work. Hence, it should be no surprise  
that it doesn't work either using Bio::DB::EUtilities.

The URL you are using to make your point is much more an example for  
using a web-service (SOAP, REST, or not) than it is for using a  
website. Using the web-service URL with a space in place of the '+'  
works, but yields a different result (just searches for BRCA2), so if  
tested for correct result the test fails.

I.e., you don't expect an input form on a website to accept URL- 
encoded input. Instead, you expect it to do any URL-encoding for you  
that needs to be done. Conversely, if you are using a URL to retrieve  
stuff using e.g. wget or curl, it is clear that you will need to do  
URL encoding yourself unless there is a command line option that lets  
you instruct the querying program to do so.

I would be careful with mangling the two definitions into one,  
resulting in a module that needs to serve two masters. You could  
consider providing an option though that lets you turn off the URL  
encoding on demand.

Aside from that, one of the advantages of having the service wrapped  
in Bioperl is in fact that you can have it accept a wider variety of  
parameters that the actual service would allow you to have, e.g.,  
arrays, hashes, or whatever seems appropriate.

My $0.02.

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Thu Oct  5 10:08:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:08:01 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
Message-ID: <452511C1.5020709@sendu.me.uk>

Hilmar Lapp wrote:
> 
> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:
> 
>> This is actually a general question and not limited to EUtilities. As I
>> see it EUtiltiies lets you do queries in Bioperl that you can do on a
>> website. The question is, should a Bioperl module always work with
>> queries that the website it is a front-end to works with?
> 
> I think yes, but stick to this definition.
> 
> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez 
> website it will actually not work. Hence, it should be no surprise that 
> it doesn't work either using Bio::DB::EUtilities.

On the contrary, I find it a surprise because EUtilities is an interface 
to NCBI's eutils, not the entrez website.

If I had previously read instructions on using eutils:
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls
I might (do) expect that I /should/ use + in my term.


> Aside from that, one of the advantages of having the service wrapped in 
> Bioperl is in fact that you can have it accept a wider variety of 
> parameters that the actual service would allow you to have, e.g., 
> arrays, hashes, or whatever seems appropriate.

I was going to suggest that terms be supplied as an array, leaving 
Bioperl code to decide how to 'AND' all the terms (elements in the 
array) together. It would also further force the user not to think of 
how eutils normally works, but to only consider the Bioperl instructions 
on how to form a query. But I'm not sure of the value of all that.


From cjfields at uiuc.edu  Thu Oct  5 10:06:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:06:50 -0500
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <452506D3.5050501@sendu.me.uk>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
Message-ID: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>

On Oct 5, 2006, at 8:21 AM, Sendu Bala wrote:

> Marc Weimer wrote:
> [snip]
>>   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
>>
>>   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
> [snip]
>> using Bioperl 1.4-1 I get the error message
> [snip]
>>   ------------- EXCEPTION: Bio::Root::Exception -------------
>>   MSG: swissprot stream with no ID. Not swissprot in my book
> [snip]
>> Any suggestions?
>
> It works with the latest Bioperl. I'm not sure if 1.5.1 works (the  
> most
> recent official release), but 1.5.2 does
> (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS
> (http://bioperl.org/wiki/Getting_BioPerl#CVS).

Mark, you'll have to update to 1.5.2 or CVS, as Sendu suggested.   
There were server changes for biofetch which were fixed about 4-6  
months ago (post rel. 1.5.1); I think several changes were made to  
Bio::SeqIO::swiss as well during this period.

I think the error here results from Bio::SeqIO::swiss trying to parse  
an empty byte stream.  Sendu, do you think that Bio::SeqIO::swiss  
(and other SeqIO parsers) should throw a more specific message for  
getting an empty byte stream?  Or is it more trouble than it's worth?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 10:14:40 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:14:40 +0100
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
	<1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>
Message-ID: <45251350.5030608@sendu.me.uk>

Chris Fields wrote:
>
>>>   ------------- EXCEPTION: Bio::Root::Exception -------------
>>>   MSG: swissprot stream with no ID. Not swissprot in my book
[snip]
> I think the error here results from Bio::SeqIO::swiss trying to parse an 
> empty byte stream.  Sendu, do you think that Bio::SeqIO::swiss (and 
> other SeqIO parsers) should throw a more specific message for getting an 
> empty byte stream?  Or is it more trouble than it's worth?

Trouble wise, I've no idea without looking into it. Generally speaking 
though I can say that the error message is pretty useless and I'm always 
in favour of better error messages.


From hlapp at gmx.net  Thu Oct  5 10:21:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 10:21:49 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452511C1.5020709@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
Message-ID: <F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>


On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote:

>>
>> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:
>>
>>> This is actually a general question and not limited to  
>>> EUtilities. As I
>>> see it EUtiltiies lets you do queries in Bioperl that you can do  
>>> on a
>>> website. The question is, should a Bioperl module always work with
>>> queries that the website it is a front-end to works with?
>>
>> I think yes, but stick to this definition.
>>
>> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez
>> website it will actually not work. Hence, it should be no surprise  
>> that
>> it doesn't work either using Bio::DB::EUtilities.
>
> On the contrary, I find it a surprise because EUtilities is an  
> interface
> to NCBI's eutils, not the entrez website.
>
> If I had previously read instructions on using eutils:
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=coursework.section.constructing-urls
> I might (do) expect that I /should/ use + in my term.

This is my point - stick to your definitions. Are you wrapping a  
query form on a website or are you wrapping a web service (i.e., a URL)?

The examples you give are about wrapping a web-service. Your original  
question was about wrapping a website. Yet another question is what  
the author of Bio::DB::EUtilities intended to wrap.

The other thing to consider is user-friendliness. If you are wrapping  
a web-service, do you still make not URL-encoding the user input the  
default? What will 90% of the users probably want or expect to be  
able to do? URL-encode all input themselves or expect the module to  
do this for them unless they turn it off?

As far as I'm concerned, I'll happily count myself among those who  
are lazy and ignorant, don't read NCBI's documentation, don't want to  
know how to URL encode and why this needs to be done, but just want  
it to work.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Oct  5 10:31:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:31:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>

On Oct 5, 2006, at 2:19 AM, Sendu Bala wrote:

> This is actually a general question and not limited to EUtilities.  
> As I
> see it EUtiltiies lets you do queries in Bioperl that you can do on a
> website. The question is, should a Bioperl module always work with
> queries that the website it is a front-end to works with?
>
> So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is
> essentially a frontend onto:
>
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? 
> retmode=xml&db=gene&term=
>
> With a web-browser you can complete that url by supplying a term. For
> example, the term 'BRCA2+9606[taxid]' works and returns results:
>
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? 
> retmode=xml&db=gene&term=BRCA2+9606[taxid]
>
> If you supply the exact same term to EUtilities::esearch like so:
>
> my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db =>
> "gene", -term "BRCA2+9606[taxid]");
>
> The search fails. From my 'user' perspective this is highly  
> unexpected.
> Chris (the author) and I both understand /why/ it fails, but Chris
> doesn't think it is a bug, or at least something than can/should be
> changed. What do other people think? At the very least, if something
> unexpected happens, I'd suggest making a note of it in the POD
> somewhere. Eg. "Do not use + in term strings, even though they might
> work on the website".
>
> Chris: what is the disadvantage of always submitting '+' as '+' to the
> server?

A few reasons:

1)  According to NCBI, you can use '+' in queries, but not as a  
boolean.  Global changes of '+' to a space may change the meaning of  
the query in a few rare occasions.  So, if you really wanted to  
search for the string 'BRCA2+ATG', NCBI looks for that term literally.

2)  '+' is a URI reserved symbol for a space delimiter.  Therefore,  
any parameters containing '+' are URI-encoded into %2B, which is  
decoded on NCBI's end back to '+' (The is demonstrable with current  
EUtilities output and the returned XML data).

3)  Why not just use a space (implicit AND)?  Or an explicit  
boolean?  Or '&' (which apparently works but is not specified in the  
NCBI Entrez docs)?

The bug is in the query and not in the code, i.e. is is a  user- 
generated bug, not an EUtilities bug.  And it shouldn't be  
unexpected, as NCBI has very specific rules for building queries for  
Entrez (just like any other database).  If I were to use nonstandard  
queries for MySQL, BioFetch, UCSC, or anything else, I would expect  
to get bad results.  As the old saying goes, garbage in, garbage out.

The following link has their updated rules:

http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
rid=helpentrez.chapter.EntrezHelp

Here is their old one:

http://www.ncbi.nlm.nih.gov/entrez/query/static/help/helpdoc.html

We could, of course, put something in POD, but you never presented  
that option to me before.  I'll grant that the EUtilities API needs  
some cleaning up, not easy to do when the returned data varies from  
each utility.  But it does get the URL encoding correct, at least in  
this case.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 10:32:49 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:32:49 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
Message-ID: <45251791.9040409@sendu.me.uk>

Hilmar Lapp wrote:
> 
> On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote:
>
>> On the contrary, I find it a surprise because EUtilities is an interface
>> to NCBI's eutils, not the entrez website.
>>
>> If I had previously read instructions on using eutils:
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls 
>>
>> I might (do) expect that I /should/ use + in my term.
> 
> This is my point - stick to your definitions. Are you wrapping a query 
> form on a website or are you wrapping a web service (i.e., a URL)?
> 
> The examples you give are about wrapping a web-service. Your original 
> question was about wrapping a website.

Right... I don't see that that changes the answer to my question though 
does it?

"The question is, should a Bioperl module always work with
queries that the web-service it is a front-end to works with?"

For me, the answer is still yes.


> As far as I'm concerned, I'll happily count myself among those who are 
> lazy and ignorant, don't read NCBI's documentation, don't want to know 
> how to URL encode and why this needs to be done, but just want it to work.

That's a reasonable attitude to take. Which comes back to the question I 
asked of Chris - naively, if you send + as + you can please everyone, 
can't you? Both people who have read the docs on the web-service and 
those who haven't? Or are there real queries in which a user may want to 
search for a phrase with a literal + in it (and where such a search 
works via eutils)?


From bix at sendu.me.uk  Thu Oct  5 10:44:33 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:44:33 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
Message-ID: <45251A51.6020802@sendu.me.uk>

Chris Fields wrote:
> The bug is in the query and not in the code, i.e. is is a  
> user-generated bug, not an EUtilities bug.  And it shouldn't be 
> unexpected, as NCBI has very specific rules for building queries for 
> Entrez (just like any other database).

So I guess this comes down to something Hilmar mentioned and I never 
even considered before. You consider your EUtilities stuff as a frontend 
to entrez, and therefore consider valid queries as queries that are 
valid for entrez and not eutils?

If that's the case, fine. I understand why you don't think this is a 
bug. Again, something that might warrant a mention in the POD.
Currently the naming of the modules and the explicit references to 
eutils (and me knowing the implementation uses eutils) got me confused.


From cjfields at uiuc.edu  Thu Oct  5 10:51:28 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:51:28 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452511C1.5020709@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
Message-ID: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>


On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote:

>>> This is actually a general question and not limited to  
>>> EUtilities. As I
>>> see it EUtiltiies lets you do queries in Bioperl that you can do  
>>> on a
>>> website. The question is, should a Bioperl module always work with
>>> queries that the website it is a front-end to works with?
>>
>> I think yes, but stick to this definition.
>>
>> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez
>> website it will actually not work. Hence, it should be no surprise  
>> that
>> it doesn't work either using Bio::DB::EUtilities.
>
> On the contrary, I find it a surprise because EUtilities is an  
> interface
> to NCBI's eutils, not the entrez website.

It uses NCBI's CGI interface for eutils, not the SOAP interface.   
Very different.  I have considered using the NCBI SOAP-based  
interface, but the web services are still somewhat incomplete, unlike  
the CGI interface.

> If I had previously read instructions on using eutils:
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=coursework.section.constructing-urls
> I might (do) expect that I /should/ use + in my term.

You are looking at part of the naked URL on that page.  Here's what  
that page says:

"When constructing URLs for the eUtils, please use lowercase  
characters for all parameters except &WebEnv. There is no required  
order for the URL parameters in an eUtils URL, and null values or  
inappropriate parameters are ignored. Avoid placing spaces in the  
URLs, particularly in queries. If a space is required, use a plus  
sign (+) instead of a space:

     * Incorrect: &id=352, 25125, 234, ...
     * Correct: &id=352,25125,234,...
     * Incorrect: &term=biomol mrna[properties] AND mouse[organism]
     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]

Other special characters, such as the # symbol used in referring to a  
query key on the History server, should be represented by their URL  
encodings (%23 for #).top link"

I use URI for building the URL with the parameters.  URI specifically  
encodes all of this for you, so spaces convert to '+' and '+'  
converts to %2B.

>> Aside from that, one of the advantages of having the service  
>> wrapped in
>> Bioperl is in fact that you can have it accept a wider variety of
>> parameters that the actual service would allow you to have, e.g.,
>> arrays, hashes, or whatever seems appropriate.
>
> I was going to suggest that terms be supplied as an array, leaving
> Bioperl code to decide how to 'AND' all the terms (elements in the
> array) together. It would also further force the user not to think of
> how eutils normally works, but to only consider the Bioperl  
> instructions
> on how to form a query. But I'm not sure of the value of all that.

Why do we need to intuit what the user is thinking at an particular  
time?  How would I know that someone actually wanted to search using  
the literal string 'abc+123' as opposed to 'abc 123'?

I see value in your last suggestion but I think a class or set of  
classes would be best suited for that:

MySQL Query     |  in                      out   | MySQL Query
Entrez Query    |-----> Generic Query class----->| Entrez Query
SRS Query       |                                | SRS Query
ad infinitum...

The generic query object could then be used in DB searches as an  
option besides using a raw string.  Though it would get tricky with  
SQL's complexity...

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Thu Oct  5 10:54:04 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 10:54:04 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251791.9040409@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
	<45251791.9040409@sendu.me.uk>
Message-ID: <9916EDEE-EA3C-4C55-A004-A46F37B559BF@gmx.net>


On Oct 5, 2006, at 10:32 AM, Sendu Bala wrote:

>> The examples you give are about wrapping a web-service. Your  
>> original question was about wrapping a website.
>
> Right... I don't see that that changes the answer to my question  
> though does it?
>
> "The question is, should a Bioperl module always work with
> queries that the web-service it is a front-end to works with?"
>
> For me, the answer is still yes.

The answer is still yes. My point was the query that works with a  
website is not necessarily the query that works with a web-service,  
even if that web-service also powers the website.

>
>> As far as I'm concerned, I'll happily count myself among those who  
>> are lazy and ignorant, don't read NCBI's documentation, don't want  
>> to know how to URL encode and why this needs to be done, but just  
>> want it to work.
>
> That's a reasonable attitude to take. Which comes back to the  
> question I asked of Chris - naively, if you send + as + you can  
> please everyone, can't you? Both people who have read the docs on  
> the web-service and those who haven't? Or are there real queries in  
> which a user may want to search for a phrase with a literal + in it  
> (and where such a search works via eutils)?

So are you suggesting to URL-encode some characters but not others?  
This would move you into muddy waters and I'm wondering what the gain  
is from that, and for whom it is a gain.

It sounds like it will mostly benefit those who have studied the NCBI  
documentation and know exactly the URL they want to send and want to  
ignore the EUtilities POD.

My humble guess is the far majority of people will either not read  
any documentation, or read the module's POD.

Maybe a better way to serve both types of people is to accept a  
parameter -querystring that is expected to include everything from  
'term=' onwards (including 'term=' itself) which gives you complete  
control and freedom if you know what you are doing, and otherwise  
implement what you suggested before:

> I was going to suggest that terms be supplied as an array, leaving
> Bioperl code to decide how to 'AND' all the terms (elements in the
> array) together. It would also further force the user not to think of
> how eutils normally works, but to only consider the Bioperl  
> instructions
> on how to form a query.


	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Thu Oct  5 11:02:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:02:01 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
Message-ID: <45251E69.7040507@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote:
>
>> On the contrary, I find it a surprise because EUtilities is an interface
>> to NCBI's eutils, not the entrez website.
> 
> It uses NCBI's CGI interface for eutils, not the SOAP interface.  Very 
> different.  I have considered using the NCBI SOAP-based interface, but 
> the web services are still somewhat incomplete, unlike the CGI interface.

I don't know anything about the SOAP interface. I'm talking about the 
CGI interface that you use.


>> If I had previously read instructions on using eutils:
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls 
>>
>> I might (do) expect that I /should/ use + in my term.
> 
> You are looking at part of the naked URL on that page.  Here's what that 
> page says:

I know what it says...

>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]

The correct query is the one that has +s in it.


> I use URI for building the URL with the parameters.  URI specifically 
> encodes all of this for you, so spaces convert to '+' and '+' converts 
> to %2B.

Well, yes. This causes what I thought of as a bug. It prevents me from 
submitting a /correct/ eutils term. However it isn't a bug if you 
explain to users they shouldn't be submitting valid eutils terms, but 
only valid /entrez/ terms.


From cjfields at uiuc.edu  Thu Oct  5 11:15:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:15:49 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251A51.6020802@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
	<45251A51.6020802@sendu.me.uk>
Message-ID: <B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>


On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> The bug is in the query and not in the code, i.e. is is a  user- 
>> generated bug, not an EUtilities bug.  And it shouldn't be  
>> unexpected, as NCBI has very specific rules for building queries  
>> for Entrez (just like any other database).
>
> So I guess this comes down to something Hilmar mentioned and I  
> never even considered before. You consider your EUtilities stuff as  
> a frontend to entrez, and therefore consider valid queries as  
> queries that are valid for entrez and not eutils?

The eutils tools access the same databases as the web page, in the  
same way, using the same search terms.  From the EUtilities docs:

"The eUtils access the core search and retrieval engine of the Entrez  
system and, therefore, are only capable of retrieving data that are  
already in Entrez."

> If that's the case, fine. I understand why you don't think this is  
> a bug. Again, something that might warrant a mention in the POD.
> Currently the naming of the modules and the explicit references to  
> eutils (and me knowing the implementation uses eutils) got me  
> confused.

I'll note that in there is URI encoding in POD, but that should be a  
no-brainer.  I don't think every Bio::DB* class specifies this,  
mainly because it is taken for granted.  Pretty much anything that  
builds URL strings needs to encode based on the URI standard, and any  
server that accepts URLs is expected to decode using the same standard.

So, again, why does that have to be specifically outlined in POD?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 11:24:39 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:24:39 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251E69.7040507@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
Message-ID: <BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>

>> I use URI for building the URL with the parameters.  URI  
>> specifically encodes all of this for you, so spaces convert to '+'  
>> and '+' converts to %2B.
>
> Well, yes. This causes what I thought of as a bug. It prevents me  
> from submitting a /correct/ eutils term. However it isn't a bug if  
> you explain to users they shouldn't be submitting valid eutils  
> terms, but only valid /entrez/ terms.

I can specify in POD that URI encoding is in effect if that placates  
you, and maybe add a bit about how terms are to be built (based on  
the website).  I also noticed that the esearch POD doesn't have a  
demo in the SYNOPSIS yet (my fault).

However, I think this is all a bit silly.  This is something most  
people already realize and take for granted (it's standard for any  
CGI interface to use URI encoding).

Also, most Entrez users do not use a term like 'BRCA2+Human 
[ORGANISM]'.  They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human 
[ORGANISM]', the latter which is implicit.  All of this is on the  
Entrez website.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From MEC at stowers-institute.org  Thu Oct  5 11:12:02 2006
From: MEC at stowers-institute.org (Cook, Malcolm)
Date: Thu, 5 Oct 2006 10:12:02 -0500
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
Message-ID: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>

Lincoln,

I committed a change to Bio::SeqFeature::Store to use nfreeze instead of
freeze which should allow SeqFeature objects to survive database
freeze/thaw cycles across architectures.

I hope I was not presumptuous or in error in doing this....

Regards,

Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
 

From bix at sendu.me.uk  Thu Oct  5 11:28:55 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:28:55 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
	<45251A51.6020802@sendu.me.uk>
	<B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>
Message-ID: <452524B7.5080003@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>> The bug is in the query and not in the code, i.e. is is a  
>>> user-generated bug, not an EUtilities bug.  And it shouldn't be 
>>> unexpected, as NCBI has very specific rules for building queries for 
>>> Entrez (just like any other database).
>>
>> So I guess this comes down to something Hilmar mentioned and I never 
>> even considered before. You consider your EUtilities stuff as a 
>> frontend to entrez, and therefore consider valid queries as queries 
>> that are valid for entrez and not eutils?
> 
> The eutils tools access the same databases as the web page, in the same 
> way, using the same search terms.

It doesn't. The eutils interface behaves differently with +s than does 
the entrez website interface. In eutils + means space, whilst in entrez, 
+ means the plus symbol.


>> If that's the case, fine. I understand why you don't think this is a 
>> bug. Again, something that might warrant a mention in the POD.
>> Currently the naming of the modules and the explicit references to 
>> eutils (and me knowing the implementation uses eutils) got me confused.
> 
> I'll note that in there is URI encoding in POD, but that should be a 
> no-brainer.

Just that it is URI encoded isn't the problem. The problem is the 
difference in behaviour outlined above.


> I don't think every Bio::DB* class specifies this, mainly 
> because it is taken for granted.  Pretty much anything that builds URL 
> strings needs to encode based on the URI standard, and any server that 
> accepts URLs is expected to decode using the same standard.
> 
> So, again, why does that have to be specifically outlined in POD?

Because they're different. If I construct a valid eutils query it might 
not work. You ought to explain why.

"EUtilities takes any valid entrez query and transforms it into a valid 
eutils query for submission. Do not try and provide a valid eutils query 
of your own, or the extra transformation will result in no results"


From bix at sendu.me.uk  Thu Oct  5 11:30:44 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:30:44 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
Message-ID: <45252524.7030006@sendu.me.uk>

Chris Fields wrote:
>>> I use URI for building the URL with the parameters.  URI specifically 
>>> encodes all of this for you, so spaces convert to '+' and '+' 
>>> converts to %2B.
>>
>> Well, yes. This causes what I thought of as a bug. It prevents me from 
>> submitting a /correct/ eutils term. However it isn't a bug if you 
>> explain to users they shouldn't be submitting valid eutils terms, but 
>> only valid /entrez/ terms.
> 
> I can specify in POD that URI encoding is in effect if that placates 
> you, and maybe add a bit about how terms are to be built (based on the 
> website).  I also noticed that the esearch POD doesn't have a demo in 
> the SYNOPSIS yet (my fault).
> 
> However, I think this is all a bit silly.  This is something most people 
> already realize and take for granted (it's standard for any CGI 
> interface to use URI encoding).
> 
> Also, most Entrez users do not use a term like 'BRCA2+Human[ORGANISM]'.  
> They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human[ORGANISM]', the 
> latter which is implicit.  All of this is on the Entrez website.

Exactly. You're assuming an entrez user and expecting an entrez query. I 
don't think its silly given the name of the modules for the user to 
assume the code needs an eutils query, which is a different thing with 
different behaviour /independent/ of URI encoding.


From cjfields at uiuc.edu  Thu Oct  5 11:50:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:50:51 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251E69.7040507@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
Message-ID: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>

> I know what it says...

Ah, that's the Sendu I know and love.

>
>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>
> The correct query is the one that has +s in it.

Yes, that's because it's a URL, not a raw search term string (it has  
been URI-encoded so spaces are converted to '+').  If you use that as  
a direct query in Entrez you will not get the same response.  You do  
get something if you use the new NCBI global query form on the main  
page, but clicking on the nucleotide or PMC hits reveals that the URL  
is malformed and no term is present.  That is exactly the same  
response in EUtilities:

<?xml version="1.0"?>
<!DOCTYPE eSearchResult PUBLIC "-//NLM//DTD eSearchResult, 11 May  
2002//EN" "http://www.ncbi.nlm.nih.gov/entrez/query/DTD/ 
eSearch_020511.dtd">
<eSearchResult>
         <Count>0</Count>
         <RetMax>0</RetMax>
         <RetStart>0</RetStart>
         <IdList>
         </IdList>
         <TranslationSet>
         </TranslationSet>
         <QueryTranslation></QueryTranslation>
</eSearchResult>

Note the QueryTranslation tag is empty.

The only noticeable difference is using egquery (which I just fixed  
in CVS yesterday).  The returned XML gives no hits for any database,  
which is true based on individual esearch queries for those database,  
and is actually more consistent than the website version.

>> I use URI for building the URL with the parameters.  URI specifically
>> encodes all of this for you, so spaces convert to '+' and '+'  
>> converts
>> to %2B.
>
> Well, yes. This causes what I thought of as a bug. It prevents me from
> submitting a /correct/ eutils term. However it isn't a bug if you
> explain to users they shouldn't be submitting valid eutils terms, but
> only valid /entrez/ terms.

If you mean that most users will actually use a URL-like search term,  
then I would say you have a point.  But that simply isn't the case.

If clarifying the docs makes it better, then so be it.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 11:59:53 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:59:53 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45252524.7030006@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
Message-ID: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>


On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>>> I use URI for building the URL with the parameters.  URI  
>>>> specifically encodes all of this for you, so spaces convert to  
>>>> '+' and '+' converts to %2B.
>>>
>>> Well, yes. This causes what I thought of as a bug. It prevents me  
>>> from submitting a /correct/ eutils term. However it isn't a bug  
>>> if you explain to users they shouldn't be submitting valid eutils  
>>> terms, but only valid /entrez/ terms.
>> I can specify in POD that URI encoding is in effect if that  
>> placates you, and maybe add a bit about how terms are to be built  
>> (based on the website).  I also noticed that the esearch POD  
>> doesn't have a demo in the SYNOPSIS yet (my fault).
>> However, I think this is all a bit silly.  This is something most  
>> people already realize and take for granted (it's standard for any  
>> CGI interface to use URI encoding).
>> Also, most Entrez users do not use a term like 'BRCA2+Human 
>> [ORGANISM]'.  They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human 
>> [ORGANISM]', the latter which is implicit.  All of this is on the  
>> Entrez website.
>
> Exactly. You're assuming an entrez user and expecting an entrez  
> query. I don't think its silly given the name of the modules for  
> the user to assume the code needs an eutils query, which is a  
> different thing with different behaviour /independent/ of URI  
> encoding.

It's a silly distinction.  The POD for Bio::DB::EUtilities states:

Bio::DB::EUtilities - interface for handling web queries and data  
retrieval from NCBI's Entrez Utilities.

My question is this : why would anyone (particularly the everyday  
bioperl user) want to use URL-encoded parameters for a query?  That  
seems to be your main argument here.  If so, wouldn't I just paste  
them together then send them off NCBI eutils?  Would I devote ~ 10  
classes to that?  I could do that in a short program using an array,  
join, and LWP::Simple.

The purpose is quite clearly stated, but if you feel that by  
badgering me to add something to POD I consider common sense, then  
you're right.  You've succeeded.  Bravo.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 12:02:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:02:05 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
Message-ID: <45252C7D.3050009@sendu.me.uk>

Chris Fields wrote:
>
>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>
>> The correct query is the one that has +s in it.
> 
> Yes, that's because it's a URL, not a raw search term string (it has 
> been URI-encoded so spaces are converted to '+').  If you use that as a 
> direct query in Entrez you will not get the same response.

But we're not doing Entrez queries. We're using a module called 
EUtilities to do an eutils query, which involves forming a url in which 
spaces should to be converted to +. That's the source of confusion. Is 
the user supposed to do this, or is EUtilities?

All you had to do 8 emails ago is tell me that EUtilities is supposed to 
do that. You /still/ haven't told me that. I give up.


From cjfields at uiuc.edu  Thu Oct  5 12:12:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 11:12:11 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45252C7D.3050009@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
Message-ID: <A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>


On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>>
>>> The correct query is the one that has +s in it.
>> Yes, that's because it's a URL, not a raw search term string (it  
>> has been URI-encoded so spaces are converted to '+').  If you use  
>> that as a direct query in Entrez you will not get the same response.
>
> But we're not doing Entrez queries. We're using a module called  
> EUtilities to do an eutils query, which involves forming a url in  
> which spaces should to be converted to +. That's the source of  
> confusion. Is the user supposed to do this, or is EUtilities?
>
> All you had to do 8 emails ago is tell me that EUtilities is  
> supposed to do that. You /still/ haven't told me that. I give up.

It should be apparent from the documentation and the URLs posted in  
debugging output the first few times you used it.  Again, why would I  
dedicate ~ 10 classes to pasting together URI-encoded strings?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 12:22:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:22:36 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
Message-ID: <4525314C.7020205@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote:
>
>> Exactly. You're assuming an entrez user and expecting an entrez query. 
>> I don't think its silly given the name of the modules for the user to 
>> assume the code needs an eutils query, which is a different thing with 
>> different behaviour /independent/ of URI encoding.
> 
> It's a silly distinction.  The POD for Bio::DB::EUtilities states:
> 
> Bio::DB::EUtilities - interface for handling web queries and data 
> retrieval from NCBI's Entrez Utilities.
> 
> My question is this : why would anyone (particularly the everyday 
> bioperl user) want to use URL-encoded parameters for a query?

Well I'll tell you why I was trying to use URL-encoded parameters, if 
that helps you any.

I read the pod for EUtilities but all the examples have very simple 
-term s defined with just a single word. So I wonder how I'm supposed to 
make an 'AND' term. I also have no idea what utilities I'm supposed to 
use, or what databases etc. I need to get the answer I want.

The POD points me here:
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
Combined with the EUtilities synopsis I know I'm supposed to start with 
esearch so I look at:
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html
And figure out what my terms are supposed to be.

Then I test some example terms in my web browser using the esearch base 
url (http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?) to see 
if they work, and copy/paste the terms into my EUtilities-using perl 
script, replacing variable terms with perl variables.

Then I find that my terms don't work, ask you about it, and you fail to 
tell me I should be testing my terms at 
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene.

If you think I'm stupid, fine, but I'm probably not the only stupid 
person on the planet. Which is why I suggested a POD addition. You don't 
have to make any POD change if you don't want to. I simply thought it 
might help avoid anyone 'badgering' you in the future with a similar 
problem.


From bix at sendu.me.uk  Thu Oct  5 12:28:51 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:28:51 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
	<A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
Message-ID: <452532C3.9030804@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>>
>>>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>>>
>>>> The correct query is the one that has +s in it.
>>> Yes, that's because it's a URL, not a raw search term string (it has 
>>> been URI-encoded so spaces are converted to '+').  If you use that as 
>>> a direct query in Entrez you will not get the same response.
>>
>> But we're not doing Entrez queries. We're using a module called 
>> EUtilities to do an eutils query, which involves forming a url in 
>> which spaces should to be converted to +. That's the source of 
>> confusion. Is the user supposed to do this, or is EUtilities?
>>
>> All you had to do 8 emails ago is tell me that EUtilities is supposed 
>> to do that. You /still/ haven't told me that. I give up.
> 
> It should be apparent from the documentation and the URLs posted in 
> debugging output the first few times you used it.  Again, why would I 
> dedicate ~ 10 classes to pasting together URI-encoded strings?

I'm not sure how not doing URI-encoding would suddenly make your classes 
worthless. I find them to be very useful (even when I didn't know there 
was any URI-encoding, was incorrectly using +s and it happened to work 
anyway).


From bernd.web at gmail.com  Thu Oct  5 10:09:38 2006
From: bernd.web at gmail.com (Bernd Web)
Date: Thu, 5 Oct 2006 16:09:38 +0200
Subject: [Bioperl-l] Eutilities Batch
Message-ID: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>

Hi,

I am using the new EUtilities. It looks great.
I was trying to use epost followed by elink but i get an error. The
same error is actually given with the example on
http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
Can't call method "get_databases" on an undefined value at EU.pl line 25.

For completeness, the code is shown below too.

Any suggestions what is going wrong?

Regards,
Bernd

# chain EUtilities for complex queries

  use Bio::DB::EUtilities;

  my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                         -db         => 'pubmed',
                                         -term       => 'hutP',
                                         -usehistory => 'y');

  $esearch->get_response; # parse the response, fetch a cookie

  my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
                                       -db           => 'protein,taxonomy',
                                       -dbfrom       => 'pubmed',
                                       -cookie       => $esearch->next_cookie,
                                       -cmd          => 'neighbor');

  # this retrieves the Bio::DB::EUtilities::ElinkData object

  my ($linkset) = $elink->next_linkset;
  my @ids;

  # step through IDs for each linked database in the ElinkData object

  for my $db ($linkset->get_databases) {
    @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
    # do something here
  }


From cjfields at uiuc.edu  Thu Oct  5 13:31:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 12:31:33 -0500
Subject: [Bioperl-l] Eutilities Batch
In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
Message-ID: <F53B83B9-E188-4715-8229-0B6D9C0C982A@uiuc.edu>

I'll look into it.  I'm busy updating the EUtilities tools now.

Chris

On Oct 5, 2006, at 9:09 AM, Bernd Web wrote:

> Hi,
>
> I am using the new EUtilities. It looks great.
> I was trying to use epost followed by elink but i get an error. The
> same error is actually given with the example on
> http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
> Can't call method "get_databases" on an undefined value at EU.pl  
> line 25.
>
> For completeness, the code is shown below too.
>
> Any suggestions what is going wrong?
>
> Regards,
> Bernd
>
> # chain EUtilities for complex queries
>
>   use Bio::DB::EUtilities;
>
>   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                          -db         => 'pubmed',
>                                          -term       => 'hutP',
>                                          -usehistory => 'y');
>
>   $esearch->get_response; # parse the response, fetch a cookie
>
>   my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
>                                        -db           =>  
> 'protein,taxonomy',
>                                        -dbfrom       => 'pubmed',
>                                        -cookie       => $esearch- 
> >next_cookie,
>                                        -cmd          => 'neighbor');
>
>   # this retrieves the Bio::DB::EUtilities::ElinkData object
>
>   my ($linkset) = $elink->next_linkset;
>   my @ids;
>
>   # step through IDs for each linked database in the ElinkData object
>
>   for my $db ($linkset->get_databases) {
>     @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
>     # do something here
>   }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From daniel.lang at biologie.uni-freiburg.de  Thu Oct  5 13:12:02 2006
From: daniel.lang at biologie.uni-freiburg.de (Daniel Lang)
Date: Thu, 05 Oct 2006 19:12:02 +0200
Subject: [Bioperl-l] Bio::DB::SeqFeature
Message-ID: <45253CE2.1070208@biologie.uni-freiburg.de>

Hi,

we are storing Bio::SeqFeature::Gene::GeneStructure objects (with
multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db
(latest bioperl-live checkout).

The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch
out of a database.

The first observation is that is seems to work (fetched objects behave
like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we
get these warnings:

Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into
lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into
lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
        (in cleanup) Not a CODE reference at
/home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
prepare_cached(SELECT f.id,f.object
  FROM feature as f
  WHERE (   f.seqid=?
   AND   f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?))
)

) statement handle DBI::st=HASH(0x1c317cf0) still Active at
/home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm
line 1422
        (in cleanup) Not a CODE reference at
/home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.

Is this something serious? Does this mean that the stored object doesn't
have everything it had before freezing? Or are we using
Bio::DB::SeqFeature inappropriately?

The other question would be, if we can visualize these stored feature
objects easily using gbrowse? I didn't find a hint mentioning
Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages...
Is it working already? Will it?

Thanks in advance,
Daniel

-- 

Daniel Lang
University of Freiburg, Plant Biotechnology
Schaenzlestr. 1, D-79104 Freiburg
fax: +49 761 203 6945
phone: +49 761 203 6974
homepage:  http://www.plant-biotech.net/
e-mail: daniel.lang at biologie.uni-freiburg.de

#################################################
My software never has bugs.
It just develops random features.
#################################################


From cjfields at uiuc.edu  Thu Oct  5 13:45:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 12:45:40 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452532C3.9030804@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
	<A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
	<452532C3.9030804@sendu.me.uk>
Message-ID: <003DD8C4-6E59-44C2-9A1C-117E036D93BC@uiuc.edu>


On Oct 5, 2006, at 11:28 AM, Sendu Bala wrote:

> I'm not sure how not doing URI-encoding would suddenly make your  
> classes worthless. I find them to be very useful (even when I  
> didn't know there was any URI-encoding, was incorrectly using +s  
> and it happened to work anyway).

That's not my point (and sincerest apologies for the 'badgering'  
bit).  If you made the assumption that all the parameters had to be  
URI-encoded, why couldn't I do something like:

my %param = (#make up your list of parameters here#);
my $eutil = 'esearch';
my $url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/$eutil.fcgi";
# join the key value pairs with '=', then join all those with &
# add to end of url
# post and retrieve via LWP::Simple

It's more user-friendly to set up the parameters so that you wouldn't  
have to encode everything yourself, esp. when the most reliable way  
to encode URI strings is to 'use URI'.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 14:11:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 13:11:25 -0500
Subject: [Bioperl-l] Eutilities Batch
In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
Message-ID: <4A340977-C6AD-4728-8947-BF5A8A782807@uiuc.edu>


On Oct 5, 2006, at 9:09 AM, Bernd Web wrote:

> Hi,
>
> I am using the new EUtilities. It looks great.
> I was trying to use epost followed by elink but i get an error. The
> same error is actually given with the example on
> http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
> Can't call method "get_databases" on an undefined value at EU.pl  
> line 25.
>
> For completeness, the code is shown below too.
>
> Any suggestions what is going wrong?
>
> Regards,
> Bernd

Grr...that's my error, sorry Bernd.  The POD wasn't updated to match  
the change I made and has a few errors.  The elink object, for  
starters, doesn't fetch the response using get_response().  Also, the  
ElinkData method has changed slightly but accomplishes the same  
thing.  Odd, since I copied and pasted that from working code...

Just a note: these are considered highly experimental at the moment,  
though they should be ready for general use and toying around.  I  
would like any suggestions on methods and so on you may have (Sendu  
has made some very helpful ones off-list which I plan on implementing).

Feel free to let me know if something doesn't work.  Note that,  
because of their experimental nature, you will want to take note of  
any methods changes in particular as I try to solidify the API and  
clean up the POD, so expect some momentary 'outages'.  I plan on  
setting up a remedial interface for all the container objects (like  
ElinkData) which will help clarify things and solidify the API in the  
next few weeks, at least to a point where the class methods have a  
consistent naming scheme.  I plan on using this as a backend web  
agent for a general Entrez interface at some point to get data into  
Bio* objects.

In the meantime, try this:

use Bio::DB::EUtilities;

my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                        -db         => 'pubmed',
                                        -term       => 'hutP',
                                        -usehistory => 'y');

$esearch->get_response; # parse the response, fetch a cookie

my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
                                      -db           =>  
'protein,taxonomy',
                                      -dbfrom       => 'pubmed',
                                      -cookie       => $esearch- 
 >next_cookie,
                                      -cmd          => 'neighbor');

$elink->get_response;

# this retrieves the Bio::DB::EUtilities::ElinkData object

my $linkset = $elink->next_linkset;
my @ids;

# step through IDs for each linked database in the ElinkData object

for my $db ($linkset->get_all_linkdbs) {
   @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
   print join q(,), @ids;
   # do something here
}


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dmessina at wustl.edu  Thu Oct  5 14:07:56 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 5 Oct 2006 13:07:56 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
Message-ID: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>

I'm pleased to announce a revised version of the BioPerl Deobfuscator  
is now available. Many thanks to Mauricio Cuadra for updating  
bioperl.org's installation:

http://bioperl.org/cgi-bin/deob_interface.cgi

I've incorporated many of the suggestions you all sent in after the  
first release, and many of the modules that had non-standard  
documentation have been updated in the meantime, too, so hopefully  
you'll find it much improved. There are still some issues with a few  
modules; please report any problems you see. Also, it's now indexing  
bioperl-live instead of 1.4, which should make it a little more  
useful, too. A complete list of changes is below.

I welcome your bug reports and suggestions for improvements, via  
email, this list, Bugzilla, or the Wiki page.


Thanks,
Dave


Changes

0.0.3  Mon Oct  2 20:01:45 CDT 2006
        FIX: change default $deob_detail_path to be a relative URL  
instead of
             having localhost hardcoded. Thanks to Jason Stajich for  
pointing
             this out.
        FIX: Bio::Ontology modules are no longer missing their prefix  
in the
             class list, and their methods are now shown in the lower  
pane
             as expected. Thanks to Hilmar Lapp for reporting this bug.
        FIX: can now handle (and ignore) VERSION POD section.
        FIX: missing SYNOPSIS section now handled properly. In fact, the
             SYNOPSIS and DESCRIPTION sections can be in reverse  
order now,
             although for consistency this is not recommended.
        FIX: Bug #2114: "Obfuscator doesn't show "Bio:Matrix:Generic"  
has been
             fixed. This bug turned out to afflict multiple modules,  
which
             weren't getting parsed correctly by deob_index.pl.
        NEW: Table cells have been padded out to get rid of that  
"scrunched"
             look. Thanks to Sendu Bala for this great suggestion.
        NEW: If the 'Returns' subsection of a method's documentation  
contains
             a POD L<> link, the Deobfuscator assumes this to be a  
package
             name, and wraps it in an href for display. This feature is
             not robust, but seems to work well enough for now.
        NEW: the list of classes is now sorted alphabetically depth- 
first, so
             that subclasses appear just after their parent class.  
Thanks to
             Amir Karger for noticing the strange sorting behavior.
        NEW: HTML page title now 'BioPerl Deobfuscator' to  
distinguish it from
             other Deobfuscators out there. Thanks to Amir Karger for
             suggesting this.
        NEW: 'No match' search string now more prominent. Yep, kudos  
to Amir
             Karger again -- another great idea!
        NEW: Search box caption now explicitly states that only  
package names
             can be searched. Big ups to Amir Karger for this  
suggestion.
             The ability to search method names is planned for a  
future version.
        NEW: added -x option to deob_index.pl. This allows the use of an
             'excluded modules' file. This feature was added to  
resolve an
             issue with four modules which rely on external modules  
to compile.
             Class::Inspector, used by the Deobfuscator needs to load a
             module to traverse its inheritance tree, and modules  
must compile
             before they can be loaded.
     CHANGE: using short name now when traversing with File::Find to  
help
             identify excluded modules (deob_index.pl).


From lincoln.stein at gmail.com  Thu Oct  5 14:41:08 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:41:08 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
Message-ID: <6dce9a0b0610051141x6b61407ar1c0a13cf7616b35f@mail.gmail.com>

The non-numeric comparison bug in Bio::DB::SeqFeature is fixed in the
latest CVS. Do I need to do anything special to get the CVS fixes into
the release candidate?

Lincoln

On 10/2/06, Chris Fields <cjfields at uiuc.edu> wrote:
> > [I won't create a wiki account just to report this.]
> >
> > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> > not set.  Lots of warnings about missing packages and all, but this
> > looks interesting:
> >
> >    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/
> > SeqFeature/Segment.pm line 423.
>
> This is verified on Mac OS X.
>
> > Otherwise:
> >
> >    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,
> > 99.99% okay.
> >
> > The failed test is:
> >
> >    t/ESEfinder..................dubious
> >       Test returned status 255 (wstat 65280, 0xff00)
> >    DIED. FAILED test 15
>
> What do you get when you run that set of tests using 'perl -I. -w t/
> ESEFinder.t'?  The bad status code is odd and could be a remote
> server issue.
>
> Chris
>
>
> >
> > florin
> >
> > --
> > If we wish to count lines of code, we should not regard them as lines
> > produced but as lines spent.                       -- Edsger Dijkstra
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From MEC at stowers-institute.org  Thu Oct  5 15:18:08 2006
From: MEC at stowers-institute.org (Cook, Malcolm)
Date: Thu, 5 Oct 2006 14:18:08 -0500
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
Message-ID: <CED81D34E37D5043A1211565277A51E5065E9897@exchkc02.stowers-institute.org>


Yes, there is overhead (c.f. perldoc Storable)

    "When writing in network order, all fields are written
    out as standard lengths, which allows full interworking, but takes
    longer to read and write)"

And, I suppose there is also risk of loosing precision in using network
order:

    You can also store data in network order to allow easy sharing
across
    multiple platforms, or when storing on a socket known to be remotely
    connected. The routines to call have an initial "n" prefix for
    *network*, as in "nstore" and "nstore_fd". At retrieval time, your
data
    will be correctly restored so you don't have to know whether you're
    restoring from native or network ordered data. Double values are
stored
    stringified to ensure portability as well, at the slight risk of
loosing
    some precision in the last decimals.

So, I agree, it should be configuration option, perhaps defaulting to
using network order.

However, given the factoring of ../Bio/DB/SeqFeature/Store.pm I'm not
sure how to best make it a configuration option since the two provided
serializers don't share a common interface.  Possibly something like:

=head1 Methods for Connecting and Initializating a Database

=head2 new

 Title   : new
 Usage   : $db = Bio::DB::SeqFeature::Store->new(@options)
 Function: connect to a database
 Returns : A descendent of Bio::DB::Seqfeature::Store
 Args    : several - see below
 Status  : public

This class method creates a new database connection. The following
-name=E<gt>$value arguments are
accepted:http://iowg.brcdevel.org/gff3.html#a_fasta

 Name               Value
 ----               -----

 -adaptor           The name of the Adaptor class (default DBI::mysql)

 -serializer        The name of the serializer class (default Storable)

 -network_order     Strive to 'preserve network order' (if the
serializer implements it.  
		        Currently, only Storable.pm does, and this will
cause it to use nfreeze 
                    instead of freeze.  (default 1)

 -index_subfeatures Whether or not to make subfeatures searchable
                    (default true)

 -cache             Activate LRU caching feature -- size of cache

 -compress          Compresses features before storing them in database
                    using Compress::Zlib


Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
  

> -----Original Message-----
> From: Lincoln Stein [mailto:lincoln.stein at gmail.com] 
> Sent: Thursday, October 05, 2006 1:43 PM
> To: Cook, Malcolm
> Cc: lstein at cshl.org; bioperl-l
> Subject: Re: using nfreeze instead of freeze in Bio::SeqFeature::Store
> 
> I think it's fine unless there is a significant performance hit, in
> which case the change should be made into a configuration option. Do
> you know if there is any overhead on doing this?
> 
> Lincoln
> 
> On 10/5/06, Cook, Malcolm <MEC at stowers-institute.org> wrote:
> > Lincoln,
> >
> > I committed a change to Bio::SeqFeature::Store to use 
> nfreeze instead of
> > freeze which should allow SeqFeature objects to survive database
> > freeze/thaw cycles across architectures.
> >
> > I hope I was not presumptuous or in error in doing this....
> >
> > Regards,
> >
> > Malcolm Cook
> > Database Applications Manager - Bioinformatics
> > Stowers Institute for Medical Research - Kansas City, Missouri
> >
> >
> 
> 
> -- 
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)
> FOR URGENT MESSAGES & SCHEDULING,
> PLEASE CONTACT MY ASSISTANT,
> SANDRA MICHELSEN, AT michelse at cshl.edu
> 


From lincoln.stein at gmail.com  Thu Oct  5 14:32:40 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:32:40 -0400
Subject: [Bioperl-l] Bio::DB::SeqFeature
In-Reply-To: <45253CE2.1070208@biologie.uni-freiburg.de>
References: <45253CE2.1070208@biologie.uni-freiburg.de>
Message-ID: <6dce9a0b0610051132p7d7fcf84g27578731f9727f3f@mail.gmail.com>

Hi Daniel,

The warnings you are seeing are occurring because
Bio::SeqFeature::Gene::GeneStructure contains a CODE reference. I
think it must be registering a cleanup method via its Bio::Root::Root
ancestor. When Storable serializes the object, it complains that it
can't serialize the CODE reference and instead converts it into the
string "CODE(0xXXXXX)". Then, after you thaw the object,
Bio::Root::Root is complaining that the CODE reference is invalid
because it is a string, not a reference.

Yuck. I think, however, that I can fix this by setting some magic
variables in Storable version 2.05 that will decompile and compile the
CODE references. I will try this and send you a note when the code is
in CVS.

GBrowse does run off Bio::DB::SeqFeature::Store and is noticeably
faster than the original Bio::DB::GFF adaptor. Nothing really changes
except that you set the db_adaptor option to
Bio::DB::SeqFeature::Store. I haven't tried it using
Bio::SeqFeature::Gene::GeneStructure, so no guarantees, but I am
hopeful that it will work.

Lincoln


On 10/5/06, Daniel Lang <daniel.lang at biologie.uni-freiburg.de> wrote:
> Hi,
>
> we are storing Bio::SeqFeature::Gene::GeneStructure objects (with
> multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db
> (latest bioperl-live checkout).
>
> The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch
> out of a database.
>
> The first observation is that is seems to work (fetched objects behave
> like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we
> get these warnings:
>
> Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into
> lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
> Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into
> lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
>         (in cleanup) Not a CODE reference at
> /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
> prepare_cached(SELECT f.id,f.object
>   FROM feature as f
>   WHERE (   f.seqid=?
>    AND   f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?))
> )
>
> ) statement handle DBI::st=HASH(0x1c317cf0) still Active at
> /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm
> line 1422
>         (in cleanup) Not a CODE reference at
> /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
>
> Is this something serious? Does this mean that the stored object doesn't
> have everything it had before freezing? Or are we using
> Bio::DB::SeqFeature inappropriately?
>
> The other question would be, if we can visualize these stored feature
> objects easily using gbrowse? I didn't find a hint mentioning
> Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages...
> Is it working already? Will it?
>
> Thanks in advance,
> Daniel
>
> --
>
> Daniel Lang
> University of Freiburg, Plant Biotechnology
> Schaenzlestr. 1, D-79104 Freiburg
> fax: +49 761 203 6945
> phone: +49 761 203 6974
> homepage:  http://www.plant-biotech.net/
> e-mail: daniel.lang at biologie.uni-freiburg.de
>
> #################################################
> My software never has bugs.
> It just develops random features.
> #################################################
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From hlapp at gmx.net  Thu Oct  5 16:34:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 16:34:49 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4525314C.7020205@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
Message-ID: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>


On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote:

> If you think I'm stupid, fine, but I'm probably not the only stupid
> person on the planet.

That's a great suggestion that I hope we can all agree on? I'll  
happily count myself among the stupid ones too so you're not alone,  
and stupid people and even more so those who are lucky enough not to  
be stupid have an obligation to document stuff so that even the  
stupid can understand, no matter how silly the documentation might get.

Is that agreeable without causing yet more progressive hair loss?

Actually - I'm having second thoughts. Isn't it a distinguishing  
feature of stupid people that - among other things - they are stupid  
enough to believe they don't need to read documentation? You admitted  
publicly that you read documentation - are you just faking the stupid?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Oct  5 17:11:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:11:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
Message-ID: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>


On Oct 5, 2006, at 3:34 PM, Hilmar Lapp wrote:

>
> On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote:
>
>> If you think I'm stupid, fine, but I'm probably not the only stupid
>> person on the planet.
>
> That's a great suggestion that I hope we can all agree on? I'll  
> happily count myself among the stupid ones too so you're not alone,  
> and stupid people and even more so those who are lucky enough not  
> to be stupid have an obligation to document stuff so that even the  
> stupid can understand, no matter how silly the documentation might  
> get.
>
> Is that agreeable without causing yet more progressive hair loss?
>
> Actually - I'm having second thoughts. Isn't it a distinguishing  
> feature of stupid people that - among other things - they are  
> stupid enough to believe they don't need to read documentation? You  
> admitted publicly that you read documentation - are you just faking  
> the stupid?
>
> 	-hilmar

If lack of good documentation == stupid, I know of a few other  
modules in trouble besides mine.  Based on that we're in for a whole  
lot of stupid!  And I feel stupid for my earlier remarks, Sendu, so  
apologies.

And Hilmar, you're too late on the hair loss, at least on my end.

I have corrected the EUtilities POD to reflect that all text input  
needs to be raw as URI encoding is done in the module, which should  
work (I think).  I plan on committing it tonight.  It also indicates  
that EUtilities search queries need to be made as if they are regular  
Entrez queries.  Would that be sufficient?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From pmiguel at purdue.edu  Thu Oct  5 16:42:00 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Thu, 05 Oct 2006 16:42:00 -0400
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
Message-ID: <45256E18.3080103@purdue.edu>

David Messina wrote:
> I'm pleased to announce a revised version of the BioPerl Deobfuscator  
> is now available. Many thanks to Mauricio Cuadra for updating  
> bioperl.org's installation:
>
> http://bioperl.org/cgi-bin/deob_interface.cgi
>
> I've incorporated many of the suggestions you all sent in after the  
> first release, and many of the modules that had non-standard  
> documentation have been updated in the meantime, too, so hopefully  
> you'll find it much improved. There are still some issues with a few  
> modules; please report any problems you see. Also, it's now indexing  
> bioperl-live instead of 1.4, which should make it a little more  
> useful, too. A complete list of changes is below.
>
> I welcome your bug reports and suggestions for improvements, via  
> email, this list, Bugzilla, or the Wiki page.
>
>
> Thanks,
> Dave
>
>   
Here are some comments:
Would be good to have the column headings for the methods table in the 
fixed part of the page, rather than the scroll box. That way you could 
always see the column headings from anywhere in the list.

Second, I've noticed that there are a fair number of methods that have 
"not documented" for "Returns" and "Usage". But in every case I've 
checked both of these were documented. For example, consider methods for 
Bio::Seq::SeqWithQuality. The method "accession_number" is listed as 
"not documented". But if you click on Bio::Seq:SeqWithQuality link to 
the documentation, usage is defined as: "$unique_biological_key = 
$obj->accession_number;" and returns is defined as "A string".

Finally, it would be good to have the version of bioperl being 
deobfuscated on the deob_interface.cgi page. Just as a quick 
sanity-checking measure. After poking around a bit I found that 
bioperl-live is being indexed in the wiki. But, I can tell, it is just 
the sort of thing I'm going to forget and look for every time come  back 
to the page after a few months...

Overall very nice, though. Just what is needed when I'm trying to 
remember "which was the method that returns subseq string and which one 
returns an object?"


Phillip SanMiguel
Purdue University


From bix at sendu.me.uk  Thu Oct  5 17:24:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 22:24:34 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
Message-ID: <45257812.5050008@sendu.me.uk>

Chris Fields wrote:
> 
> I have corrected the EUtilities POD to reflect that all text input needs 
> to be raw as URI encoding is done in the module, which should work (I 
> think).  I plan on committing it tonight.  It also indicates that 
> EUtilities search queries need to be made as if they are regular Entrez 
> queries.  Would that be sufficient?

You may not even need to mention anything about URI encoding, which 
might frighten some people. Something as simple as:

=head1 SYNOPSIS

use Bio::DB::EUtilities;

   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                          -db         => 'pubmed',
                                          -term       => 'hutP AND xyz',
...

and/or some POD for the new() method:

=head2 new

  Title   : new
...
  Args    : -eutil => ...
            -db    => ...
            -term  => string, an entrez-style query

=cut

would get the point across, I think.

BTW, can the term string be supplied anywhere else other than new()? It 
doesn't matter at all if it can't, I'm just idly wondering if I missed 
anything.


From dmessina at wustl.edu  Thu Oct  5 17:42:49 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 5 Oct 2006 16:42:49 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45256E18.3080103@purdue.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
Message-ID: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>

Thanks so much, Phillip, for taking the time to check out the new  
version and send your comments. I really appreciate it! I've added  
them to the wiki page so I can track them.

Best,
Dave


From cjfields at uiuc.edu  Thu Oct  5 17:50:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:50:11 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45257812.5050008@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
	<45257812.5050008@sendu.me.uk>
Message-ID: <A0B37F41-7C33-49F6-A039-A35AB5696947@uiuc.edu>

Sendu,

I have the parameters all set up as get/sets at this point, but I'm  
open to suggestions on that.  Note in the BEGIN block the heredoc eval 
{} block.  Yes, nasty I know, but I hate AUTOLOAD.  It works as a  
quick way of getting parameter get/sets up-and-running.  I plan on  
making those explicit get/sets as soon as I can then sorting out  
particular ones to the various eutil modules where they are primarily  
used.

Long story short, every parameter is a get/set at this time  
(including term()).  The common ones needed for most EUtilities are  
initialized in the parent EUtilities::_initialize(), and eutil- 
specific parameters are initialized in the individual eutil plugins.   
Each eutil plugin only sets whatever parameters may be needed for  
operation (though you could circumvent that, since all of them are  
inherited via EUtilities).

We could always simplify it to accept simple key-value pairs, but get/ 
sets (at least to me) allow more flexibility as long as you remember  
which parameters are set and to what.

Chris

On Oct 5, 2006, at 4:24 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> I have corrected the EUtilities POD to reflect that all text input  
>> needs to be raw as URI encoding is done in the module, which  
>> should work (I think).  I plan on committing it tonight.  It also  
>> indicates that EUtilities search queries need to be made as if  
>> they are regular Entrez queries.  Would that be sufficient?
>
> You may not even need to mention anything about URI encoding, which  
> might frighten some people. Something as simple as:
>
> =head1 SYNOPSIS
>
> use Bio::DB::EUtilities;
>
>   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                          -db         => 'pubmed',
>                                          -term       => 'hutP AND  
> xyz',
> ...
>
> and/or some POD for the new() method:
>
> =head2 new
>
>  Title   : new
> ...
>  Args    : -eutil => ...
>            -db    => ...
>            -term  => string, an entrez-style query
>
> =cut
>
> would get the point across, I think.
>
> BTW, can the term string be supplied anywhere else other than new 
> ()? It doesn't matter at all if it can't, I'm just idly wondering  
> if I missed anything.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 17:51:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:51:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45257812.5050008@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
	<45257812.5050008@sendu.me.uk>
Message-ID: <5B2E844F-7B8B-4F69-9005-138826B835FB@uiuc.edu>

> You may not even need to mention anything about URI encoding, which
> might frighten some people. Something as simple as:
>
> =head1 SYNOPSIS
>
> use Bio::DB::EUtilities;
>
>    my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                           -db         => 'pubmed',
>                                           -term       => 'hutP AND  
> xyz',
> ...
>
> and/or some POD for the new() method:
>
> =head2 new
>
>   Title   : new
> ...
>   Args    : -eutil => ...
>             -db    => ...
>             -term  => string, an entrez-style query
>
> =cut
>
> would get the point across, I think.

Oops, forgot.  I'll add this in and update new() when I can.  Thanks!

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Thu Oct  5 18:12:49 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 05 Oct 2006 17:12:49 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45256E18.3080103@purdue.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
Message-ID: <45258361.8080803@campus.iztacala.unam.mx>

Phillip San Miguel wrote:
> Finally, it would be good to have the version of bioperl being 
> deobfuscated on the deob_interface.cgi page. Just as a quick 
> sanity-checking measure. After poking around a bit I found that 
> bioperl-live is being indexed in the wiki. But, I can tell, it is just 
> the sort of thing I'm going to forget and look for every time come  back 
> to the page after a few months...

Dave,

I think this value can be stored in one of the index files and passed as 
an argument to the deob_index.pl script. What do you think?

Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From lincoln.stein at gmail.com  Thu Oct  5 14:42:41 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:42:41 -0400
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
In-Reply-To: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>
References: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>
Message-ID: <6dce9a0b0610051142h56479843ofc5429d959cb6e3@mail.gmail.com>

I think it's fine unless there is a significant performance hit, in
which case the change should be made into a configuration option. Do
you know if there is any overhead on doing this?

Lincoln

On 10/5/06, Cook, Malcolm <MEC at stowers-institute.org> wrote:
> Lincoln,
>
> I committed a change to Bio::SeqFeature::Store to use nfreeze instead of
> freeze which should allow SeqFeature objects to survive database
> freeze/thaw cycles across architectures.
>
> I hope I was not presumptuous or in error in doing this....
>
> Regards,
>
> Malcolm Cook
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, Missouri
>
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From torsten.seemann at infotech.monash.edu.au  Fri Oct  6 01:26:10 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 06 Oct 2006 15:26:10 +1000
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
Message-ID: <4525E8F2.1000704@infotech.monash.edu.au>

Hilmar,

> I don't think there's a need to deprecate - if the methods just plain  
> delegate to whatever File:: module is appropriate their  
> implementation (supposedly) will become very simple and hence won't  
> pose a maintenance burden anymore.

>> I have an uncommitted simplified version of Bio::Root::IO which does
>> this, and "all tests pass". The functions currently (silently)  
>> dispatch
>> directly to their native counterparts.
>>
>> The only tricky function is tempfile() which is *mostly* like
>> File::Temp::tempfile(), but does some voodoo of converting
>> (TEMPLATE=>'xxx') to the non-hash first parameter of the File::  
>> version,
>> so I'm hesitant to commit. It may do other magic - Hilmar?
> 
> Not that I would know of. If the tests pass (without having to change  
> them!) I'd give it a try.

Tempfile.t had two tests that failed. It seems that Bio::Root::IO had 
some magic whereby it would keep a list of all tempfilenames created 
with UNLINK != 0 and when the Bio::Root::IO object was destroyed (eg. 
undef $obj) it would MANUALLY unlink each of them. This would occur 
before File::Temp got to unlink them. Not sure why it was written like 
this (as File::Temp will delete them at the end of the script anyway) 
but maybe it was legacy for when File::Temp::tempfile WASN'T available.
Anyway, I've kept backward compatibility there, although I think 
eventually it should be removed and Tempfile.t adjusted.

Although all tests pass with my new trim Bio/Root/IO.pm I am still 
concerned about committing as the assumption is that the BioPerl test 
suite is good enough to handle such a change to an important module, but 
the reality may be different :-)

Let me know if you think I should commit anyway,

Your advice is appreciated.

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From dmessina at wustl.edu  Fri Oct  6 01:25:56 2006
From: dmessina at wustl.edu (David Messina)
Date: Fri, 6 Oct 2006 00:25:56 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45258361.8080803@campus.iztacala.unam.mx>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
	<45258361.8080803@campus.iztacala.unam.mx>
Message-ID: <CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>


On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote:
> I think this value can be stored in one of the index files and  
> passed as an argument to the deob_index.pl script. What do you think?

Yep, I think that works nicely. I added this feature and committed it  
to CVS. Here's what the new header looks like if you do deob_index.pl  
-s "bioperl-live":

?
Thanks for the suggestions, guys.

Dave

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061006/1c5819f9/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deob_header.jpg
Type: image/jpeg
Size: 25739 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061006/1c5819f9/attachment-0003.jpg>

From deep_ans at yahoo.com  Fri Oct  6 09:22:49 2006
From: deep_ans at yahoo.com (deepak shingan)
Date: Fri, 6 Oct 2006 06:22:49 -0700 (PDT)
Subject: [Bioperl-l] Sort blast file result according to evalues
Message-ID: <20061006132249.49450.qmail@web51711.mail.yahoo.com>

Hi ,
  Is  there any way to parse the blast file according to evalue for each hit. I want the output sorted according to hit evalue. I am using SearchIO algorithm and already tried sorting the hits according to bits, gaps, but I am not able to sort the hits by evalue. 
  As evalues are mainly associated with hsp and each hit may have multiple hsps. 
   
  waiting for help.
   
  Thanks,
  Dun Dansi
   
   
---------------------------------
How low will we go? Check out Yahoo! Messenger?s low  PC-to-Phone call rates.


From hlapp at gmx.net  Fri Oct  6 10:03:04 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 6 Oct 2006 10:03:04 -0400
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <4525E8F2.1000704@infotech.monash.edu.au>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
	<4525E8F2.1000704@infotech.monash.edu.au>
Message-ID: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>

This is a 1.5, i.e. developers release that's in the works, and also  
you'd be doing this on the main trunk. If you get the tests to pass  
there's no reason to hold back.

You may be right and in reality it has repercussions somewhere, but  
those will be the opportunities to improve our test suite.

	-hilmar

On Oct 6, 2006, at 1:26 AM, Torsten Seemann wrote:

> Although all tests pass with my new trim Bio/Root/IO.pm I am still  
> concerned about committing as the assumption is that the BioPerl  
> test suite is good enough to handle such a change to an important  
> module, but the reality may be different :-)
>
> Let me know if you think I should commit anyway,
>
> Your advice is appreciated.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct  6 10:58:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 09:58:09 -0500
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
Message-ID: <F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>

The evalue for the hit is retrieved by the BlastHit::signifiance()  
method, if I remember correctly.  So if $hit is a  
Bio::Search::Hit::BlastHit object, you use $hit->significance.  If  
you want individual HSP evalues, you would use $hsp->evalue for the  
individual HSP objects.

The output is normally sorted by the order they appear in the  
alignments and table, which is typically by increasing evalue or  
decreasing bits (score).  So they are already sorted.  If you wanted  
to run a sort yourself you could use a sort block using '{$a- 
 >significance() <=> $b->significance()} @hits', but as pointed out  
on the wiki it may be safer to run a Schwartzian transform instead:

http://www.bioperl.org/wiki/Bioperl_Best_Practices#Sorting

Chris

On Oct 6, 2006, at 8:22 AM, deepak shingan wrote:

> Hi ,
>   Is  there any way to parse the blast file according to evalue for  
> each hit. I want the output sorted according to hit evalue. I am  
> using SearchIO algorithm and already tried sorting the hits  
> according to bits, gaps, but I am not able to sort the hits by evalue.
>   As evalues are mainly associated with hsp and each hit may have  
> multiple hsps.
>
>   waiting for help.
>
>   Thanks,
>   Dun Dansi
>
>
>
>
>  		
> ---------------------------------
> How low will we go? Check out Yahoo! Messenger?s low  PC-to-Phone  
> call rates.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Oct  6 11:03:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 10:03:45 -0500
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
	<4525E8F2.1000704@infotech.monash.edu.au>
	<074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>
Message-ID: <265AD609-F74E-4545-B3DD-FF94290BE0B4@uiuc.edu>

On Oct 6, 2006, at 9:03 AM, Hilmar Lapp wrote:

> This is a 1.5, i.e. developers release that's in the works, and also
> you'd be doing this on the main trunk. If you get the tests to pass
> there's no reason to hold back.
>
> You may be right and in reality it has repercussions somewhere, but
> those will be the opportunities to improve our test suite.
>
> 	-hilmar

Agreed, though I think Sendu only wants bug fixes for 1.5.2.  You  
could always commit to CVS HEAD and it could be in 1.5.3.

Let me rethink that.  There were some subtle tempfile/tempdir issues  
that were popping up on WinXP where the some tempfiles were not being  
deleted b/c of permissions issues; I had planned on adding that to  
Bugzilla today or tomorrow.  Maybe changing to File::Temp would fix  
that, so in essence it would be a bug fix!

I'll go ahead and post the bug.

Chris

>> Although all tests pass with my new trim Bio/Root/IO.pm I am still
>> concerned about committing as the assumption is that the BioPerl
>> test suite is good enough to handle such a change to an important
>> module, but the reality may be different :-)
>>
>> Let me know if you think I should commit anyway,
>>
>> Your advice is appreciated.
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From pmiguel at purdue.edu  Fri Oct  6 11:06:56 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Fri, 06 Oct 2006 11:06:56 -0400
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>	<45256E18.3080103@purdue.edu>
	<5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>
Message-ID: <45267110.7030905@purdue.edu>

David Messina wrote:
> Thanks so much, Phillip, for taking the time to check out the new  
> version and send your comments. I really appreciate it! I've added  
> them to the wiki page so I can track them.
>
> Best,
> Dave
>   
Dave,
    No problem.
    I've just added a "keyword" to search BioPerl Deobfuscator to my 
Firefox browser. That way I can just type "deob qual" in my URL bar in 
firefox and the browser jumps directly to BioPerl Deobfuscator (like a 
bookmark) but it pre-submits the search item "qual".
    I heard about the Firefox "keywords" in a TWiT/FLOSS episode on 
mozilla. You just go to any search page and right-click in the search 
box of interest and one of the choices is "Add a Keyword for this 
Search". Then you just have to fill out "Name" and "Keyword" fields and 
drop the keyword into whatever folder you like. The "Keyword" then 
becomes the word to invoke that search with parameters that follow it 
when it is typed into the URL bar.
Phillip


From arareko at campus.iztacala.unam.mx  Fri Oct  6 11:18:02 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Fri, 06 Oct 2006 10:18:02 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>	<45256E18.3080103@purdue.edu>	<45258361.8080803@campus.iztacala.unam.mx>
	<CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>
Message-ID: <452673AA.7070305@campus.iztacala.unam.mx>

Looks great! I'll update it during the weekend.

Mauricio.

David Messina wrote:
> 
> On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote:
>> I think this value can be stored in one of the index files and passed 
>> as an argument to the deob_index.pl script. What do you think?
> 
> Yep, I think that works nicely. I added this feature and committed it to 
> CVS. Here's what the new header looks like if you do deob_index.pl -s 
> "bioperl-live":
> 
> 
> Thanks for the suggestions, guys.
> 
> Dave
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From bix at sendu.me.uk  Fri Oct  6 11:27:14 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 06 Oct 2006 16:27:14 +0100
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
	<F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
Message-ID: <452675D2.9090803@sendu.me.uk>

Chris Fields wrote:
> The evalue for the hit is retrieved by the BlastHit::signifiance()  
> method, if I remember correctly.  So if $hit is a  
> Bio::Search::Hit::BlastHit object, you use $hit->significance.  If  
> you want individual HSP evalues, you would use $hsp->evalue for the  
> individual HSP objects.
> 
> The output is normally sorted by the order they appear in the  
> alignments and table, which is typically by increasing evalue or  
> decreasing bits (score).  So they are already sorted.

Concur.


> If you wanted to run a sort yourself you could use a sort block using
> '{$a->significance() <=> $b->significance()} @hits'

Actually, it is best to use the sort_hits() method of the result object 
prior to asking for any hits. (As this allows for potential optimization 
in the parser.)

->significance is still the thing you need to sort on though.


From cjfields at uiuc.edu  Fri Oct  6 11:52:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 10:52:57 -0500
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <452675D2.9090803@sendu.me.uk>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
	<F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
	<452675D2.9090803@sendu.me.uk>
Message-ID: <31A6FC3A-8BEB-42B8-B51D-66E659EF7495@uiuc.edu>


On Oct 6, 2006, at 10:27 AM, Sendu Bala wrote:

>> If you wanted to run a sort yourself you could use a sort block using
>> '{$a->significance() <=> $b->significance()} @hits'
>
> Actually, it is best to use the sort_hits() method of the result  
> object
> prior to asking for any hits. (As this allows for potential  
> optimization
> in the parser.)

Ah, forgot about that one!

Chris


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Fri Oct  6 14:36:49 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 6 Oct 2006 11:36:49 -0700
Subject: [Bioperl-l] tempfile cleanup
In-Reply-To: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu>
References: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu>
Message-ID: <0FCEC6B2-E190-4800-AAB1-89559C552FA6@bioperl.org>

I think the magic trickery in there for cleanup is that File::Temp  
only cleans up tempfiles when Perl exits not when the Root::IO object  
goes out of scope -- so this can be a problem for people on CGI  
scripts that stay resident in memory and don't ever have tempfiles  
cleaned up.  The managing the list aspect allows us to call _cleanup  
periodically (perhaps before the start of every Blast run) to insure  
that tempfiles are removed.  perhaps newer File::Temp versions can  
solve this better now but I believe that was the behavior we were  
trying to deal with with managing the list of to-be-deleted files by  
the Root::IO object.

This is some hackery that also had to do with not expecting  
File::Temp to be installed I believe.

-jason


From torsten.seemann at infotech.monash.edu.au  Mon Oct  9 00:52:29 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Mon, 09 Oct 2006 14:52:29 +1000
Subject: [Bioperl-l] Multiple packages in the one .pm file
Message-ID: <4529D58D.1080004@infotech.monash.edu.au>

Hi all,

The following modules have more than one "package xxxx;" declaration in 
them. For small, internal classes I guess this is fine, but for others,
they should be split up into the filesystem - otherwise they are 
troublesome to locate and the online documentation doesn't list them!

eg.
bioperl-run/Bio/Tools/Run/Analysis/Job.pm
is in
bioperl-run/Bio/Tools/Run/Analysis.pm

Here's the culprits:

% for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | 
sed 's/:.*$//' | sort | uniq -d ; done

bioperl-live/Bio/AnalysisI.pm
bioperl-live/Bio/DB/Fasta.pm
bioperl-live/Bio/DB/GFF.pm
bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
bioperl-live/Bio/SeqIO/interpro.pm

bioperl-run/Bio/Tools/Run/Analysis.pm
bioperl-run/Bio/Tools/Run/Analysis/soap.pm

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From pmiguel at purdue.edu  Mon Oct  9 15:57:12 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Mon, 09 Oct 2006 15:57:12 -0400
Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC?
Message-ID: <452AA998.5010104@purdue.edu>

I found a bug in Bio::SeqIO::phd and am wondering if the fix will 
propagate into the next release candidate?

The bug is here:

http://bugzilla.open-bio.org/show_bug.cgi?id=2120

I also created a patch that fixes it (on my machine, anyway).  It is a 
fairly minor change, so it seems like it would be worth propagating it 
into the next release candidate.

-- 
Phillip SanMiguel


From bix at sendu.me.uk  Mon Oct  9 16:57:28 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 09 Oct 2006 21:57:28 +0100
Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC?
In-Reply-To: <452AA998.5010104@purdue.edu>
References: <452AA998.5010104@purdue.edu>
Message-ID: <452AB7B8.4040404@sendu.me.uk>

Phillip San Miguel wrote:
> I found a bug in Bio::SeqIO::phd and am wondering if the fix will 
> propagate into the next release candidate?
> 
> The bug is here:
> 
> http://bugzilla.open-bio.org/show_bug.cgi?id=2120
> 
> I also created a patch that fixes it (on my machine, anyway).  It is a 
> fairly minor change, so it seems like it would be worth propagating it 
> into the next release candidate.

If it gets committed to HEAD before I make the next candidate, then yes.
I'll do that if no one beats me to it (and if someone does, please add a 
new test for this).

BTW Phillip, thank you for the bug report but in future use the 
attachment capabilities for files, please don't paste them into the 
comments box.


From bix at sendu.me.uk  Mon Oct  9 17:01:56 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 09 Oct 2006 22:01:56 +0100
Subject: [Bioperl-l] Analysis soap problem
Message-ID: <452AB8C4.1010704@sendu.me.uk>

I thought I'd 'advertise' this bug on the list so more people see it:
http://bugzilla.open-bio.org/show_bug.cgi?id=2117

I don't want to make the next 1.5.2 release candidate until its fixed. 
Does anyone have any idea about it? Even if you can't fix it, just 
explaining what's (supposed) to be going on would help a lot.

Thank you,
Sendu.


From Kevin.M.Brown at asu.edu  Mon Oct  9 18:40:54 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 9 Oct 2006 15:40:54 -0700
Subject: [Bioperl-l] Analysis soap problem
Message-ID: <1A4207F8295607498283FE9E93B775B40219690B@EX02.asurite.ad.asu.edu>

If I had to guess from looking at the snippet provided, the variable
$seq holds no data so when you try to setup the regex /^$seq$/ you end
up with /^$/ (blank line) and the warning.

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 09, 2006 2:02 PM
> To: bioperl-l List
> Subject: [Bioperl-l] Analysis soap problem
> 
> I thought I'd 'advertise' this bug on the list so more people see it:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2117
> 
> I don't want to make the next 1.5.2 release candidate until 
> its fixed. 
> Does anyone have any idea about it? Even if you can't fix it, just 
> explaining what's (supposed) to be going on would help a lot.
> 
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Mon Oct  9 22:34:23 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 9 Oct 2006 21:34:23 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452AB8C4.1010704@sendu.me.uk>
References: <452AB8C4.1010704@sendu.me.uk>
Message-ID: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>

I have 'fixed' this in CVS.  Note the quotes; it depends on what you  
might consider fixed.  Multiple calls to results() were returning  
empty hash refs, so no data was being returned.   For now, I stored  
the hash reference in a variable then tested each one.  All tests now  
pass, including the 'outseq' one.

Maybe it's just me, but shouldn't results() either consistently  
return the same information, or contain documentation that it doesn't  
do so?  Anyway, I have left the bugzilla report open for now.

Chris

On Oct 9, 2006, at 4:01 PM, Sendu Bala wrote:

> I thought I'd 'advertise' this bug on the list so more people see it:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2117
>
> I don't want to make the next 1.5.2 release candidate until its fixed.
> Does anyone have any idea about it? Even if you can't fix it, just
> explaining what's (supposed) to be going on would help a lot.
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bosborne11 at verizon.net  Mon Oct  9 22:09:45 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 09 Oct 2006 22:09:45 -0400
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <C1507929.AB8F%bosborne11@verizon.net>

Torsten,

Fixed interpro.pm, it could have been written more simply (or more like
other SeqIO modules). Can't really address the others.

Brian O.


On 10/9/06 12:52 AM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

> Hi all,
> 
> The following modules have more than one "package xxxx;" declaration in
> them. For small, internal classes I guess this is fine, but for others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
> 
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
> 
> Here's the culprits:
> 
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
> 
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
> 
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm


From bix at sendu.me.uk  Tue Oct 10 03:03:20 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 08:03:20 +0100
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
Message-ID: <452B45B8.8010401@sendu.me.uk>

Chris Fields wrote:
> I have 'fixed' this in CVS.  Note the quotes; it depends on what you  
> might consider fixed.  Multiple calls to results() were returning  
> empty hash refs, so no data was being returned.   For now, I stored  
> the hash reference in a variable then tested each one.  All tests now  
> pass, including the 'outseq' one.
> 
> Maybe it's just me, but shouldn't results() either consistently  
> return the same information, or contain documentation that it doesn't  
> do so?  Anyway, I have left the bugzilla report open for now.

Judging by the tests there seems a clear expectation that multiple calls 
to results() should work, and certainly that makes sense and seems 
natural. So I'd say that results() should be fixed and the test script 
reverted.


From cjfields at uiuc.edu  Tue Oct 10 07:42:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 06:42:33 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452B45B8.8010401@sendu.me.uk>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
	<452B45B8.8010401@sendu.me.uk>
Message-ID: <A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>

I agree, though I think Martin Senger should be contacted, at least  
to get his thoughts.  Has anyone tried yet?

Chris

On Oct 10, 2006, at 2:03 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> I have 'fixed' this in CVS.  Note the quotes; it depends on what you
>> might consider fixed.  Multiple calls to results() were returning
>> empty hash refs, so no data was being returned.   For now, I stored
>> the hash reference in a variable then tested each one.  All tests now
>> pass, including the 'outseq' one.
>>
>> Maybe it's just me, but shouldn't results() either consistently
>> return the same information, or contain documentation that it doesn't
>> do so?  Anyway, I have left the bugzilla report open for now.
>
> Judging by the tests there seems a clear expectation that multiple  
> calls
> to results() should work, and certainly that makes sense and seems
> natural. So I'd say that results() should be fixed and the test script
> reverted.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 10 08:14:31 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 13:14:31 +0100
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
	<452B45B8.8010401@sendu.me.uk>
	<A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>
Message-ID: <452B8EA7.1080800@sendu.me.uk>

Chris Fields wrote:
> I agree, though I think Martin Senger should be contacted, at least to 
> get his thoughts.  Has anyone tried yet?

He's CCd on the bug report, but I haven't tried directly, no. Do you 
want to tackle this (contacting him and/or fixing the bug)?

Cheers,
Sendu.


From cjfields at uiuc.edu  Tue Oct 10 09:20:03 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 08:20:03 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452B8EA7.1080800@sendu.me.uk>
Message-ID: <001801c6ec6e$cc016900$15327e82@pyrimidine>

I'll try giving it a closer look, just didn't have much time yesterday.
I'll also try contacting Martin.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Tuesday, October 10, 2006 7:15 AM
> To: bioperl-l
> Subject: Re: [Bioperl-l] Analysis soap problem
> 
> Chris Fields wrote:
> > I agree, though I think Martin Senger should be contacted, at least to
> > get his thoughts.  Has anyone tried yet?
> 
> He's CCd on the bug report, but I haven't tried directly, no. Do you
> want to tackle this (contacting him and/or fixing the bug)?
> 
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pmiguel at purdue.edu  Tue Oct 10 10:26:35 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Tue, 10 Oct 2006 10:26:35 -0400
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452AB7B8.4040404@sendu.me.uk>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
Message-ID: <452BAD9B.5010903@purdue.edu>

Sendu Bala wrote:
>
> BTW Phillip, thank you for the bug report but in future use the 
> attachment capabilities for files, please don't paste them into the 
> comments box.
>   
Sendu,
    Sounds reasonable to me. I should note, however; when I entered the 
bug, I was looking for some method to attach files. There is none on the 
"Enter Bug: Bioperl" page:

http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl

Also, "bug writing guidelines" makes no mention of it. I vaguely 
remembered there being some method to do it--but given the "bug writing 
guidelines" exhortations to be specific and detailed, I thought I must 
put the information somewhere. So I put them them the only place offered 
(on that page)--"Description:"
    I see that, once submitted, attachments can be added to a bug 
report. Is that normally how it is done? Doesn't each attachment result 
in a separate email to the bioperl guts email list?
    Anyway,  I've just added the files to the bug report as attachments, 
in case someone needs them to construct a test.
   
-- 
Phillip


From bix at sendu.me.uk  Tue Oct 10 11:10:25 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 16:10:25 +0100
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
	<452BAD9B.5010903@purdue.edu>
Message-ID: <452BB7E1.5020200@sendu.me.uk>

Phillip San Miguel wrote:
> Sendu Bala wrote:
>> BTW Phillip, thank you for the bug report but in future use the 
>> attachment capabilities for files, please don't paste them into the
>>  comments box.
>> 
> Sendu, Sounds reasonable to me. I should note, however; when I
> entered the bug, I was looking for some method to attach files. There
> is none on the "Enter Bug: Bioperl" page:
> 
> http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl
> 
> Also, "bug writing guidelines" makes no mention of it. I vaguely 
> remembered there being some method to do it--but given the "bug
> writing guidelines" exhortations to be specific and detailed, I
> thought I must put the information somewhere. So I put them them the
> only place offered (on that page)--"Description:"

I agree that things could be better here. Who looks after bugzilla, and
is this an alterable feature?


> I see that, once submitted, attachments can be added to a bug report.
> Is that normally how it is done?

Yes, AFAIK.


> Doesn't each attachment result in a separate email to the bioperl
> guts email list?

Yes, but that's not a problem. In fact, doing it this way means you
don't email everyone subscribed to guts your big files in plain text,
but instead they get a small email with a link to the download.


> Anyway,  I've just added the files to the bug report as attachments,
>  in case someone needs them to construct a test.

Thank you.


From arareko at campus.iztacala.unam.mx  Tue Oct 10 11:14:00 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Tue, 10 Oct 2006 10:14:00 -0500
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
	<452BAD9B.5010903@purdue.edu>
Message-ID: <452BB8B8.40409@campus.iztacala.unam.mx>

Phillip San Miguel wrote:
> I see that, once submitted, attachments can be added to a bug report.
>  Is that normally how it is done?

Yes, it's the normal method: create the bug report, then attach files.

> Doesn't each attachment result in a separate email to the bioperl 
> guts email list?

Adding a file will generate an informative email per bug change 
(attaching the file in this case) but won't send the attachment to the list.

Regards,
Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Tue Oct 10 11:20:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 10:20:55 -0500
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
Message-ID: <002801c6ec7f$ae8d85f0$15327e82@pyrimidine>

> Also, "bug writing guidelines" makes no mention of it. I vaguely
> remembered there being some method to do it--but given the "bug writing
> guidelines" exhortations to be specific and detailed, I thought I must
> put the information somewhere. So I put them them the only place offered
> (on that page)--"Description:"
>     I see that, once submitted, attachments can be added to a bug
> report. Is that normally how it is done? Doesn't each attachment result
> in a separate email to the bioperl guts email list?
>     Anyway,  I've just added the files to the bug report as attachments,
> in case someone needs them to construct a test.

Phillip,

Initial bug reports only require the general description, OS used, bioperl
version, etc.  That's quite normal.  Any relevant attachments are added
afterward.  We should probably make that clearer upfront on the wiki page; I
don't know if anyone can make similar changes to bugzilla.

Any bug changes, CVS commits, etc are mailed to bioperl-guts, yes.  That
isn't an issue though; it keeps the developers updated on the various
bugs/commits that are going on and is a pretty common practice.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 12:48:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 11:48:22 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
References: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>

There are a number of other bioperl-run examples (the  
Bio::Tools::Run::Analysis::soap issue I looked into revealed such).

I agree with both points, 1) that it depends on the size of the  
classes, and 2) from a maintainability standpoint, it can be very  
frustrating when looking for documentation.  Is there really any  
advantage to doing this?

Chris

On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:

> Hi all,
>
> The following modules have more than one "package xxxx;"  
> declaration in
> them. For small, internal classes I guess this is fine, but for  
> others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
>
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
>
> Here's the culprits:
>
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
>
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
>
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 12:48:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 11:48:22 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
References: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>

There are a number of other bioperl-run examples (the  
Bio::Tools::Run::Analysis::soap issue I looked into revealed such).

I agree with both points, 1) that it depends on the size of the  
classes, and 2) from a maintainability standpoint, it can be very  
frustrating when looking for documentation.  Is there really any  
advantage to doing this?

Chris

On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:

> Hi all,
>
> The following modules have more than one "package xxxx;"  
> declaration in
> them. For small, internal classes I guess this is fine, but for  
> others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
>
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
>
> Here's the culprits:
>
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
>
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
>
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From lzhtom at hotmail.com  Tue Oct 10 15:42:48 2006
From: lzhtom at hotmail.com (zhihua li)
Date: Tue, 10 Oct 2006 19:42:48 +0000
Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise?
Message-ID: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>

Hi netters.

I've installed Bioperl 1.5.1, both core and run modules.  But when I tried 
to use the Pise module, an error occured saying that there's no "new" 
method in this package.

My script is:

use strict;
use warnings;
use Bio::Tools::Run::AnalysisFactory::Pise;
my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
my $program=$factory->program('mfold');
$program->seq('my_input_file');
my $job = $program->run();
print STDERR $job->contect('mfold.out');

The error message I got is:

Can't locate object method "new" via package 
"Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load 
"Bio::Tools::Run::AnalysisFactor::Pise"?)

I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm and 
it DOES contain a sub new.

So what's going on? Anyone could give me a hint?

Thanks a lot!


From cjfields at uiuc.edu  Tue Oct 10 16:27:27 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 15:27:27 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
Message-ID: <E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>

Makes sense to me.  I think, as long as they're documented, it  
shouldn't be a problem.

I think the main point is that the class methods for these don't show  
up using perldoc (something I ran into with Bio::DB::Fasta's  
inclusion of Bio::PrimarySeq::Fasta), but they do show up when using  
other documentation.  So 'perldoc Bio::DB::Fasta' works, but 'perldoc  
Bio::PrimarySeq::Fasta' doesn't.  So these can be problematic when  
looking for specific methods.

However, I think pod2html handles multiple package declarations in  
one module, and the PDOC online do as well.  Does the Deobfuscator?

Chris

On Oct 10, 2006, at 3:11 PM, Lincoln Stein wrote:

> Hi,
>
> These ones are all mine:
>
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
>
> In each case, the second modules are teeny tiny ones that implement  
> iterators which are at most two methods long (typically a new() and  
> a next()). I prefer not to split them out because they will just  
> clutter up the file tree with stuff that is already well documented  
> in the "parent ship" modules.
>
> Lincoln
>
>
> On 10/10/06, Chris Fields <cjfields at uiuc.edu> wrote: There are a  
> number of other bioperl-run examples (the
> Bio::Tools::Run::Analysis::soap issue I looked into revealed such).
>
> I agree with both points, 1) that it depends on the size of the
> classes, and 2) from a maintainability standpoint, it can be very
> frustrating when looking for documentation.  Is there really any
> advantage to doing this?
>
> Chris
>
> On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:
>
> > Hi all,
> >
> > The following modules have more than one "package xxxx;"
> > declaration in
> > them. For small, internal classes I guess this is fine, but for
> > others,
> > they should be split up into the filesystem - otherwise they are
> > troublesome to locate and the online documentation doesn't list  
> them!
> >
> > eg.
> > bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> > is in
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> >
> > Here's the culprits:
> >
> > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/ 
> Bio |
> > sed 's/:.*$//' | sort | uniq -d ; done
> >
> > bioperl-live/Bio/AnalysisI.pm
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> > bioperl-live/Bio/SeqIO/interpro.pm
> >
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> > bioperl-run/Bio/Tools/Run/Analysis/soap.pm
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)
> FOR URGENT MESSAGES & SCHEDULING,
> PLEASE CONTACT MY ASSISTANT,
> SANDRA MICHELSEN, AT michelse at cshl.edu

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 16:30:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 15:30:16 -0500
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <870B7500-AA83-42D7-965B-865B91AA8E7F@uiuc.edu>


On Oct 10, 2006, at 2:42 PM, zhihua li wrote:

> Hi netters.
>
> I've installed Bioperl 1.5.1, both core and run modules.  But when  
> I tried to use the Pise module, an error occured saying that  
> there's no "new" method in this package.
>
> My script is:
>
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
>
> The error message I got is:
>
> Can't locate object method "new" via package  
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load  
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
>
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/ 
> Pise.pm and it DOES contain a sub new.
>
> So what's going on? Anyone could give me a hint?
>
> Thanks a lot!

Well, according to your error output you have AnalysisFactory  
misspelled ('AnalysisFactor'), which should tell you what the problem  
is.  Look for the same thing in your script.

Chris


>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 10 16:43:06 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 21:43:06 +0100
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <452C05DA.5050803@sendu.me.uk>

zhihua li wrote:
> Hi netters.
> 
> I've installed Bioperl 1.5.1, both core and run modules.  But when I 
> tried to use the Pise module, an error occured saying that there's no 
> "new" method in this package.
> 
> My script is:
> 
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
> 
> The error message I got is:
> 
> Can't locate object method "new" via package 
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load 
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
> 
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm 
> and it DOES contain a sub new.
> 
> So what's going on? Anyone could give me a hint?

You have a typo.

Bio::Tools::Run::AnalysisFactory::Pise, not
Bio::Tools::Run::AnalysisFactor::Pise


From lincoln.stein at gmail.com  Tue Oct 10 16:11:00 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 10 Oct 2006 16:11:00 -0400
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
Message-ID: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>

Hi,

These ones are all mine:

> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm

In each case, the second modules are teeny tiny ones that implement
iterators which are at most two methods long (typically a new() and a
next()). I prefer not to split them out because they will just clutter up
the file tree with stuff that is already well documented in the "parent
ship" modules.

Lincoln


On 10/10/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> There are a number of other bioperl-run examples (the
> Bio::Tools::Run::Analysis::soap issue I looked into revealed such).
>
> I agree with both points, 1) that it depends on the size of the
> classes, and 2) from a maintainability standpoint, it can be very
> frustrating when looking for documentation.  Is there really any
> advantage to doing this?
>
> Chris
>
> On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:
>
> > Hi all,
> >
> > The following modules have more than one "package xxxx;"
> > declaration in
> > them. For small, internal classes I guess this is fine, but for
> > others,
> > they should be split up into the filesystem - otherwise they are
> > troublesome to locate and the online documentation doesn't list them!
> >
> > eg.
> > bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> > is in
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> >
> > Here's the culprits:
> >
> > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> > sed 's/:.*$//' | sort | uniq -d ; done
> >
> > bioperl-live/Bio/AnalysisI.pm
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> > bioperl-live/Bio/SeqIO/interpro.pm
> >
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> > bioperl-run/Bio/Tools/Run/Analysis/soap.pm
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From asjo at koldfront.dk  Tue Oct 10 16:04:35 2006
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Tue, 10 Oct 2006 22:04:35 +0200
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <871wpglyy4.fsf@topper.koldfront.dk>

On Tue, 10 Oct 2006 19:42:48 +0000, zhihua wrote:

> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
                                               ^
                                               y
[...]

> Can't locate object method "new" via package
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load
> "Bio::Tools::Run::AnalysisFactor::Pise"?)

You missed a 'y' in "Factory".


  Best wishes,

-- 
 "We've reached a special place... Spiritually...             Adam Sj?gren
  ecumenically... grammatically."                        asjo at koldfront.dk


From dmessina at wustl.edu  Tue Oct 10 17:08:45 2006
From: dmessina at wustl.edu (David Messina)
Date: Tue, 10 Oct 2006 16:08:45 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
	<E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
Message-ID: <A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>

> However, I think pod2html handles multiple package declarations in
> one module, and the PDOC online do as well.  Does the Deobfuscator?

Nope. From my cursory examination at the time they mostly were, as  
Lincoln said, short and sweet, so I didn't consider it a big deal.

I do think the Deobfuscator should theoretically handle such cases  
anyway, though. I'll add it as a feature request on the wiki page. Or  
if you're chomping at the bit for it, I could certainly be beer- 
suaded to do it sooner rather than later... :)

Dave


From cjfields at uiuc.edu  Tue Oct 10 17:33:39 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 16:33:39 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
	<E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
	<A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>
Message-ID: <7F35F565-7D28-4B06-A501-4D4083652C5C@uiuc.edu>

Me?  I'm a lowly postdoc.  Lincoln's got the cash!

Chris

On Oct 10, 2006, at 4:08 PM, David Messina wrote:

>> However, I think pod2html handles multiple package declarations in
>> one module, and the PDOC online do as well.  Does the Deobfuscator?
>
> Nope. From my cursory examination at the time they mostly were, as  
> Lincoln said, short and sweet, so I didn't consider it a big deal.
>
> I do think the Deobfuscator should theoretically handle such cases  
> anyway, though. I'll add it as a feature request on the wiki page.  
> Or if you're chomping at the bit for it, I could certainly be beer- 
> suaded to do it sooner rather than later... :)
>
> Dave
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From sdavis2 at mail.nih.gov  Wed Oct 11 05:43:35 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 11 Oct 2006 05:43:35 -0400
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <452CBCC7.30108@mail.nih.gov>

zhihua li wrote:
> Hi netters.
>
> I've installed Bioperl 1.5.1, both core and run modules. But when I
> tried to use the Pise module, an error occured saying that there's no
> "new" method in this package.
>
> My script is:
>
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
>
> The error message I got is:
>
> Can't locate object method "new" via package
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
>
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm
> and it DOES contain a sub new.
>
> So what's going on? Anyone could give me a hint?
>
> Thanks a lot!

The module name is Bio::Tools::Run::AnalysisFactory::Pise. Note that it
is not "factor" but "factory". That should probably fix your problem.

Sean


From jay at jays.net  Sat Oct  7 18:34:23 2006
From: jay at jays.net (Jay Hannah)
Date: Sat, 07 Oct 2006 17:34:23 -0500
Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult
Message-ID: <45282B6F.1030308@jays.net>

I just updated my bioperl-live this morning, so I think I'm current. :)

perldoc Bio::Search::Result::GenericResult
------------
SYNOPSIS
           # typically one gets Results from a SearchIO stream
           use Bio::SearchIO;
           my $io = new Bio::SearchIO(-format => 'blast',
                                       -file   => 't/data/HUMBETGLOA.tblastx');
           while( my $result = $io->next_result) {
               # process all search results within the input stream
               while( my $hit = $result->next_hits()) {
-------------

Except that "next_hits()" does not exist. Should be "next_hit()".

(Should I have posted a patch instead?)

Thanks,

j


From bosborne11 at verizon.net  Tue Oct 10 18:42:25 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 10 Oct 2006 18:42:25 -0400
Subject: [Bioperl-l] Documentation typo:
	Bio::Search::Result::GenericResult
In-Reply-To: <45282B6F.1030308@jays.net>
Message-ID: <C1519A11.ABD1%bosborne11@verizon.net>

j,

No need, not for something so simple.

Brian O.


On 10/7/06 6:34 PM, "Jay Hannah" <jay at jays.net> wrote:

> Except that "next_hits()" does not exist. Should be "next_hit()".
> 
> (Should I have posted a patch instead?)


From zchou at cau.edu.cn  Wed Oct 11 02:34:24 2006
From: zchou at cau.edu.cn (zhuocheng Hou)
Date: Wed, 11 Oct 2006 14:34:24 +0800
Subject: [Bioperl-l] about retreive alinged sequence
Message-ID: <000a01c6ecff$4ea4b2f0$0915020a@zchou>

Hello,everyone,

I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out.

The codes as follows (from the tutorials of HOWTOPAML):

         #
         # These codes run  and can find the screen print out of clustalw
         .......
         my $aa_aln = $aln_factory->align(\@prots, at params);
         # project the protein alignment back to CDS coordinates
         my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs);  
         my @each = $dna_aln->each_seq();         
         
         # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. 


         my $in  = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta');
         my $aln=$dna_aln;
         my $out = Bio::AlignIO->new(-file => ">out.msf" ,
                                   -format => 'msf');
         #print $out $_ while <$in>; 
         while ($aln = $in->next_aln() ) {
               my $out->write_aln($aln);
         }
         

Best regards,

Zhuocheng
CAU


From n.haigh at sheffield.ac.uk  Wed Oct 11 10:00:33 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 11 Oct 2006 15:00:33 +0100
Subject: [Bioperl-l] about retreive alinged sequence
In-Reply-To: <000a01c6ecff$4ea4b2f0$0915020a@zchou>
References: <000a01c6ecff$4ea4b2f0$0915020a@zchou>
Message-ID: <452CF901.6020409@sheffield.ac.uk>

Dear Zhuocheng

I'm not familiar with the aa_to_dna_al method but it appears that from 
your code that it returns an alignment object. Please find comments 
inserted below - hope they help!

Nathan

zhuocheng Hou wrote:
> Hello,everyone,
>
> I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out.
>
> The codes as follows (from the tutorials of HOWTOPAML):
>
>          #
>          # These codes run  and can find the screen print out of clustalw
>          .......
>          my $aa_aln = $aln_factory->align(\@prots, at params);
>          # project the protein alignment back to CDS coordinates
>          my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs);  
>   
$dna_aln should be a Bio::AlignIO object so all you need to do is setup 
the output stream to write the alignment object similar to what you 
wrote below. i.e.

my $out = Bio::AlignIO->new(-file => ">out.msf" ,
                                   -format => 'msf');

Then simply write the input alignment ($dna_aln) to the output stream 
with this:

my $out->write_aln($dna_aln);


>          my @each = $dna_aln->each_seq();         
>          
>          # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. 
>
>
>          my $in  = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta');
>          my $aln=$dna_aln;
>          my $out = Bio::AlignIO->new(-file => ">out.msf" ,
>                                    -format => 'msf');
>          #print $out $_ while <$in>; 
>          while ($aln = $in->next_aln() ) {
>                my $out->write_aln($aln);
>          }
>          
>
> Best regards,
>
> Zhuocheng
> CAU
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From melcher at rescomp.berkeley.edu  Wed Oct 11 17:09:17 2006
From: melcher at rescomp.berkeley.edu (Graham Melcher)
Date: Wed, 11 Oct 2006 14:09:17 -0700
Subject: [Bioperl-l] Accessing GO through MYSQL?
Message-ID: <20061011210917.GA783@rescomp.berkeley.edu>

Hey all,

Preface:: This is my first post to this list, please redirect if my
questions belong elsewhere.  

I need to lookup GO ontology information given GO:Accessors, and I have
a local mysql db that mirrors the GO db from that website.  I am not
sure if the Bio::Ontology::* libraries were designed to be used in a
dynamic, load-as-you-need sort of way, and am wondering how other people
have gone about solving this problem.  Details follow...

Right now I'm using Class::DBI to access the Mysql database, then made a
new set of subclassed Bio::Ontology::TermI and
Bio::Ontology::RelationshipI which use these class::DBI objects to
access the relevent information in the database on the fly.
Unfortunately, I was getting stuck with the implementation of some of
the other Bio::Ontology::*I, especially Ontology.   Making all of these
subclasses seems infeasible, or at least enough work that it might be
available somewhere.  Are mysql accessors out there, and I just haven't
found them, or is Bio::Ontology possibly not way to go?  

Alternatively, if I end up having to write this sort of Bio::Ontology -
Class::DBI interface, would anyone be interested in it being made
generally usable and available?

Finally, I just found go-perl, but although I haven't had a lot of time
to look into it, it doesn't seem to use mysql either.

Thanks!

Graham

-- 
Graham Melcher


From sdavis2 at mail.nih.gov  Thu Oct 12 07:51:14 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 07:51:14 -0400
Subject: [Bioperl-l] Accessing GO through MYSQL?
In-Reply-To: <20061011210917.GA783@rescomp.berkeley.edu>
References: <20061011210917.GA783@rescomp.berkeley.edu>
Message-ID: <452E2C32.7070502@mail.nih.gov>

Graham Melcher wrote:
> Finally, I just found go-perl, but although I haven't had a lot of time
> to look into it, it doesn't seem to use mysql either.
>   
Yep.  Keep going.  Go-perl and Go-db-perl:

http://www.godatabase.org/dev/go-db-perl/doc/go-db-perl-doc.html

Sean


From hlapp at gmx.net  Thu Oct 12 00:44:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 12 Oct 2006 00:44:49 -0400
Subject: [Bioperl-l] NESCent Phyloinformatics Hackathon
Message-ID: <939B253E-2F87-450A-A277-78B5645D3494@gmx.net>

(apologies in advance to those who receive this multiple times)

The National Evolutionary Synthesis Center (NESCent) in collaboration  
with Arlin Stoltzfus (U. Maryland, NIST), Aaron Mackey (GSK), Rutger  
Vos (UBC), and Mark Holder (FSU) sponsors a Phyloinformatics  
Hackathon to take place Dec 11-15 in Durham, NC.

The (wiki) website with more information and a formal proposal is at

	https://www.nescent.org/wg_phyloinformatics/

In short, the goal is to leverage the Bio* toolkits to provide the  
"glue" for evolutionary analyses of various types that depend on  
automation, interoperability, and data integration.

CALL FOR INPUT:

The specific objectives are driven by "use cases", that is, specific  
target problems of interest to evolutionary biologists (click 'Use  
Cases' at the above website). We invite community input in order to  
focus efforts on the most urgent or pervasive problems. The wiki for  
the hackathon allows direct editing of the use cases after  
registration. You may also upload data files, or add comments to the  
"Forum" page. Alternatively, send email to hlapp at nescent.org. You  
may also contact any of the organizers with questions or comments.

ATTENDANCE:

The hackathon is scheduled for Dec 11-15, 2006 in Durham NC. Space is  
limited, and attendance is by invitation. If you have not been  
contacted but desire to attend, please contact Hilmar Lapp (hlapp at  
nescent.org).

ORGANIZERS:

Hilmar Lapp (NESCent; hlapp at nescent.org)
Aaron Mackey (GSK; aaron.j.mackey at gsk.com)
Mark Holder (FSU; mholder at scs.fsu.edu)
Arlin Stoltzfus (CARB, NIST; arlin.stoltzfus at nist.gov)
Todd Vision (NESCent; tjv at bio.unc.edu)
Rutger Vos (UBC; rvosa at sfu.ca)


From neetisomaiya at gmail.com  Thu Oct 12 02:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Thu Oct 12 02:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Thu Oct 12 02:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive


From sayali_salodkar at persistent.co.in  Thu Oct 12 06:16:34 2006
From: sayali_salodkar at persistent.co.in (Sayali)
Date: Thu, 12 Oct 2006 15:46:34 +0530
Subject: [Bioperl-l] regarding polyphred output
Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in>

Hi, 

I want to parse the output of polyphred
http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already
available in Bioperl which would help me in doing the same.

Thanks,

Sayali

 
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails.


From sayali_salodkar at persistent.co.in  Thu Oct 12 06:16:34 2006
From: sayali_salodkar at persistent.co.in (Sayali)
Date: Thu, 12 Oct 2006 15:46:34 +0530
Subject: [Bioperl-l] regarding polyphred output
Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in>

Hi, 

I want to parse the output of polyphred
http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already
available in Bioperl which would help me in doing the same.

Thanks,

Sayali

 
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails.


From sdavis2 at mail.nih.gov  Thu Oct 12 06:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean


From sdavis2 at mail.nih.gov  Thu Oct 12 06:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean


From sdavis2 at mail.nih.gov  Thu Oct 12 06:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean


From crabtree at tigr.ORG  Thu Oct 12 07:28:06 2006
From: crabtree at tigr.ORG (Jonathan Crabtree)
Date: Thu, 12 Oct 2006 07:28:06 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <452E26C6.6040800@tigr.org>


Hi Neeti-

neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
>   

This doesn't sound like a BioPerl issue per se, so this list might not
be the best venue for your question.  Since SQL*Loader is an Oracle
utility you may have better luck in a forum frequented by Oracle DBAs
and/or general bioinformatics people.  (Not that this isn't such a
forum, but unless your difficulty is actually being caused by BioPerl,
or there's some kind of SQL*Loader wrapper in BioPerl--which I don't
think is the case--you run the risk of having people complain that your
question doesn't have enough to do with BioPerl.)

> We have tried loading sequences into CLOB columns using sql loader, and that
> works fine, but the same syntax when used for loading alignments, is not
> working.
>   

It's been a while since I've done any work with SQL*Loader, but I'd
guess that the reason it works with sequences and not alignments is
because there are characters in the alignments (newlines, perhaps?) that
SQL*Loader is incorrectly interpreting as either column (field) or row
(record) delimiters.  You may need to change your flat file encoding to
use delimiters other than the defaults (and alter the SQL*Loader control
file accordingly.)  As Sean pointed out, however, it's difficult to be
much help without seeing an example of a failed input and the
corresponding error(s)!  One other thing I remember about SQL*Loader (as
of Oracle 8-9 or so) is that all the CLOB values had to appear *last* in
the SQL*Loader record, at least if you were using variable-length
fields.  But since you've loaded sequences successfully, I doubt this is
the issue.  One final thought is that I believe SQL*Loader has an option
whereby you can place your LOB values in files external to the main
SQL*Loader input file, which sidesteps the field/row delimiter issue
completely; you may want to look into this if you're not already loading
your Oracle database this way.

Jonathan


From bix at sendu.me.uk  Fri Oct 13 04:56:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 13 Oct 2006 09:56:01 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au>
References: <4521E74E.1040404@infotech.monash.edu.au>
Message-ID: <452F54A1.7010908@sendu.me.uk>

Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's 
certainly interface-like, but doesn't follow the normal interface naming 
convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed 
WrapperBaseI? Left alone?


From cjfields at uiuc.edu  Fri Oct 13 08:20:58 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 13 Oct 2006 07:20:58 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <452F54A1.7010908@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>
	<452F54A1.7010908@sendu.me.uk>
Message-ID: <43CC4E80-8F15-4C83-929D-DDC719360C8F@uiuc.edu>

I would say, according to BioPerl convention, it should be renamed  
WrapperBaseI.  It has a few interface-like methods and (importantly)  
lacks a constructor.  Unless someone else out there has other reasoning?

Note that this will require lots of bioperl-run changes as well, at  
least I think it will.

Chris

On Oct 13, 2006, at 3:56 AM, Sendu Bala wrote:

> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's
> certainly interface-like, but doesn't follow the normal interface  
> naming
> convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed
> WrapperBaseI? Left alone?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From avilella at gmail.com  Fri Oct 13 11:26:47 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 13 Oct 2006 16:26:47 +0100
Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method
Message-ID: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>

Hi all,

While using the remove_gaps method in Bio::SimpleAlign I found out
that if the alignment is (bad enough for) having no columns without
any gap at all, the method will give a:

Use of uninitialized value in split at this line in add_seq:

map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq);

So my idea was to tweak this line to something like:

map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || '');

But I am unsure about any other side effects this may have.

Anyone?

    Albert.


From cjfields at uiuc.edu  Fri Oct 13 11:51:38 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 13 Oct 2006 10:51:38 -0500
Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method
In-Reply-To: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>
References: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>
Message-ID: <EE9FE57F-EE17-44FE-B298-CD4084675085@uiuc.edu>

You can check to see if it passes all tests.  I'm guessing  
SimpleAlign.t tests this method out in some way (though it's always  
safer to check).

Chris

On Oct 13, 2006, at 10:26 AM, Albert Vilella wrote:

> Hi all,
>
> While using the remove_gaps method in Bio::SimpleAlign I found out
> that if the alignment is (bad enough for) having no columns without
> any gap at all, the method will give a:
>
> Use of uninitialized value in split at this line in add_seq:
>
> map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq);
>
> So my idea was to tweak this line to something like:
>
> map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || '');
>
> But I am unsure about any other side effects this may have.
>
> Anyone?
>
>     Albert.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jay at jays.net  Fri Oct 13 12:09:16 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 11:09:16 -0500
Subject: [Bioperl-l] Documentation typo:
	Bio::Search::Result::GenericResult
In-Reply-To: <C1519A11.ABD1%bosborne11@verizon.net>
References: <C1519A11.ABD1%bosborne11@verizon.net>
Message-ID: <452FBA2C.7070003@jays.net>

Thanks Brian! 

My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :)

/home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v
----------------------------
revision 1.27
date: 2006/10/10 22:41:46;  author: bosborne;  state: Exp;  lines: +4 -4
next_hit, not next_hits
----------------------------

I'm a simple man who takes great satisfaction in the simple things. :)

j


Brian Osborne wrote:
> j,
> 
> No need, not for something so simple.
> 
> Brian O.
> 
> 
> On 10/7/06 6:34 PM, "Jay Hannah" <jay at jays.net> wrote:
>> Except that "next_hits()" does not exist. Should be "next_hit()".
>>
>> (Should I have posted a patch instead?)
> 


From jay at jays.net  Fri Oct 13 12:24:48 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 11:24:48 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
Message-ID: <452FBDD0.2070008@jays.net>

So I'm doing the following:

1) Using Bio::SeqIO to read in a genbank file and kick out fasta.
2) Reading that fasta file w/ command line formatdb.
3) Using that output for command line blastall.
4) Using Bio::SearchIO to read the blast results.

(If there's a better way, do tell. -grin-)

This sequence is working great for nucleotide BLASTing, but I'm stuck on step 1 when trying protein BLAST. 

my $seq_in  = Bio::SeqIO->new(
   -file => "<Organism1.genbank", 
   -format => "genbank", 
   -alphabet => "protein"
);
my $seq_out_protein = Bio::SeqIO->new(
   -file => ">out",
   -format => 'fasta',
   -alphabet => 'protein'
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");
   $seq_out_protein->write_seq($inseq);
}

This creates a nucleotide file "out". Setting -alphabet doesn't seem to do anything. Setting molecule("protein") doesn't seem to do anything either.

I was expecting that it would just pull all the CDS strings out of the genbank file and dump those into fasta format?

Am I missing something obvious?

Thanks,

j


From bosborne11 at verizon.net  Fri Oct 13 12:54:02 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 13 Oct 2006 12:54:02 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <452FBDD0.2070008@jays.net>
Message-ID: <C1553CEA.AC2E%bosborne11@verizon.net>

Jay,

You're looking for the "translation" string in the CDS section, yes? You
need to delve a bit into features, the CDS is considered to be a feature of
the main or parent nucleotide sequence and the translation is part of CDS
feature:

http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank


Brian O.


On 10/13/06 12:24 PM, "Jay Hannah" <jay at jays.net> wrote:

> Am I missing something 


From bix at sendu.me.uk  Fri Oct 13 12:59:46 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 13 Oct 2006 17:59:46 +0100
Subject: [Bioperl-l] Documentation
	typo:	Bio::Search::Result::GenericResult
In-Reply-To: <452FBA2C.7070003@jays.net>
References: <C1519A11.ABD1%bosborne11@verizon.net> <452FBA2C.7070003@jays.net>
Message-ID: <452FC602.3080302@sendu.me.uk>

Jay Hannah wrote:
> Thanks Brian! 
> 
> My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :)
> 
> /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v
> ----------------------------
> revision 1.27
> date: 2006/10/10 22:41:46;  author: bosborne;  state: Exp;  lines: +4 -4
> next_hit, not next_hits
> ----------------------------

Congratulations! :D

Next it will be two byte corrections and from there, the sky's the limit! :)


From hlapp at gmx.net  Fri Oct 13 13:28:50 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 13 Oct 2006 13:28:50 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <452F54A1.7010908@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>
	<452F54A1.7010908@sendu.me.uk>
Message-ID: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>

What does the POD (and the code) say about instantiating it?

	-hilmar

On Oct 13, 2006, at 4:56 AM, Sendu Bala wrote:

> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's
> certainly interface-like, but doesn't follow the normal interface  
> naming
> convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed
> WrapperBaseI? Left alone?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jay at jays.net  Fri Oct 13 14:56:38 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 13:56:38 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <C1553CEA.AC2E%bosborne11@verizon.net>
References: <C1553CEA.AC2E%bosborne11@verizon.net>
Message-ID: <452FE166.5080405@jays.net>

Brian Osborne wrote:
> You're looking for the "translation" string in the CDS section, yes? You
> need to delve a bit into features, the CDS is considered to be a feature of
> the main or parent nucleotide sequence and the translation is part of CDS
> feature:
> 
> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank

Yes. Thanks. I "rolled my own" -- I'm now doing this:

while (my $inseq = $seq_in->next_seq) {
   my @features = $inseq->get_SeqFeatures();
   foreach my $feat ( @features ) {
      next unless ($feat->primary_tag eq "CDS");
      my @db_xrefs = $feat->annotation->get_Annotations("db_xref");
      @db_xrefs = grep { /^GI:/ } @db_xrefs;
      die "Panic! More than one GI: db_xref?"     if (@db_xrefs > 1);
      die "Panic! No GI: db_xref?"            unless (@db_xrefs == 1);
      my $gi = $db_xrefs[0];
      $gi =~ s/^GI://;
      my @translations = $feat->annotation->get_Annotations("translation");
      die "Panic! More than one translation?" if (@translations > 1);
      my @protein_ids = $feat->annotation->get_Annotations("protein_id");
      die "Panic! More than one protein_id?"  if (@protein_ids > 1);
      my @product = $feat->annotation->get_Annotations("product");
      die "Panic! More than one product?"  if (@product > 1);
      print ">gi|$gi|gb|$protein_ids[0]|";
      print $inseq->id . " $product[0]\n";
      print "$translations[0]\n";
   }
}

To generate a homebrew fasta file for a protein BLAST.

I just thought that -alphabet and molecule() would do that stuff for me? What else would "protein" mean in those? Does anyone use -alphabet and/or molecule()? For what? How? Again, here's what I'm talking about:

==========
my $seq_out_protein = Bio::SeqIO->new(
   -file => ">out",
   -format => 'fasta',
   -alphabet => 'protein'    # No effect?
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");    # No effect?
==========

Thanks,

j


From bosborne11 at verizon.net  Fri Oct 13 17:20:40 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 13 Oct 2006 17:20:40 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <452FE166.5080405@jays.net>
Message-ID: <C1557B68.AC3E%bosborne11@verizon.net>

Jay,

Yes, people use the -alphabet parameter. If you set it to something then
Bioperl will not try to determine whether the sequence is protein, rna, or
dna and this is particularly useful when the sequence contains characters
that Bioperl would object to (sequences with distasteful characters can be
created by various applications, for example, or you might introduce some
weird character for some reason). Setting the -alphabet would also speed up
Bioperl a bit, for the same reason.

Brian O.


On 10/13/06 2:56 PM, "Jay Hannah" <jay at jays.net> wrote:

> 
> I just thought that -alphabet and molecule() would do that stuff for me? What
> else would "protein" mean in those? 


From jay at jays.net  Sat Oct 14 11:25:05 2006
From: jay at jays.net (Jay Hannah)
Date: Sat, 14 Oct 2006 10:25:05 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <C1557B68.AC3E%bosborne11@verizon.net>
References: <C1557B68.AC3E%bosborne11@verizon.net>
Message-ID: <45310151.5050901@jays.net>

Brian Osborne wrote:
> Yes, people use the -alphabet parameter. If you set it to something then
> Bioperl will not try to determine whether the sequence is protein, rna, or
> dna and this is particularly useful when the sequence contains characters
> that Bioperl would object to (sequences with distasteful characters can be
> created by various applications, for example, or you might introduce some
> weird character for some reason). Setting the -alphabet would also speed up
> Bioperl a bit, for the same reason.

Huh. That's what I assumed when I stumbled into the -alphabet parameter. So I thought this would read the protein sequences out of my genbank file and write a fasta file for me:

my $seq_in  = Bio::SeqIO->new(
   -file     => "<$file",  
   -format   => "genbank",
   -alphabet => "protein"  # No effect?
);
my $seq_out = Bio::SeqIO->new(
   -file     => ">$outfile",
   -format   => "fasta",
   -alphabet => "protein"  # No effect?
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");    # No effect?
   $seq_out->write_seq($inseq);
}

It didn't. Would it be a Good Thing if it did what I was expecting? (Like I said I rolled my own, but I'm always looking for ways to enhance BioPerl that other people might find useful... Someday I will contribute something useful, by golly. -grin-)

(Background: I'm doing protein BLASTs from genbank files. To make formatdb happy I have to have fasta files full of the protein sequences.)

j


From bosborne11 at verizon.net  Sat Oct 14 14:40:21 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Sat, 14 Oct 2006 14:40:21 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <45310151.5050901@jays.net>
Message-ID: <C156A755.AC52%bosborne11@verizon.net>

Jay,

What you expected was that setting the -alphabet to "protein" would make
Bioperl translate the input nucleotide sequence to output protein. In
Bioperl this is accomplished by using the translate() method, no surprise
there. If you take a look at the documentation on translate() in the online
Bioperl Tutorial you'll see that this is a fairly sophisticated method, you
can do all sorts of different things with it. So using -alphabet for this
purpose won't really work, there are too many different ways to translate.

Brian O.


On 10/14/06 11:25 AM, "Jay Hannah" <jay at jays.net> wrote:

> Would it be a Good Thing if it did what I was expecting?


From cjfields at uiuc.edu  Sat Oct 14 20:44:04 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 14 Oct 2006 19:44:04 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <45310151.5050901@jays.net>
Message-ID: <000601c6eff3$084663c0$15327e82@pyrimidine>

...
> Huh. That's what I assumed when I stumbled into the -alphabet parameter.
> So I thought this would read the protein sequences out of my genbank file
> and write a fasta file for me:

You have to think about it this way: the GenBank record you are using is for
the nucleotide sequence only, and all other information in that record
describes the sequence.  Similarly, if you used a 'GenPept' sequence, the
focus would be the protein sequence.  Both normally contain annotations
which describe the sequence globally, such as references, organism info,
etc.  Both also may contain features (or SeqFeatures), which describe a
feature bound to a particular location on the sequence.  However, features
are not an absolute requirement for a sequence; they're sort of 'window
dressing', albeit almost always essential for describing the main sequence.

I would do exactly as Brian suggests.  See the Feature/Annotation HOWTO for
ideas on how to screen out the particular features you want and either grab
the 'translation' tag data or get the sequence object from the feature and
translate it directly.  You should get the same result either way though
getting the tag may be faster.

...

> It didn't. Would it be a Good Thing if it did what I was expecting? (Like
> I said I rolled my own, but I'm always looking for ways to enhance BioPerl
> that other people might find useful... Someday I will contribute something
> useful, by golly. -grin-)
> 
> (Background: I'm doing protein BLASTs from genbank files. To make formatdb
> happy I have to have fasta files full of the protein sequences.)
> 
> j

You could, theoretically, write up a method to only retrieve features which
correspond to coding regions only (CDS).  You may want to optionally screen
out pseudogenes but that's up to you.

Chris


From avilella at gmail.com  Sun Oct 15 07:08:23 2006
From: avilella at gmail.com (Albert Vilella)
Date: Sun, 15 Oct 2006 12:08:23 +0100
Subject: [Bioperl-l] no_residues test in SimpleAlign.t
Message-ID: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>

Hi all,

Can somebody check the SimpleAlign.t test?

perl t/SimpleAlign.t

I get a few errors, I am looking at one that deals with no_residues. I
don't understand if this is suposed to work:

sub no_residues {
    my $self = shift;
    my $count = 0;

    foreach my $seq ($self->each_seq) {
	my $str = $seq->seq();

	$count += ($str =~ s/[^A-Za-z]//g);
        #is this the same as:
        # $str =~ s/[^A-Za-z]//g;
        # $count += length($str);
    }

Cheers,

    Albert.
    return $count;
}


From cjfields at uiuc.edu  Sun Oct 15 13:53:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 15 Oct 2006 12:53:50 -0500
Subject: [Bioperl-l] no_residues test in SimpleAlign.t
In-Reply-To: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>
References: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>
Message-ID: <FE798536-21DA-4377-96E2-0BF98C235970@uiuc.edu>

Albert,

I get all 75 tests passing.  SimpleAlign.t was recently switched over  
to Test::More, so you should be seeing more explicit test  
descriptions.  It looks like test 27 is no_residues().  Were there  
any more that failed?

I usually run 'perl -I. t/test.t' from the main bioperl directory to  
check individual tests from the local directory.  Otherwise you are  
checking your installed version which may be older (and may not match  
tests and recent bug fixes).  Could that be the problem?

Chris

On Oct 15, 2006, at 6:08 AM, Albert Vilella wrote:

> Hi all,
>
> Can somebody check the SimpleAlign.t test?
>
> perl t/SimpleAlign.t
>
> I get a few errors, I am looking at one that deals with no_residues. I
> don't understand if this is suposed to work:
>
> sub no_residues {
>     my $self = shift;
>     my $count = 0;
>
>     foreach my $seq ($self->each_seq) {
> 	my $str = $seq->seq();
>
> 	$count += ($str =~ s/[^A-Za-z]//g);
>         #is this the same as:
>         # $str =~ s/[^A-Za-z]//g;
>         # $count += length($str);
>     }
>
> Cheers,
>
>     Albert.
>     return $count;
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From DGroskreutz at twt.com  Mon Oct 16 02:00:39 2006
From: DGroskreutz at twt.com (DGroskreutz at twt.com)
Date: Mon, 16 Oct 2006 01:00:39 -0500
Subject: [Bioperl-l] CN=Deb Groskreutz/OU=MSN/O=TWT is out of the office.
Message-ID: <OF66FF39D7.C58855EB-ON86257209.002104F9-86257209.002104F9@twt.com>


I will be out of the office starting  10/13/2006 and will not return until
10/30/2006.

I will be out of the office until October 30, 2006.
I will reply to your message at that time.

Thanks,
Deb


NOTICE OF CONFIDENTIALITY:
The information contained in this communication, including attachments, is intended for the specific delivery to and use by the individual(s) to whom it is addressed. This email includes confidential information that may be attorney-client privileged. Any review, retransmission, dissemination, or unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please reply to the sender immediately and delete the original communication and any copy of it from your computer system, including all attachments.


From bix at sendu.me.uk  Mon Oct 16 04:08:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 09:08:34 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>
	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
Message-ID: <45333E02.9070808@sendu.me.uk>

Hilmar Lapp wrote:
> What does the POD (and the code) say about instantiating it?

=head1 SYNOPSIS

   # do not use this object directly, it provides the following methods
   # for its subclasses

...


=head1 DESCRIPTION

This is a basic module from which to build executable wrapper modules.
It has some basic methods to help when implementing new modules.


There is no new() method.


From bix at sendu.me.uk  Mon Oct 16 09:23:41 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 14:23:41 +0100
Subject: [Bioperl-l] Bio::WebAgent sleep warning
Message-ID: <453387DD.3040105@sendu.me.uk>

Hi,

Does anyone think it's appropriate for Bio::WebAgent to issue warnings 
every time it sleeps? I'd consider the sleeping part of its normal, 
expected and desired behaviour so I don't need to be warned about it. 
Perhaps change the $self->warn to a $self->debug?


From cjfields at uiuc.edu  Mon Oct 16 10:12:10 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 09:12:10 -0500
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <453387DD.3040105@sendu.me.uk>
Message-ID: <000c01c6f12d$121b5000$15327e82@pyrimidine>

> Hi,
> 
> Does anyone think it's appropriate for Bio::WebAgent to issue warnings
> every time it sleeps? I'd consider the sleeping part of its normal,
> expected and desired behaviour so I don't need to be warned about it.
> Perhaps change the $self->warn to a $self->debug?

That sounds fine.  Using debugging output for sleep would be similar
behavior to Bio::DB::NCBIHelper and BioDB::GenericWebDBI.  You may want to
pass it by Heikki (I think that's his module).  

The only reason I would want to see sleep output, personally, is to make
sure it is working properly.

Almost looks like that class has the same intent that GenericWebDBI has
(even down to using LWP::UserAgent as a superclass).  I may look into it to
see if I can use this as a superclass for GenericWebDBI.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Mon Oct 16 10:26:21 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 15:26:21 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
Message-ID: <4533968D.6040009@sheffield.ac.uk>

Did anyone reconfigure the bioperl web server (which ever server hosts
http://bioperl.org/DIST) by adding the following lines to the httpd.conf
file:

RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*)
http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1

This will be required as a workaround to a bug in ActivePerl 5.8.8.819
which will result in a failed install of Bioperl via PPM.

Cheers
Nath


From n.haigh at sheffield.ac.uk  Mon Oct 16 11:30:16 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 16:30:16 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A257.2000207@campus.iztacala.unam.mx>
References: <4533968D.6040009@sheffield.ac.uk>
	<4533A257.2000207@campus.iztacala.unam.mx>
Message-ID: <4533A588.9020505@sheffield.ac.uk>

Mauricio Herrera Cuadra wrote:
> Done. Could you please check if it works as it should?
>
> Cheers,
> Mauricio.
Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
someone to pop it in http://bioperl/DIST

Volunteers?

BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
the PPD? I seem to remember that there was talk about having to maintain
a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
this front?

Nath


From arareko at campus.iztacala.unam.mx  Mon Oct 16 11:16:39 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 10:16:39 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533968D.6040009@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>
Message-ID: <4533A257.2000207@campus.iztacala.unam.mx>

Done. Could you please check if it works as it should?

Cheers,
Mauricio.

Nathan Haigh wrote:
> Did anyone reconfigure the bioperl web server (which ever server hosts
> http://bioperl.org/DIST) by adding the following lines to the httpd.conf
> file:
> 
> RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*)
> http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1
> 
> This will be required as a workaround to a bug in ActivePerl 5.8.8.819
> which will result in a failed install of Bioperl via PPM.
> 
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From arareko at campus.iztacala.unam.mx  Mon Oct 16 11:33:33 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 10:33:33 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>
	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk>
Message-ID: <4533A64D.6040203@campus.iztacala.unam.mx>

Nathan Haigh wrote:
> Mauricio Herrera Cuadra wrote:
>> Done. Could you please check if it works as it should?
>>
>> Cheers,
>> Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?

You can send it to me.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From akarger at CGR.Harvard.edu  Mon Oct 16 11:54:33 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 16 Oct 2006 11:54:33 -0400
Subject: [Bioperl-l] Bio::Location::Split
Message-ID: <B9182BFF5B004245BABC12956EA6322E016B1ED7@huls5.nucleus.harvard.edu>

I recently came across bug 2101, where Bio::Location::Split::to_FTstring
gives the incorrect order for multi-sublocation locations on the minus
strand. That is, I found it by getting incorrect results, and then found
it in Bugzilla and in the September archives.

I'm converting CDS files from one format to another. E.g., I read an
EMBL file with a chromosome and CDS features, and want to output the
location in a FASTA header. If I do something like:

foreach (<$in>) {
    foreach my $feat ($seq->getSeqFeatures) {
        print $feat->location->to_FTstring()
    }
}

I get the wrong results for multi-exon CDSs on the -1 strand, as
described in the bug report.

Is there a relatively easy way around this? I assume I can't get at the
original string of the location, which in this case is all I need. Can I
just flip the order of the exons in certain cases? Chris F, can you tell
me the preliminary solution you mentioned?

I must say I'm sort of surprised this wasn't found before. It seems like
a not-that-rare occurrence. Oh well.

Thanks,

- Amir Karger
Research Computing
Life Sciences Division
Harvard University


From bix at sendu.me.uk  Mon Oct 16 12:14:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 17:14:39 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk>
Message-ID: <4533AFEF.8080103@sendu.me.uk>

Nathan Haigh wrote:
> Mauricio Herrera Cuadra wrote:
>> Done. Could you please check if it works as it should?
>>
>> Cheers,
>> Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?

I'm sure Mauricio would be happy to do it, but so am I. You may want to 
hold off a little while until I release rc2, which may be a few hours away.


> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
> the PPD? I seem to remember that there was talk about having to maintain
> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
> this front?

It depends on what is in the PPD and what kind of auto-dependency 
features the ActiveState installer has. Given Perl 5.8 and your current 
PPD, does Bioperl install with the same or fewer number of skips if you 
also install Bundle::BioPerl first? That is, does Bundle::BioPerl even 
do anything useful anymore? If not, obviously don't bother making it a 
pre-req. If it does, my opinion is that you make it a pre-req. If people 
really don't want to install the optional stuff they can download the 
.zip file and install manually without even a make.


From Kevin.M.Brown at asu.edu  Mon Oct 16 12:14:51 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Oct 2006 09:14:51 -0700
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
Message-ID: <1A4207F8295607498283FE9E93B775B402196FAA@EX02.asurite.ad.asu.edu>

> > Yes, people use the -alphabet parameter. If you set it to 
> something then
> > Bioperl will not try to determine whether the sequence is 
> protein, rna, or
> > dna and this is particularly useful when the sequence 
> contains characters
> > that Bioperl would object to (sequences with distasteful 
> characters can be
> > created by various applications, for example, or you might 
> introduce some
> > weird character for some reason). Setting the -alphabet 
> would also speed up
> > Bioperl a bit, for the same reason.
> 
> Huh. That's what I assumed when I stumbled into the -alphabet 
> parameter. So I thought this would read the protein sequences 
> out of my genbank file and write a fasta file for me:
> 
> my $seq_in  = Bio::SeqIO->new(
>    -file     => "<$file",  
>    -format   => "genbank",
>    -alphabet => "protein"  # No effect?
> );
> my $seq_out = Bio::SeqIO->new(
>    -file     => ">$outfile",
>    -format   => "fasta",
>    -alphabet => "protein"  # No effect?
> );
> while (my $inseq = $seq_in->next_seq) {
>    $inseq->molecule("protein");    # No effect?
>    $seq_out->write_seq($inseq);
> }
> 
> It didn't. Would it be a Good Thing if it did what I was 
> expecting? (Like I said I rolled my own, but I'm always 
> looking for ways to enhance BioPerl that other people might 
> find useful... Someday I will contribute something useful, by 
> golly. -grin-)
> 
> (Background: I'm doing protein BLASTs from genbank files. To 
> make formatdb happy I have to have fasta files full of the 
> protein sequences.)

This might work for your needs (CDS to protein FASTA).

my $seq_in  = Bio::SeqIO->new(
   -file     => "<$file",  
   -format   => "genbank",
);

open my $seq_out, ">$outfile";

while (my $inseq = $seq_in->next_seq) {
   print $seq_out ">". $inseq->display_id(). "\n";
   print $seq_out $inseq->translate() ."\n";
}


From bix at sendu.me.uk  Mon Oct 16 11:44:19 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 16:44:19 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
Message-ID: <4533A8D3.90709@sendu.me.uk>

I think Chris recently deprecated this, but should it be? For me, its 
POD description justifies its existence, and perhaps more importantly, 
Bio::Index::Blast relies on it.

I took a quick peek at the latter and it didn't seem trivial to move it 
over to Bio::SearchIO instead.

Should it be undeprecated?


From n.haigh at sheffield.ac.uk  Mon Oct 16 12:39:02 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 17:39:02 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533AFEF.8080103@sendu.me.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk>
Message-ID: <4533B5A6.1070701@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> Mauricio Herrera Cuadra wrote:
>>> Done. Could you please check if it works as it should?
>>>
>>> Cheers,
>>> Mauricio.
>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
>> someone to pop it in http://bioperl/DIST
>>
>> Volunteers?
>
> I'm sure Mauricio would be happy to do it, but so am I. You may want
> to hold off a little while until I release rc2, which may be a few
> hours away.

Just e-mailed Mauricio links to the files off list, It's not a big job
for me to remake the bioperl PPD, so Mauricio it's up to you if you want
to wait 18hrs for me to make the PPDs for 1.5.2-rc2.
>
>
>> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
>> the PPD? I seem to remember that there was talk about having to maintain
>> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
>> this front?
>
> It depends on what is in the PPD and what kind of auto-dependency
> features the ActiveState installer has. Given Perl 5.8 and your
> current PPD, does Bioperl install with the same or fewer number of
> skips if you also install Bundle::BioPerl first? That is, does
> Bundle::BioPerl even do anything useful anymore? If not, obviously
> don't bother making it a pre-req. If it does, my opinion is that you
> make it a pre-req. If people really don't want to install the optional
> stuff they can download the .zip file and install manually without
> even a make.
As far as the PPDs are concerned - no tests are run during installation.
PPM more or less just copies files into the correct place for Perl to
find so both approaches result in the same thing. However, I've not
tried making a CPAN distribution file for either Bioperl or
Bundle::Bioperl - I wouldn't know where to start!

MakeFile.PL now only documents the prereq in one place (%packages), and
this is used to add the prereq to the bioperl PPD when issuing "nmake
ppd". This way, each release of BioPerl should be up-to-date with prereq
as long as developers add their modules prereq to %packages. If we have
Bundle::BioPerl, most of those prereq need to be moved from the Bioperl
PPD to the Bundle::BioPerl PPD - a bit of a pain because there are no
guidelines as to what should/should not go in Bundle::BioPerl.
Therefore, as far as the PPDs are concerned, it far easier to do away
with Bundel::BioPerl.

Nath


From hlapp at gmx.net  Mon Oct 16 13:04:24 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:04:24 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <45333E02.9070808@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>
	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
	<45333E02.9070808@sendu.me.uk>
Message-ID: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>

So it looks like an abstract base class, not an interface that  
defines a contract or API? Should use Root.pm then, would be my vote.

	-hilmar

On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> What does the POD (and the code) say about instantiating it?
>
> =head1 SYNOPSIS
>
>    # do not use this object directly, it provides the following  
> methods
>    # for its subclasses
>
> ...
>
>
> =head1 DESCRIPTION
>
> This is a basic module from which to build executable wrapper modules.
> It has some basic methods to help when implementing new modules.
>
>
> There is no new() method.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Oct 16 13:08:28 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:08:28 -0400
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <453387DD.3040105@sendu.me.uk>
References: <453387DD.3040105@sendu.me.uk>
Message-ID: <B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>

It depends. What triggers the sleeping? If it's part of every request  
that it processes then I'd agree. If it is triggered by failure to  
precede the next try then the failure is probably not expected  
(though possible), and hence should be reported by warn().

If it is just part of the polling cycle then there should probably be  
a limit up to which the time waited is considered 'normal' and after  
which it is considered 'excessive' and hence should be reported  
through warn().

My $0.02.

	-hilmar

On Oct 16, 2006, at 9:23 AM, Sendu Bala wrote:

> Hi,
>
> Does anyone think it's appropriate for Bio::WebAgent to issue warnings
> every time it sleeps? I'd consider the sleeping part of its normal,
> expected and desired behaviour so I don't need to be warned about it.
> Perhaps change the $self->warn to a $self->debug?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Mon Oct 16 13:13:53 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 18:13:53 +0100
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>
References: <453387DD.3040105@sendu.me.uk>
	<B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>
Message-ID: <4533BDD1.8060204@sendu.me.uk>

Hilmar Lapp wrote:
> It depends. What triggers the sleeping? If it's part of every request 
> that it processes then I'd agree. If it is triggered by failure to 
> precede the next try then the failure is probably not expected (though 
> possible), and hence should be reported by warn().
> 
> If it is just part of the polling cycle then there should probably be a 
> limit up to which the time waited is considered 'normal' and after which 
> it is considered 'excessive' and hence should be reported through warn().

=head2 sleep

  Title   : sleep
  Usage   : $self->sleep
  Function: sleep for a number of seconds indicated by the delay policy
  Returns : none
  Args    : none

NOTE: This method keeps track of the last time it was called and only
imposes a sleep if it was called more recently than the delay_policy()
allows.

=cut

It issues a warning every time it actually sleeps. I find it 
inappropriate that a method warns me that it did what I asked it to do.


From arareko at campus.iztacala.unam.mx  Mon Oct 16 13:14:06 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 12:14:06 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533B5A6.1070701@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>	<4533A588.9020505@sheffield.ac.uk>
	<4533AFEF.8080103@sendu.me.uk> <4533B5A6.1070701@sheffield.ac.uk>
Message-ID: <4533BDDE.2040801@campus.iztacala.unam.mx>

Nathan Haigh wrote:
> Sendu Bala wrote:
>> Nathan Haigh wrote:
>>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
>>> someone to pop it in http://bioperl/DIST
>>>
>>> Volunteers?
>> I'm sure Mauricio would be happy to do it, but so am I. You may want
>> to hold off a little while until I release rc2, which may be a few
>> hours away.
> 
> Just e-mailed Mauricio links to the files off list, It's not a big job
> for me to remake the bioperl PPD, so Mauricio it's up to you if you want
> to wait 18hrs for me to make the PPDs for 1.5.2-rc2.

Too late, I've already placed 1.5.2-rc1 in DIST. hehe :)

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From bix at sendu.me.uk  Mon Oct 16 12:32:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 17:32:11 +0100
Subject: [Bioperl-l] Swissprot problems
Message-ID: <4533B40B.2030908@sendu.me.uk>

t/Biofetch.t and t/DB.t are skipping their swissprot database fetches.
Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for 
maintenance but is now back up. However I'm guessing the databases must 
have changed. I've manually looked for the test case 'YNB3_YEAST' in 
database 'UniProtKB' and it came back with no result, even though I can 
find the test case manually at the expasy website.

Is this an EBI bug or deliberate change that makes sense to someone?


From m.weimer at dkfz-heidelberg.de  Mon Oct 16 12:43:38 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Mon, 16 Oct 2006 18:43:38 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Problem
Message-ID: <1161017019.5203.6.camel@localhost>

Dear list members,

when running 

######################################################################
#! /usr/bin/perl -w

use strict;
use Bio::DB::SwissProt;

my $db_obj = new Bio::DB::SwissProt(-verbose => 1);

my $seq_obj = $db_obj->get_Seq_by_acc("O02938");
######################################################################

using Bioperl 1.5.2 I get the following message:

##########################################################################################

request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
Content-Length: 49
Content-Type: application/x-www-form-urlencoded

format=swissprot&db=UniProtKB&style=raw&id=O02938


------------- EXCEPTION: Bio::Root::Exception -------------
MSG: acc O02938 does not exist
STACK: Error::throw
STACK:
Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350
STACK:
Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181
STACK: ./get.test.pl:8
-----------------------------------------------------------

##########################################################################################

But the accession number does exist. Surprisingly, everything worked
fine a few days ago. Any ideas of what might have happened?

Thanks and best regards,

Marc

 
From hlapp at gmx.net  Mon Oct 16 13:15:50 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:15:50 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
References: <4533A8D3.90709@sendu.me.uk>
Message-ID: <C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>

The problem is it is not maintained, and there are outstanding been  
bug reports.

If you un-deprecate it, then we need a response to people who come  
across problems with it when using it. Either you change the POD to  
say exactly who and when one should use it (or rather not) and point  
to the fact that it is unsupported for all other cases.

Or what would you suggest?

	-hilmar

On Oct 16, 2006, at 11:44 AM, Sendu Bala wrote:

> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
>
> I took a quick peek at the latter and it didn't seem trivial to  
> move it
> over to Bio::SearchIO instead.
>
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Oct 16 13:21:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:21:46 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
Message-ID: <000001c6f147$8efdfd60$15327e82@pyrimidine>

Bio::Tools::BPlite was placed on the deprecation list a while back (~ rel
1.5); the other related Bio::Tools::BP* modules were also supposed to be on
that list as well.  

If we want to undeprecate (de-deprecate? reprecate?) BPlite we also would
need to do the same for the others.  They must be updated to parse current
BLAST/PSI-BLAST/bl2seq text output, something that Bio::SearchIO::blast is
currently capable of (so the functionality is redundant).  And someone needs
to take them over.

In my opinion it may be more trouble than it's worth as they haven't been
touched in a while.    Seems if we 'revive' BPlite we're not really moving
forward esp. since you have added the PullParser recently and made
substantial improvements to SearchIO.  

Maybe Bio::Index::Blast just needs to be deprecated or rewritten to use
SearchIO?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 16, 2006 10:44 AM
> To: bioperl-l
> Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
> 
> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
> 
> I took a quick peek at the latter and it didn't seem trivial to move it
> over to Bio::SearchIO instead.
> 
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Mon Oct 16 13:21:58 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 18:21:58 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>
References: <4533A8D3.90709@sendu.me.uk>
	<C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>
Message-ID: <4533BFB6.5070504@sendu.me.uk>

Hilmar Lapp wrote:
> The problem is it is not maintained, and there are outstanding been bug 
> reports.
> 
> If you un-deprecate it, then we need a response to people who come 
> across problems with it when using it. Either you change the POD to say 
> exactly who and when one should use it (or rather not) and point to the 
> fact that it is unsupported for all other cases.
> 
> Or what would you suggest?

I'm not sure.

Does Bio::Index::Blast even work correctly? Does it suffer from whatever 
bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should 
that be deprecated as well?

Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO 
and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't 
seem trivial (or even appropriate).

Ultimately I just wanted to solve the warnings in the test suite. 
Thoughts, Chris?


From cjfields at uiuc.edu  Mon Oct 16 13:30:05 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:30:05 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
Message-ID: <000101c6f148$b8538b20$15327e82@pyrimidine>

> Mauricio Herrera Cuadra wrote:
> > Done. Could you please check if it works as it should?
> >
> > Cheers,
> > Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?
> 
> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
> the PPD? I seem to remember that there was talk about having to maintain
> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
> this front?
> 
> Nath

Nathan,

I think Chris Dagdigian still maintains Bundle::Bioperl on CPAN.  That
version should be the common basis for prereqs for any Bioperl core
installation.  

It's relatively easy to add/remove modules to the Bundle::Bioperl.  Contact
Chris D. and let him know if anything needs to be changed.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 16 13:33:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:33:50 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
Message-ID: <000201c6f149$3ed63490$15327e82@pyrimidine>

> So it looks like an abstract base class, not an interface that
> defines a contract or API? Should use Root.pm then, would be my vote.
> 
> 	-hilmar

Makes sense to me.  Maybe another audit is needed to catch similar
instances, or has this been done already?

Chris

> On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote:
> 
> > Hilmar Lapp wrote:
> >> What does the POD (and the code) say about instantiating it?
> >
> > =head1 SYNOPSIS
> >
> >    # do not use this object directly, it provides the following
> > methods
> >    # for its subclasses
> >
> > ...
> >
> >
> > =head1 DESCRIPTION
> >
> > This is a basic module from which to build executable wrapper modules.
> > It has some basic methods to help when implementing new modules.
> >
> >
> > There is no new() method.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 16 13:57:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:57:35 -0500
Subject: [Bioperl-l] Bio::Location::Split
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E016B1ED7@huls5.nucleus.harvard.edu>
Message-ID: <000301c6f14c$8fb0e060$15327e82@pyrimidine>

> I recently came across bug 2101, where Bio::Location::Split::to_FTstring
> gives the incorrect order for multi-sublocation locations on the minus
> strand. That is, I found it by getting incorrect results, and then found
> it in Bugzilla and in the September archives.
>
> I'm converting CDS files from one format to another. E.g., I read an
> EMBL file with a chromosome and CDS features, and want to output the
> location in a FASTA header. If I do something like:
> 
> foreach (<$in>) {
>     foreach my $feat ($seq->getSeqFeatures) {
>         print $feat->location->to_FTstring()
>     }
> }
> 
> I get the wrong results for multi-exon CDSs on the -1 strand, as
> described in the bug report.
> 
> Is there a relatively easy way around this? I assume I can't get at the
> original string of the location, which in this case is all I need. Can I
> just flip the order of the exons in certain cases? Chris F, can you tell
> me the preliminary solution you mentioned?
> 
> I must say I'm sort of surprised this wasn't found before. It seems like
> a not-that-rare occurrence. Oh well.
> 
> Thanks,
> 
> - Amir Karger
> Research Computing
> Life Sciences Division
> Harvard University

Could you let me know specifically which EMBL file contains the odd
locations?  The bug report uses theoretical locations, not actual ones, so
it would be nice to have a real-world example to test against.  

As for the lack of catching this, the particular types of locations that
cause the issue are quite rare.  Note that there are two bugs for that bug
report.  The first (and more serious) is still unresolved.  The second
(where remote locations are treated differently in Location::Split, which
caused more problems than it was worth) had a fix committed about a month
ago.  

Any fixes I have made for the first bug invariably break several other
methods, which use the current Location::Split object logic for retrieving
sequences, building feature strings, etc.  Since a new RC is imminent and
the bug only affects a small number of locations, I have held off until
after a final release is made (the last thing I want to do is fix something
that breaks ~6-8 other methods), but I'll try looking at it again this week.


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 16 14:29:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:29:02 -0500
Subject: [Bioperl-l] Swissprot problems
In-Reply-To: <4533B40B.2030908@sendu.me.uk>
Message-ID: <000401c6f150$f57dfc30$15327e82@pyrimidine>


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 16, 2006 11:32 AM
> To: bioperl-l
> Subject: [Bioperl-l] Swissprot problems
> 
> t/Biofetch.t and t/DB.t are skipping their swissprot database fetches.
> Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for
> maintenance but is now back up. However I'm guessing the databases must
> have changed. I've manually looked for the test case 'YNB3_YEAST' in
> database 'UniProtKB' and it came back with no result, even though I can
> find the test case manually at the expasy website.
> 
> Is this an EBI bug or deliberate change that makes sense to someone?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

I can confirm that.  It's not our end, though.  Entering the same data on
the DBFetch web page also gets no data.  I have emailed EBI about the
problem and will let you know if I hear anything; I think it's related to
the maintenance issue.

Notably, nothing on the web page indicates any database name changes yet.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 16 14:29:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:29:52 -0500
Subject: [Bioperl-l] Bio::DB::SwissProt Problem
In-Reply-To: <1161017019.5203.6.camel@localhost>
Message-ID: <000501c6f151$12918710$15327e82@pyrimidine>

We think there is a problem on the SwissProt (DBFetch) server.  I have
contacted them about the problem and will post something when I hear
something back.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Marc Weimer
> Sent: Monday, October 16, 2006 11:44 AM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bio::DB::SwissProt Problem
> 
> Dear list members,
> 
> when running
> 
> ######################################################################
> #! /usr/bin/perl -w
> 
> use strict;
> use Bio::DB::SwissProt;
> 
> my $db_obj = new Bio::DB::SwissProt(-verbose => 1);
> 
> my $seq_obj = $db_obj->get_Seq_by_acc("O02938");
> ######################################################################
> 
> using Bioperl 1.5.2 I get the following message:
> 
> ##########################################################################
> ################
> 
> request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
> Content-Length: 49
> Content-Type: application/x-www-form-urlencoded
> 
> format=swissprot&db=UniProtKB&style=raw&id=O02938
> 
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: acc O02938 does not exist
> STACK: Error::throw
> STACK:
> Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350
> STACK:
> Bio::DB::WebDBSeqI::get_Seq_by_acc
> /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181
> STACK: ./get.test.pl:8
> -----------------------------------------------------------
> 
> ##########################################################################
> ################
> 
> But the accession number does exist. Surprisingly, everything worked
> fine a few days ago. Any ideas of what might have happened?
> 
> Thanks and best regards,
> 
> Marc
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Mon Oct 16 14:39:28 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:39:28 -0500
Subject: [Bioperl-l] SwissProt Down
Message-ID: <000601c6f152$6997dbd0$15327e82@pyrimidine>

Looks like the swissprot problem stems from maintenance at EBI.  From the
EBI page http://www.ebi.ac.uk/Information/ (not on the DBFetch page, BTW):

Please Note: Monday October 16th 12:00-15:00 -  Due to general maintenance,
some services from the EBI may be temporarily unavailable. We apologise for
any inconvenience.

At least we know that Test::More skips are working!

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Mon Oct 16 14:51:31 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 19:51:31 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C15946CA.ACA9%bosborne11@verizon.net>
References: <C15946CA.ACA9%bosborne11@verizon.net>
Message-ID: <4533D4B3.2000809@sendu.me.uk>

Brian Osborne wrote:
> Sendu,
> 
> I just made a commit that makes Bio::Index::Blast use SearchIO instead of
> BPlite.

I was concerned about the whole id_parser thing. Did you determine that 
your change still allows for id_parser to be used and have the intended 
effect, or that id_parser is in someway meaningless and should be 
removed as a method?


From cjfields at uiuc.edu  Mon Oct 16 15:03:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 14:03:33 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533BFB6.5070504@sendu.me.uk>
Message-ID: <000301c6f155$c7029ff0$15327e82@pyrimidine>

> Hilmar Lapp wrote:
> > The problem is it is not maintained, and there are outstanding been bug
> > reports.
> >
> > If you un-deprecate it, then we need a response to people who come
> > across problems with it when using it. Either you change the POD to say
> > exactly who and when one should use it (or rather not) and point to the
> > fact that it is unsupported for all other cases.
> >
> > Or what would you suggest?
> 
> I'm not sure.
> 
> Does Bio::Index::Blast even work correctly? Does it suffer from whatever
> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should
> that be deprecated as well?
> 
> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO
> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't
> seem trivial (or even appropriate).
> 
> Ultimately I just wanted to solve the warnings in the test suite.
> Thoughts, Chris?

My opinion is we either have to completely support BPlite (and the others)
or drop it altogether.  I don't think we can state "use BPLite only with
Bio::Index::Blast, use SearchIO everywhere else."  That's too inconsistent.


It seems simpler to deprecate the various Bio::Tools::BP* classes and either
fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working
on) or deprecate Bio::Index::Blast as well.  

The warnings in the test suite belong to BlastIndex.t, correct?  I updated
using Brian's Bio::Index::blast fix and it passes now w/o warnings.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From akarger at CGR.Harvard.edu  Mon Oct 16 15:00:28 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 16 Oct 2006 15:00:28 -0400
Subject: [Bioperl-l] Bio::Location::Split
Message-ID: <B9182BFF5B004245BABC12956EA6322E016B1F89@huls5.nucleus.harvard.edu>

 
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu] 
> >
> > I'm converting CDS files from one format to another. E.g., I read an
> > EMBL file with a chromosome and CDS features, and want to output the
> > location in a FASTA header.> > 
> > I get the wrong results for multi-exon CDSs on the -1 strand, as
> > described in the bug report.
> > 
>
> Could you let me know specifically which EMBL file contains the odd
> locations?  The bug report uses theoretical locations, not 
> actual ones, so
> it would be nice to have a real-world example to test against. 

I downloaded candida glabrata chromosome B from EBI:
http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948

testportal>perl location.pl new_glabrata_B.embl > bio
testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/'
new_glabrata_B.embl > nonbio
testportal>wc bio nonbio
 217  217 4537 bio
 217  217 4549 nonbio
 434  434 9086 total
testportal>diff bio nonbio
4c4
< complement(join(10632..11157,10347..10372))
---
> join(complement(10632..11157),complement(10347..10372))

Just one example here, but see below.
 
> As for the lack of catching this, the particular types of 
> locations that
> cause the issue are quite rare.  

Really? I guess our definitions of rare depend on which sequences we're
working with. I'm doing fungal genomes, and here's a grep for a few
species' entire genomes:

testportal>foreach i ( *.embl )
foreach? echo $i
foreach? grep CDS $i | grep join | grep -c complement
foreach? end
glabrata_orf.embl
29
hansenii_orf.embl
151
lactis_orf.embl
70
lipolytica_orf.embl
337
pombe_orf.embl
1137

You might like to use pombe as a test case, as it has lots of these
complement joins, including ones with multiple introns.

Anyway, I'd question the "rare" designation. It seems to me like any
species that has introns will have situations like this in their CDSs.
Not to mention any other sequence that uses Bio::Location::Split. (Since
I'm not a Real Biologist, I can't think up mor examples here, but I'm
sure they exist.)

Or are you saying it's rare to use join (complement(C..D),
complement(A..B)) instead of complement(join(A..B, C..D)). In that case,
I guess I just got really unlucky in that five fungal genomes I was
using decided to use the "rare" syntax. 

> Note that there are two bugs 
> for that bug
> report.  The first (and more serious) is still unresolved.  The second
> (where remote locations are treated differently in 
> Location::Split, which
> caused more problems than it was worth) had a fix committed 
> about a month
> ago.  

Sadly, it's the first (and in my case, more common (I have no remote
locations.)) bug for me.

> Any fixes I have made for the first bug invariably break several other
> methods, which use the current Location::Split object logic 
> for retrieving
> sequences, building feature strings, etc.  Since a new RC is 
> imminent and
> the bug only affects a small number of locations, I have held 
> off until
> after a final release is made (the last thing I want to do is 
> fix something
> that breaks ~6-8 other methods), but I'll try looking at it 
> again this week.

IMO this is a pretty serious bug (if these kinds of sequences aren't
that rare as I've shown above), because you're outputting sequence
descriptions that are just plain wrong. Anyone who uses
FTLocationFactory to read these output description will have incorrect
sequence, incorrect translated proteins, etc. And it's even more serious
if other methods are depending on it.

I know I can't dictate your time, and should be volunteering to work on
fixing it. But if it affects other modules, then I will no doubt break
things even more than you have in your attempts.  

-Amir

> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 


From bosborne11 at verizon.net  Mon Oct 16 14:25:14 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 14:25:14 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
Message-ID: <C15946CA.ACA9%bosborne11@verizon.net>

Sendu,

I just made a commit that makes Bio::Index::Blast use SearchIO instead of
BPlite. The BlastIndex.t test is giving a few warnings so I need to take a
look at that but all tests are passing.

An awful lot of work has gone into the SearchIO system, for more on why its
approach is deemed to be superior in the context of Bioperl see the SearchIO
HOWTO. One key feature of this upcoming release is an emphasis on removing
extraneous modules, I think it's safe to say that BPlite has been considered
extraneous for a number of years now.

Brian O.


On 10/16/06 11:44 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
> 
> I took a quick peek at the latter and it didn't seem trivial to move it
> over to Bio::SearchIO instead.
> 
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Mon Oct 16 14:59:38 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 14:59:38 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533D4B3.2000809@sendu.me.uk>
Message-ID: <C1594EDA.ACB9%bosborne11@verizon.net>

Sendu,

OK. I _think_ this change shouldn't affect id_parser() but I will test this
in BlastIndex.t. The id_parser() method is relevant to all these Index*
modules - don't know how much it's used but it certainly is nice to have it
available.

Brian O.


On 10/16/06 2:51 PM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> Brian Osborne wrote:
>> Sendu,
>> 
>> I just made a commit that makes Bio::Index::Blast use SearchIO instead of
>> BPlite.
> 
> I was concerned about the whole id_parser thing. Did you determine that
> your change still allows for id_parser to be used and have the intended
> effect, or that id_parser is in someway meaningless and should be
> removed as a method?


From cjfields at uiuc.edu  Mon Oct 16 16:51:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 15:51:08 -0500
Subject: [Bioperl-l] Bio::Location::Split
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E016B1F89@huls5.nucleus.harvard.edu>
Message-ID: <000001c6f164$d1380190$15327e82@pyrimidine>

...
> I downloaded candida glabrata chromosome B from EBI:
> http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948
> 
> testportal>perl location.pl new_glabrata_B.embl > bio
> testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/'
> new_glabrata_B.embl > nonbio
> testportal>wc bio nonbio
>  217  217 4537 bio
>  217  217 4549 nonbio
>  434  434 9086 total
> testportal>diff bio nonbio
> 4c4
> < complement(join(10632..11157,10347..10372))
> ---
> > join(complement(10632..11157),complement(10347..10372))
> 
> Just one example here, but see below.
> 
> > As for the lack of catching this, the particular types of
> > locations that
> > cause the issue are quite rare.
> 
> Really? I guess our definitions of rare depend on which sequences we're
> working with. I'm doing fungal genomes, and here's a grep for a few
> species' entire genomes:
> 
> testportal>foreach i ( *.embl )
> foreach? echo $i
> foreach? grep CDS $i | grep join | grep -c complement
> foreach? end
> glabrata_orf.embl
> 29
> hansenii_orf.embl
> 151
> lactis_orf.embl
> 70
> lipolytica_orf.embl
> 337
> pombe_orf.embl
> 1137
> 
> You might like to use pombe as a test case, as it has lots of these
> complement joins, including ones with multiple introns.

I'll use those.  I'll see if an analogous GenBank file exists as well.  

I can probably make a preliminary fix for FT_string() so that it arranges
the sublocations correctly, but I think the best way to go is to have
FTLocationFactory not modify the various sublocations to start with, which
it currently does when it sets strand() (strand() propagates the strand info
to sublocations). 

> Anyway, I'd question the "rare" designation. It seems to me like any
> species that has introns will have situations like this in their CDSs.
> Not to mention any other sequence that uses Bio::Location::Split. (Since
> I'm not a Real Biologist, I can't think up mor examples here, but I'm
> sure they exist.)

I think that additional tests are definitely needed for pulling out
sequences.  

What I mean by 'rare' is that the majority of sequences do not have
problems.  Also, this seems to be a 'silent' bug since the error shows up in
to_FTstring() but the object sublocations seem to beprocessed correctly when
using the location object directly (such as via SeqFeatureI).  

Round-tripping the sequence should pick it up though.  Since
complement(join(10632..11157,10347..10372)) is not the same as
join(complement(10632..11157),complement(10347..10372)).  

That is essentially what you are doing, correct? i.e. getting the sequences
using Bioperl, saving them (which passes them through SeqIO), reading them
again (back through SeqIO with the malformed location string).

> Or are you saying it's rare to use join (complement(C..D),
> complement(A..B)) instead of complement(join(A..B, C..D)). In that case,
> I guess I just got really unlucky in that five fungal genomes I was
> using decided to use the "rare" syntax.

Location::Split is supposed to handle all variations, but apparently it
isn't.  

> > Note that there are two bugs
> > for that bug
> > report.  The first (and more serious) is still unresolved.  The second
> > (where remote locations are treated differently in
> > Location::Split, which
> > caused more problems than it was worth) had a fix committed
> > about a month
> > ago.
> 
> Sadly, it's the first (and in my case, more common (I have no remote
> locations.)) bug for me.
> 
> > Any fixes I have made for the first bug invariably break several other
> > methods, which use the current Location::Split object logic
> > for retrieving
> > sequences, building feature strings, etc.  Since a new RC is
> > imminent and
> > the bug only affects a small number of locations, I have held
> > off until
> > after a final release is made (the last thing I want to do is
> > fix something
> > that breaks ~6-8 other methods), but I'll try looking at it
> > again this week.
> 
> IMO this is a pretty serious bug (if these kinds of sequences aren't
> that rare as I've shown above), because you're outputting sequence
> descriptions that are just plain wrong. Anyone who uses
> FTLocationFactory to read these output description will have incorrect
> sequence, incorrect translated proteins, etc. And it's even more serious
> if other methods are depending on it.
> 
> I know I can't dictate your time, and should be volunteering to work on
> fixing it. But if it affects other modules, then I will no doubt break
> things even more than you have in your attempts.
> 
> -Amir

I'll give it a look over the next week.  Like I mentioned above, I may be
able to fix it in Split::to_FTstring() w/o breaking other tests (in which
case I'll commit it for the 1.5.2 release), but it would be a temporary hack
until I can work out why other tests are failing.

Chris


From jason at bioperl.org  Mon Oct 16 18:45:21 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Oct 2006 15:45:21 -0700
Subject: [Bioperl-l] split location problems
Message-ID: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>

The whole point of split locations is to represent genes with introns  
so that is not the "rare" case.

I'm confused where the problem is.  The locations that I get out with  
to_FTstring on the location object are exactly the same as those input.

I have processed the genbank fungal genomes into GFF3 and have had no  
problems so I'm confused where you are breaking down.  If I write  
them out as embl I also get the correct thing.  This is using the CVS  
version of bioperl from the HEAD.

I've added code to test this to bug 2101 including a C.glabrata  
chromsome downloaded from genbank.  Perhaps the problem is on the  
EMBL parsing side, I didn't test that.

On the technical side, I still am not sure I fully know where the  
strand information should be stored - the top level container or the  
sub-features.  I'll try and stay up on the discussion if anything has  
been decided that I should know about.

-jason


From torsten.seemann at infotech.monash.edu.au  Mon Oct 16 18:23:23 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 17 Oct 2006 08:23:23 +1000
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <000201c6f149$3ed63490$15327e82@pyrimidine>
References: <000201c6f149$3ed63490$15327e82@pyrimidine>
Message-ID: <4534065B.9020309@infotech.monash.edu.au>

Chris Fields wrote:
>> So it looks like an abstract base class, not an interface that
>> defines a contract or API? Should use Root.pm then, would be my vote.
>> 	-hilmar
> 
> Makes sense to me.  Maybe another audit is needed to catch similar
> instances, or has this been done already?

The purpose of my original (poorly phrased) question was to try and sort 
out where Root and RootI where being used the wrong way around.

I'm currently "all-audited out" so I leave this task to another volunteer.

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From cjfields at uiuc.edu  Mon Oct 16 21:07:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 20:07:55 -0500
Subject: [Bioperl-l] split location problems
In-Reply-To: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
Message-ID: <BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>


On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote:

> The whole point of split locations is to represent genes with  
> introns so that is not the "rare" case.
>
> I'm confused where the problem is.  The locations that I get out  
> with to_FTstring on the location object are exactly the same as  
> those input.

The problem is with the a subset of split locations described in the  
bug report.  The following works:

complement(join(2691..4571,4918..5163))

whereas this:

join(complement(4918..5163),complement(2691..4571))

gives this:

complement(join(4918..5163,2691..4571))

which is not syntactically the same.  It should be:

complement(join(2691..4571,4918..5163))

since 'join' implies that the order of the segments to be joined is  
important ('order' and 'bond' do not, I guess).

> I have processed the genbank fungal genomes into GFF3 and have had  
> no problems so I'm confused where you are breaking down.  If I  
> write them out as embl I also get the correct thing.  This is using  
> the CVS version of bioperl from the HEAD.
>
> I've added code to test this to bug 2101 including a C.glabrata  
> chromsome downloaded from genbank.  Perhaps the problem is on the  
> EMBL parsing side, I didn't test that.
>
> On the technical side, I still am not sure I fully know where the  
> strand information should be stored - the top level container or  
> the sub-features.  I'll try and stay up on the discussion if  
> anything has been decided that I should know about.
>
> -jason

Split::strand() sets the sublocations as well, which seems to confuse  
the situation more but it is consistent with LocationI, as Hilmar  
points out.  I'm looking into a few solutions now, including a fix in  
Split::to_FTstring().

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Mon Oct 16 22:48:14 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Oct 2006 19:48:14 -0700
Subject: [Bioperl-l] split location problems
In-Reply-To: <BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>
References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
	<BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>
Message-ID: <8273f6c20610161948w201537a5v2fcfa189eb809283@mail.gmail.com>

This probably was exposed by the fact that the Split object used to
explicitly sort the features by start*strand always.  But with remote
locations and needing to be able to explicitly set the order (for features
that are not required to be 5' -> 3') that code must have been removed.   I
think there is just one place that must be missing a 'reverse' on the list
of sub-locations when the top-level feature is a complement.  I'll wait for
your fix before wading in - we probably might want to figure out a
'consolidate' method to shrink redundant and equivalent representations to
the shortest possible form. Ugh this really starts to resemble trying to
write a boolean logic toolkit....
-jason

On 10/16/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
> On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote:
>
> > The whole point of split locations is to represent genes with
> > introns so that is not the "rare" case.
> >
> > I'm confused where the problem is.  The locations that I get out
> > with to_FTstring on the location object are exactly the same as
> > those input.
>
> The problem is with the a subset of split locations described in the
> bug report.  The following works:
>
> complement(join(2691..4571,4918..5163))
>
> whereas this:
>
> join(complement(4918..5163),complement(2691..4571))
>
> gives this:
>
> complement(join(4918..5163,2691..4571))
>
> which is not syntactically the same.  It should be:
>
> complement(join(2691..4571,4918..5163))
>
> since 'join' implies that the order of the segments to be joined is
> important ('order' and 'bond' do not, I guess).
>
> > I have processed the genbank fungal genomes into GFF3 and have had
> > no problems so I'm confused where you are breaking down.  If I
> > write them out as embl I also get the correct thing.  This is using
> > the CVS version of bioperl from the HEAD.
> >
> > I've added code to test this to bug 2101 including a C.glabrata
> > chromsome downloaded from genbank.  Perhaps the problem is on the
> > EMBL parsing side, I didn't test that.
> >
> > On the technical side, I still am not sure I fully know where the
> > strand information should be stored - the top level container or
> > the sub-features.  I'll try and stay up on the discussion if
> > anything has been decided that I should know about.
> >
> > -jason
>
> Split::strand() sets the sublocations as well, which seems to confuse
> the situation more but it is consistent with LocationI, as Hilmar
> points out.  I'm looking into a few solutions now, including a fix in
> Split::to_FTstring().
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
Jason Stajich
jason at bioperl.org
http://www.duke.edu/~jes12/


From cjfields at uiuc.edu  Mon Oct 16 23:34:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 22:34:25 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C159C54B.ACD5%bosborne11@verizon.net>
References: <C159C54B.ACD5%bosborne11@verizon.net>
Message-ID: <AE334107-1639-468E-ABA8-2F992693809A@uiuc.edu>


On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:

> Chris and Sendu,
>
> Sendu was correct in wondering whether id_parser() in Blast.pm  
> would work
> after the module was altered to use SearchIO but what I've found  
> out from my
> local tests is that id_parser() didn't work when BPlite was being used
> either. I can continue to work on this but it's safe to say that  
> removing
> BPlite doesn't cause a problem with id_parser, it was already there.
>
> Brian O.

....

It may be one reason (the main reason?) the method wasn't tested.   
Maybe it should be removed if it can't be easily fixed; I don't think  
it makes sense keeping it otherwise.

Chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bosborne11 at verizon.net  Mon Oct 16 23:24:59 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 23:24:59 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <000301c6f155$c7029ff0$15327e82@pyrimidine>
Message-ID: <C159C54B.ACD5%bosborne11@verizon.net>

Chris and Sendu,

Sendu was correct in wondering whether id_parser() in Blast.pm would work
after the module was altered to use SearchIO but what I've found out from my
local tests is that id_parser() didn't work when BPlite was being used
either. I can continue to work on this but it's safe to say that removing
BPlite doesn't cause a problem with id_parser, it was already there.

Brian O.


On 10/16/06 3:03 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

>> Hilmar Lapp wrote:
>>> The problem is it is not maintained, and there are outstanding been bug
>>> reports.
>>> 
>>> If you un-deprecate it, then we need a response to people who come
>>> across problems with it when using it. Either you change the POD to say
>>> exactly who and when one should use it (or rather not) and point to the
>>> fact that it is unsupported for all other cases.
>>> 
>>> Or what would you suggest?
>> 
>> I'm not sure.
>> 
>> Does Bio::Index::Blast even work correctly? Does it suffer from whatever
>> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should
>> that be deprecated as well?
>> 
>> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO
>> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't
>> seem trivial (or even appropriate).
>> 
>> Ultimately I just wanted to solve the warnings in the test suite.
>> Thoughts, Chris?
> 
> My opinion is we either have to completely support BPlite (and the others)
> or drop it altogether.  I don't think we can state "use BPLite only with
> Bio::Index::Blast, use SearchIO everywhere else."  That's too inconsistent.
> 
> 
> It seems simpler to deprecate the various Bio::Tools::BP* classes and either
> fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working
> on) or deprecate Bio::Index::Blast as well.
> 
> The warnings in the test suite belong to BlastIndex.t, correct?  I updated
> using Brian's Bio::Index::blast fix and it passes now w/o warnings.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Mon Oct 16 23:48:56 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 23:48:56 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <AE334107-1639-468E-ABA8-2F992693809A@uiuc.edu>
Message-ID: <C159CAE8.ACD9%bosborne11@verizon.net>

Chris,

OK. In fact there's no written guarantee that all Bio::Index* modules have
an id_parser() method. It happens that most do, and it's useful. I'll fix
the documentation in Bio::Index::Blast and add an enhancement request to
Bugzilla, may be able to get around to before 1.5.2 release but no promises.

Brian O.


On 10/16/06 11:34 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> 
> On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:
> 
>> Chris and Sendu,
>> 
>> Sendu was correct in wondering whether id_parser() in Blast.pm
>> would work
>> after the module was altered to use SearchIO but what I've found
>> out from my
>> local tests is that id_parser() didn't work when BPlite was being used
>> either. I can continue to work on this but it's safe to say that
>> removing
>> BPlite doesn't cause a problem with id_parser, it was already there.
>> 
>> Brian O.
> 
> ....
> 
> It may be one reason (the main reason?) the method wasn't tested.
> Maybe it should be removed if it can't be easily fixed; I don't think
> it makes sense keeping it otherwise.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Tue Oct 17 02:35:43 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 07:35:43 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
Message-ID: <453479BF.90408@sheffield.ac.uk>

I'm a bit unclear as to what is happening with these files.

Are these files now superseded by the wikified versions? If so, should 
these files now just simply contain a link to the wikified versions - 
otherwise things could get in a mess since I updated the wiki version of 
INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks 
ago - hopefully these differences aren't that big.

Nath


From faruque at ebi.ac.uk  Tue Oct 17 04:19:44 2006
From: faruque at ebi.ac.uk (Nadeem Faruque)
Date: Tue, 17 Oct 2006 09:19:44 +0100
Subject: [Bioperl-l] split location problems
Message-ID: <F2A2DB48-8EDF-43AA-AFCF-45B48AF43B1C@ebi.ac.uk>

EMBL' currently outputs join-complements in the format
join(complement(30..40),complement(10..20))
instead of the Genbank preferred
complement(join(10..20,30..40))

EMBL's may reflect what happens in the cell a little more than  
Genbank's, but it is less readable and less concise.
NB I've also seen a couple of people construct these incorrectly
eg join(complement(10..20),complement(30..40))

I believe we are moving to the complement-join format but I can't  
give a date for the transition.

Having said that, trans-splicing will still give us the joys of  
complex locations,
eg
join(1..5,complement(join(10..20,30..40)))
complement(join(30..40,10..20)) <- looks wrong (unless it is a very  
small circle) but mis-ordered exons are resolved by the trans- 
splicing machinery.

Nadeem


--
S.M. Nadeem N. Faruque
EMBL Nucleotide Database Curation Team
EMBL Outstation
Tel: +44 1223 494611                     Fax: +44 1223 494472
The European Bioinformatics Institute    URL: http://www.ebi.ac.uk/
Email for data submissions: datasubs at ebi.ac.uk
Email for updates: update at ebi.ac.uk
========================================================


From bix at sendu.me.uk  Tue Oct 17 04:59:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 09:59:36 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>	<45333E02.9070808@sendu.me.uk>
	<1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
Message-ID: <45349B78.8090905@sendu.me.uk>

Hilmar Lapp wrote:
> So it looks like an abstract base class, not an interface that  
> defines a contract or API? Should use Root.pm then, would be my vote.

Agreed, that was actually what I did in my local copy when I made a new 
inheriting class (so discovering the problem). This change is harmless 
to other modules, but does mean they'll have redundant use of 
Bio::Root::Root which will want cleaning up at some stage.


From bix at sendu.me.uk  Tue Oct 17 06:32:54 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 11:32:54 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
Message-ID: <4534B156.4090501@sendu.me.uk>

Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
See http://www.bioperl.org/wiki/Release_1.5.2 for
instructions on getting and testing this RC.

Developers:
   This should be the last RC before release ~next monday. Now would
   be a good time for last minute documentaiton updates and additions.

Users:
   Even though 1.5.2 is a 'developer' release, we consider it the most
   stable and capable version of Bioperl, and recommend that you use
   it in all but the most critical production environments. Please
   try it out and let us know of any problems or difficulties you run
   into.


Thank you,
Sendu.


From cjfields at uiuc.edu  Tue Oct 17 07:16:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 06:16:47 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <453479BF.90408@sheffield.ac.uk>
References: <453479BF.90408@sheffield.ac.uk>
Message-ID: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>

The general consensus was to keep text versions available; we could  
add URL links to the wiki pages for the most up-to-dat version.  BTW,  
I have modified INSTALL already.  INSTALL.WIN is next in line (I was  
waiting for your changes).

Chris

On Oct 17, 2006, at 1:35 AM, Nathan S. Haigh wrote:

> I'm a bit unclear as to what is happening with these files.
>
> Are these files now superseded by the wikified versions? If so, should
> these files now just simply contain a link to the wikified versions -
> otherwise things could get in a mess since I updated the wiki  
> version of
> INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks
> ago - hopefully these differences aren't that big.
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Tue Oct 17 07:45:45 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 12:45:45 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>
References: <453479BF.90408@sheffield.ac.uk>
	<72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>
Message-ID: <4534C269.5050704@sheffield.ac.uk>

Chris Fields wrote:
> The general consensus was to keep text versions available; we could 
> add URL links to the wiki pages for the most up-to-dat version.  BTW, 
> I have modified INSTALL already.  INSTALL.WIN is next in line (I was 
> waiting for your changes).
>
Is it possible to generate these files from the wiki whenever there is a 
release? I now edits shouldn't be too severe or too often - but I can 
see things getting a little messy/annoying if edits have to be made in 2 
places.

Nath


From cjfields at uiuc.edu  Tue Oct 17 10:04:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:04:32 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534C269.5050704@sheffield.ac.uk>
Message-ID: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>

There isn't a very easy way since so many links have to be removed/modified.
I have found a few CPAN modules that could help, but for now I just dump the
text output from a text browser (elinks) using the 'printable version' page
and hand-edit, which works very quickly.  That works for the time being
until I can find another more automated solution.

Fortunately there have been very few edits to either INSTALL wiki page so
they should remain relatively stable.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
> Sent: Tuesday, October 17, 2006 6:46 AM
> To: Chris Fields
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> 
> Chris Fields wrote:
> > The general consensus was to keep text versions available; we could
> > add URL links to the wiki pages for the most up-to-dat version.  BTW,
> > I have modified INSTALL already.  INSTALL.WIN is next in line (I was
> > waiting for your changes).
> >
> Is it possible to generate these files from the wiki whenever there is a
> release? I now edits shouldn't be too severe or too often - but I can
> see things getting a little messy/annoying if edits have to be made in 2
> places.
> 
> Nath


From cjfields at uiuc.edu  Tue Oct 17 10:12:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:12:09 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C159CAE8.ACD9%bosborne11@verizon.net>
Message-ID: <000401c6f1f6$424b5580$15327e82@pyrimidine>


> Chris,
> 
> OK. In fact there's no written guarantee that all Bio::Index* modules have
> an id_parser() method. It happens that most do, and it's useful. I'll fix
> the documentation in Bio::Index::Blast and add an enhancement request to
> Bugzilla, may be able to get around to before 1.5.2 release but no
> promises.
> 
> Brian O.

Do the various Bio::Index* modules share a common interface?  

I wouldn't worry too much about it for this release, unless you really have
time.  It is still, after all, a developer's release, and you've noted it in
Bugzilla.  We could try for another dev release in winter (rel 1.5.3, I
guess) to get any bug fixes or new modules added.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> On 10/16/06 11:34 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
> 
> >
> > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:
> >
> >> Chris and Sendu,
> >>
> >> Sendu was correct in wondering whether id_parser() in Blast.pm
> >> would work
> >> after the module was altered to use SearchIO but what I've found
> >> out from my
> >> local tests is that id_parser() didn't work when BPlite was being used
> >> either. I can continue to work on this but it's safe to say that
> >> removing
> >> BPlite doesn't cause a problem with id_parser, it was already there.
> >>
> >> Brian O.
> >
> > ....
> >
> > It may be one reason (the main reason?) the method wasn't tested.
> > Maybe it should be removed if it can't be easily fixed; I don't think
> > it makes sense keeping it otherwise.
> >
> > Chris
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Tue Oct 17 10:15:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 15:15:17 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
Message-ID: <4534E575.5050308@sheffield.ac.uk>

Chris Fields wrote:
> There isn't a very easy way since so many links have to be removed/modified.
> I have found a few CPAN modules that could help, but for now I just dump the
> text output from a text browser (elinks) using the 'printable version' page
> and hand-edit, which works very quickly.  That works for the time being
> until I can find another more automated solution.
>
> Fortunately there have been very few edits to either INSTALL wiki page so
> they should remain relatively stable.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>   
So am I correct in saying that the best way is to make all updates to 
the wikified versions of these files, and then at regular 
intervals/major releases you (or someone else) will update the CVS 
version of the files in the way describe above?

Cheers
Nath


From bix at sendu.me.uk  Tue Oct 17 10:00:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 15:00:39 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E09C.9030707@genomics.dk>
References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk>
Message-ID: <4534E207.8030508@sendu.me.uk>

Niels Larsen wrote:
> Greetings,
> 
> I am no perl beginner, but I am a BioPerl beginner. Today I looked
> for remote similarity services that can be used from Perl. I found
> the EBI SOAP interface where their example script returns
> 
> Can't find method element in the message at 
> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

What script exactly? There was a problem with the SOAP server that was 
fixed earlier today.


> and the DDBJ service which (from Denmark) returns
> 
> undef

What returned undef? Specifics please.


> and then the NCBI server accessed through BioPerls RemoteBlast which
> seems to spin in a loop that fills TMPDIR with many tempfiles. Will
> release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
> is working towards that).

What version of Bioperl were you testing with? What did you do to get it 
to 'spin in a loop'? I can tell you that remote blasting certainly works 
in Bioperl 1.5.2, but you'll have to give more details on the things you 
tried and the problems you encountered.

You can also answer the questions yourself by trying the release candidate.


From B.Beckert at ibmc.u-strasbg.fr  Tue Oct 17 09:59:30 2006
From: B.Beckert at ibmc.u-strasbg.fr (Bertrand Beckert)
Date: Tue, 17 Oct 2006 15:59:30 +0200
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
Message-ID: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>


hi,

I am running a large number of blasts via a connexion to ncbi blast
page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
some problems. I make a simple example with only one sequence in
order to understand how work this module. This is my simple input
file, a DNA sequence in fasta form:

> test
>
TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT
TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG
TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA
I have made some modification of the example available in doc of
bioperl.
It give me a RID which contain the results of my blast but I have a
problem with the "$result=$factory->retrieve_blast($rid)" in my script.
In the documentation it wrote that $result=$factory->retrieve_blast
($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
object. In my case it returns a Bio::SearchIO::blast... I don't
understand why I don't have the good type of object return (see PART I).

I also try to resolve the problem by replace the foreach loop in my
script by a new one in order to explore the blast page result but it
also don't work (see part II).

could you help me please. Thank you

Bertrand Beckert.

PART I:

Here is my script with a little annotation and also the shell window
printing:
------------------------------------------------------------------------

----------------------------
#!/usr/bin/perl -w
use Bio::Tools::Run::RemoteBlast;
use Bio::SearchIO;
sub blast {
my $prog='blastn';
my $db='refseq_genomic';
my $e_val='1e-10';
my $Input='Seq.fasta';
my @params = ('-prog' =>  $prog, '-data' =>  $db, '-expect' =>
$e_val, '-readmethod' => 'SearchIO');
my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
#changes parameters
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25';
$factory->submit_blast($Input);
print STDERR "waiting...\n";
while (my @rids=$factory->each_rid) {
           print "my rid: ", at rids,"\n";
	 #return me the ID of the submited blast i.e. RID:
1161079157-766-185099855365.BLASTQ2
	 #this page contains the result of my blast...
	         foreach my $rid (@rids) {
		                 $result=$factory->retrieve_blast($rid);
		#line in order to understand what type of object is
return by
retrieve_blast		
                  print "rc:", $result,"\n";
		
		                }
			}
		}

&blast;
------------------------------------------------------------------------

----------------------------

here you can see the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc54)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc30)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x89eb7f4)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x8a2cc74)
my rid: 1161079157-766-185099855365.BLASTQ2
...
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x886bbac)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x89eb5f0)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x8a2d2d4)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x84fa054)
...

PARTII:

I also try to resolve the problem by replace the foreach loop in my
script by:
------------------------------------------------------------------------

----------------------------
foreach my $rid (@rids) {
                  while(1) {
                  $result=$factory->retrieve_blast($rid)->next_result();
                  print "rc:", $result,"\n";
                  if ($result) {
                  print  $result->num_hits(),"\n";
                  }
------------------------------------------------------------------------

----------------------------
With tis loop I could explore the result Blast page. that is what I
obtain in the shell window:
		
bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161088606-9905-123050755601.BLASTQ4
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb834)


----
-- 
Berrtrand BECKERT
PhD student
IBMC - UPR 9002 du CNRS - ARN
15, rue Rene Descartes
F-67084 STRASBOURG Cedex
b.beckert at ibmc.u-strasbg.fr


From niels at genomics.dk  Tue Oct 17 09:54:36 2006
From: niels at genomics.dk (Niels Larsen)
Date: Tue, 17 Oct 2006 15:54:36 +0200
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534B156.4090501@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>
Message-ID: <4534E09C.9030707@genomics.dk>

Greetings,

I am no perl beginner, but I am a BioPerl beginner. Today I looked
for remote similarity services that can be used from Perl. I found
the EBI SOAP interface where their example script returns

Can't find method element in the message at 
/ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

and the DDBJ service which (from Denmark) returns

undef

and then the NCBI server accessed through BioPerls RemoteBlast which
seems to spin in a loop that fills TMPDIR with many tempfiles. Will
release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
is working towards that).

Niels L


------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------


From cjfields at uiuc.edu  Tue Oct 17 10:28:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:28:40 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534E575.5050308@sheffield.ac.uk>
Message-ID: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>

...
> So am I correct in saying that the best way is to make all updates to
> the wikified versions of these files, and then at regular
> intervals/major releases you (or someone else) will update the CVS
> version of the files in the way describe above?
> 
> Cheers
> Nath

Yes.  I think the online docs will stay relatively stable.  A week or so ago
Mauricio and I were discussing moving the dependencies list to it's own CVS
document (since they pertain to all Bioperl installations, not just UNIX'y
flavors).  I haven't done that yet since I was waiting on the INSTALL.WIN
changes before I made any more changes.  Well, that and I've been really
busy doing other things.

One way we could make sure that changes to the online docs would match the
CVS docs would be to only allow certain wiki users (such as sysadmins) make
modifications to those pages.  That way any changes would have to go through
someone who also has CVS access and could make similar changes to the
distribution docs.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Tue Oct 17 10:37:38 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 15:37:38 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>
References: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>
Message-ID: <4534EAB2.50609@sheffield.ac.uk>

Chris Fields wrote:
> ...
>   
>> So am I correct in saying that the best way is to make all updates to
>> the wikified versions of these files, and then at regular
>> intervals/major releases you (or someone else) will update the CVS
>> version of the files in the way describe above?
>>
>> Cheers
>> Nath
>>     
>
> Yes.  I think the online docs will stay relatively stable.  A week or so ago
> Mauricio and I were discussing moving the dependencies list to it's own CVS
> document (since they pertain to all Bioperl installations, not just UNIX'y
> flavors).  I haven't done that yet since I was waiting on the INSTALL.WIN
> changes before I made any more changes.  Well, that and I've been really
> busy doing other things.
>   
Sounds good.
> One way we could make sure that changes to the online docs would match the
> CVS docs would be to only allow certain wiki users (such as sysadmins) make
> modifications to those pages.  That way any changes would have to go through
> someone who also has CVS access and could make similar changes to the
> distribution docs.
>   
Ugh, not sure I like the sound of maintaining 2 copies of any files - 
sounds like a future headache even if they are pretty stable. It also 
makes it unclear which of the two file should be considered first (i.e. 
is the most up-to-date) on pages such as:
http://www.bioperl.org/wiki/Installing_BioPerl

It suggests that INSTALL and INSTALL.WIN should be looked at first, but 
there are online copies of those files available - this should now be 
the other way around - shouldn't it? I might just be making a mountain 
out of a molehill, so I'll shut up on this topic and make any future 
edits to the wiki pages instead.
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   


From bosborne11 at verizon.net  Tue Oct 17 10:48:54 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 17 Oct 2006 10:48:54 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <000401c6f1f6$424b5580$15327e82@pyrimidine>
Message-ID: <C15A6596.AD0B%bosborne11@verizon.net>

Chris,

The Bio::Index modules either 'use base qw(Bio::Index::Abstract)' or 'use
base qw(Bio::Index::AbstractSeq)'. Neither of these modules has an
id_parser() method.

Brian O.


On 10/17/06 10:12 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> Do the various Bio::Index* modules share a common interface?  


From cjfields at uiuc.edu  Tue Oct 17 10:45:53 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:45:53 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534EAB2.50609@sheffield.ac.uk>
Message-ID: <000601c6f1fa$f260b560$15327e82@pyrimidine>

...
> > One way we could make sure that changes to the online docs would match
> the
> > CVS docs would be to only allow certain wiki users (such as sysadmins)
> make
> > modifications to those pages.  That way any changes would have to go
> through
> > someone who also has CVS access and could make similar changes to the
> > distribution docs.
> >
> Ugh, not sure I like the sound of maintaining 2 copies of any files -
> sounds like a future headache even if they are pretty stable. It also
> makes it unclear which of the two file should be considered first (i.e.
> is the most up-to-date) on pages such as:
> http://www.bioperl.org/wiki/Installing_BioPerl
> 
> It suggests that INSTALL and INSTALL.WIN should be looked at first, but
> there are online copies of those files available - this should now be
> the other way around - shouldn't it? I might just be making a mountain
> out of a molehill, so I'll shut up on this topic and make any future
> edits to the wiki pages instead.

Yes that should be the other way around (the wiki would be the most
up-to-date), so the CVS docs should point to the wiki, not vice-versa.

Getting the docs right is as important as getting the code to work.  So I
don't consider it a 'mountain-out-of-a-molehill' problem.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Tue Oct 17 11:07:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 10:07:49 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E207.8030508@sendu.me.uk>
Message-ID: <001001c6f1fe$02fd4de0$15327e82@pyrimidine>

> Niels Larsen wrote:
> > Greetings,
> >
> > I am no perl beginner, but I am a BioPerl beginner. Today I looked
> > for remote similarity services that can be used from Perl. I found
> > the EBI SOAP interface where their example script returns
> >
> > Can't find method element in the message at
> > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.
> 
> What script exactly? There was a problem with the SOAP server that was
> fixed earlier today.
> 
> 
> > and the DDBJ service which (from Denmark) returns
> >
> > undef
> 
> What returned undef? Specifics please.
> 

The first problem, like Sendu mentions, was fixed on the remote server (I
get them to pass now).  Those were from bioperl-run, though, not the bioperl
core distribution.

As for DDBJ, do you mean EBI or SwissProt?  I ask b/c you mention Denmark.
EBI were having server maintenance outages yesterday, which was announced
here.

As Sendu mentions, please be more specific.

> > and then the NCBI server accessed through BioPerls RemoteBlast which
> > seems to spin in a loop that fills TMPDIR with many tempfiles. Will
> > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
> > is working towards that).
> 
> What version of Bioperl were you testing with? What did you do to get it
> to 'spin in a loop'? I can tell you that remote blasting certainly works
> in Bioperl 1.5.2, but you'll have to give more details on the things you
> tried and the problems you encountered.
> 
> You can also answer the questions yourself by trying the release
> candidate.

The tempfiles showing up are from the repeated RID requests and are deleted
after the BLAST run (at least they should be); this is quite normal.  They
don't 'spin in a loop' unless the BLAST query is taking a particularly long
time, which can happen depending on how the BLAST query is set up, i.e. what
type of BLAST program is requested, if comp-based stats are requested,
length of query, database requested, etc.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 17 11:14:07 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 16:14:07 +0100
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
In-Reply-To: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>
References: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>
Message-ID: <4534F33F.3070809@sendu.me.uk>

Bertrand Beckert wrote:
> hi,
> 
> I am running a large number of blasts via a connexion to ncbi blast
> page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
> I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
> some problems.
[snip]
> In the documentation it wrote that $result=$factory->retrieve_blast
> ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
> object. In my case it returns a Bio::SearchIO::blast... I don't
> understand why I don't have the good type of object return (see PART I).

I take it you're using some old version of Bioperl where unfortunately 
the documentation was incorrect. In fact you're supposed to get a 
Bio::SearchIO object, so it is a good thing that you are. The latest 
version of Bioperl has (as far as I can see) correct documentation and 
behaviour.

Bio::Tools::Bplite and Bio::Tools::Blast are deprecated. You want 
Bio::SearchIO::blast. All is well.


> I also try to resolve the problem by replace the foreach loop in my
> script by a new one in order to explore the blast page result but it
> also don't work (see part II).

I'm not really sure what problem you might be facing there, but take a 
look at some up-to-date documentation, using the new example code:

http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html


From n.haigh at sheffield.ac.uk  Tue Oct 17 12:10:15 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 17:10:15 +0100
Subject: [Bioperl-l] [Fwd: Re: Bundle::BioPerl]
Message-ID: <45350067.6070604@sheffield.ac.uk>

FYI on Bundle::BioPerl

Nathan

-------- Original Message --------
Subject: 	Re: Bundle::BioPerl
Date: 	Tue, 17 Oct 2006 11:52:00 -0400
From: 	Chris Dagdigian <dag at sonsorol.org>
To: 	Nathan S. Haigh <n.haigh at sheffield.ac.uk>
References: 	<45348FB8.4050009 at sheffield.ac.uk>


Hi Nathan,

I've updated the Bundle and uploaded it to CPAN.

I *think* the rationale for keeping it still exists but I'm removed  
enough from Bioperl now that I'll defer to others on the decision.

The basic idea was that BioPerl has a heck of a lot of dependencies  
that it requires of (other perl modules) in order to get all the  
functionality out of it. Many of these dependencies may not be  
present in default Perl installations.  Tracking down all of the  
dependencies and installing them (along with all of the dependencies- 
of-the-dependencies) by hand is a massive pain.

The nice thing about the Bundle is that it lists the core module  
dependencies and it works great with the CPAN.pm module to automate  
the downloading and installation of everything that BioPerl requires.  
The CPAN module is smart enough that when processing *our* bundle it  
will also track down and install anything that our bundle entries  
themselves list as a dependency.

So for unix/Linux systems the Bundle is a great one-liner ("perl - 
MCPAN -e 'install Bundle::BioPerl'" )  way to auto-install or update  
the many perl modules that BioPerl makes use of.

On the windows side, not sure if it is of any help though.

Regards,
Chris


On Oct 17, 2006, at 4:09 AM, Nathan S. Haigh wrote:

> Hi Chris
>
> I've been working on making a PPD for the upcoming Bioperl 1.5.2  
> release. During this time I also updated Bundle::BioPerl to include  
> up-to-date prereqs. I was wondering if you could update the CPAN  
> package? The updated BioPerl.pm file is attached.
>
> There is some talk about why and if we need Bundle::BioPerl  
> anymore. What was the rationale for having it in the first place,  
> and does it still hold true now?
>
> Cheers
> Nath
>


From plu5even at gmail.com  Tue Oct 17 12:26:34 2006
From: plu5even at gmail.com (Peter H. Baenziger)
Date: Tue, 17 Oct 2006 12:26:34 -0400
Subject: [Bioperl-l] LocatableSeq object vs Sequence Object
Message-ID: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com>

All,
This is my first bioperl script (but not my first Perl script) so
please forgive my naivety.  I've read through documentation and looked
through cookbooks and the like but to no avail.  Any advice is
appreciated.
 So...I am working with an alignment object of several sequences.  My
intentions is to loop through all the sequences of the alignment to
find what amino acid they have at a known position in the alignment
(not the position in the sequence).  I was thinking I could use:
foreach $seq ($alignment->each_seq())
to loop through the sequences and call:
$seq->location_from_column($pos)
on each of the sequences.  However, I don't think I have
"LocatableSequences" (the type of object that has method
"location_from_columns") being returned by $alignment->each_seq().
So, how do I bridge this gap here?  Or is there a better way?
My appreciation in advance!
Peter

 code:
my $swissObj = $swissdb->get_Seq_by_acc($query);  //put several of
these in @sequenceObjects
...
my $alignFactory = Bio::Tools::Run::Alignment::Clustalw->new();
    my $alignment = $alignFactory->align(\@sequenceObjects);
    #print $alignment->overall_percentage_identity(); #works

    #now we find the "alignment position" of the mutation we have on
the human version and get the amino acid at that "alignment position"
for all seq
    my $humanSequence = $prefix."HUMAN";
    my $pos = $alignment->column_from_residue_number($humanSequence,
$aa_seqpos); #this is the "alignment position" equivalent to the
mutation position

    #we'll keep track of what amino acid each species has at the
"alignment equivalent" location listed as being a mutation on the the
human version
    foreach $seq ($alignment->each_seq())
    {
        #print $seq->species() . "\n"; #won't work because
$alignment->each_seq() actually returns a locatableSeq object, not a
normal sequence object
        $speciesAA{$species} = $seq->locatation_from_column($pos);
    }


-- 
<<->>
Peter H. Baenziger


From akarger at CGR.Harvard.edu  Tue Oct 17 12:53:19 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Tue, 17 Oct 2006 12:53:19 -0400
Subject: [Bioperl-l] split location problems
Message-ID: <B9182BFF5B004245BABC12956EA6322E018E6735@huls5.nucleus.harvard.edu>

> From: Jason Stajich [mailto:jason.stajich at gmail.com]
> 
> The whole point of split locations is to represent genes with 
> introns  
> so that is not the "rare" case.

Absolutely.

> I have processed the genbank fungal genomes into GFF3 and 
> have had no  
> problems so I'm confused where you are breaking down.  If I write  
> them out as embl I also get the correct thing.  This is using 
> the CVS  
> version of bioperl from the HEAD.
> 
> I've added code to test this to bug 2101 including a C.glabrata  
> chromsome downloaded from genbank.  Perhaps the problem is on the  
> EMBL parsing side, I didn't test that.

Well, I don't know whether it's EMBL parsing, or a bit further down the
pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968),
and it describes the complement/joins in the way that Bioperl is
handling correctly.

GenBank:
     CDS             complement(join(10347..10372,10632..11157))
                     /locus_tag="CAGL0B00242g"

EMBL:
FT   CDS
join(complement(10632..11157),complement(10347..10372))
FT                   /locus_tag="CAGL0B00242g"

Here's the diff when I run the location-printing script I posted
yesterday:

diff biogb bio
1c1,5
< complement(join(10347..10372,10632..11157))
---
> complement(1701..2651)
> complement(2635..3345)
> complement(3980..4408)
> complement(join(10632..11157,10347..10372))
> 10379..10615
209a214,217
> 498198..498890
> 499712..500062
> 499851..500702
> 500579..501364

As you can see, the complement/join CDS is written out in a different
order, which is Bad.

(I looked at at least one of the other differences: the GB file says
it's a "misc feature" and EMBL says it's a CDS. But they don't seem to
be relevant here.)

-Amir

> 
> On the technical side, I still am not sure I fully know where the  
> strand information should be stored - the top level container or the  
> sub-features.  I'll try and stay up on the discussion if 
> anything has  
> been decided that I should know about.
> 
> -jason
> 
> 
> 
> 


From paul.boutros at utoronto.ca  Tue Oct 17 12:57:19 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 12:57:19 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
Message-ID: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>

Hi,
Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed  
tests, the first seems to be just a result of me not having DBD::mysql  
installed.
Paul

Test Summary
============

Failed Test               Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
t/BioDBSeqFeature_mysql.t               46   46  1-46
t/SearchIO.t                22  5632  1337 2671  2-1337
2 tests and 106 subtests skipped.
Failed 2/236 test scripts. 1382/11688 subtests failed.
Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =  
159.61 CPU)

BioDBSeqFeature_mysql
=====================
pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
1..46
install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC  
contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t  
/db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi  
/db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at  
(eval 37) line 3.
Perhaps the DBD::mysql perl module hasn't been fully installed,
or perhaps the capitalisation of 'mysql' isn't right.
Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
  at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208

SearchIO
========
pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
1..1337
ok 1

-------------------- WARNING ---------------------
MSG: XML::SAX::Expat not currently supported; must have local copies  
of NCBI DTD docs!
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: error in parsing a report:

404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'  
does not exist  
file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
Handler couldn't resolve external entity at line 2, column 82, byte 104
error in processing external entity reference at line 2, column 82,  
byte 104 at  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line  
187

---------------------------------------------------
not ok 2
# Failed test 2 in t/SearchIO.t at line 68
Can't call method "database_name" on an undefined value at  
t/SearchIO.t line 69.

------------------------------

Message: 10
Date: Tue, 17 Oct 2006 11:32:54 +0100
From: Sendu Bala <bix at sendu.me.uk>
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
To: bioperl-l at bioperl.org
Message-ID: <4534B156.4090501 at sendu.me.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
See http://www.bioperl.org/wiki/Release_1.5.2 for
instructions on getting and testing this RC.

Developers:
    This should be the last RC before release ~next monday. Now would
    be a good time for last minute documentaiton updates and additions.

Users:
    Even though 1.5.2 is a 'developer' release, we consider it the most
    stable and capable version of Bioperl, and recommend that you use
    it in all but the most critical production environments. Please
    try it out and let us know of any problems or difficulties you run
    into.


Thank you,
Sendu.


From barry.moore at genetics.utah.edu  Tue Oct 17 12:57:48 2006
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Tue, 17 Oct 2006 10:57:48 -0600
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
Message-ID: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>

lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix

does a reasonable job of textifying html.  You get the links as  
numbered references at the bottom or:

lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |  
perl -ane 's/\[?\[\d+\](edit\])?//g;print'

to remove the links all together.

Barry

P.S.  Looks like this:

    #Creative Commons copyright

Installing Bioperl for Unix

 From BioPerl

    Jump to: navigation, search

Contents

      * 1 BIOPERL INSTALLATION
      * 2 SYSTEM REQUIREMENTS
      * 3 OPTIONAL
      * 4 ADDITIONAL INSTALLATION INFORMATION
      * 5 THE BIOPERL BUNDLE
      * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
      * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
      * 8 WHERE ARE THE MAN PAGES?
      * 9 EXTERNAL PROGRAMS
           + 9.1 Environment Variables
      * 10 INSTALLING BIOPERL SCRIPTS
      * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
      * 12 INSTALLING BIOPERL MODULES THE HARD WAY
      * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
      * 14 THE TEST SYSTEM
      * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
           + 15.1 CONFIGURING for BSD and Solaris boxes
           + 15.2 INSTALLATION
         * 16 DEPENDENCIES AND Bundle::BioPerl


BIOPERL INSTALLATION

    Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
    and on Mac OS X (see the PLATFORMS file for more details).  
Following are
    instructions  for  installing Bioperl for Unix/Linux/Mac OS X;  
Windows
    installation instructions can be found here. For installing  
Bioperl for
    Mac OS X using Fink, see Getting BioPerl.


SYSTEM REQUIREMENTS

      * Perl 5.005 or later; version 5.6 and greater are recommended.  
Note
        that most modules will work with earlier versions of Perl.  
The only ones
        that will not are Bio::SimpleAlign and the Bio::Index::*  
modules. If
        you don't need these modules and you want to install Bioperl  
using an
        earlier version of Perl, edit the "require 5.005;" line in  
Makefile.PL
        as necessary.

      * External modules: Bioperl uses functionality provided in  
other Perl
        modules. Some of these are included in the standard perl  
package but
        some  need to be obtained from the CPAN site. The list of  
external
        modules is included at the bottom of this document.

    The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of  
these
    external modules easy. Simply install the bundle using your CPAN  
shell and
    all necessary modules will be installed. See THE BIOPERL BUNDLE,  
below.


OPTIONAL

      * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
        bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
        PACKAGE, below).


ADDITIONAL INSTALLATION INFORMATION

      * Additional information on Bioperl and MAC OS:
           + OS 9 - http://bioperl.org/Core/mac-bioperl.html
           + OSX-http://www.tc.umn.edu/~cann0010/ 
Bioperl_OSX_install.html
           + OS X - Installing using Fink (in Getting BioPerl)


THE BIOPERL BUNDLE

    You typically need root privileges to install using CPAN. If you  
don't
    have these privileges please see INSTALLING BIOPERL IN A PERSONAL  
MODULE
    AREA for additional information.

    Install Bundle::Bioperl using CPAN. One way:
 >perl -MCPAN -e "install Bundle::BioPerl"

    Another way:
 >perl -MCPAN -e shell
cpan>install Bundle::BioPerl


On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:

> There isn't a very easy way since so many links have to be removed/ 
> modified.
> I have found a few CPAN modules that could help, but for now I just  
> dump the
> text output from a text browser (elinks) using the 'printable  
> version' page
> and hand-edit, which works very quickly.  That works for the time  
> being
> until I can find another more automated solution.
>
> Fortunately there have been very few edits to either INSTALL wiki  
> page so
> they should remain relatively stable.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>> -----Original Message-----
>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>> Sent: Tuesday, October 17, 2006 6:46 AM
>> To: Chris Fields
>> Cc: bioperl-l
>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>
>> Chris Fields wrote:
>>> The general consensus was to keep text versions available; we could
>>> add URL links to the wiki pages for the most up-to-dat version.   
>>> BTW,
>>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
>>> waiting for your changes).
>>>
>> Is it possible to generate these files from the wiki whenever  
>> there is a
>> release? I now edits shouldn't be too severe or too often - but I can
>> see things getting a little messy/annoying if edits have to be  
>> made in 2
>> places.
>>
>> Nath
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From niels at genomics.dk  Tue Oct 17 12:58:14 2006
From: niels at genomics.dk (Niels Larsen)
Date: Tue, 17 Oct 2006 18:58:14 +0200
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E207.8030508@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk>
	<4534E207.8030508@sendu.me.uk>
Message-ID: <45350BA6.3040102@genomics.dk>

Ok, here are ways to reproduce; I sure apologize if I made the
test scripts wrong. And I suppose EBI/DDBJ's interfaces are not
a bioperl issue really.

Niels

------------ EBI

I invoked the EBI script

http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip

like this

WSWUBlastClient.pl -p blastn -D embl test.fasta

where the content of test.fasta is below, and got

Can't find method element in the message at 
/ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

 >Planctomyces sp. 282; Genbank Taxonomy ID: 79927
AATGAACGTTGGCGGCATGGATTAGGCATGCAAGTCGAGGGAGAACCCGCAAGGGGACACCGGCG
AACGGGGTAGGAATACATAGGTAACGTACCCTCAGGACGGGGATAGCCAAGGGAAACTTTGGGTA
ATACCCGATGTGATGGCAAGATGTGAATGCTTGTCATCAAAGGTGAGATTCCACCTGAGGAGCGG
CTTATGCATCATTAGCTTGTTGGCGGGGTAACGGCCCACCAAGGCTGCGATGATTAGGGGGTGTG
AGAGCATGGCCCCCACCACTGGCACTGAGACACTGGCCAGACACCTACGGGTGGCTGCAGTCGAG

I tried with this test sequence in fasta format and with just the
sequence.

------------ DDBJ

Inspired by this page,

http://xml.nig.ac.jp/doc/Blast.txt

I made this test script

------ cut --
#!/usr/bin/env perl

use strict;
use warnings FATAL => qw ( all );

my ( $service, $seqstr, $result );

use SOAP::Lite;
use Data::Dumper;

$service = SOAP::Lite->service('http://xml.nig.ac.jp/wsdl/Blast.wsdl');

$seqstr = "MSSRIARALALVVTLLHLTRLALSTCPAACHCPLEAPKCAPGVGLVRDGCGCCKVCAKQL";

$result = $service->searchSimple( "blastp", "SWISS", $seqstr );

print Dumper( $result );
------ cut --

which for me prints undef.

------------- NCBI/Bioperl

I installed 1.5.2-RC2, looked at the RemoteBlast example in

http://www.bioperl.org/wiki/Bptutorial.pl

and then put that into this test code, more or less cut/paste,

--- cut --
#!/usr/bin/env perl

use strict;
use warnings FATAL => qw ( all );

use Bio::Tools::Run::RemoteBlast;
use Data::Dumper;

my ( $remote_blast, $r, $rc, $rid, @rids );

$remote_blast = Bio::Tools::Run::RemoteBlast->new (
                 -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' );

$r = $remote_blast->submit_blast("ecoli.fasta");

while ( @rids = $remote_blast->each_rid )
{
#    print Dumper( \@rids );

     for $rid ( @rids ) {
         $rc = $remote_blast->retrieve_blast($rid);
#        print Dumper( $rc );
     }

     sleep 10;
}
--- cut --

which saves the same blast report to TMPDIR for every 10 seconds.
The "ecoli.fasta" file contains this

 >test
gggggctctgttggttctcccgcaacgctactctgtttaccaggtcaggtccggaaggaa
gcagccaaggcagatgacgcgtgtgccgggatgtagctggcagggcccccaccc

Maybe I am supposed to add a check for content in $rc and then stop
the inner loop? I could figure that out maybe, but I wish there was a
function which simply takes a single sequence + arguments and only
returns a list of matches when done, and does not return until then
(or until a specified timeout).


------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------


From bertrand.beckert at gmail.com  Tue Oct 17 10:52:36 2006
From: bertrand.beckert at gmail.com (bertrand beckert)
Date: Tue, 17 Oct 2006 16:52:36 +0200
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
Message-ID: <500217090610170752q565cfc08t5208e3b64f99ef7f@mail.gmail.com>

hi,

I am running a large number of blasts via a connexion to ncbi blast
page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
some problems. I make a simple example with only one sequence in
order to understand how work this module. This is my simple input
file, a DNA sequence in fasta form:

>test
TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT
TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG
TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA

I have made some modification of the example available in doc of bioperl.
It give me a RID which contain the results of my blast but I have a
problem with the "$result=$factory->retrieve_blast($rid)" in my script.
In the documentation it wrote that $result=$factory->retrieve_blast
($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
object. In my case it returns a Bio::SearchIO::blast... I don't
understand why I don't have the good type of object return (see PART I).

I also try to resolve the problem by replace the foreach loop in my
script by a new one in order to explore the blast page result but it
also don't work (see part II).

could you help me please. Thank you

Bertrand Beckert.

PART I:

Here is my script with a little annotation and also the shell window
printing:
------------------------------------------------------------------------
#!/usr/bin/perl -w
use Bio::Tools::Run::RemoteBlast;
use Bio::SearchIO;
sub blast {
my $prog='blastn';
my $db='refseq_genomic';
my $e_val='1e-10';
my $Input='Seq.fasta';
my @params = ('-prog' =>  $prog, '-data' =>  $db, '-expect' =>
$e_val, '-readmethod' => 'SearchIO');
my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
#changes parameters
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25';
$factory->submit_blast($Input);
print STDERR "waiting...\n";
while (my @rids=$factory->each_rid) {
          print "my rid: ", at rids,"\n";
     #return me the ID of the submited blast i.e. RID:
1161079157-766-185099855365.BLASTQ2
     #this page contains the result of my blast...
             foreach my $rid (@rids) {
                         $result=$factory->retrieve_blast($rid);
        #line in order to understand what type of object is
return by
retrieve_blast
                 print "rc:", $result,"\n";

                        }
            }
        }

&blast;
------------------------------------------------------------------------

here you can see the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc54)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc30)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x89eb7f4)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x8a2cc74)
my rid: 1161079157-766-185099855365.BLASTQ2
...
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x886bbac)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x89eb5f0)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x8a2d2d4)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x84fa054)
...

PARTII:

I also try to resolve the problem by replace the foreach loop in my
script by:
------------------------------------------------------------------------

foreach my $rid (@rids) {
                 while(1) {
                 $result=$factory->retrieve_blast($rid)->next_result();
                 print "rc:", $result,"\n";
                 if ($result) {
                 print  $result->num_hits(),"\n";
                 }
------------------------------------------------------------------------

With tis loop I could explore the result Blast page. that is what I
obtain in the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161088606-9905-123050755601.BLASTQ4
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb834)


----
-- 
Berrtrand BECKERT
PhD student
IBMC - UPR 9002 du CNRS - ARN
15, rue Rene Descartes
F-67084 STRASBOURG Cedex
b.beckert at ibmc.u-strasbg.fr
bertrand.beckert at gmail.com


From cjfields at uiuc.edu  Tue Oct 17 13:50:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 12:50:49 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>
Message-ID: <001201c6f214$c8934440$15327e82@pyrimidine>

(Apologies for the top post, but I thought my response might get lost below)

I use elinks in a similar fashion.  It tends to format the tables a bit
better than lynx.

Chris

> -----Original Message-----
> From: Barry Moore [mailto:barry.moore at genetics.utah.edu]
> Sent: Tuesday, October 17, 2006 11:58 AM
> To: Chris Fields
> Cc: 'Nathan S. Haigh'; 'bioperl-l'
> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
> 
> does a reasonable job of textifying html.  You get the links as
> numbered references at the bottom or:
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
> 
> to remove the links all together.
> 
> Barry
> 
> P.S.  Looks like this:
> 
>     #Creative Commons copyright
> 
> Installing Bioperl for Unix
> 
>  From BioPerl
> 
>     Jump to: navigation, search
> 
> Contents
> 
>       * 1 BIOPERL INSTALLATION
>       * 2 SYSTEM REQUIREMENTS
>       * 3 OPTIONAL
>       * 4 ADDITIONAL INSTALLATION INFORMATION
>       * 5 THE BIOPERL BUNDLE
>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>       * 8 WHERE ARE THE MAN PAGES?
>       * 9 EXTERNAL PROGRAMS
>            + 9.1 Environment Variables
>       * 10 INSTALLING BIOPERL SCRIPTS
>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>       * 14 THE TEST SYSTEM
>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>            + 15.1 CONFIGURING for BSD and Solaris boxes
>            + 15.2 INSTALLATION
>          * 16 DEPENDENCIES AND Bundle::BioPerl
> 
> 
> BIOPERL INSTALLATION
> 
>     Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
>     and on Mac OS X (see the PLATFORMS file for more details).
> Following are
>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
> Windows
>     installation instructions can be found here. For installing
> Bioperl for
>     Mac OS X using Fink, see Getting BioPerl.
> 
> 
> SYSTEM REQUIREMENTS
> 
>       * Perl 5.005 or later; version 5.6 and greater are recommended.
> Note
>         that most modules will work with earlier versions of Perl.
> The only ones
>         that will not are Bio::SimpleAlign and the Bio::Index::*
> modules. If
>         you don't need these modules and you want to install Bioperl
> using an
>         earlier version of Perl, edit the "require 5.005;" line in
> Makefile.PL
>         as necessary.
> 
>       * External modules: Bioperl uses functionality provided in
> other Perl
>         modules. Some of these are included in the standard perl
> package but
>         some  need to be obtained from the CPAN site. The list of
> external
>         modules is included at the bottom of this document.
> 
>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
> these
>     external modules easy. Simply install the bundle using your CPAN
> shell and
>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
> below.
> 
> 
> OPTIONAL
> 
>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>         PACKAGE, below).
> 
> 
> 
> ADDITIONAL INSTALLATION INFORMATION
> 
>       * Additional information on Bioperl and MAC OS:
>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>            + OSX-http://www.tc.umn.edu/~cann0010/
> Bioperl_OSX_install.html
>            + OS X - Installing using Fink (in Getting BioPerl)
> 
> 
> 
> THE BIOPERL BUNDLE
> 
>     You typically need root privileges to install using CPAN. If you
> don't
>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
> MODULE
>     AREA for additional information.
> 
>     Install Bundle::Bioperl using CPAN. One way:
>  >perl -MCPAN -e "install Bundle::BioPerl"
> 
>     Another way:
>  >perl -MCPAN -e shell
> cpan>install Bundle::BioPerl
> 
> 
> 
> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
> 
> > There isn't a very easy way since so many links have to be removed/
> > modified.
> > I have found a few CPAN modules that could help, but for now I just
> > dump the
> > text output from a text browser (elinks) using the 'printable
> > version' page
> > and hand-edit, which works very quickly.  That works for the time
> > being
> > until I can find another more automated solution.
> >
> > Fortunately there have been very few edits to either INSTALL wiki
> > page so
> > they should remain relatively stable.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >> -----Original Message-----
> >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
> >> Sent: Tuesday, October 17, 2006 6:46 AM
> >> To: Chris Fields
> >> Cc: bioperl-l
> >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> >>
> >> Chris Fields wrote:
> >>> The general consensus was to keep text versions available; we could
> >>> add URL links to the wiki pages for the most up-to-dat version.
> >>> BTW,
> >>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
> >>> waiting for your changes).
> >>>
> >> Is it possible to generate these files from the wiki whenever
> >> there is a
> >> release? I now edits shouldn't be too severe or too often - but I can
> >> see things getting a little messy/annoying if edits have to be
> >> made in 2
> >> places.
> >>
> >> Nath
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Oct 17 13:52:36 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 12:52:36 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
Message-ID: <001301c6f215$07a9a070$15327e82@pyrimidine>

What do you get when you run the SearchIO.t test by itself using 'perl -I.
t/SearchIO.t'?  It looks like something pretty catastrophic happened.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
> Sent: Tuesday, October 17, 2006 11:57 AM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
> 
> Hi,
> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
> tests, the first seems to be just a result of me not having DBD::mysql
> installed.
> Paul
> 
> Test Summary
> ============
> 
> Failed Test               Stat Wstat Total Fail  List of Failed
> --------------------------------------------------------------------------
> -----
> t/BioDBSeqFeature_mysql.t               46   46  1-46
> t/SearchIO.t                22  5632  1337 2671  2-1337
> 2 tests and 106 subtests skipped.
> Failed 2/236 test scripts. 1382/11688 subtests failed.
> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
> 159.61 CPU)
> 
> BioDBSeqFeature_mysql
> =====================
> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
> 1..46
> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
> (eval 37) line 3.
> Perhaps the DBD::mysql perl module hasn't been fully installed,
> or perhaps the capitalisation of 'mysql' isn't right.
> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
> 
> SearchIO
> ========
> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
> 1..1337
> ok 1
> 
> -------------------- WARNING ---------------------
> MSG: XML::SAX::Expat not currently supported; must have local copies
> of NCBI DTD docs!
> ---------------------------------------------------
> 
> -------------------- WARNING ---------------------
> MSG: error in parsing a report:
> 
> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> does not exist
> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> Handler couldn't resolve external entity at line 2, column 82, byte 104
> error in processing external entity reference at line 2, column 82,
> byte 104 at
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> 187
> 
> ---------------------------------------------------
> not ok 2
> # Failed test 2 in t/SearchIO.t at line 68
> Can't call method "database_name" on an undefined value at
> t/SearchIO.t line 69.
> 
> ------------------------------
> 
> Message: 10
> Date: Tue, 17 Oct 2006 11:32:54 +0100
> From: Sendu Bala <bix at sendu.me.uk>
> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
> To: bioperl-l at bioperl.org
> Message-ID: <4534B156.4090501 at sendu.me.uk>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> See http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
> 
> Developers:
>     This should be the last RC before release ~next monday. Now would
>     be a good time for last minute documentaiton updates and additions.
> 
> Users:
>     Even though 1.5.2 is a 'developer' release, we consider it the most
>     stable and capable version of Bioperl, and recommend that you use
>     it in all but the most critical production environments. Please
>     try it out and let us know of any problems or difficulties you run
>     into.
> 
> 
> Thank you,
> Sendu.
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paul.boutros at utoronto.ca  Tue Oct 17 13:59:33 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 13:59:33 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
Message-ID: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca>

Hi Chris,

Here it is:
pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
1..1337
ok 1

-------------------- WARNING ---------------------
MSG: XML::SAX::Expat not currently supported; must have local copies  
of NCBI DTD docs!
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: error in parsing a report:

404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'  
does not exist  
file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
Handler couldn't resolve external entity at line 2, column 82, byte 104
error in processing external entity reference at line 2, column 82,  
byte 104 at  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line  
187

---------------------------------------------------
not ok 2
# Failed test 2 in t/SearchIO.t at line 68
Can't call method "database_name" on an undefined value at  
t/SearchIO.t line 69.


Quoting Chris Fields <cjfields at uiuc.edu>:

> What do you get when you run the SearchIO.t test by itself using 'perl -I.
> t/SearchIO.t'?  It looks like something pretty catastrophic happened.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>> Sent: Tuesday, October 17, 2006 11:57 AM
>> To: bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>>
>> Hi,
>> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
>> tests, the first seems to be just a result of me not having DBD::mysql
>> installed.
>> Paul
>>
>> Test Summary
>> ============
>>
>> Failed Test               Stat Wstat Total Fail  List of Failed
>> --------------------------------------------------------------------------
>> -----
>> t/BioDBSeqFeature_mysql.t               46   46  1-46
>> t/SearchIO.t                22  5632  1337 2671  2-1337
>> 2 tests and 106 subtests skipped.
>> Failed 2/236 test scripts. 1382/11688 subtests failed.
>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
>> 159.61 CPU)
>>
>> BioDBSeqFeature_mysql
>> =====================
>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>> 1..46
>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
>> (eval 37) line 3.
>> Perhaps the DBD::mysql perl module hasn't been fully installed,
>> or perhaps the capitalisation of 'mysql' isn't right.
>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>>
>> SearchIO
>> ========
>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>> 1..1337
>> ok 1
>>
>> -------------------- WARNING ---------------------
>> MSG: XML::SAX::Expat not currently supported; must have local copies
>> of NCBI DTD docs!
>> ---------------------------------------------------
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>
>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> does not exist
>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> error in processing external entity reference at line 2, column 82,
>> byte 104 at
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> 187
>>
>> ---------------------------------------------------
>> not ok 2
>> # Failed test 2 in t/SearchIO.t at line 68
>> Can't call method "database_name" on an undefined value at
>> t/SearchIO.t line 69.
>>
>> ------------------------------
>>
>> Message: 10
>> Date: Tue, 17 Oct 2006 11:32:54 +0100
>> From: Sendu Bala <bix at sendu.me.uk>
>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>> To: bioperl-l at bioperl.org
>> Message-ID: <4534B156.4090501 at sendu.me.uk>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
>> See http://www.bioperl.org/wiki/Release_1.5.2 for
>> instructions on getting and testing this RC.
>>
>> Developers:
>>     This should be the last RC before release ~next monday. Now would
>>     be a good time for last minute documentaiton updates and additions.
>>
>> Users:
>>     Even though 1.5.2 is a 'developer' release, we consider it the most
>>     stable and capable version of Bioperl, and recommend that you use
>>     it in all but the most critical production environments. Please
>>     try it out and let us know of any problems or difficulties you run
>>     into.
>>
>>
>> Thank you,
>> Sendu.
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From barry.moore at genetics.utah.edu  Tue Oct 17 14:07:12 2006
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Tue, 17 Oct 2006 12:07:12 -0600
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <C15A8DE6.AD40%bosborne11@verizon.net>
References: <C15A8DE6.AD40%bosborne11@verizon.net>
Message-ID: <588DE26B-8F18-4540-BAEE-2B479CBDE8B3@genetics.utah.edu>

In fact, I think it was you who taught me that trick in the first place.

B

On Oct 17, 2006, at 11:40 AM, Brian Osborne wrote:

> Barry,
>
> I second that. lynx does the best job of converting HTML to text  
> I've seen.
>
> Brian O.
>
>
> On 10/17/06 12:57 PM, "Barry Moore" <barry.moore at genetics.utah.edu>  
> wrote:
>
>> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
>>
>> does a reasonable job of textifying html.  You get the links as
>> numbered references at the bottom or:
>>
>> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
>> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
>>
>> to remove the links all together.
>>
>> Barry
>>
>> P.S.  Looks like this:
>>
>>     #Creative Commons copyright
>>
>> Installing Bioperl for Unix
>>
>>  From BioPerl
>>
>>     Jump to: navigation, search
>>
>> Contents
>>
>>       * 1 BIOPERL INSTALLATION
>>       * 2 SYSTEM REQUIREMENTS
>>       * 3 OPTIONAL
>>       * 4 ADDITIONAL INSTALLATION INFORMATION
>>       * 5 THE BIOPERL BUNDLE
>>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>>       * 8 WHERE ARE THE MAN PAGES?
>>       * 9 EXTERNAL PROGRAMS
>>            + 9.1 Environment Variables
>>       * 10 INSTALLING BIOPERL SCRIPTS
>>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>>       * 14 THE TEST SYSTEM
>>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>>            + 15.1 CONFIGURING for BSD and Solaris boxes
>>            + 15.2 INSTALLATION
>>          * 16 DEPENDENCIES AND Bundle::BioPerl
>>
>>
>> BIOPERL INSTALLATION
>>
>>     Bioperl has been installed on many forms of Unix, Win9X/NT/ 
>> 2000/XP,
>>     and on Mac OS X (see the PLATFORMS file for more details).
>> Following are
>>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
>> Windows
>>     installation instructions can be found here. For installing
>> Bioperl for
>>     Mac OS X using Fink, see Getting BioPerl.
>>
>>
>> SYSTEM REQUIREMENTS
>>
>>       * Perl 5.005 or later; version 5.6 and greater are recommended.
>> Note
>>         that most modules will work with earlier versions of Perl.
>> The only ones
>>         that will not are Bio::SimpleAlign and the Bio::Index::*
>> modules. If
>>         you don't need these modules and you want to install Bioperl
>> using an
>>         earlier version of Perl, edit the "require 5.005;" line in
>> Makefile.PL
>>         as necessary.
>>
>>       * External modules: Bioperl uses functionality provided in
>> other Perl
>>         modules. Some of these are included in the standard perl
>> package but
>>         some  need to be obtained from the CPAN site. The list of
>> external
>>         modules is included at the bottom of this document.
>>
>>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
>> these
>>     external modules easy. Simply install the bundle using your CPAN
>> shell and
>>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
>> below.
>>
>>
>> OPTIONAL
>>
>>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions  
>> (the
>>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>>         PACKAGE, below).
>>
>>
>>
>> ADDITIONAL INSTALLATION INFORMATION
>>
>>       * Additional information on Bioperl and MAC OS:
>>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>>            + OSX-http://www.tc.umn.edu/~cann0010/
>> Bioperl_OSX_install.html
>>            + OS X - Installing using Fink (in Getting BioPerl)
>>
>>
>>
>> THE BIOPERL BUNDLE
>>
>>     You typically need root privileges to install using CPAN. If you
>> don't
>>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
>> MODULE
>>     AREA for additional information.
>>
>>     Install Bundle::Bioperl using CPAN. One way:
>>> perl -MCPAN -e "install Bundle::BioPerl"
>>
>>     Another way:
>>> perl -MCPAN -e shell
>> cpan>install Bundle::BioPerl
>>
>>
>>
>> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
>>
>>> There isn't a very easy way since so many links have to be removed/
>>> modified.
>>> I have found a few CPAN modules that could help, but for now I just
>>> dump the
>>> text output from a text browser (elinks) using the 'printable
>>> version' page
>>> and hand-edit, which works very quickly.  That works for the time
>>> being
>>> until I can find another more automated solution.
>>>
>>> Fortunately there have been very few edits to either INSTALL wiki
>>> page so
>>> they should remain relatively stable.
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher - Switzer Lab
>>> Dept. of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>> -----Original Message-----
>>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>>>> Sent: Tuesday, October 17, 2006 6:46 AM
>>>> To: Chris Fields
>>>> Cc: bioperl-l
>>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>>>
>>>> Chris Fields wrote:
>>>>> The general consensus was to keep text versions available; we  
>>>>> could
>>>>> add URL links to the wiki pages for the most up-to-dat version.
>>>>> BTW,
>>>>> I have modified INSTALL already.  INSTALL.WIN is next in line  
>>>>> (I was
>>>>> waiting for your changes).
>>>>>
>>>> Is it possible to generate these files from the wiki whenever
>>>> there is a
>>>> release? I now edits shouldn't be too severe or too often - but  
>>>> I can
>>>> see things getting a little messy/annoying if edits have to be
>>>> made in 2
>>>> places.
>>>>
>>>> Nath
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From bix at sendu.me.uk  Tue Oct 17 14:07:04 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 19:07:04 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
References: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
Message-ID: <45351BC8.9080507@sendu.me.uk>

Paul Boutros wrote:
> Hi,
> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed  
> tests, the first seems to be just a result of me not having DBD::mysql  
> installed.
[snip]

Thanks for those, very useful. Not something that's come up before 
afaik; I'll look into them.


From cjfields at uiuc.edu  Tue Oct 17 14:31:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 13:31:51 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca>
Message-ID: <001401c6f21a$836f9fc0$15327e82@pyrimidine>

Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX
backend parser.  For some reason BLAST XML parsing doesn't work with that
parser (it tries to verify the XML first before parsing, hence the DTD
error).  I may try getting this to work again, but so far I haven't found an
easy way to prevent XML verification via XML::SAX::Expat.

There are two options: 1) install XML::SAX::ExpatXS (the better option),
which works AND is 4x faster than XML::SAX::Expat, or  2) set the default
parser in the PareserDetails.ini file in your local to use
XML::SAX::PurePerl.  

BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just
hasn't officially happened yet); the latter hasn't had significant
development in about three years.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
> Sent: Tuesday, October 17, 2006 1:00 PM
> To: Chris Fields
> Cc: bioperl-l at lists.open-bio.org
> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
> 
> Hi Chris,
> 
> Here it is:
> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
> 1..1337
> ok 1
> 
> -------------------- WARNING ---------------------
> MSG: XML::SAX::Expat not currently supported; must have local copies
> of NCBI DTD docs!
> ---------------------------------------------------
> 
> -------------------- WARNING ---------------------
> MSG: error in parsing a report:
> 
> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> does not exist
> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> Handler couldn't resolve external entity at line 2, column 82, byte 104
> error in processing external entity reference at line 2, column 82,
> byte 104 at
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> 187
> 
> ---------------------------------------------------
> not ok 2
> # Failed test 2 in t/SearchIO.t at line 68
> Can't call method "database_name" on an undefined value at
> t/SearchIO.t line 69.
> 
> 
> Quoting Chris Fields <cjfields at uiuc.edu>:
> 
> > What do you get when you run the SearchIO.t test by itself using 'perl -
> I.
> > t/SearchIO.t'?  It looks like something pretty catastrophic happened.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
> >> Sent: Tuesday, October 17, 2006 11:57 AM
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
> >>
> >> Hi,
> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
> >> tests, the first seems to be just a result of me not having DBD::mysql
> >> installed.
> >> Paul
> >>
> >> Test Summary
> >> ============
> >>
> >> Failed Test               Stat Wstat Total Fail  List of Failed
> >> -----------------------------------------------------------------------
> ---
> >> -----
> >> t/BioDBSeqFeature_mysql.t               46   46  1-46
> >> t/SearchIO.t                22  5632  1337 2671  2-1337
> >> 2 tests and 106 subtests skipped.
> >> Failed 2/236 test scripts. 1382/11688 subtests failed.
> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
> >> 159.61 CPU)
> >>
> >> BioDBSeqFeature_mysql
> >> =====================
> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
> >> 1..46
> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
> >> (eval 37) line 3.
> >> Perhaps the DBD::mysql perl module hasn't been fully installed,
> >> or perhaps the capitalisation of 'mysql' isn't right.
> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
> >>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
> >>
> >> SearchIO
> >> ========
> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
> >> 1..1337
> >> ok 1
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: XML::SAX::Expat not currently supported; must have local copies
> >> of NCBI DTD docs!
> >> ---------------------------------------------------
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: error in parsing a report:
> >>
> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> >> does not exist
> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> >> Handler couldn't resolve external entity at line 2, column 82, byte 104
> >> error in processing external entity reference at line 2, column 82,
> >> byte 104 at
> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> >> 187
> >>
> >> ---------------------------------------------------
> >> not ok 2
> >> # Failed test 2 in t/SearchIO.t at line 68
> >> Can't call method "database_name" on an undefined value at
> >> t/SearchIO.t line 69.
> >>
> >> ------------------------------
> >>
> >> Message: 10
> >> Date: Tue, 17 Oct 2006 11:32:54 +0100
> >> From: Sendu Bala <bix at sendu.me.uk>
> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
> >> To: bioperl-l at bioperl.org
> >> Message-ID: <4534B156.4090501 at sendu.me.uk>
> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>
> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> >> See http://www.bioperl.org/wiki/Release_1.5.2 for
> >> instructions on getting and testing this RC.
> >>
> >> Developers:
> >>     This should be the last RC before release ~next monday. Now would
> >>     be a good time for last minute documentaiton updates and additions.
> >>
> >> Users:
> >>     Even though 1.5.2 is a 'developer' release, we consider it the most
> >>     stable and capable version of Bioperl, and recommend that you use
> >>     it in all but the most critical production environments. Please
> >>     try it out and let us know of any problems or difficulties you run
> >>     into.
> >>
> >>
> >> Thank you,
> >> Sendu.
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> 


From cjfields at uiuc.edu  Tue Oct 17 15:05:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 14:05:59 -0500
Subject: [Bioperl-l] split location problems
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E018E6735@huls5.nucleus.harvard.edu>
Message-ID: <001b01c6f21f$48640420$15327e82@pyrimidine>

> > From: Jason Stajich [mailto:jason.stajich at gmail.com]
> >
> > The whole point of split locations is to represent genes with
> > introns
> > so that is not the "rare" case.
> 
> Absolutely.

Right, but that specific kind of join statement is not commonly used  in
GenBank files, which seems to be the format predominately used (no offense
to EBI).  This may explain why we haven't seen this pop up more often.  

I believe we're seeing is a difference in the way these locations are
described at NCBI vs EBI, which Nadeem Faruque seems to corroborate.  He
indicated that EBI may move to using similar GenBank-like location strings.
Regardless, FTlocationFactory and Bio::Location::Split should handle both if
they are present but only seems to like the GenBank version.

> > I've added code to test this to bug 2101 including a C.glabrata
> > chromsome downloaded from genbank.  Perhaps the problem is on the
> > EMBL parsing side, I didn't test that.
> 
> Well, I don't know whether it's EMBL parsing, or a bit further down the
> pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968),
> and it describes the complement/joins in the way that Bioperl is
> handling correctly.
> 
> GenBank:
>      CDS             complement(join(10347..10372,10632..11157))
>                      /locus_tag="CAGL0B00242g"
> 
> EMBL:
> FT   CDS
> join(complement(10632..11157),complement(10347..10372))
> FT                   /locus_tag="CAGL0B00242g"

Yes, something that I found out independently (and corroborated by Nadeem).

> Here's the diff when I run the location-printing script I posted
> yesterday:
> 
> diff biogb bio
> 1c1,5
> < complement(join(10347..10372,10632..11157))
> ---
> > complement(1701..2651)
> > complement(2635..3345)
> > complement(3980..4408)
> > complement(join(10632..11157,10347..10372))
> > 10379..10615
> 209a214,217
> > 498198..498890
> > 499712..500062
> > 499851..500702
> > 500579..501364
> 
> As you can see, the complement/join CDS is written out in a different
> order, which is Bad.

I think this can be handled directly in to_FTstring().  I'll have to add a
method to get the strand info from the Split object w/o going through
strand().  

However, I'm thinking about trying a different tact which is a bit simpler
and, if it proves fruitful, may simplify Split locations somewhat.  It won't
be ready for 1.5.2 but maybe the next release.

> (I looked at at least one of the other differences: the GB file says
> it's a "misc feature" and EMBL says it's a CDS. But they don't seem to
> be relevant here.)
> -Amir

Probably not but something to keep in mind.
 
-c

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From er at xs4all.nl  Tue Oct 17 15:01:48 2006
From: er at xs4all.nl (Erikjan)
Date: Tue, 17 Oct 2006 21:01:48 +0200 (CEST)
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
Message-ID: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>

Hello,

I noticed a little problem with the Annotation "DBLink" from GenBank entries

When I run:

perl -MBio::DB::GenBank -e 'my $gi =
56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
$db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
$ac=$seq->annotation(); my @annotations = $ac->get_Annotations("dblink");
for(@annotations) { print $_, "\n";} print $INC{
"Bio/Annotation/DBLink.pm" }, "\n"; '

This yields:

   GenBank:AL591065.17.17

and the place where the used Bio/Annotation/DBLink.pm resides.

Can others repeat this?

I have dug into the source a little and Bio::Annotation::DBLink seems to
be the place where this happens: it has a concatenation which leads to
that repeated version number.

It this something that I should fix "client-side", so to speak, or is it
worthwhile to add some logic to that concatenation to prevent this?


Thanks,

Eric


From bosborne11 at verizon.net  Tue Oct 17 13:40:54 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 17 Oct 2006 13:40:54 -0400
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>
Message-ID: <C15A8DE6.AD40%bosborne11@verizon.net>

Barry,

I second that. lynx does the best job of converting HTML to text I've seen.

Brian O.


On 10/17/06 12:57 PM, "Barry Moore" <barry.moore at genetics.utah.edu> wrote:

> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
> 
> does a reasonable job of textifying html.  You get the links as
> numbered references at the bottom or:
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
> 
> to remove the links all together.
> 
> Barry
> 
> P.S.  Looks like this:
> 
>     #Creative Commons copyright
> 
> Installing Bioperl for Unix
> 
>  From BioPerl
> 
>     Jump to: navigation, search
> 
> Contents
> 
>       * 1 BIOPERL INSTALLATION
>       * 2 SYSTEM REQUIREMENTS
>       * 3 OPTIONAL
>       * 4 ADDITIONAL INSTALLATION INFORMATION
>       * 5 THE BIOPERL BUNDLE
>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>       * 8 WHERE ARE THE MAN PAGES?
>       * 9 EXTERNAL PROGRAMS
>            + 9.1 Environment Variables
>       * 10 INSTALLING BIOPERL SCRIPTS
>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>       * 14 THE TEST SYSTEM
>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>            + 15.1 CONFIGURING for BSD and Solaris boxes
>            + 15.2 INSTALLATION
>          * 16 DEPENDENCIES AND Bundle::BioPerl
> 
> 
> BIOPERL INSTALLATION
> 
>     Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
>     and on Mac OS X (see the PLATFORMS file for more details).
> Following are
>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
> Windows
>     installation instructions can be found here. For installing
> Bioperl for
>     Mac OS X using Fink, see Getting BioPerl.
> 
> 
> SYSTEM REQUIREMENTS
> 
>       * Perl 5.005 or later; version 5.6 and greater are recommended.
> Note
>         that most modules will work with earlier versions of Perl.
> The only ones
>         that will not are Bio::SimpleAlign and the Bio::Index::*
> modules. If
>         you don't need these modules and you want to install Bioperl
> using an
>         earlier version of Perl, edit the "require 5.005;" line in
> Makefile.PL
>         as necessary.
> 
>       * External modules: Bioperl uses functionality provided in
> other Perl
>         modules. Some of these are included in the standard perl
> package but
>         some  need to be obtained from the CPAN site. The list of
> external
>         modules is included at the bottom of this document.
> 
>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
> these
>     external modules easy. Simply install the bundle using your CPAN
> shell and
>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
> below.
> 
> 
> OPTIONAL
> 
>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>         PACKAGE, below).
> 
> 
> 
> ADDITIONAL INSTALLATION INFORMATION
> 
>       * Additional information on Bioperl and MAC OS:
>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>            + OSX-http://www.tc.umn.edu/~cann0010/
> Bioperl_OSX_install.html
>            + OS X - Installing using Fink (in Getting BioPerl)
> 
> 
> 
> THE BIOPERL BUNDLE
> 
>     You typically need root privileges to install using CPAN. If you
> don't
>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
> MODULE
>     AREA for additional information.
> 
>     Install Bundle::Bioperl using CPAN. One way:
>> perl -MCPAN -e "install Bundle::BioPerl"
> 
>     Another way:
>> perl -MCPAN -e shell
> cpan>install Bundle::BioPerl
> 
> 
> 
> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
> 
>> There isn't a very easy way since so many links have to be removed/
>> modified.
>> I have found a few CPAN modules that could help, but for now I just
>> dump the
>> text output from a text browser (elinks) using the 'printable
>> version' page
>> and hand-edit, which works very quickly.  That works for the time
>> being
>> until I can find another more automated solution.
>> 
>> Fortunately there have been very few edits to either INSTALL wiki
>> page so
>> they should remain relatively stable.
>> 
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>> 
>>> -----Original Message-----
>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>>> Sent: Tuesday, October 17, 2006 6:46 AM
>>> To: Chris Fields
>>> Cc: bioperl-l
>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>> 
>>> Chris Fields wrote:
>>>> The general consensus was to keep text versions available; we could
>>>> add URL links to the wiki pages for the most up-to-dat version.
>>>> BTW,
>>>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
>>>> waiting for your changes).
>>>> 
>>> Is it possible to generate these files from the wiki whenever
>>> there is a
>>> release? I now edits shouldn't be too severe or too often - but I can
>>> see things getting a little messy/annoying if edits have to be
>>> made in 2
>>> places.
>>> 
>>> Nath
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Oct 17 16:30:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 15:30:15 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
Message-ID: <0FB91820-B2A1-4F7F-866C-8D4791DD8306@uiuc.edu>

I can confirm this using bioperl-live:

GenBank:AL591065.17.17
/Users/cjfields/src/bioperl-live/Bio/Annotation/DBLink.pm

Could you file a bug report via bugzilla?

Chris

On Oct 17, 2006, at 2:01 PM, Erikjan wrote:

> Hello,
>
> I noticed a little problem with the Annotation "DBLink" from  
> GenBank entries
>
> When I run:
>
> perl -MBio::DB::GenBank -e 'my $gi =
> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations 
> ("dblink");
> for(@annotations) { print $_, "\n";} print $INC{
> "Bio/Annotation/DBLink.pm" }, "\n"; '
>
> This yields:
>
>    GenBank:AL591065.17.17
>
> and the place where the used Bio/Annotation/DBLink.pm resides.
>
> Can others repeat this?
>
> I have dug into the source a little and Bio::Annotation::DBLink  
> seems to
> be the place where this happens: it has a concatenation which leads to
> that repeated version number.
>
> It this something that I should fix "client-side", so to speak, or  
> is it
> worthwhile to add some logic to that concatenation to prevent this?
>
>
> Thanks,
>
> Eric
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From paul.boutros at utoronto.ca  Tue Oct 17 19:49:52 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 19:49:52 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
Message-ID: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>

Hi Chris,

Yup, that's it.  I installed XML::SAX::ExpatXS (make test output  
below).  Should there be a note somewhere in the INSTALL docs saying  
basically what you just wrote?  Or maybe it's already there somewhere  
and I missed it.

Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks  
if DBD::mysql can be loaded, and if not doesn't run the test.  Since  
the file is only one-line long, here's the modified file rather than a  
patch:
################################################################
BEGIN {
         # DBD::mysql is required
         eval {
                 require DBD::mysql;
                 };
         if ( $@ ) {
                 use Test::More skip_all => "DBD::mysql is not  
installed or is installed incorrectly - skipping BioDBSeqFeature
_mysql.t";
                 exit(0);
                 }
         }

system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1  
-dsn test";
################################################################

And when I run it I get:
t/BioDBSeqFeature_mysql......skipped
         all skipped: DBD::mysql is not installed or is installed  
incorrectly - skipping BioDBSeqFeature_mysql.t

And for the overall make test:
All tests successful, 3 tests and 106 subtests skipped.
Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys =  
164.24 CPU)

Hope this helps,
Paul


Quoting Chris Fields <cjfields at uiuc.edu>:

> Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX
> backend parser.  For some reason BLAST XML parsing doesn't work with that
> parser (it tries to verify the XML first before parsing, hence the DTD
> error).  I may try getting this to work again, but so far I haven't found an
> easy way to prevent XML verification via XML::SAX::Expat.
>
> There are two options: 1) install XML::SAX::ExpatXS (the better option),
> which works AND is 4x faster than XML::SAX::Expat, or  2) set the default
> parser in the PareserDetails.ini file in your local to use
> XML::SAX::PurePerl.
>
> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just
> hasn't officially happened yet); the latter hasn't had significant
> development in about three years.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>> -----Original Message-----
>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
>> Sent: Tuesday, October 17, 2006 1:00 PM
>> To: Chris Fields
>> Cc: bioperl-l at lists.open-bio.org
>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
>>
>> Hi Chris,
>>
>> Here it is:
>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
>> 1..1337
>> ok 1
>>
>> -------------------- WARNING ---------------------
>> MSG: XML::SAX::Expat not currently supported; must have local copies
>> of NCBI DTD docs!
>> ---------------------------------------------------
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>
>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> does not exist
>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> error in processing external entity reference at line 2, column 82,
>> byte 104 at
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> 187
>>
>> ---------------------------------------------------
>> not ok 2
>> # Failed test 2 in t/SearchIO.t at line 68
>> Can't call method "database_name" on an undefined value at
>> t/SearchIO.t line 69.
>>
>>
>> Quoting Chris Fields <cjfields at uiuc.edu>:
>>
>> > What do you get when you run the SearchIO.t test by itself using 'perl -
>> I.
>> > t/SearchIO.t'?  It looks like something pretty catastrophic happened.
>> >
>> > Christopher Fields
>> > Postdoctoral Researcher - Switzer Lab
>> > Dept. of Biochemistry
>> > University of Illinois Urbana-Champaign
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>> >> Sent: Tuesday, October 17, 2006 11:57 AM
>> >> To: bioperl-l at lists.open-bio.org
>> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>> >>
>> >> Hi,
>> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
>> >> tests, the first seems to be just a result of me not having DBD::mysql
>> >> installed.
>> >> Paul
>> >>
>> >> Test Summary
>> >> ============
>> >>
>> >> Failed Test               Stat Wstat Total Fail  List of Failed
>> >> -----------------------------------------------------------------------
>> ---
>> >> -----
>> >> t/BioDBSeqFeature_mysql.t               46   46  1-46
>> >> t/SearchIO.t                22  5632  1337 2671  2-1337
>> >> 2 tests and 106 subtests skipped.
>> >> Failed 2/236 test scripts. 1382/11688 subtests failed.
>> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
>> >> 159.61 CPU)
>> >>
>> >> BioDBSeqFeature_mysql
>> >> =====================
>> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>> >> 1..46
>> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
>> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
>> >> (eval 37) line 3.
>> >> Perhaps the DBD::mysql perl module hasn't been fully installed,
>> >> or perhaps the capitalisation of 'mysql' isn't right.
>> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>> >>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>> >>
>> >> SearchIO
>> >> ========
>> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>> >> 1..1337
>> >> ok 1
>> >>
>> >> -------------------- WARNING ---------------------
>> >> MSG: XML::SAX::Expat not currently supported; must have local copies
>> >> of NCBI DTD docs!
>> >> ---------------------------------------------------
>> >>
>> >> -------------------- WARNING ---------------------
>> >> MSG: error in parsing a report:
>> >>
>> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> >> does not exist
>> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> >> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> >> error in processing external entity reference at line 2, column 82,
>> >> byte 104 at
>> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> >> 187
>> >>
>> >> ---------------------------------------------------
>> >> not ok 2
>> >> # Failed test 2 in t/SearchIO.t at line 68
>> >> Can't call method "database_name" on an undefined value at
>> >> t/SearchIO.t line 69.
>> >>
>> >> ------------------------------
>> >>
>> >> Message: 10
>> >> Date: Tue, 17 Oct 2006 11:32:54 +0100
>> >> From: Sendu Bala <bix at sendu.me.uk>
>> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>> >> To: bioperl-l at bioperl.org
>> >> Message-ID: <4534B156.4090501 at sendu.me.uk>
>> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>> >>
>> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
>> >> See http://www.bioperl.org/wiki/Release_1.5.2 for
>> >> instructions on getting and testing this RC.
>> >>
>> >> Developers:
>> >>     This should be the last RC before release ~next monday. Now would
>> >>     be a good time for last minute documentaiton updates and additions.
>> >>
>> >> Users:
>> >>     Even though 1.5.2 is a 'developer' release, we consider it the most
>> >>     stable and capable version of Bioperl, and recommend that you use
>> >>     it in all but the most critical production environments. Please
>> >>     try it out and let us know of any problems or difficulties you run
>> >>     into.
>> >>
>> >>
>> >> Thank you,
>> >> Sendu.
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >
>> >
>>
>
>
>


From cjfields at uiuc.edu  Tue Oct 17 20:51:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 19:51:35 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
	<20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
Message-ID: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>

On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote:

> Hi Chris,
>
> Yup, that's it.  I installed XML::SAX::ExpatXS (make test output
> below).  Should there be a note somewhere in the INSTALL docs saying
> basically what you just wrote?  Or maybe it's already there somewhere
> and I missed it.

The INSTALL docs should have this, yes.  I'll double-check though.

Pretty much anything that plugs into XML::SAX except XML::SAX::Expat  
works (XML::LibXML also works, I found).

> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks
> if DBD::mysql can be loaded, and if not doesn't run the test.  Since
> the file is only one-line long, here's the modified file rather than a
> patch:
> ################################################################
> BEGIN {
>          # DBD::mysql is required
>          eval {
>                  require DBD::mysql;
>                  };
>          if ( $@ ) {
>                  use Test::More skip_all => "DBD::mysql is not
> installed or is installed incorrectly - skipping BioDBSeqFeature
> _mysql.t";
>                  exit(0);
>                  }
>          }
>
> system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1
> -dsn test";
> ################################################################
>
> And when I run it I get:
> t/BioDBSeqFeature_mysql......skipped
>          all skipped: DBD::mysql is not installed or is installed
> incorrectly - skipping BioDBSeqFeature_mysql.t
>
> And for the overall make test:
> All tests successful, 3 tests and 106 subtests skipped.
> Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys =
> 164.24 CPU)

It should check this when using 'perl Makefile.PL', since the tests  
are only set up if MySQL is present (so you would assume that it  
checks for DBD::mysql).  I'll look into it.

Chris

> Hope this helps,
> Paul
>
>
> Quoting Chris Fields <cjfields at uiuc.edu>:
>
>> Your local copy of XML::SAX has XML::SAX::Expat set as the default  
>> SAX
>> backend parser.  For some reason BLAST XML parsing doesn't work  
>> with that
>> parser (it tries to verify the XML first before parsing, hence the  
>> DTD
>> error).  I may try getting this to work again, but so far I  
>> haven't found an
>> easy way to prevent XML verification via XML::SAX::Expat.
>>
>> There are two options: 1) install XML::SAX::ExpatXS (the better  
>> option),
>> which works AND is 4x faster than XML::SAX::Expat, or  2) set the  
>> default
>> parser in the PareserDetails.ini file in your local to use
>> XML::SAX::PurePerl.
>>
>> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it  
>> just
>> hasn't officially happened yet); the latter hasn't had significant
>> development in about three years.
>>
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>> -----Original Message-----
>>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
>>> Sent: Tuesday, October 17, 2006 1:00 PM
>>> To: Chris Fields
>>> Cc: bioperl-l at lists.open-bio.org
>>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
>>>
>>> Hi Chris,
>>>
>>> Here it is:
>>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
>>> 1..1337
>>> ok 1
>>>
>>> -------------------- WARNING ---------------------
>>> MSG: XML::SAX::Expat not currently supported; must have local copies
>>> of NCBI DTD docs!
>>> ---------------------------------------------------
>>>
>>> -------------------- WARNING ---------------------
>>> MSG: error in parsing a report:
>>>
>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>>> does not exist
>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>>> Handler couldn't resolve external entity at line 2, column 82,  
>>> byte 104
>>> error in processing external entity reference at line 2, column 82,
>>> byte 104 at
>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm  
>>> line
>>> 187
>>>
>>> ---------------------------------------------------
>>> not ok 2
>>> # Failed test 2 in t/SearchIO.t at line 68
>>> Can't call method "database_name" on an undefined value at
>>> t/SearchIO.t line 69.
>>>
>>>
>>> Quoting Chris Fields <cjfields at uiuc.edu>:
>>>
>>>> What do you get when you run the SearchIO.t test by itself using  
>>>> 'perl -
>>> I.
>>>> t/SearchIO.t'?  It looks like something pretty catastrophic  
>>>> happened.
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher - Switzer Lab
>>>> Dept. of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>>>>> Sent: Tuesday, October 17, 2006 11:57 AM
>>>>> To: bioperl-l at lists.open-bio.org
>>>>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>>>>>
>>>>> Hi,
>>>>> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two  
>>>>> failed
>>>>> tests, the first seems to be just a result of me not having  
>>>>> DBD::mysql
>>>>> installed.
>>>>> Paul
>>>>>
>>>>> Test Summary
>>>>> ============
>>>>>
>>>>> Failed Test               Stat Wstat Total Fail  List of Failed
>>>>> ------------------------------------------------------------------ 
>>>>> -----
>>> ---
>>>>> -----
>>>>> t/BioDBSeqFeature_mysql.t               46   46  1-46
>>>>> t/SearchIO.t                22  5632  1337 2671  2-1337
>>>>> 2 tests and 106 subtests skipped.
>>>>> Failed 2/236 test scripts. 1382/11688 subtests failed.
>>>>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14  
>>>>> csys =
>>>>> 159.61 CPU)
>>>>>
>>>>> BioDBSeqFeature_mysql
>>>>> =====================
>>>>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>>>>> 1..46
>>>>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC  
>>>>> (@INC
>>>>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>>>>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>>>>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/ 
>>>>> site_perl) at
>>>>> (eval 37) line 3.
>>>>> Perhaps the DBD::mysql perl module hasn't been fully installed,
>>>>> or perhaps the capitalisation of 'mysql' isn't right.
>>>>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>>>>>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>>>>>
>>>>> SearchIO
>>>>> ========
>>>>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>>>>> 1..1337
>>>>> ok 1
>>>>>
>>>>> -------------------- WARNING ---------------------
>>>>> MSG: XML::SAX::Expat not currently supported; must have local  
>>>>> copies
>>>>> of NCBI DTD docs!
>>>>> ---------------------------------------------------
>>>>>
>>>>> -------------------- WARNING ---------------------
>>>>> MSG: error in parsing a report:
>>>>>
>>>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/ 
>>>>> NCBI_BlastOutput.dtd'
>>>>> does not exist
>>>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>>>>> Handler couldn't resolve external entity at line 2, column 82,  
>>>>> byte 104
>>>>> error in processing external entity reference at line 2, column  
>>>>> 82,
>>>>> byte 104 at
>>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/ 
>>>>> Parser.pm line
>>>>> 187
>>>>>
>>>>> ---------------------------------------------------
>>>>> not ok 2
>>>>> # Failed test 2 in t/SearchIO.t at line 68
>>>>> Can't call method "database_name" on an undefined value at
>>>>> t/SearchIO.t line 69.
>>>>>
>>>>> ------------------------------
>>>>>
>>>>> Message: 10
>>>>> Date: Tue, 17 Oct 2006 11:32:54 +0100
>>>>> From: Sendu Bala <bix at sendu.me.uk>
>>>>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>>>>> To: bioperl-l at bioperl.org
>>>>> Message-ID: <4534B156.4090501 at sendu.me.uk>
>>>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>>>
>>>>> Bioperl 1.5.2 Release Candidate 2 is ready and available for  
>>>>> testing.
>>>>> See http://www.bioperl.org/wiki/Release_1.5.2 for
>>>>> instructions on getting and testing this RC.
>>>>>
>>>>> Developers:
>>>>>     This should be the last RC before release ~next monday. Now  
>>>>> would
>>>>>     be a good time for last minute documentaiton updates and  
>>>>> additions.
>>>>>
>>>>> Users:
>>>>>     Even though 1.5.2 is a 'developer' release, we consider it  
>>>>> the most
>>>>>     stable and capable version of Bioperl, and recommend that  
>>>>> you use
>>>>>     it in all but the most critical production environments.  
>>>>> Please
>>>>>     try it out and let us know of any problems or difficulties  
>>>>> you run
>>>>>     into.
>>>>>
>>>>>
>>>>> Thank you,
>>>>> Sendu.
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>
>>
>>
>>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Wed Oct 18 02:52:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 07:52:05 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534B156.4090501@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>
Message-ID: <4535CF15.4090502@sendu.me.uk>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> See http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
> 
> Developers:
>    This should be the last RC before release ~next monday. Now would
>    be a good time for last minute documentaiton updates and additions.

Given the few issues that have come up, it would be prudent to have 
another RC, so expect one around the time the 'Needs investigation' 
issues on the release page have been solved.

If you think there are more things that need investigation, please add 
them, but note the bias toward things that affect the successful 
completion of the test suite as opposed to general bugs which should go 
to Bugzilla as normal.


From bix at sendu.me.uk  Wed Oct 18 04:55:21 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 09:55:21 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45350BA6.3040102@genomics.dk>
References: <4534B156.4090501@sendu.me.uk>
	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>
	<45350BA6.3040102@genomics.dk>
Message-ID: <4535EBF9.1090706@sendu.me.uk>

Niels Larsen wrote:

> ------------ EBI
> 
> I invoked the EBI script
> 
> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip
> 
> like this
> 
> WSWUBlastClient.pl -p blastn -D embl test.fasta
> 
> where the content of test.fasta is below, and got
> 
> Can't find method element in the message at 
> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

As you admit, this is not a Bioperl issue. I would suggest you contact 
EBI support.

In the mean time/alternatively I'd suggest investigating the Bioperl 
interface to the SOAP server, which is part of the Bioperl-run package.

http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/Analysis.html


> ------------ DDBJ
> 
> Inspired by this page,
> 
> http://xml.nig.ac.jp/doc/Blast.txt
> 
> I made this test script
[snip]
> which for me prints undef.

Again, not something I can really help you with. You'll need to 
triple-check your code and then seek support from the providers of that 
SOAP service.


> ------------- NCBI/Bioperl
> 
> I installed 1.5.2-RC2, looked at the RemoteBlast example in
> 
> http://www.bioperl.org/wiki/Bptutorial.pl
> 
> and then put that into this test code, more or less cut/paste,
[snip]
> Maybe I am supposed to add a check for content in $rc and then stop
> the inner loop?

Yes, the wiki page example isn't really adequate. I'll update it. For a 
better code example see the RemoteBlast documentation:

http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html


> I could figure that out maybe, but I wish there was a
> function which simply takes a single sequence + arguments and only
> returns a list of matches when done, and does not return until then
> (or until a specified timeout).

Yes, I hardly find dealing with RIDs that pleasant. You might like to 
add a feature request to Bugzilla.


From n.haigh at sheffield.ac.uk  Wed Oct 18 05:58:00 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 10:58:00 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
Message-ID: <4535FAA8.2050506@sheffield.ac.uk>

I get all tests passing except for BioDBSeqFeature_mysql which fails all
tests (1-46).

During perl Makefile.PL I get:
"I see you have Berkeleydb installed. I will create the DBD tests for
Bio::DB::SeqFeature::Store..."

I notice under the "needs investigation" there is mention about tests
been generated even if DBD::mysql isn't installed. I assume this is the
problem? If this is the problem should DBD::mysql be added to the
dependencies in Makefile.PL?

Is there an easy way to find out what tests are being skipped due to
absent modules?

Cheers
Nath


From n.haigh at sheffield.ac.uk  Wed Oct 18 07:34:21 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 12:34:21 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4535EBF9.1090706@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk>
Message-ID: <4536113D.1080307@sheffield.ac.uk>

I've just added test results for 1.5.2 RC2 to the wiki.

There are lots of fails for packages other than bioperl-live. I'm not
sure excatly how the test fails/skipps are/should be handled since my
setups are as follows.

Clean WinXP Pro:
This is a clean install of WinXP Pro SP2 with no major software
installed, other than ActivePerl 5.8.8.819 and a few tools for archive
extracting, anti virus etc. Therefore, I'm unsure how tests in
bioperl-network and bioperl-db should return. For example, I have made
no effort to setup biosql-schema but I thought that maybe there would be
a test that would detect this, and fail, then skip over other tests
gracefully - like the bioperl-run tests when a piece of software is not
installed???

Debian Linux:
This is a Bio-Linux machine with quite a lot of bioinformatics software
installed in the Path. So most of the tests in bioperl-run should
probably have passed. The same goes for bioperl-network and bioperl-db
as with my Windows setup.

If my thoughts are totally wrong - let me know!
Nath


From bix at sendu.me.uk  Wed Oct 18 08:03:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 13:03:11 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <4535FAA8.2050506@sheffield.ac.uk>
References: <4535FAA8.2050506@sheffield.ac.uk>
Message-ID: <453617FF.9080508@sendu.me.uk>

Nathan Haigh wrote:
> I get all tests passing except for BioDBSeqFeature_mysql which fails all
> tests (1-46).
> 
> During perl Makefile.PL I get:
> "I see you have Berkeleydb installed. I will create the DBD tests for
> Bio::DB::SeqFeature::Store..."
> 
> I notice under the "needs investigation" there is mention about tests
> been generated even if DBD::mysql isn't installed. I assume this is the
> problem? 

Probably. I'm looking into it. Not sure why it wasn't causing a problem 
before now.

 > If this is the problem should DBD::mysql be added to the
 > dependencies in Makefile.PL?

No. You can use the modules in question without mysql (presumably; ie. 
you have a different sql setup), so it makes no sense to warn people 
they don't have a module they absolutely do not need.


> Is there an easy way to find out what tests are being skipped due to
> absent modules?

Ideally, when the skip occurs the test script will issue a message. I 
think that happens in most, if not all cases.


From bix at sendu.me.uk  Wed Oct 18 09:02:50 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 14:02:50 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <453617FF.9080508@sendu.me.uk>
References: <4535FAA8.2050506@sheffield.ac.uk> <453617FF.9080508@sendu.me.uk>
Message-ID: <453625FA.6090907@sendu.me.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
?
>> I notice under the "needs investigation" there is mention about tests
>> been generated even if DBD::mysql isn't installed. I assume this is the
>> problem? 
> 
> Probably. I'm looking into it. Not sure why it wasn't causing a problem 
> before now.
> 
>  > If this is the problem should DBD::mysql be added to the
>  > dependencies in Makefile.PL?
> 
> No. You can use the modules in question without mysql (presumably; ie. 
> you have a different sql setup), so it makes no sense to warn people 
> they don't have a module they absolutely do not need.

Oops. It /is/ in the pre-reqs in Makefile.PL. Maybe DBD::mysql is the 
only supported driver?


From bix at sendu.me.uk  Wed Oct 18 09:16:24 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 14:16:24 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>	<20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
	<67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>
Message-ID: <45362928.8070104@sendu.me.uk>

Chris Fields wrote:
> On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote:
> 
>> Hi Chris,
>>
>> Yup, that's it.  I installed XML::SAX::ExpatXS (make test output
>> below).  Should there be a note somewhere in the INSTALL docs saying
>> basically what you just wrote?  Or maybe it's already there somewhere
>> and I missed it.
> 
> The INSTALL docs should have this, yes.  I'll double-check though.
> 
> Pretty much anything that plugs into XML::SAX except XML::SAX::Expat  
> works (XML::LibXML also works, I found).
> 
>> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks
>> if DBD::mysql can be loaded,
[snip]
> It should check this when using 'perl Makefile.PL', since the tests  
> are only set up if MySQL is present (so you would assume that it  
> checks for DBD::mysql).  I'll look into it.

This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in 
my t directory when I packed it up for release.

I'm tweaking Makefile.PL right now in any case; there are a few errors 
and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean.


From cjfields at uiuc.edu  Wed Oct 18 09:55:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 08:55:37 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
Message-ID: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>

Ding dong the witch is dead!  As announce previously, from the latest
GenBank release (156.0):

-----------------------------------------------

1.3.8 Feature location syntax X.Y no longer supported

  The Feature Table has supported feature locations of the form 'X.Y', to
represent a base position which is greater or equal to X, and less than or
equal to Y. For example:

	misc_feature    1.10..20
	misc_feature    join(100..150,200.210..250)

  In the first example, the misc_feature starts somewhere between bases 1
and 10 (inclusive), and ends at basepair 20. In the second, the 51 bases
from 100..150 are joined together with a second basepair interval, which
could be anywhere from 200..250 to 210..250 .

  Although this syntax seems like a reasonable way to capture an uncertain
interval, it is used for features on a vanishingly small number of sequence
records, most database submission mechanisms don't support it, and the
meaning of its use in a join() context is not entirely clear.

  As of October 2006, this type of location is no longer supported.
Those records with features which utilize X.Y locations will be reviewed and
converted to a non-uncertain format.

-----------------------------------------------

EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
Not sure about UniProt/SwissProt.

I guess we're keeping this in for backwards compatibility only, but how do
we handle any bugs that pop up related to this?  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Wed Oct 18 10:10:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:10:07 -0500
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <453617FF.9080508@sendu.me.uk>
Message-ID: <001f01c6f2bf$20737270$15327e82@pyrimidine>

> Nathan Haigh wrote:
> > I get all tests passing except for BioDBSeqFeature_mysql which fails all
> > tests (1-46).
> >
> > During perl Makefile.PL I get:
> > "I see you have Berkeleydb installed. I will create the DBD tests for
> > Bio::DB::SeqFeature::Store..."
> >
> > I notice under the "needs investigation" there is mention about tests
> > been generated even if DBD::mysql isn't installed. I assume this is the
> > problem?
> 
> Probably. I'm looking into it. Not sure why it wasn't causing a problem
> before now.

Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP
because 'perl Makefile.PL' doesn't detect my MySQL installation, so the
MySQL-based tests don't run even though I have DBD::mysql installed.  I
thought this might just be a WinXP issue, but apparently not.  If I can get
to it I'll run a few checks.

>  > If this is the problem should DBD::mysql be added to the
>  > dependencies in Makefile.PL?
> 
> No. You can use the modules in question without mysql (presumably; ie.
> you have a different sql setup), so it makes no sense to warn people
> they don't have a module they absolutely do not need.

Agreed, though I don't know if other relational DB's are supported like
PostgreSQL.

> > Is there an easy way to find out what tests are being skipped due to
> > absent modules?
> 
> Ideally, when the skip occurs the test script will issue a message. I
> think that happens in most, if not all cases.

Yes, though we may run into the same issue we had with XEMBL tests not
reporting the reasons it skipped.  Each test suite should run an eval{} to
check the required modules, then only skip blocks of tests that rely on
those modules.  I think we have caught most of those, but who knows w/o
doing a complete test suite audit?

Our eventual complete switchover to Test::More should hopefully clean these
up.  I don't consider it a pressing issue for this release, though Sendu may
feel differently.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Wed Oct 18 10:12:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:12:52 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45362928.8070104@sendu.me.uk>
Message-ID: <002001c6f2bf$807849c0$15327e82@pyrimidine>

...
> This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in
> my t directory when I packed it up for release.
> 
> I'm tweaking Makefile.PL right now in any case; there are a few errors
> and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean.

Okay, makes sense now.  No big deal, it's still an RC (a developer's RC at
that!).

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Wed Oct 18 10:17:35 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 15:17:35 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <001f01c6f2bf$20737270$15327e82@pyrimidine>
References: <001f01c6f2bf$20737270$15327e82@pyrimidine>
Message-ID: <4536377F.6000408@sheffield.ac.uk>

Chris Fields wrote:
>> Nathan Haigh wrote:
>>     
>>> I get all tests passing except for BioDBSeqFeature_mysql which fails all
>>> tests (1-46).
>>>
>>> During perl Makefile.PL I get:
>>> "I see you have Berkeleydb installed. I will create the DBD tests for
>>> Bio::DB::SeqFeature::Store..."
>>>
>>> I notice under the "needs investigation" there is mention about tests
>>> been generated even if DBD::mysql isn't installed. I assume this is the
>>> problem?
>>>       
>> Probably. I'm looking into it. Not sure why it wasn't causing a problem
>> before now.
>>     
>
> Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP
> because 'perl Makefile.PL' doesn't detect my MySQL installation, so the
> MySQL-based tests don't run even though I have DBD::mysql installed.  I
> thought this might just be a WinXP issue, but apparently not.  If I can get
> to it I'll run a few checks.
>
>   
This was on WinXP.
>>  > If this is the problem should DBD::mysql be added to the
>>  > dependencies in Makefile.PL?
>>
>> No. You can use the modules in question without mysql (presumably; ie.
>> you have a different sql setup), so it makes no sense to warn people
>> they don't have a module they absolutely do not need.
>>     
>
> Agreed, though I don't know if other relational DB's are supported like
> PostgreSQL.
>
>   
>>> Is there an easy way to find out what tests are being skipped due to
>>> absent modules?
>>>       
>> Ideally, when the skip occurs the test script will issue a message. I
>> think that happens in most, if not all cases.
>>     
>
> Yes, though we may run into the same issue we had with XEMBL tests not
> reporting the reasons it skipped.  Each test suite should run an eval{} to
> check the required modules, then only skip blocks of tests that rely on
> those modules.  I think we have caught most of those, but who knows w/o
> doing a complete test suite audit?
>
> Our eventual complete switchover to Test::More should hopefully clean these
> up.  I don't consider it a pressing issue for this release, though Sendu may
> feel differently.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   


From hlapp at gmx.net  Wed Oct 18 10:36:31 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 18 Oct 2006 10:36:31 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
References: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
Message-ID: <B8036AB0-741F-427A-9EB1-7E80A28EC79F@gmx.net>


On Oct 18, 2006, at 9:55 AM, Chris Fields wrote:

> how do we handle any bugs that pop up related to this?

By an evil grin, followed by deflecting the blame to NCBI, followed  
by another evil grin.
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Oct 18 10:43:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:43:31 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <B8036AB0-741F-427A-9EB1-7E80A28EC79F@gmx.net>
Message-ID: <002401c6f2c3$c83c7e30$15327e82@pyrimidine>

> On Oct 18, 2006, at 9:55 AM, Chris Fields wrote:
> 
> > how do we handle any bugs that pop up related to this?
> 
> By an evil grin, followed by deflecting the blame to NCBI, followed
> by another evil grin.
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Sounds good to me!  One less thing to worry about.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Wed Oct 18 10:45:57 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 15:45:57 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
Message-ID: <45363E25.8010806@sheffield.ac.uk>

Nathan Haigh wrote:
> I've just added test results for 1.5.2 RC2 to the wiki.
>
> There are lots of fails for packages other than bioperl-live. I'm not
> sure excatly how the test fails/skipps are/should be handled since my
> setups are as follows.
>
> Clean WinXP Pro:
> This is a clean install of WinXP Pro SP2 with no major software
> installed, other than ActivePerl 5.8.8.819 and a few tools for archive
> extracting, anti virus etc. Therefore, I'm unsure how tests in
> bioperl-network and bioperl-db should return. For example, I have made
> no effort to setup biosql-schema but I thought that maybe there would be
> a test that would detect this, and fail, then skip over other tests
> gracefully - like the bioperl-run tests when a piece of software is not
> installed???
>
> Debian Linux:
> This is a Bio-Linux machine with quite a lot of bioinformatics software
> installed in the Path. So most of the tests in bioperl-run should
> probably have passed. The same goes for bioperl-network and bioperl-db
> as with my Windows setup.
>
> If my thoughts are totally wrong - let me know!
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just looking into the failed Linux tests.

Several of the tests result in errors like:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: ARGUMENTS !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Alignment::Exonerate::AUTOLOAD
/home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:126
STACK: Bio::Tools::Run::Alignment::Exonerate::new
/home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:154
STACK: t/Exonerate.t:32
-----------------------------------------------------------

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: 'arguments' !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Hmmer::AUTOLOAD Bio/Tools/Run/Hmmer.pm:172
STACK: Bio::Tools::Run::Hmmer::_run Bio/Tools/Run/Hmmer.pm:253
STACK: Bio::Tools::Run::Hmmer::run Bio/Tools/Run/Hmmer.pm:228
STACK: t/Hmmer.t:54
-----------------------------------------------------------

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: ARGUMENTS !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Phrap::AUTOLOAD Bio/Tools/Run/Phrap.pm:137
STACK: Bio::Tools::Run::Phrap::new Bio/Tools/Run/Phrap.pm:165
STACK: t/Phrap.t:34
-----------------------------------------------------------

Any ideas??

Nath


From hlapp at gmx.net  Wed Oct 18 10:51:36 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 18 Oct 2006 10:51:36 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk>
Message-ID: <E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>


On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:

>  For example, I have made
> no effort to setup biosql-schema but I thought that maybe there  
> would be
> a test that would detect this

I'm afraid there isn't. Bioperl-db is meaningless without biosql-schema.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bosborne11 at verizon.net  Wed Oct 18 10:43:06 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 10:43:06 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
 GenBank/EMBL/DDBJ
In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
Message-ID: <C15BB5BA.ADAA%bosborne11@verizon.net>

Chris,

I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
of the more recent examples in t/LocationFactory.t come from there.

Brian O.


On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
> Not sure about UniProt/SwissProt.


From cjfields at uiuc.edu  Wed Oct 18 11:00:30 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 10:00:30 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <C15BB5BA.ADAA%bosborne11@verizon.net>
Message-ID: <002501c6f2c6$27625540$15327e82@pyrimidine>

Do they still use the X.Y notations?  Those are the most troublesome.  I
guess we still don't support the ones containing '?'.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: Brian Osborne [mailto:bosborne11 at verizon.net]
> Sent: Wednesday, October 18, 2006 9:43 AM
> To: Chris Fields; bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in
> GenBank/EMBL/DDBJ
> 
> Chris,
> 
> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
> of the more recent examples in t/LocationFactory.t come from there.
> 
> Brian O.
> 
> 
> On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:
> 
> > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
> > Not sure about UniProt/SwissProt.


From Kevin.M.Brown at asu.edu  Wed Oct 18 11:16:50 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 18 Oct 2006 08:16:50 -0700
Subject: [Bioperl-l] Blast information
Message-ID: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>

I just recently upgraded to 1.5.1 on WinXP to bring this version closer
to live to parse some locally created blast files.  I'm trying to find
the method that returns the values that are underneath the Identities
and Positives information as I'm trying to replicate the output of an
old blast parser we have here written in RealBasic which is showing its
age.  Once I have it replicating the old output I then intend to add
more features in terms of filtering returned hits (like not returning
self->self hits or a->b so don't show b->a).

Example:
I'm looking for the methods that will return 117 from identities and 117
from positives.  I can't just use num_identical/percent_identity as that
isn't 100% accurate.

>BurkM_2016
          Length = 241

 Score = 43.2 bits (88), Expect = 7e-005
 Identities = 26/117 (22%), Positives = 51/117 (43%)

Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
357
           Q   F  F  + A+    ++ +         + + L +R   GL   + P   E + A+L
Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
170

Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
              A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227

Thanks,
Kevin


From cjfields at uiuc.edu  Wed Oct 18 11:25:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 10:25:59 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
Message-ID: <002601c6f2c9$b6d04a90$15327e82@pyrimidine>

> I've just added test results for 1.5.2 RC2 to the wiki.
> 
> There are lots of fails for packages other than bioperl-live. I'm not
> sure excatly how the test fails/skipps are/should be handled since my
> setups are as follows.
> 
> Clean WinXP Pro:
> This is a clean install of WinXP Pro SP2 with no major software
> installed, other than ActivePerl 5.8.8.819 and a few tools for archive
> extracting, anti virus etc. Therefore, I'm unsure how tests in
> bioperl-network and bioperl-db should return. For example, I have made
> no effort to setup biosql-schema but I thought that maybe there would be
> a test that would detect this, and fail, then skip over other tests
> gracefully - like the bioperl-run tests when a piece of software is not
> installed???
> 
> Debian Linux:
> This is a Bio-Linux machine with quite a lot of bioinformatics software
> installed in the Path. So most of the tests in bioperl-run should
> probably have passed. The same goes for bioperl-network and bioperl-db
> as with my Windows setup.
> 
> If my thoughts are totally wrong - let me know!
> Nath

The bioperl-db tests rely on a local BioSQL database and on having a
properly set up configuration file (these are detailed in the bioperl-db
INSTALL doc).  Furthermore, there are serious problems with bioperl-db and
WinXP (see Bug 1938 in bugzilla).  There is a workaround, but it isn't
perfect by any means.  

http://bugzilla.open-bio.org/show_bug.cgi?id=1938

Many of the bioperl-run tests rely on env. variables being set properly, so
maybe that's why they failed.  These should all be detailed in the INSTALL
file (but maybe they aren't?).

I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac OS
X yet but intended on doing this within the week.  The INSTALL file details
the requirements for the packages (Graph 0.80 is the only one for
bioperl-network, for instance, and there isn't a PPM for that version
available yet).  

It would be nice to skip the tests based on absence of the particular
modules or installed programs, and I think the final goal is to possibly
attempt to do this.  However, all of the bioperl-related distributions have
their own documentation which outline their installation, requirements, and
use.  At least we can point to that, which works for now.  We could always
start up a wiki page for the various bioperl distributions to monitor
problems or issues with each based on OS, proposed enhancements/ideas, etc.


Also, most (if not all, including core) have been primarily tested on some
*nix-related system, which means that they may not work on Win32 systems.
Though the Windows support is light-years ahead of what it used to be circa
rel 0.7, I don't think it is full-proof yet, as witnessed by the bioperl-db
bug.  Frankly, we need more WinXP users for those packages willing to test
them out and offer suggestions.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign l


From bosborne11 at verizon.net  Wed Oct 18 11:13:51 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 11:13:51 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
 GenBank/EMBL/DDBJ
In-Reply-To: <002501c6f2c6$27625540$15327e82@pyrimidine>
Message-ID: <C15BBCEF.ADB8%bosborne11@verizon.net>

Chris,

No, I don't think they use the form X.Y. See below, from
t/LocationFactory.t, we do support most of the forms using ?. Supposedly
these tests accommodate all of the possible fuzzy locations encountered in
Swissprot, I wrote these a year or so ago.

Brian O.


        # UNCERTAIN locations and positions (Swissprot)
   "?2465..2774" => [$fuzzy_impl,
       2465, 2465, "UNCERTAIN", 2774, 2774, "EXACT", "EXACT", 1, 1],
   "22..?64" => [$fuzzy_impl,
       22, 22, "EXACT", 64, 64, "UNCERTAIN", "EXACT", 1, 1],
   "?22..?64" => [$fuzzy_impl,
       22, 22, "UNCERTAIN", 64, 64, "UNCERTAIN", "EXACT", 1, 1],
   "?..>393" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", 393, undef, "AFTER", "UNCERTAIN", 1, 1],
   "<1..?" => [$fuzzy_impl,
       undef, 1, "BEFORE", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1],
   "?..536" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", 536, 536, "EXACT", "UNCERTAIN", 1, 1],
   "1..?" => [$fuzzy_impl,
       1, 1, "EXACT", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1],
   "?..?" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", undef, undef, "UNCERTAIN", "UNCERTAIN", 1,
1],
   # Not working yet:
   #"12..?1" => [$fuzzy_impl,
   #    1, 1, "UNCERTAIN", 12, 12, "EXACT", "EXACT", 1, 1]


On 10/18/06 11:00 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> Do they still use the X.Y notations?  Those are the most troublesome.  I
> guess we still don't support the ones containing '?'.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
>> -----Original Message-----
>> From: Brian Osborne [mailto:bosborne11 at verizon.net]
>> Sent: Wednesday, October 18, 2006 9:43 AM
>> To: Chris Fields; bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in
>> GenBank/EMBL/DDBJ
>> 
>> Chris,
>> 
>> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
>> of the more recent examples in t/LocationFactory.t come from there.
>> 
>> Brian O.
>> 
>> 
>> On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>> 
>>> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
>>> Not sure about UniProt/SwissProt.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Wed Oct 18 12:56:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 11:56:07 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <002601c6f2c9$b6d04a90$15327e82@pyrimidine>
Message-ID: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>

...
> I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac
> OS
All,

> X yet but intended on doing this within the week.  The INSTALL file
> details
> the requirements for the packages (Graph 0.80 is the only one for
> bioperl-network, for instance, and there isn't a PPM for that version
> available yet).
...

As a followup in this, I tried bioperl-network and had similar failed tests
with Graph 0.79 (the only PPM available from ActiveState).  However, the
INSTALL docs state that Graph 0.80 is needed, and the test run gave several
warnings about not having Graph 0.80 installed. 

I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and
everything passed.  Maybe we need to have a Graph PPM available for those
who want bioperl-network?

As for bioperl-run, all tests passed from a new CVS checkout even though I
have none of the programs installed, so they seem to skip properly.  The
test run also printed warnings when a program wasn't available or installed.


Chris


From bosborne11 at verizon.net  Wed Oct 18 13:10:34 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 13:10:34 -0400
Subject: [Bioperl-l] Blast information
In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
Message-ID: <C15BD84A.ADCC%bosborne11@verizon.net>

Kevin,

Are you looking for hsp_length()? See the SearchIO HOWTO for a list of
methods:

http://www.bioperl.org/wiki/HOWTO:SearchIO


Brian O.


On 10/18/06 11:16 AM, "Kevin Brown" <Kevin.M.Brown at asu.edu> wrote:

> I just recently upgraded to 1.5.1 on WinXP to bring this version closer
> to live to parse some locally created blast files.  I'm trying to find
> the method that returns the values that are underneath the Identities
> and Positives information as I'm trying to replicate the output of an
> old blast parser we have here written in RealBasic which is showing its
> age.  Once I have it replicating the old output I then intend to add
> more features in terms of filtering returned hits (like not returning
> self->self hits or a->b so don't show b->a).
> 
> Example:
> I'm looking for the methods that will return 117 from identities and 117
> from positives.  I can't just use num_identical/percent_identity as that
> isn't 100% accurate.
> 
>> BurkM_2016
>           Length = 241
> 
>  Score = 43.2 bits (88), Expect = 7e-005
>  Identities = 26/117 (22%), Positives = 51/117 (43%)
> 
> Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> 357
>            Q   F  F  + A+    ++ +         + + L +R   GL   + P   E + A+L
> Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> 170
> 
> Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
>               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
> 
> Thanks,
> Kevin
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Kevin.M.Brown at asu.edu  Wed Oct 18 17:25:48 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 18 Oct 2006 14:25:48 -0700
Subject: [Bioperl-l] Blast information
Message-ID: <1A4207F8295607498283FE9E93B775B4022A71C3@EX02.asurite.ad.asu.edu>

Yes, that does indeed look like what I was after. 

> -----Original Message-----
> From: Brian Osborne [mailto:bosborne11 at verizon.net] 
> Sent: Wednesday, October 18, 2006 10:11 AM
> To: Kevin Brown; bioperl-l
> Subject: Re: [Bioperl-l] Blast information
> 
> Kevin,
> 
> Are you looking for hsp_length()? See the SearchIO HOWTO for a list of
> methods:
> 
> http://www.bioperl.org/wiki/HOWTO:SearchIO
> 
> 
> Brian O.
> 
> 
> On 10/18/06 11:16 AM, "Kevin Brown" <Kevin.M.Brown at asu.edu> wrote:
> 
> > I just recently upgraded to 1.5.1 on WinXP to bring this 
> version closer
> > to live to parse some locally created blast files.  I'm 
> trying to find
> > the method that returns the values that are underneath the 
> Identities
> > and Positives information as I'm trying to replicate the 
> output of an
> > old blast parser we have here written in RealBasic which is 
> showing its
> > age.  Once I have it replicating the old output I then intend to add
> > more features in terms of filtering returned hits (like not 
> returning
> > self->self hits or a->b so don't show b->a).
> > 
> > Example:
> > I'm looking for the methods that will return 117 from 
> identities and 117
> > from positives.  I can't just use 
> num_identical/percent_identity as that
> > isn't 100% accurate.
> > 
> >> BurkM_2016
> >           Length = 241
> > 
> >  Score = 43.2 bits (88), Expect = 7e-005
> >  Identities = 26/117 (22%), Positives = 51/117 (43%)
> > 
> > Query: 298 
> QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> > 357
> >            Q   F  F  + A+    ++ +         + + L +R   GL   + 
> P   E + A+L
> > Sbjct: 111 
> QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> > 170
> > 
> > Query: 358 
> MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
> >               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> > Sbjct: 171 
> KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
> > 
> > Thanks,
> > Kevin
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> 


From n.appleby at uq.edu.au  Wed Oct 18 17:58:06 2006
From: n.appleby at uq.edu.au (Nikki Appleby)
Date: Thu, 19 Oct 2006 07:58:06 +1000
Subject: [Bioperl-l] CONTIG dealing
Message-ID: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>


I have just entered the wonderful new world of BioPerl, so the answer to my
question may be obvious to any of the gurus reading this.

I need to collect sequence features and ontology annotations. Here goes.

I am retrieving sequences from SwissProt via Bio::DB::SwissProt and
get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into an RDBMS
format that I am happy with I can get at the xref ids. In this case, they
are 

AP003451; BAB86144.1; -; Genomic_DNA. 
AP008207; BAF07116.1; -; Genomic_DNA. 
AB103395; BAC81207.1; -; mRNA. 

I can happily go off and fetch those from Bio::DB::GenBank (first column),
and Bio::DB::GenPept (second). All good, except...

AP008207 is a contig. I don't want to get all of the features for the entire
thing, just the single contig that actually matches the original sequence.
It takes a couple of hours to get at it and then it gives me way too much.

I will come across this problem with other sequences. How do I (a) find out
if it is a contig without downloading it in it's entirety and (b) extract
the list of sequences that are about to be contigged together.

I have searched the web for answers, including this list, but see nothing.
Help!
 
Nikki Appleby.


From bosborne11 at verizon.net  Wed Oct 18 20:54:04 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 20:54:04 -0400
Subject: [Bioperl-l] LocatableSeq object vs Sequence Object
In-Reply-To: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com>
Message-ID: <C15C44EC.ADF8%bosborne11@verizon.net>

Peter,

I'm not understanding your question, partly because your letter and your
code are saying different things. You say you want to call
location_from_column() but your code shows you calling species(). What
happens when you call location_from_column? Do you see errors?

Brian O.


On 10/17/06 12:26 PM, "Peter H. Baenziger" <plu5even at gmail.com> wrote:

> I was thinking I could use:
> foreach $seq ($alignment->each_seq())
> to loop through the sequences and call:
> $seq->location_from_column($pos)
> on each of the sequences.  


From cjfields at uiuc.edu  Wed Oct 18 22:46:14 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 21:46:14 -0500
Subject: [Bioperl-l] CONTIG dealing
In-Reply-To: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>
References: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>
Message-ID: <FAEAE9E1-EF95-4B79-AD75-B54D3E24E827@uiuc.edu>

On Oct 18, 2006, at 4:58 PM, Nikki Appleby wrote:

>
> I have just entered the wonderful new world of BioPerl, so the  
> answer to my
> question may be obvious to any of the gurus reading this.
>
> I need to collect sequence features and ontology annotations. Here  
> goes.
>
> I am retrieving sequences from SwissProt via Bio::DB::SwissProt and
> get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into  
> an RDBMS
> format that I am happy with I can get at the xref ids. In this  
> case, they
> are
>
> AP003451; BAB86144.1; -; Genomic_DNA.
> AP008207; BAF07116.1; -; Genomic_DNA.
> AB103395; BAC81207.1; -; mRNA.
>
> I can happily go off and fetch those from Bio::DB::GenBank (first  
> column),
> and Bio::DB::GenPept (second). All good, except...
>
> AP008207 is a contig. I don't want to get all of the features for  
> the entire
> thing, just the single contig that actually matches the original  
> sequence.
> It takes a couple of hours to get at it and then it gives me way  
> too much.
>
> I will come across this problem with other sequences. How do I (a)  
> find out
> if it is a contig without downloading it in it's entirety and (b)  
> extract
> the list of sequences that are about to be contigged together.
>
> I have searched the web for answers, including this list, but see  
> nothing.
> Help!
>
> Nikki Appleby.

The default setting for the retrieval format for GenBank is  
'gbwithparts' (which gets the full sequence at all times).  You can  
set this to 'gb' using request_format() to retrieve the sequence file  
with the contig information instead of the sequence, if it contains  
such (otherwise it just retrieves the sequence anyway).

However, I have noticed this particular file does not represent a  
true contig record but is the entire chromosome sequence.  The contig  
information is in the comments section, probably b/c the record is  
converted over.  You could just download the sequence record and run  
regexp to grab the comments section, then parse out the contigs (a  
pain) if you really want that.  Or you could try to find the  
equivalent GenBank record, such as the ones derived from the WGS  
records.

I did notice the list of dbxrefs in your swissprot record indicate  
three EMBL sequences.  If the order is consistent for the SwissProt  
entries you want, they probably represent:

The contig (what you want): AP003451; BAB86144.1; -; Genomic_DNA.
The supercontig (chromosome) : AP008207; BAF07116.1; -; Genomic_DNA.
The cDNA : AB103395; BAC81207.1; -; mRNA.

I checked the first one (AP003451), which seems to confirm this.

Since the chromosome supercontig is built from the smaller sequence  
contigs you could just grab the first EMBL dbxref instead of all of  
them.  It parses much faster than the chromosome file.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Wed Oct 18 11:47:14 2006
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Oct 2006 08:47:14 -0700
Subject: [Bioperl-l] Blast information
In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
References: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
Message-ID: <6B7D24F3-69F1-498D-AB53-B4CEB14E4F3D@bioperl.org>

I think this will work for you.

The seq_inds method parses the middle homology sequence and  
classifies each alignment column and returns a list of the columns  
meeting the criteria.  You can interrogate query or hit in this case  
since you are requiring it to be identical

my $identicalbases = scalar $hsp->seq_inds('query', 'identical');
my $conservedbases =  scalar $hsp->seq_inds('query','conserved');

Conserved returns those identical or conserved, if you want just  
those with conservative replacements use 'conserved-not-identical'

See http://bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods for more  
info.

-jason
On Oct 18, 2006, at 8:16 AM, Kevin Brown wrote:

> I just recently upgraded to 1.5.1 on WinXP to bring this version  
> closer
> to live to parse some locally created blast files.  I'm trying to find
> the method that returns the values that are underneath the Identities
> and Positives information as I'm trying to replicate the output of an
> old blast parser we have here written in RealBasic which is showing  
> its
> age.  Once I have it replicating the old output I then intend to add
> more features in terms of filtering returned hits (like not returning
> self->self hits or a->b so don't show b->a).
>
> Example:
> I'm looking for the methods that will return 117 from identities  
> and 117
> from positives.  I can't just use num_identical/percent_identity as  
> that
> isn't 100% accurate.
>
>> BurkM_2016
>           Length = 241
>
>  Score = 43.2 bits (88), Expect = 7e-005
>  Identities = 26/117 (22%), Positives = 51/117 (43%)
>
> Query: 298  
> QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> 357
>            Q   F  F  + A+    ++ +         + + L +R   GL   + P   E +  
> A+L
> Sbjct: 111  
> QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> 170
>
> Query: 358  
> MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
>               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> Sbjct: 171  
> KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
>
> Thanks,
> Kevin
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Thu Oct 19 01:00:28 2006
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Oct 2006 22:00:28 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
Message-ID: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>

So I'm unsure what we should do here.

We can certainly fix the problem which you report which is relying on  
the "" method -- if you were to do instead:
print $_->database, ":", $_->primary_id, "\n";

you'll get the right answer.  We at a minimum just fix the auto- 
string converting method to do The Right Thing.

But I am not sure if we should keep the version out of the primary_id  
field.  This will require some rejiggering in several modules when it  
comes to printing DBlinks and I don't want to do this before the  
release. I also am not sure if there was an explicit reason why  
someone did put the version information in the primary_id. (I hope it  
wasn't me because I don't think I'm going to remember why).

Does anyone else have a strong feeling?

-jason
On Oct 17, 2006, at 12:01 PM, Erikjan wrote:

> Hello,
>
> I noticed a little problem with the Annotation "DBLink" from  
> GenBank entries
>
> When I run:
>
> perl -MBio::DB::GenBank -e 'my $gi =
> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations 
> ("dblink");
> for(@annotations) { print $_, "\n";} print $INC{
> "Bio/Annotation/DBLink.pm" }, "\n"; '
>
> This yields:
>
>    GenBank:AL591065.17.17
>
> and the place where the used Bio/Annotation/DBLink.pm resides.
>
> Can others repeat this?
>
> I have dug into the source a little and Bio::Annotation::DBLink  
> seems to
> be the place where this happens: it has a concatenation which leads to
> that repeated version number.
>
> It this something that I should fix "client-side", so to speak, or  
> is it
> worthwhile to add some logic to that concatenation to prevent this?
>
>
> Thanks,
>
> Eric
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From n.haigh at sheffield.ac.uk  Thu Oct 19 02:41:02 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 07:41:02 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>
References: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>
Message-ID: <45371DFE.6050306@sheffield.ac.uk>


> As a followup in this, I tried bioperl-network and had similar failed tests
> with Graph 0.79 (the only PPM available from ActiveState).  However, the
> INSTALL docs state that Graph 0.80 is needed, and the test run gave several
> warnings about not having Graph 0.80 installed. 
>
> I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and
> everything passed.  Maybe we need to have a Graph PPM available for those
> who want bioperl-network?
>
> As for bioperl-run, all tests passed from a new CVS checkout even though I
> have none of the programs installed, so they seem to skip properly.  The
> test run also printed warnings when a program wasn't available or installed.
>
>
> Chris
>
>   
If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make 
modifications to integrate them into the package.xml file for PPM4 clients.

Nath


From n.haigh at sheffield.ac.uk  Thu Oct 19 06:40:21 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 11:40:21 +0100
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
Message-ID: <45375615.1020603@sheffield.ac.uk>

Should line 25 read:
require Bio::Factory::EMBOSS

instead of:
require Bio::EMBOSS::Factory;

Nath


From hlapp at gmx.net  Thu Oct 19 09:56:05 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 19 Oct 2006 09:56:05 -0400
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
Message-ID: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>

Here is the overload code:

use overload '""' => sub {
	(($_[0]->database ? $_[0]->database . ':' : '' )
	. ($_[0]->primary_id ? $_[0]->primary_id : '')
	. ($_[0]->version ? '.' . $_[0]->version : ''))
	|| '' };

Except that the last '||' is redundant and unnecessary (it either  
does nothing or replaces an empty string with an empty string), I  
don't see the potential for duplicating the version number here -  
unless primary_id() did that, which I don't see it doing.

So, to me this seems to come from a parsing error in the beginning,  
rather than an erroneous mangling of version into primary_id later.

Is someone in the position to confirm this?

	-hilmar

On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:

> So I'm unsure what we should do here.
>
> We can certainly fix the problem which you report which is relying on
> the "" method -- if you were to do instead:
> print $_->database, ":", $_->primary_id, "\n";
>
> you'll get the right answer.  We at a minimum just fix the auto-
> string converting method to do The Right Thing.
>
> But I am not sure if we should keep the version out of the primary_id
> field.  This will require some rejiggering in several modules when it
> comes to printing DBlinks and I don't want to do this before the
> release. I also am not sure if there was an explicit reason why
> someone did put the version information in the primary_id. (I hope it
> wasn't me because I don't think I'm going to remember why).
>
> Does anyone else have a strong feeling?
>
> -jason
> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>
>> Hello,
>>
>> I noticed a little problem with the Annotation "DBLink" from
>> GenBank entries
>>
>> When I run:
>>
>> perl -MBio::DB::GenBank -e 'my $gi =
>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>> ("dblink");
>> for(@annotations) { print $_, "\n";} print $INC{
>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>
>> This yields:
>>
>>    GenBank:AL591065.17.17
>>
>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>
>> Can others repeat this?
>>
>> I have dug into the source a little and Bio::Annotation::DBLink
>> seems to
>> be the place where this happens: it has a concatenation which  
>> leads to
>> that repeated version number.
>>
>> It this something that I should fix "client-side", so to speak, or
>> is it
>> worthwhile to add some logic to that concatenation to prevent this?
>>
>>
>> Thanks,
>>
>> Eric
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From dmessina at wustl.edu  Thu Oct 19 09:55:31 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 19 Oct 2006 08:55:31 -0500
Subject: [Bioperl-l] missing documentation (request for help)
Message-ID: <69453D5F-7794-4DC7-BAE1-A8B2191752E6@wustl.edu>

Hi all,

There are a few modules missing a one-line description, and by one- 
line description, I'm referring to the part that comes after the  
module name in the POD.

e.g. in

=head1 NAME

Bio::SearchIO - Driver for parsing Sequence Database Searches
(BLAST, FASTA, ...)

=head1 SYNOPSIS

[etc...]

"Driver for parsing Sequence Database Searches (BLAST, FASTA, ...)"  
is the one-line description (even though it falls onto two lines) :).

I fixed the modules that I knew something about, but there are some I  
haven't used. Perhaps the author, or someone else familiar with these  
modules, could fill in an appropriate short description?

Here is the list of affected modules:
Bio::DB::Expression
Bio::Expression::Contact
Bio::Expression::DataSet
Bio::Expression::Platform
Bio::Expression::Sample
Bio::Search::Processor
Bio::DB::EUtilities::ElinkData
Bio::DB::GFF::Adaptor::memory::feature_serializer
Bio::DB::SeqFeature::Store::DBI::Iterator
Bio::Expression::FeatureGroup::FeatureGroupMas50
Bio::Expression::FeatureSet::FeatureSetMas50
Bio::Matrix::PSM::PsmHeaderI
Bio::OntologyIO::Handlers::BaseSAXHandler

Some of these are missing other POD parts as well -- please add those  
too if you can.


Thanks,
Dave


From mckays at cshl.edu  Thu Oct 19 09:51:18 2006
From: mckays at cshl.edu (Sheldon McKay)
Date: Thu, 19 Oct 2006 09:51:18 -0400
Subject: [Bioperl-l] chromosome ideograms
Message-ID: <6b0de00426b3c04b0d0d7641bc8e14e3@cshl.edu>

Hi,

Sorry for the late reply.  I have been working on a karyotype drawing 
tool as part of the Generic Genome Browser that may be useful.  In 
addition to drawing features next to chromosome ideograms, it also 
supports making chromosome 'bands' from any kind of scored features to 
create a sort of heat map on the chromosome itself.

I have a demo running at

http://mckay.cshl.edu/cgi-bin/gbrowse_karyotype

and the source is available from the GMOD CVS HEAD 
http://www.gmod.org/cvs

Sheldon

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Sheldon McKay, PhD
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724


From n.haigh at sheffield.ac.uk  Thu Oct 19 11:37:31 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 15:37:31 +0000
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
In-Reply-To: <45375615.1020603@sheffield.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>
Message-ID: <45379BBB.1040400@sheffield.ac.uk>

Thanks for committing that change Brian. Now the tests proceed from this
point, I get the following error:

------------- EXCEPTION: Bio::Root::NotImplemented -------------
MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not
implemented by package Bio::Tools::Run::EMBOSSApplication.
This is not your fault - author of Bio::Tools::Run::EMBOSSApplication
should be blamed!

STACK: Error::throw
STACK: Bio::Root::Root::throw
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350
STACK: Bio::Root::RootI::throw_not_implemented
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522
STACK: Bio::Tools::Run::WrapperBase::program_dir
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346
STACK: Bio::Tools::Run::WrapperBase::program_path
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327
STACK: Bio::Tools::Run::WrapperBase::executable
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297
STACK: t/EMBOSS.t:58
----------------------------------------------------------------


From N.Haigh at sheffield.ac.uk  Thu Oct 19 11:03:00 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 16:03:00 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45379BBB.1040400@sheffield.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>
	<45379BBB.1040400@sheffield.ac.uk>
Message-ID: <1161270180.453793a432e4f@webmail.shef.ac.uk>

I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be
consistent with other tests.

Failing that - Is there a good test writing style I should follow in one of the other test files?

Thanks
Nathan


From bosborne11 at verizon.net  Thu Oct 19 11:06:08 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Thu, 19 Oct 2006 11:06:08 -0400
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
In-Reply-To: <45379BBB.1040400@sheffield.ac.uk>
Message-ID: <C15D0CA0.AE2C%bosborne11@verizon.net>

Nathan,

Yes, I see. Those EMBOSS programs work a bit differently from the typical
app run by bioperl-run, there's no need for WrapperBase methods like
program_dir(), executable(), it seems. Well, I can try and take a look at
this tonight but there's probably someone better suited to this than me,
I've spent very little time with bioperl-run. Volunteer?

Brian O.


On 10/19/06 11:37 AM, "Nathan S. Haigh" <n.haigh at sheffield.ac.uk> wrote:

> Thanks for committing that change Brian. Now the tests proceed from this
> point, I get the following error:
> 
> ------------- EXCEPTION: Bio::Root::NotImplemented -------------
> MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not
> implemented by package Bio::Tools::Run::EMBOSSApplication.
> This is not your fault - author of Bio::Tools::Run::EMBOSSApplication
> should be blamed!
> 
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350
> STACK: Bio::Root::RootI::throw_not_implemented
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522
> STACK: Bio::Tools::Run::WrapperBase::program_dir
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346
> STACK: Bio::Tools::Run::WrapperBase::program_path
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327
> STACK: Bio::Tools::Run::WrapperBase::executable
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297
> STACK: t/EMBOSS.t:58
> ----------------------------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From niels at genomics.dk  Thu Oct 19 11:16:37 2006
From: niels at genomics.dk (Niels Larsen)
Date: Thu, 19 Oct 2006 17:16:37 +0200
Subject: [Bioperl-l] From EBI support re WU-Blast SOAP service
In-Reply-To: <4535EBF9.1090706@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk>
Message-ID: <453796D5.2070808@genomics.dk>

Sendu Bala wrote:
>> I invoked the EBI script
>>
>> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip
>>
>> like this
>>
>> WSWUBlastClient.pl -p blastn -D embl test.fasta
>>
>> where the content of test.fasta is below, and got
>>
>> Can't find method element in the message at 
>> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.
> 
> As you admit, this is not a Bioperl issue. I would suggest you contact 
> EBI support.
> 

To use EBI's WU-blast SOAP interface from perl, EBI support
says it one must use SOAP::Lite v 0.60 (no later version)
and include '--email you.example.com' on the command line.
This is neither evident from their web pages or the script
usage statement, but they promised to fix.

------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------


From cjfields at uiuc.edu  Thu Oct 19 11:31:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 10:31:45 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45371DFE.6050306@sheffield.ac.uk>
Message-ID: <001501c6f393$b66bd4a0$15327e82@pyrimidine>

> > As a followup in this, I tried bioperl-network and had similar failed
> tests
> > with Graph 0.79 (the only PPM available from ActiveState).  However, the
> > INSTALL docs state that Graph 0.80 is needed, and the test run gave
> several
> > warnings about not having Graph 0.80 installed.
> >
> > I made a PPM of Graph 0.80, installed, retried bioperl-network tests,
> and
> > everything passed.  Maybe we need to have a Graph PPM available for
> those
> > who want bioperl-network?
> >
> > As for bioperl-run, all tests passed from a new CVS checkout even though
> I
> > have none of the programs installed, so they seem to skip properly.  The
> > test run also printed warnings when a program wasn't available or
> installed.
> >
> >
> > Chris
> >
> >
> If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make
> modifications to integrate them into the package.xml file for PPM4
> clients.
> 
> Nath

Will do.  Should these be forwarded to Mauricio?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From N.Haigh at sheffield.ac.uk  Thu Oct 19 11:38:05 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 16:38:05 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001501c6f393$b66bd4a0$15327e82@pyrimidine>
References: <001501c6f393$b66bd4a0$15327e82@pyrimidine>
Message-ID: <1161272285.45379bdd1aea4@webmail.shef.ac.uk>


> > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make
> > modifications to integrate them into the package.xml file for PPM4
> > clients.
> > 
> > Nath
> 
> Will do.  Should these be forwarded to Mauricio?
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 


If you don't have access to the web, you can send them to me - I now have an account on that server.

Cheers
Nath


From cjfields at uiuc.edu  Thu Oct 19 11:45:00 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 10:45:00 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk>
Message-ID: <001601c6f395$8a752ed0$15327e82@pyrimidine>

> I thought I'd have my first proper try at writing some tests. I was
> wondering if there is a template test file that I should use/study in
> order to be
> consistent with other tests.
> 
> Failing that - Is there a good test writing style I should follow in one
> of the other test files?
> 
> Thanks
> Nathan

I would start with the Test::Simple and Test::More perldoc; they're pretty
self-explanatory.  You can look at the various test suites using Test::More
as well for pointers.  By far, most tests will use is().  You can use SKIP
blocks to skip tests that have a requirement, or skip all tests if they all
require something.  Pretty flexible.

We should probably get a wiki page for the developers underway, maybe a
HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
DB tests, etc.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Thu Oct 19 12:23:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 11:23:40 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
Message-ID: <001b01c6f39a$f0288ba0$15327e82@pyrimidine>

> Here is the overload code:
> 
> use overload '""' => sub {
> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> 	|| '' };
> 
> Except that the last '||' is redundant and unnecessary (it either
> does nothing or replaces an empty string with an empty string), I
> don't see the potential for duplicating the version number here -
> unless primary_id() did that, which I don't see it doing.
> 
> So, to me this seems to come from a parsing error in the beginning,
> rather than an erroneous mangling of version into primary_id later.
> 
> Is someone in the position to confirm this?
> 
> 	-hilmar

I have attached a script to the bug report on bugzilla, as well as the test
output sequence and the actual GenBank record.  There are a number of
problems:

1)  primary_id() is assigned both the id and version.
2)  version() is still assigned the version.

The above explain when printing the object directly using the overload (it
concatenates them).  

However, there are a few more issues.  The ID is printed normally
(accession.version), but the source DB is not present when SeqIO handles the
sequence.  I have attached the output and the original GenBank record to the
bug report.  

I can look into it but it won't be today; got my hands full with enzyme
assays. 

Chris


From N.Haigh at sheffield.ac.uk  Thu Oct 19 12:50:57 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 17:50:57 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine>
References: <001601c6f395$8a752ed0$15327e82@pyrimidine>
Message-ID: <1161276657.4537acf1edc80@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> > I thought I'd have my first proper try at writing some tests. I was
> > wondering if there is a template test file that I should use/study in
> > order to be
> > consistent with other tests.
> > 
> > Failing that - Is there a good test writing style I should follow in one
> > of the other test files?
> > 
> > Thanks
> > Nathan
> 
> I would start with the Test::Simple and Test::More perldoc; they're pretty
> self-explanatory.  You can look at the various test suites using Test::More
> as well for pointers.  By far, most tests will use is().  You can use SKIP
> blocks to skip tests that have a requirement, or skip all tests if they all
> require something.  Pretty flexible.
> 
> We should probably get a wiki page for the developers underway, maybe a
> HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
> DB tests, etc.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 
> 
> 


Just working through some test things now, I thought I'd start on the bioperl-run stuff as I thought it might be a bit more straight forward, i'm
familiar with some of them and they seem to get neglected.

I'm heavily commenting my tests with the thought of starting a wiki guide to testing Bioperl modules. See how far I get!

Nath


From hlapp at gmx.net  Thu Oct 19 13:11:27 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 19 Oct 2006 13:11:27 -0400
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
	<8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
Message-ID: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>

Actually you did that Jason: http://tinyurl.com/ye2edk

Apparently the motivation was to "parse swissprot fields in genpept  
file (dbsource)"?

It clearly looks wrong to add the version. You've probably had a  
reason why you did this at the time but if we (you :) can't recover  
that I guess it's best to just fix it to do the right thing (in both  
places obviously).

	-hilmar

On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:

> Well there is explicit addition of the version to the primary id so  
> it isn't so much a parsing error as a deliberate decision to append  
> it.
> see Bio::SeqIO::genbank
>
> to make the dblink
>                                               $annotation- 
> >add_Annotation
>                                                     ('dblink',
>                                                       
> Bio::Annotation::DBLink->new
>                                                      (-primary_id  
> => $id . "." . $version,
>                                                       -version =>  
> $version,
>                                                       -database =>  
> $db,
>                                                       -tagname =>  
> 'dblink'));
>
> and the code to print the dblink back out in the writer already  
> assumes the version number is appended...
>
>         foreach my $ref ( $seq->annotation->get_Annotations 
> ('dblink') ) {
>             # if ($ref->comment eq 'DBSOURCE') {
>             $self->_print('DBSOURCE    accession ',
>                           $ref->primary_id, "\n");
>             # }
>         }
>
> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
>
>> Here is the overload code:
>>
>> use overload '""' => sub {
>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
>> 	|| '' };
>>
>> Except that the last '||' is redundant and unnecessary (it either  
>> does nothing or replaces an empty string with an empty string), I  
>> don't see the potential for duplicating the version number here -  
>> unless primary_id() did that, which I don't see it doing.
>>
>> So, to me this seems to come from a parsing error in the  
>> beginning, rather than an erroneous mangling of version into  
>> primary_id later.
>>
>> Is someone in the position to confirm this?
>>
>> 	-hilmar
>>
>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>>
>>> So I'm unsure what we should do here.
>>>
>>> We can certainly fix the problem which you report which is  
>>> relying on
>>> the "" method -- if you were to do instead:
>>> print $_->database, ":", $_->primary_id, "\n";
>>>
>>> you'll get the right answer.  We at a minimum just fix the auto-
>>> string converting method to do The Right Thing.
>>>
>>> But I am not sure if we should keep the version out of the  
>>> primary_id
>>> field.  This will require some rejiggering in several modules  
>>> when it
>>> comes to printing DBlinks and I don't want to do this before the
>>> release. I also am not sure if there was an explicit reason why
>>> someone did put the version information in the primary_id. (I  
>>> hope it
>>> wasn't me because I don't think I'm going to remember why).
>>>
>>> Does anyone else have a strong feeling?
>>>
>>> -jason
>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>>
>>>> Hello,
>>>>
>>>> I noticed a little problem with the Annotation "DBLink" from
>>>> GenBank entries
>>>>
>>>> When I run:
>>>>
>>>> perl -MBio::DB::GenBank -e 'my $gi =
>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>>> $seqio =
>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>>> ("dblink");
>>>> for(@annotations) { print $_, "\n";} print $INC{
>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>>
>>>> This yields:
>>>>
>>>>    GenBank:AL591065.17.17
>>>>
>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>>
>>>> Can others repeat this?
>>>>
>>>> I have dug into the source a little and Bio::Annotation::DBLink
>>>> seems to
>>>> be the place where this happens: it has a concatenation which  
>>>> leads to
>>>> that repeated version number.
>>>>
>>>> It this something that I should fix "client-side", so to speak, or
>>>> is it
>>>> worthwhile to add some logic to that concatenation to prevent this?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Eric
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> --
>>> Jason Stajich, PhD
>>> Miller Research Fellow
>>> University of California
>>> Dept of Plant and Microbial Biology
>>> 321 Koshland Hall #3102
>>> Berkeley, CA 94720-3102
>>> lab: 510.642.8441
>>> http://pmb.berkeley.edu/~taylor/people/js.html
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From N.Haigh at sheffield.ac.uk  Thu Oct 19 13:17:33 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 18:17:33 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine>
References: <001601c6f395$8a752ed0$15327e82@pyrimidine>
Message-ID: <1161278253.4537b32dd3d15@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> > I thought I'd have my first proper try at writing some tests. I was
> > wondering if there is a template test file that I should use/study in
> > order to be
> > consistent with other tests.
> > 
> > Failing that - Is there a good test writing style I should follow in one
> > of the other test files?
> > 
> > Thanks
> > Nathan
> 
> I would start with the Test::Simple and Test::More perldoc; they're pretty
> self-explanatory.  You can look at the various test suites using Test::More
> as well for pointers.  By far, most tests will use is().  You can use SKIP
> blocks to skip tests that have a requirement, or skip all tests if they all
> require something.  Pretty flexible.
> 
> We should probably get a wiki page for the developers underway, maybe a
> HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
> DB tests, etc.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 

Just wrote a partial and small test script for t/Amap.t in bioperl-run. When I run "perl -I. t/Amap.t" I get the following output:
1..10
ok 1 - use Bio::Tools::Run::Alignment::Amap;
ok 2 - use Bio::AlignIO;
ok 3 - use Bio::SeqIO;
ok 4 - use Bio::Root::IO;
ok 5 - All the required modules are present
ok 6 - new() returned something
ok 7 -   and its the right class
not ok 8 - executable() got the correct filename
#   Failed test 'executable() got the correct filename'
#   in t/Amap.t at line 90.
#          got: undef
#     expected: 'filename'
ok 9 # skip Got incorrect filename for executable
ok 10 # skip Got incorrect filename for executable
# Looks like you failed 1 test of 10.


So far this looks good (well, that it's failing passing expected tests). However, when i run "make test" the output is unexpected and I don't know
why. It seems to die and produce the results of the testing before the rest of the test suit is run:
t/Amap....................NOK 8
#   Failed test 'executable() got the correct filename'
#   in t/Amap.t at line 90.
#          got: undef
#     expected: 'filename'
# Looks like you failed 1 test of 10.
t/Amap....................dubious
        Test returned status 1 (wstat 256, 0x100)
DIED. FAILED test 8
        Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, 70.00%)
t/Analysis_soap...........ok 7/17make: *** wait: No child processes.  Stop.


Is there something I'm missing?? If it's something less obvious, let me know and i'll post whole test file.
Nath


From cjfields at uiuc.edu  Thu Oct 19 13:26:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 12:26:45 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161278253.4537b32dd3d15@webmail.shef.ac.uk>
Message-ID: <002001c6f3a3$c00b9080$15327e82@pyrimidine>

...
> Just wrote a partial and small test script for t/Amap.t in bioperl-run.
> When I run "perl -I. t/Amap.t" I get the following output:
> 1..10
> ok 1 - use Bio::Tools::Run::Alignment::Amap;
> ok 2 - use Bio::AlignIO;
> ok 3 - use Bio::SeqIO;
> ok 4 - use Bio::Root::IO;
> ok 5 - All the required modules are present
> ok 6 - new() returned something
> ok 7 -   and its the right class
> not ok 8 - executable() got the correct filename
> #   Failed test 'executable() got the correct filename'
> #   in t/Amap.t at line 90.
> #          got: undef
> #     expected: 'filename'
> ok 9 # skip Got incorrect filename for executable
> ok 10 # skip Got incorrect filename for executable
> # Looks like you failed 1 test of 10.
> 
> 
> So far this looks good (well, that it's failing passing expected tests).
> However, when i run "make test" the output is unexpected and I don't know
> why. It seems to die and produce the results of the testing before the
> rest of the test suit is run:
> t/Amap....................NOK 8
> #   Failed test 'executable() got the correct filename'
> #   in t/Amap.t at line 90.
> #          got: undef
> #     expected: 'filename'
> # Looks like you failed 1 test of 10.
> t/Amap....................dubious
>         Test returned status 1 (wstat 256, 0x100)
> DIED. FAILED test 8
>         Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay,
> 70.00%)
> t/Analysis_soap...........ok 7/17make: *** wait: No child processes.
> Stop.
> 
> 
> 
> Is there something I'm missing?? If it's something less obvious, let me
> know and i'll post whole test file.
> Nath

Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
the problem.  The only issue I can think of is that Test::More TODO blocks
require a newer version of Test::Harness (which most users have anyway).
Are you using a TODO block?

You can send me Amap.t and I'll give it a try, but I can't promise I'll get
to it immediately (busy day).

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From N.Haigh at sheffield.ac.uk  Thu Oct 19 13:38:25 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 18:38:25 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161279505.4537b811e143f@webmail.shef.ac.uk>


> Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
> the problem.  The only issue I can think of is that Test::More TODO blocks
> require a newer version of Test::Harness (which most users have anyway).
> Are you using a TODO block?
> 
> You can send me Amap.t and I'll give it a try, but I can't promise I'll get
> to it immediately (busy day).
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 
> 

No TODO blocks.

I must have done something wrong - it's the first time I've seen this - but then again, I don't look that closely at the output of "make test" unless
something shows as a fail. Anyway, below is the short bit of code.

Thanks
Nath

use strict;
use Bio::Root::IO;  # cant test for this, might be needed to get Test::More

BEGIN {
  # Things to do ASAP once the script is run
  # even before anything else in the file is parsed
  use vars qw($NUMTESTS $DEBUG $error);
  $DEBUG = $ENV{'BIOIPERLDEBUG'} || 0;

  # Use installed Test module, otherwise fall back
  # to copy of Test.pm located in the t dir
  eval { require Test::More; };
  if ( $@ ) {
    use lib Bio::Root::IO->catfile('t','lib');
  }

  # Currently no errors
  $error = 0;

  # Setup the number of tests to be run
  # what about using:
  # use Test::More 'no_plan';
  use Test::More;
  $NUMTESTS = 10;
  plan tests => $NUMTESTS;

  # Use modules that are needed in this test that are from
  # any of the Bioperl packages: Bioperl-core, Bioperl-run ... etc
  # use_ok('<module::to::use>');
  use_ok('Bio::Tools::Run::Alignment::Amap');
  use_ok('Bio::AlignIO');
  use_ok('Bio::SeqIO');
  use_ok('Bio::Root::IO');
}

# Multiple END blocks are run in reverse order of their definition
# Last In, First Out (LIFO)
END {
  # Things to do right at the very end, just
  # when the  interpreter finishes/exits
  # E.g. deleting intermediate files produced during the test

  foreach my $file ( qw(cysprot.dnd cysprot1a.dnd) ) {
    unlink $file;
    # check it was deleted

  }
  #unlink qw(cysprot.dnd cysprot1a.dnd)
}

END {
  # Not sure what this is doing?
  #for ( $Test::ntest..$NUMTESTS ) {
  #  skip("Amap program not found. Skipping.\n",1);
  #}
}

# if we got to here, thats OK!
# is this really needed?
ok( 1, 'All the required modules are present');

# setup input files etc
my $inputfilename = Bio::Root::IO->catfile("t","data","cysprot.fa");

# setup output files etc
# none in this test

# setup global objects that are to be used in more than one test
# Also test they were initialised correctly
my @params = ();
my $aln;
my $factory = Bio::Tools::Run::Alignment::Amap->new(@params);
ok( defined $factory,                                  'new() returned something' );
ok( $factory->isa('Bio::Tools::Run::Alignment::Amap'), '  and its the right class' );

# Now onto the nitty gritty tests of the modules methods
my $executable_file = $factory->executable();
#is( $factory->executable(), 'filename',                'executable() got the correct filename' );

# block of tests to skip if you know the tests will fail
# under some condition. E.g.:
#   Need network access,
#   Wont work on particular OS,
#   Cant find the exectuable
# Do not just skip tests that seem to fail for an unknown reason
SKIP: {
  # condition used to skip this block of tests
  #skip($why, $how_many_in_block);
  skip("Got incorrect filename for executable", 2)
    unless is($factory->executable(), 'filename',       'executable() got the correct filename');

  ok( -e $executable_file,                              'Found executable' );
  ok( $factory->version >= 2.0,                         'Code tested on Amap versions >= 2.0' );

}


From jason at bioperl.org  Thu Oct 19 13:44:51 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 10:44:51 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
	<8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
	<7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
Message-ID: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>

Yikes - I was worried that it might have been me.....

Okay I'll look into fixing it -- ChrisF - check in with me before  
diving in, in case I've gotten it done and I expect your enzyme  
assays might take up the time.

-jason
On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:

> Actually you did that Jason: http://tinyurl.com/ye2edk
>
> Apparently the motivation was to "parse swissprot fields in genpept  
> file (dbsource)"?
>
> It clearly looks wrong to add the version. You've probably had a  
> reason why you did this at the time but if we (you :) can't recover  
> that I guess it's best to just fix it to do the right thing (in  
> both places obviously).
>
> 	-hilmar
>
> On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
>
>> Well there is explicit addition of the version to the primary id  
>> so it isn't so much a parsing error as a deliberate decision to  
>> append it.
>> see Bio::SeqIO::genbank
>>
>> to make the dblink
>>                                               $annotation- 
>> >add_Annotation
>>                                                     ('dblink',
>>                                                       
>> Bio::Annotation::DBLink->new
>>                                                      (-primary_id  
>> => $id . "." . $version,
>>                                                       -version =>  
>> $version,
>>                                                       -database =>  
>> $db,
>>                                                       -tagname =>  
>> 'dblink'));
>>
>> and the code to print the dblink back out in the writer already  
>> assumes the version number is appended...
>>
>>         foreach my $ref ( $seq->annotation->get_Annotations 
>> ('dblink') ) {
>>             # if ($ref->comment eq 'DBSOURCE') {
>>             $self->_print('DBSOURCE    accession ',
>>                           $ref->primary_id, "\n");
>>             # }
>>         }
>>
>> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
>>
>>> Here is the overload code:
>>>
>>> use overload '""' => sub {
>>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
>>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
>>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
>>> 	|| '' };
>>>
>>> Except that the last '||' is redundant and unnecessary (it either  
>>> does nothing or replaces an empty string with an empty string), I  
>>> don't see the potential for duplicating the version number here -  
>>> unless primary_id() did that, which I don't see it doing.
>>>
>>> So, to me this seems to come from a parsing error in the  
>>> beginning, rather than an erroneous mangling of version into  
>>> primary_id later.
>>>
>>> Is someone in the position to confirm this?
>>>
>>> 	-hilmar
>>>
>>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>>>
>>>> So I'm unsure what we should do here.
>>>>
>>>> We can certainly fix the problem which you report which is  
>>>> relying on
>>>> the "" method -- if you were to do instead:
>>>> print $_->database, ":", $_->primary_id, "\n";
>>>>
>>>> you'll get the right answer.  We at a minimum just fix the auto-
>>>> string converting method to do The Right Thing.
>>>>
>>>> But I am not sure if we should keep the version out of the  
>>>> primary_id
>>>> field.  This will require some rejiggering in several modules  
>>>> when it
>>>> comes to printing DBlinks and I don't want to do this before the
>>>> release. I also am not sure if there was an explicit reason why
>>>> someone did put the version information in the primary_id. (I  
>>>> hope it
>>>> wasn't me because I don't think I'm going to remember why).
>>>>
>>>> Does anyone else have a strong feeling?
>>>>
>>>> -jason
>>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I noticed a little problem with the Annotation "DBLink" from
>>>>> GenBank entries
>>>>>
>>>>> When I run:
>>>>>
>>>>> perl -MBio::DB::GenBank -e 'my $gi =
>>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>>>> $seqio =
>>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>>>> ("dblink");
>>>>> for(@annotations) { print $_, "\n";} print $INC{
>>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>>>
>>>>> This yields:
>>>>>
>>>>>    GenBank:AL591065.17.17
>>>>>
>>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>>>
>>>>> Can others repeat this?
>>>>>
>>>>> I have dug into the source a little and Bio::Annotation::DBLink
>>>>> seems to
>>>>> be the place where this happens: it has a concatenation which  
>>>>> leads to
>>>>> that repeated version number.
>>>>>
>>>>> It this something that I should fix "client-side", so to speak, or
>>>>> is it
>>>>> worthwhile to add some logic to that concatenation to prevent  
>>>>> this?
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Eric
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> --
>>>> Jason Stajich, PhD
>>>> Miller Research Fellow
>>>> University of California
>>>> Dept of Plant and Microbial Biology
>>>> 321 Koshland Hall #3102
>>>> Berkeley, CA 94720-3102
>>>> lab: 510.642.8441
>>>> http://pmb.berkeley.edu/~taylor/people/js.html
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Jason Stajich, PhD
>> Miller Research Fellow
>> University of California
>> Dept of Plant and Microbial Biology
>> 321 Koshland Hall #3102
>> Berkeley, CA 94720-3102
>> lab: 510.642.8441
>> http://pmb.berkeley.edu/~taylor/people/js.html
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From cjfields at uiuc.edu  Thu Oct 19 14:03:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:03:52 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
Message-ID: <000001c6f3a8$f0a46a00$15327e82@pyrimidine>

Also seems that the DBSOURCE line isn't caught correctly and stuffs it by
default into a GenBank dblink (the dbsource ihn the test case is EMBL, not
GenBank).  

http://bugzilla.open-bio.org/show_bug.cgi?id=2124

It looks like NCBI may be now using:

DBSOURCE    embl accession Z49548.1

instead of the old version:

DBSOURCE    embl locus SCYJR048W, accession Z49548.1

I don't recall NCBI mentioning changes regarding DBSOURCE in any of the
recent release notes.

Chris

> Actually you did that Jason: http://tinyurl.com/ye2edk
> 
> Apparently the motivation was to "parse swissprot fields in genpept
> file (dbsource)"?
> 
> It clearly looks wrong to add the version. You've probably had a
> reason why you did this at the time but if we (you :) can't recover
> that I guess it's best to just fix it to do the right thing (in both
> places obviously).
> 
> 	-hilmar
> 
> On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> 
> > Well there is explicit addition of the version to the primary id so
> > it isn't so much a parsing error as a deliberate decision to append
> > it.
> > see Bio::SeqIO::genbank
> >
> > to make the dblink
> >                                               $annotation-
> > >add_Annotation
> >                                                     ('dblink',
> >
> > Bio::Annotation::DBLink->new
> >                                                      (-primary_id
> > => $id . "." . $version,
> >                                                       -version =>
> > $version,
> >                                                       -database =>
> > $db,
> >                                                       -tagname =>
> > 'dblink'));
> >
> > and the code to print the dblink back out in the writer already
> > assumes the version number is appended...
> >
> >         foreach my $ref ( $seq->annotation->get_Annotations
> > ('dblink') ) {
> >             # if ($ref->comment eq 'DBSOURCE') {
> >             $self->_print('DBSOURCE    accession ',
> >                           $ref->primary_id, "\n");
> >             # }
> >         }
> >
> > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >
> >> Here is the overload code:
> >>
> >> use overload '""' => sub {
> >> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >> 	|| '' };
> >>
> >> Except that the last '||' is redundant and unnecessary (it either
> >> does nothing or replaces an empty string with an empty string), I
> >> don't see the potential for duplicating the version number here -
> >> unless primary_id() did that, which I don't see it doing.
> >>
> >> So, to me this seems to come from a parsing error in the
> >> beginning, rather than an erroneous mangling of version into
> >> primary_id later.
> >>
> >> Is someone in the position to confirm this?
> >>
> >> 	-hilmar
> >>
> >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>
> >>> So I'm unsure what we should do here.
> >>>
> >>> We can certainly fix the problem which you report which is
> >>> relying on
> >>> the "" method -- if you were to do instead:
> >>> print $_->database, ":", $_->primary_id, "\n";
> >>>
> >>> you'll get the right answer.  We at a minimum just fix the auto-
> >>> string converting method to do The Right Thing.
> >>>
> >>> But I am not sure if we should keep the version out of the
> >>> primary_id
> >>> field.  This will require some rejiggering in several modules
> >>> when it
> >>> comes to printing DBlinks and I don't want to do this before the
> >>> release. I also am not sure if there was an explicit reason why
> >>> someone did put the version information in the primary_id. (I
> >>> hope it
> >>> wasn't me because I don't think I'm going to remember why).
> >>>
> >>> Does anyone else have a strong feeling?
> >>>
> >>> -jason
> >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> I noticed a little problem with the Annotation "DBLink" from
> >>>> GenBank entries
> >>>>
> >>>> When I run:
> >>>>
> >>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>> $seqio =
> >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>> ("dblink");
> >>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>
> >>>> This yields:
> >>>>
> >>>>    GenBank:AL591065.17.17
> >>>>
> >>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>
> >>>> Can others repeat this?
> >>>>
> >>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>> seems to
> >>>> be the place where this happens: it has a concatenation which
> >>>> leads to
> >>>> that repeated version number.
> >>>>
> >>>> It this something that I should fix "client-side", so to speak, or
> >>>> is it
> >>>> worthwhile to add some logic to that concatenation to prevent this?
> >>>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>> --
> >>> Jason Stajich, PhD
> >>> Miller Research Fellow
> >>> University of California
> >>> Dept of Plant and Microbial Biology
> >>> 321 Koshland Hall #3102
> >>> Berkeley, CA 94720-3102
> >>> lab: 510.642.8441
> >>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >> --
> >> ===========================================================
> >> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >> ===========================================================
> >>
> >>
> >>
> >>
> >>
> >
> > --
> > Jason Stajich, PhD
> > Miller Research Fellow
> > University of California
> > Dept of Plant and Microbial Biology
> > 321 Koshland Hall #3102
> > Berkeley, CA 94720-3102
> > lab: 510.642.8441
> > http://pmb.berkeley.edu/~taylor/people/js.html
> >
> >
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From N.Haigh at sheffield.ac.uk  Thu Oct 19 14:06:11 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:06:11 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161281171.4537be93b63c9@webmail.shef.ac.uk>


> 
> Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
> the problem.  The only issue I can think of is that Test::More TODO blocks
> require a newer version of Test::Harness (which most users have anyway).
> Are you using a TODO block?
> 
> You can send me Amap.t and I'll give it a try, but I can't promise I'll get
> to it immediately (busy day).
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 

Nevermind about this - It's working as expected!

I got confused as a previous run threw errors but wasn't included in the final table of failed tests - working now.

Nath 


From N.Haigh at sheffield.ac.uk  Thu Oct 19 14:14:54 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:14:54 +0100
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>

I have a few questions about How bioperl-run modules.

1) How do modules define what the name of the executable is that it uses?
2) Is there a way to test what this is?
3) Does $factory->executable return this or does it only return the name if it successfully found it?

Thanks
Nath


From cjfields at uiuc.edu  Thu Oct 19 14:15:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:15:08 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>
Message-ID: <000001c6f3aa$82845ba0$15327e82@pyrimidine>

Go for it.  I haven't got the time to spare at the moment, sucky protein
assays....

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Jason Stajich
> Sent: Thursday, October 19, 2006 12:45 PM
> To: Hilmar Lapp
> Cc: bioperl-l at lists.open-bio.org; Erikjan
> Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating
> 
> Yikes - I was worried that it might have been me.....
> 
> Okay I'll look into fixing it -- ChrisF - check in with me before
> diving in, in case I've gotten it done and I expect your enzyme
> assays might take up the time.
> 
> -jason
> On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:
> 
> > Actually you did that Jason: http://tinyurl.com/ye2edk
> >
> > Apparently the motivation was to "parse swissprot fields in genpept
> > file (dbsource)"?
> >
> > It clearly looks wrong to add the version. You've probably had a
> > reason why you did this at the time but if we (you :) can't recover
> > that I guess it's best to just fix it to do the right thing (in
> > both places obviously).
> >
> > 	-hilmar
> >
> > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> >
> >> Well there is explicit addition of the version to the primary id
> >> so it isn't so much a parsing error as a deliberate decision to
> >> append it.
> >> see Bio::SeqIO::genbank
> >>
> >> to make the dblink
> >>                                               $annotation-
> >> >add_Annotation
> >>                                                     ('dblink',
> >>
> >> Bio::Annotation::DBLink->new
> >>                                                      (-primary_id
> >> => $id . "." . $version,
> >>                                                       -version =>
> >> $version,
> >>                                                       -database =>
> >> $db,
> >>                                                       -tagname =>
> >> 'dblink'));
> >>
> >> and the code to print the dblink back out in the writer already
> >> assumes the version number is appended...
> >>
> >>         foreach my $ref ( $seq->annotation->get_Annotations
> >> ('dblink') ) {
> >>             # if ($ref->comment eq 'DBSOURCE') {
> >>             $self->_print('DBSOURCE    accession ',
> >>                           $ref->primary_id, "\n");
> >>             # }
> >>         }
> >>
> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >>
> >>> Here is the overload code:
> >>>
> >>> use overload '""' => sub {
> >>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >>> 	|| '' };
> >>>
> >>> Except that the last '||' is redundant and unnecessary (it either
> >>> does nothing or replaces an empty string with an empty string), I
> >>> don't see the potential for duplicating the version number here -
> >>> unless primary_id() did that, which I don't see it doing.
> >>>
> >>> So, to me this seems to come from a parsing error in the
> >>> beginning, rather than an erroneous mangling of version into
> >>> primary_id later.
> >>>
> >>> Is someone in the position to confirm this?
> >>>
> >>> 	-hilmar
> >>>
> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>>
> >>>> So I'm unsure what we should do here.
> >>>>
> >>>> We can certainly fix the problem which you report which is
> >>>> relying on
> >>>> the "" method -- if you were to do instead:
> >>>> print $_->database, ":", $_->primary_id, "\n";
> >>>>
> >>>> you'll get the right answer.  We at a minimum just fix the auto-
> >>>> string converting method to do The Right Thing.
> >>>>
> >>>> But I am not sure if we should keep the version out of the
> >>>> primary_id
> >>>> field.  This will require some rejiggering in several modules
> >>>> when it
> >>>> comes to printing DBlinks and I don't want to do this before the
> >>>> release. I also am not sure if there was an explicit reason why
> >>>> someone did put the version information in the primary_id. (I
> >>>> hope it
> >>>> wasn't me because I don't think I'm going to remember why).
> >>>>
> >>>> Does anyone else have a strong feeling?
> >>>>
> >>>> -jason
> >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> I noticed a little problem with the Annotation "DBLink" from
> >>>>> GenBank entries
> >>>>>
> >>>>> When I run:
> >>>>>
> >>>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>>> $seqio =
> >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>>> ("dblink");
> >>>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>>
> >>>>> This yields:
> >>>>>
> >>>>>    GenBank:AL591065.17.17
> >>>>>
> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>>
> >>>>> Can others repeat this?
> >>>>>
> >>>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>>> seems to
> >>>>> be the place where this happens: it has a concatenation which
> >>>>> leads to
> >>>>> that repeated version number.
> >>>>>
> >>>>> It this something that I should fix "client-side", so to speak, or
> >>>>> is it
> >>>>> worthwhile to add some logic to that concatenation to prevent
> >>>>> this?
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l at lists.open-bio.org
> >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>> --
> >>>> Jason Stajich, PhD
> >>>> Miller Research Fellow
> >>>> University of California
> >>>> Dept of Plant and Microbial Biology
> >>>> 321 Koshland Hall #3102
> >>>> Berkeley, CA 94720-3102
> >>>> lab: 510.642.8441
> >>>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>
> >>> --
> >>> ===========================================================
> >>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >>> ===========================================================
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Jason Stajich, PhD
> >> Miller Research Fellow
> >> University of California
> >> Dept of Plant and Microbial Biology
> >> 321 Koshland Hall #3102
> >> Berkeley, CA 94720-3102
> >> lab: 510.642.8441
> >> http://pmb.berkeley.edu/~taylor/people/js.html
> >>
> >>
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Thu Oct 19 14:35:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:35:08 -0500
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
Message-ID: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>

I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase
but I'm not sure.  I haven't used them very much myself but plan on making
wrappers at some point soon for some programs I use.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Nathan Haigh [mailto:N.Haigh at sheffield.ac.uk]
> Sent: Thursday, October 19, 2006 1:15 PM
> To: Chris Fields
> Cc: 'bioperl-l'
> Subject: bioperl-run executable
> 
> I have a few questions about How bioperl-run modules.
> 
> 1) How do modules define what the name of the executable is that it uses?
> 2) Is there a way to test what this is?
> 3) Does $factory->executable return this or does it only return the name
> if it successfully found it?
> 
> Thanks
> Nath


From N.Haigh at sheffield.ac.uk  Thu Oct 19 14:47:01 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:47:01 +0100
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
Message-ID: <1161283620.4537c82501c43@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase
> but I'm not sure.  I haven't used them very much myself but plan on making
> wrappers at some point soon for some programs I use.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 

On closer inspection of a couple of other modules (Clustalw.pm and TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME and have a sub
(program_name) that simply returns this value. I'd like to see the program_name become a getter/setter so users can change the default and have the
string stored in the factory object.

Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core not bioperl-run? I suppose not since bioperl-core is a prerep for bioperl-run but
wouldn't it make sence to go in bioperl-run?

Nath


From cjfields at uiuc.edu  Thu Oct 19 15:07:05 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 14:07:05 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>
Message-ID: <000701c6f3b1$c5914230$15327e82@pyrimidine>

Jason, Hilmar, 

How about changing the default parsed dblink in SeqIO::genbank (line 520) to

		if( $dbsource =~ /^(\S*?)\s*accession\s+(\S+)\.(\d+)/ ) {
		    my ($db,$id,$version) = ($1,$2,$3);
		    $annotation->add_Annotation
			('dblink',
			 Bio::Annotation::DBLink->new
			 (-primary_id => $id,
			  -version => $version,
			  -database => $db || 'GenBank',
			  -tagname => 'dblink'));
		} 

It passes tests and catches the optional database ('embl' for the bugzilla
report).  The output sequence still doesn't print the DB if it isn't GenBank
via write_seq(), but that should be too hard to fix (famous last words).

Okay, okay, back to the assays...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Jason Stajich
> Sent: Thursday, October 19, 2006 12:45 PM
> To: Hilmar Lapp
> Cc: bioperl-l at lists.open-bio.org; Erikjan
> Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating
> 
> Yikes - I was worried that it might have been me.....
> 
> Okay I'll look into fixing it -- ChrisF - check in with me before
> diving in, in case I've gotten it done and I expect your enzyme
> assays might take up the time.
> 
> -jason
> On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:
> 
> > Actually you did that Jason: http://tinyurl.com/ye2edk
> >
> > Apparently the motivation was to "parse swissprot fields in genpept
> > file (dbsource)"?
> >
> > It clearly looks wrong to add the version. You've probably had a
> > reason why you did this at the time but if we (you :) can't recover
> > that I guess it's best to just fix it to do the right thing (in
> > both places obviously).
> >
> > 	-hilmar
> >
> > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> >
> >> Well there is explicit addition of the version to the primary id
> >> so it isn't so much a parsing error as a deliberate decision to
> >> append it.
> >> see Bio::SeqIO::genbank
> >>
> >> to make the dblink
> >>                                               $annotation-
> >> >add_Annotation
> >>                                                     ('dblink',
> >>
> >> Bio::Annotation::DBLink->new
> >>                                                      (-primary_id
> >> => $id . "." . $version,
> >>                                                       -version =>
> >> $version,
> >>                                                       -database =>
> >> $db,
> >>                                                       -tagname =>
> >> 'dblink'));
> >>
> >> and the code to print the dblink back out in the writer already
> >> assumes the version number is appended...
> >>
> >>         foreach my $ref ( $seq->annotation->get_Annotations
> >> ('dblink') ) {
> >>             # if ($ref->comment eq 'DBSOURCE') {
> >>             $self->_print('DBSOURCE    accession ',
> >>                           $ref->primary_id, "\n");
> >>             # }
> >>         }
> >>
> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >>
> >>> Here is the overload code:
> >>>
> >>> use overload '""' => sub {
> >>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >>> 	|| '' };
> >>>
> >>> Except that the last '||' is redundant and unnecessary (it either
> >>> does nothing or replaces an empty string with an empty string), I
> >>> don't see the potential for duplicating the version number here -
> >>> unless primary_id() did that, which I don't see it doing.
> >>>
> >>> So, to me this seems to come from a parsing error in the
> >>> beginning, rather than an erroneous mangling of version into
> >>> primary_id later.
> >>>
> >>> Is someone in the position to confirm this?
> >>>
> >>> 	-hilmar
> >>>
> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>>
> >>>> So I'm unsure what we should do here.
> >>>>
> >>>> We can certainly fix the problem which you report which is
> >>>> relying on
> >>>> the "" method -- if you were to do instead:
> >>>> print $_->database, ":", $_->primary_id, "\n";
> >>>>
> >>>> you'll get the right answer.  We at a minimum just fix the auto-
> >>>> string converting method to do The Right Thing.
> >>>>
> >>>> But I am not sure if we should keep the version out of the
> >>>> primary_id
> >>>> field.  This will require some rejiggering in several modules
> >>>> when it
> >>>> comes to printing DBlinks and I don't want to do this before the
> >>>> release. I also am not sure if there was an explicit reason why
> >>>> someone did put the version information in the primary_id. (I
> >>>> hope it
> >>>> wasn't me because I don't think I'm going to remember why).
> >>>>
> >>>> Does anyone else have a strong feeling?
> >>>>
> >>>> -jason
> >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> I noticed a little problem with the Annotation "DBLink" from
> >>>>> GenBank entries
> >>>>>
> >>>>> When I run:
> >>>>>
> >>>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>>> $seqio =
> >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>>> ("dblink");
> >>>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>>
> >>>>> This yields:
> >>>>>
> >>>>>    GenBank:AL591065.17.17
> >>>>>
> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>>
> >>>>> Can others repeat this?
> >>>>>
> >>>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>>> seems to
> >>>>> be the place where this happens: it has a concatenation which
> >>>>> leads to
> >>>>> that repeated version number.
> >>>>>
> >>>>> It this something that I should fix "client-side", so to speak, or
> >>>>> is it
> >>>>> worthwhile to add some logic to that concatenation to prevent
> >>>>> this?
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l at lists.open-bio.org
> >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>> --
> >>>> Jason Stajich, PhD
> >>>> Miller Research Fellow
> >>>> University of California
> >>>> Dept of Plant and Microbial Biology
> >>>> 321 Koshland Hall #3102
> >>>> Berkeley, CA 94720-3102
> >>>> lab: 510.642.8441
> >>>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>
> >>> --
> >>> ===========================================================
> >>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >>> ===========================================================
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Jason Stajich, PhD
> >> Miller Research Fellow
> >> University of California
> >> Dept of Plant and Microbial Biology
> >> 321 Koshland Hall #3102
> >> Berkeley, CA 94720-3102
> >> lab: 510.642.8441
> >> http://pmb.berkeley.edu/~taylor/people/js.html
> >>
> >>
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Thu Oct 19 14:48:28 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 11:48:28 -0700
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
	<1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
Message-ID: <67650240-D61B-4842-AE7C-75F15F608F6F@bioperl.org>

program_name()
  Should return the name of the program

executable()
  Is a function that you don't have to mess with that tries to find  
the executable named  program_name() based on your PATH.


-jason
On Oct 19, 2006, at 11:14 AM, Nathan Haigh wrote:

> I have a few questions about How bioperl-run modules.
>
> 1) How do modules define what the name of the executable is that it  
> uses?
> 2) Is there a way to test what this is?
> 3) Does $factory->executable return this or does it only return the  
> name if it successfully found it?
>
> Thanks
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Thu Oct 19 17:06:43 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 14:06:43 -0700
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161283620.4537c82501c43@webmail.shef.ac.uk>
References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
	<1161283620.4537c82501c43@webmail.shef.ac.uk>
Message-ID: <AA1A41EC-C0E1-49C3-818E-64210971E331@bioperl.org>

It can be reset now but of course this not a very nice way of doing it:

$Bio::Tools::Run::Alignment::Clustalw::PROGRAM_NAME = 'clustalw_smp';

I am not sure if there are pros and cons to making it a getter- 
setter, but if you want to run with it, please do.

The whole run system has been hard to keep people adhering to a  
standard (and the standard has changed a bit) so some auditing is  
warranted.

-jason

On Oct 19, 2006, at 11:47 AM, Nathan Haigh wrote:

> Quoting Chris Fields <cjfields at uiuc.edu>:
>
>> I think a lot of the bioperl-run modules use  
>> Bio::Tools::Run::WrapperBase
>> but I'm not sure.  I haven't used them very much myself but plan  
>> on making
>> wrappers at some point soon for some programs I use.
>>
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>
> On closer inspection of a couple of other modules (Clustalw.pm and  
> TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME  
> and have a sub
> (program_name) that simply returns this value. I'd like to see the  
> program_name become a getter/setter so users can change the default  
> and have the
> string stored in the factory object.
>
> Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core  
> not bioperl-run? I suppose not since bioperl-core is a prerep for  
> bioperl-run but
> wouldn't it make sence to go in bioperl-run?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From torsten.seemann at infotech.monash.edu.au  Thu Oct 19 19:24:03 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 20 Oct 2006 09:24:03 +1000
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161279505.4537b811e143f@webmail.shef.ac.uk>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
	<1161279505.4537b811e143f@webmail.shef.ac.uk>
Message-ID: <45380913.3070506@infotech.monash.edu.au>

Nathan,

> use strict;
> use Bio::Root::IO;  # cant test for this, might be needed to get Test::More

use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
and File::Spec is "guaranteed" to be installed with Perl 5.6+.

>     use lib Bio::Root::IO->catfile('t','lib');

Simpler as:
	use lib 't/lib';
I understand the 'lib.pm' accepts Unix style directories REGARDLESS of native 
platform.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia


From prabubio at gmail.com  Thu Oct 19 20:11:36 2006
From: prabubio at gmail.com (Prabu Raja)
Date: 20 Oct 2006 00:11:36 -0000
Subject: [Bioperl-l] Prabu Raja sent you this link
Message-ID: <20061020001136.86586.qmail@x05.namesdatabase.com>

Remember your link from Prabu Raja:

http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2


1 -> Use Prabu Raja's link by clicking above.

2 -> Enter your info for a membership connected to Prabu.

3 -> Share links with other friends, family and co-workers.

4 -> Use the members-only people search tools.

Prabu selected you for this on 09-02-2004 22:52 ET.


prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-bio.org
at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99.
If you do not know a Prabu Raja, use http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more reminders about this.
For reference, the address of The Names Database is 1253 N. Research Way, Suite Q-2500, Orem, UT 84097.


From cjfields at uiuc.edu  Thu Oct 19 20:29:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 19:29:11 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <45380913.3070506@infotech.monash.edu.au>
Message-ID: <000f01c6f3de$c3d91170$15327e82@pyrimidine>

> Nathan,
> 
> > use strict;
> > use Bio::Root::IO;  # cant test for this, might be needed to get
> Test::More
> 
> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
> and File::Spec is "guaranteed" to be installed with Perl 5.6+.
> 
> >     use lib Bio::Root::IO->catfile('t','lib');
> 
> Simpler as:
> 	use lib 't/lib';
> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of
> native
> platform.
> 
> --
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia

That is true, at least for WinXP (not sure about older Windows versions out
there).  I was using 'Root::IO->catfile' but found 'use lib 't/lib' works.
I may have a few of the 'catfile' versions floating around out there, which
may be where that originated.

Note that if you plan on using Test::More with the bioperl-run test suite,
you should add it to the bioperl-run CVS distribution directory in 't/lib'.
Most people will have it installed, but you never know.

Chris


From cjfields at uiuc.edu  Thu Oct 19 20:33:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 19:33:22 -0500
Subject: [Bioperl-l] Prabu Raja sent you this link
In-Reply-To: <20061020001136.86586.qmail@x05.namesdatabase.com>
Message-ID: <001001c6f3df$598a24c0$15327e82@pyrimidine>

That Prabu Raja sure gets around...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Prabu Raja
> Sent: Thursday, October 19, 2006 7:12 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Prabu Raja sent you this link
> 
> Remember your link from Prabu Raja:
> 
> http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2
> 
> 
> 1 -> Use Prabu Raja's link by clicking above.
> 
> 2 -> Enter your info for a membership connected to Prabu.
> 
> 3 -> Share links with other friends, family and co-workers.
> 
> 4 -> Use the members-only people search tools.
> 
> Prabu selected you for this on 09-02-2004 22:52 ET.
> 
> 
> prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-
> bio.org
> at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99.
> If you do not know a Prabu Raja, use
> http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more
> reminders about this.
> For reference, the address of The Names Database is 1253 N. Research Way,
> Suite Q-2500, Orem, UT 84097.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From keithplayer at hotmail.com  Thu Oct 19 22:13:52 2006
From: keithplayer at hotmail.com (Keith Player)
Date: Fri, 20 Oct 2006 02:13:52 +0000 (UTC)
Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning
Message-ID: <loom.20061020T041338-193@post.gmane.org>

I know that there may be some changes resulting from new GFF3 implementations, 
but thought I would see if the following is useful anyway.

I implemented the R-tree binning schema as used by Bio::DB::GFF::Util::Binning 
and as mention in this article:

I tested the following query on a normal table (no binning), but it assumes 
that you know the longest range in the table.  So for example with a table of 
human genes, where the longest gene we know of is around 2.4Mb.

 SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) AND 
g.start < [end] AND g.end > [start] AND g.chromosome = '1'

so for 100Mb:101Mb

SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < 
101000000 AND g.end > 100000000 AND g.chromosome = '1'


where [start] and [end] define the region of interest.  This query outperforms 
the R-Tree implementation on all tests that I have performed (for lengths of 
200bp to 10Mb across a whole chromsome).  Could this be of some practical use?


From jason at bioperl.org  Thu Oct 19 11:50:49 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 08:50:49 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
Message-ID: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>

Well there is explicit addition of the version to the primary id so  
it isn't so much a parsing error as a deliberate decision to append it.
see Bio::SeqIO::genbank

to make the dblink
                                               $annotation- 
 >add_Annotation
                                                     ('dblink',
                                                       
Bio::Annotation::DBLink->new
                                                      (-primary_id =>  
$id . "." . $version,
                                                       -version =>  
$version,
                                                       -database => $db,
                                                       -tagname =>  
'dblink'));

and the code to print the dblink back out in the writer already  
assumes the version number is appended...

         foreach my $ref ( $seq->annotation->get_Annotations 
('dblink') ) {
             # if ($ref->comment eq 'DBSOURCE') {
             $self->_print('DBSOURCE    accession ',
                           $ref->primary_id, "\n");
             # }
         }

On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:

> Here is the overload code:
>
> use overload '""' => sub {
> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> 	|| '' };
>
> Except that the last '||' is redundant and unnecessary (it either  
> does nothing or replaces an empty string with an empty string), I  
> don't see the potential for duplicating the version number here -  
> unless primary_id() did that, which I don't see it doing.
>
> So, to me this seems to come from a parsing error in the beginning,  
> rather than an erroneous mangling of version into primary_id later.
>
> Is someone in the position to confirm this?
>
> 	-hilmar
>
> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>
>> So I'm unsure what we should do here.
>>
>> We can certainly fix the problem which you report which is relying on
>> the "" method -- if you were to do instead:
>> print $_->database, ":", $_->primary_id, "\n";
>>
>> you'll get the right answer.  We at a minimum just fix the auto-
>> string converting method to do The Right Thing.
>>
>> But I am not sure if we should keep the version out of the primary_id
>> field.  This will require some rejiggering in several modules when it
>> comes to printing DBlinks and I don't want to do this before the
>> release. I also am not sure if there was an explicit reason why
>> someone did put the version information in the primary_id. (I hope it
>> wasn't me because I don't think I'm going to remember why).
>>
>> Does anyone else have a strong feeling?
>>
>> -jason
>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>
>>> Hello,
>>>
>>> I noticed a little problem with the Annotation "DBLink" from
>>> GenBank entries
>>>
>>> When I run:
>>>
>>> perl -MBio::DB::GenBank -e 'my $gi =
>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>> $seqio =
>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>> ("dblink");
>>> for(@annotations) { print $_, "\n";} print $INC{
>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>
>>> This yields:
>>>
>>>    GenBank:AL591065.17.17
>>>
>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>
>>> Can others repeat this?
>>>
>>> I have dug into the source a little and Bio::Annotation::DBLink
>>> seems to
>>> be the place where this happens: it has a concatenation which  
>>> leads to
>>> that repeated version number.
>>>
>>> It this something that I should fix "client-side", so to speak, or
>>> is it
>>> worthwhile to add some logic to that concatenation to prevent this?
>>>
>>>
>>> Thanks,
>>>
>>> Eric
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich, PhD
>> Miller Research Fellow
>> University of California
>> Dept of Plant and Microbial Biology
>> 321 Koshland Hall #3102
>> Berkeley, CA 94720-3102
>> lab: 510.642.8441
>> http://pmb.berkeley.edu/~taylor/people/js.html
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From n.haigh at sheffield.ac.uk  Fri Oct 20 04:35:03 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 20 Oct 2006 08:35:03 +0000
Subject: [Bioperl-l] test::more template
In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
Message-ID: <45388A37.7040505@sheffield.ac.uk>

Chris Fields wrote:
>> Nathan,
>>
>>     
>>> use strict;
>>> use Bio::Root::IO;  # cant test for this, might be needed to get
>>>       
>> Test::More
>>
>> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
>> and File::Spec is "guaranteed" to be installed with Perl 5.6+.
>>
>>     
>>>     use lib Bio::Root::IO->catfile('t','lib');
>>>       
>> Simpler as:
>> 	use lib 't/lib';
>> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of
>> native
>> platform.
>>
>> --
>> Torsten Seemann
>> Victorian Bioinformatics Consortium, Monash University, Australia
>>     
>
> That is true, at least for WinXP (not sure about older Windows versions out
> there).  I was using 'Root::IO->catfile' but found 'use lib 't/lib' works.
> I may have a few of the 'catfile' versions floating around out there, which
> may be where that originated.
>
> Note that if you plan on using Test::More with the bioperl-run test suite,
> you should add it to the bioperl-run CVS distribution directory in 't/lib'.
> Most people will have it installed, but you never know.
>
> Chris
>
>
>   
What is the reason for including Test::More in 't/lib' rather than
having it as a prereq?

-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>


From n.haigh at sheffield.ac.uk  Fri Oct 20 05:27:19 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 20 Oct 2006 10:27:19 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
Message-ID: <45389677.1000709@sheffield.ac.uk>

Is it really necessary to specify the number of tests that are to be
conducted in advance? It seems a bit annoying to have to count the
number of tests in the script or to run the test just to see how many
tests were done, we could just use:
use Test::More 'no_plan';

And then it's up to Test::More to keep a track of how many tests it's
run. The only thing then to worry about is how many tests are in a SKIP
block if the skip criteria are met. This is unless there is a good
reason to use it that I am unaware of.

Thanks
Nath


From bix at sendu.me.uk  Fri Oct 20 06:01:09 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 11:01:09 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45389677.1000709@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk>
Message-ID: <45389E65.6080908@sendu.me.uk>

Nathan Haigh wrote:
> Is it really necessary to specify the number of tests that are to be
> conducted in advance? It seems a bit annoying to have to count the
> number of tests in the script or to run the test just to see how many
> tests were done, we could just use:
> use Test::More 'no_plan';

It's very important to have a plan. That way you know all the tests 
actually ran and weren't skipped (either due to an actual SKIP block or 
an if block that returned false due to a bug, or a for/foreach/while 
that didn't loop enough times due to a bug, or any number of other reasons).


From bix at sendu.me.uk  Fri Oct 20 06:04:48 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 11:04:48 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45388A37.7040505@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45388A37.7040505@sheffield.ac.uk>
Message-ID: <45389F40.5060601@sendu.me.uk>

Nathan S. Haigh wrote:
> Chris Fields wrote:
>
>> Note that if you plan on using Test::More with the bioperl-run test suite,
>> you should add it to the bioperl-run CVS distribution directory in 't/lib'.
>> Most people will have it installed, but you never know.
>
> What is the reason for including Test::More in 't/lib' rather than
> having it as a prereq?

Because we want to ensure that the test suite runs and tells you real 
problems (if any) about the code (Bioperl) that it is testing, not 
problems about actually running the tests (which are NOT required for 
using Bioperl, so cannot be considered 'pre-requisites').


From n.haigh at sheffield.ac.uk  Fri Oct 20 06:54:30 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 20 Oct 2006 11:54:30 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45389E65.6080908@sendu.me.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk>
Message-ID: <4538AAE6.5070600@sheffield.ac.uk>

If there are known bugs in a particular version of software, what is the
best approach for dealing with tests that would fail due to this bug?
Simply skip those tests that would be affected by the bug, or to fail if
the affected version is detected and report the reason so the user is
informed? Or simply bump the minimum version to one above the affected
versions?

For example, t/Clustalw has a test for at least version 1.8. It then has
some profile alignment tests that are only run if version > 1.82 is
installed. It states that versions 1.81 and 1.82 are affected by a
profile alignment bug - which i assume would make the tests fail.

Cheers
Nath


From bix at sendu.me.uk  Fri Oct 20 07:06:07 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 12:06:07 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <4538AAE6.5070600@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk>
	<4538AAE6.5070600@sheffield.ac.uk>
Message-ID: <4538AD9F.8040003@sendu.me.uk>

Nathan Haigh wrote:
> If there are known bugs in a particular version of software, what is the
> best approach for dealing with tests that would fail due to this bug?
> Simply skip those tests that would be affected by the bug, or to fail if
> the affected version is detected and report the reason so the user is
> informed? Or simply bump the minimum version to one above the affected
> versions?
> 
> For example, t/Clustalw has a test for at least version 1.8. It then has
> some profile alignment tests that are only run if version > 1.82 is
> installed. It states that versions 1.81 and 1.82 are affected by a
> profile alignment bug - which i assume would make the tests fail.

Specific cases like this, I'd discuss on the list/ with the author of
the module in question. Maybe there is some great need to allow usage
with <1.81?

My view, based purely on what you've said above, bump the pre-requisite
to a version that works.


From cjfields at uiuc.edu  Fri Oct 20 08:36:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 07:36:37 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <45388A37.7040505@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45388A37.7040505@sheffield.ac.uk>
Message-ID: <80A2D210-B0DB-4CD2-9B56-A38097F4F63F@uiuc.edu>


>> ,,,
>>
> What is the reason for including Test::More in 't/lib' rather than
> having it as a prereq?

We could do that.  Many CPAN modules include it in 't/lib' b/c it is  
only needed for testing purposes.

Chris

>
> -- 
>> A: Yes.
>>> Q: Are you sure?
>>>
>>>> A: Because it reverses the logical flow of conversation.
>>>>
>>>>> Q: Why is top posting frowned upon?
>>>>>
> Get Thunderbird <http://www.mozilla.org/products/thunderbird/>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Fri Oct 20 10:44:29 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 15:44:29 +0100
Subject: [Bioperl-l] Updated Makefile.PL
Message-ID: <4538E0CD.1030908@sendu.me.uk>

Hi,
I've just committed an updated Makefile.PL to HEAD for bioperl-live. 
Could some people test it on multiple platforms and confirm it is ok 
(try out the different possible options as well)?

(NB. in the below, 'pre-reqs' are things the makefile considers optional 
dependencies)

Note that some pre-reqs have been removed:
# DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end 
up requiring it but only after the user makes an explicit choice by 
typing 'DBD::mysql' in their own code to supply as an option to Bioperl 
code)
# File::Temp (standard in 5.6.1)


This pre-req was wrong:
# Data::Stag::Writer
and has been replaced with:
Data::Stag::XMLWriter


Also, I note that very many Bioperl modules need IO::String, including 
Bio::SeqIO, so I'm not sure to what extent we can pretend it is an 
optional module. I didn't make any change though.


I don't know if these changes affect the Windows ppm Nathan, or anything 
else (Bundle?)?

The INSTALL docs need updating with these new and improved pre-reqs 
(note that some pre-reqs had wrong/not enough Bioperl modules listed as 
needing them); does someone want to correct the wiki (based on the new 
Makefile.PL) and then Chris can re-create the text version?


From hlapp at gmx.net  Fri Oct 20 11:03:34 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 20 Oct 2006 11:03:34 -0400
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
References: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>


On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote:

> Also, I note that very many Bioperl modules need IO::String, including
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
> optional module. I didn't make any change though.

I agree. There's really not that many terribly useful things you can  
do with Bioperl w/o having IO::String installed, which is in stark  
contrast to many other dependencies.

I don't have a problem with making it (and a few others used all over  
the place) required, to better contrast them with the dependencies  
that are really optional (and not needed for 90% of users).

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct 20 11:18:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 10:18:32 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <001501c6f45b$019103c0$15327e82@pyrimidine>

> Hi,
> I've just committed an updated Makefile.PL to HEAD for bioperl-live.
> Could some people test it on multiple platforms and confirm it is ok
> (try out the different possible options as well)?
> 
> (NB. in the below, 'pre-reqs' are things the makefile considers optional
> dependencies)
> 
> Note that some pre-reqs have been removed:
> # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end
> up requiring it but only after the user makes an explicit choice by
> typing 'DBD::mysql' in their own code to supply as an option to Bioperl
> code)
> # File::Temp (standard in 5.6.1)

I'll try it out on WinXP and Mac OS X.  BTW, do any of Lincoln's Bio::DB*
use DBD::mySQL?  Bio::DB::GFF comes to mind.  I don't think it should be an
absolute requirement, though.

If we plan on removing those, then we should also remove them from
Bundle::Bioperl (if they are present).

> This pre-req was wrong:
> # Data::Stag::Writer
> and has been replaced with:
> Data::Stag::XMLWriter
> 
> 
> Also, I note that very many Bioperl modules need IO::String, including
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
> optional module. I didn't make any change though.

Do they all require IO::String or is it an option?  There are a few
instances (WebDBSeqI-implementing, for instance) where this is presented as
an option for most OS's (along with the default, pipeline, and tempfile).
However, it is currently used by default with Windows due to lack of
pipe/fork support at the time.

BTW, the latter may now work with WinXP ActivePerl.  ActiveState has been
working on WinXP fork() emulation for a while, but I think it is still
somewhat experimental.  

> I don't know if these changes affect the Windows ppm Nathan, or anything
> else (Bundle?)?
> 
> The INSTALL docs need updating with these new and improved pre-reqs
> (note that some pre-reqs had wrong/not enough Bioperl modules listed as
> needing them); does someone want to correct the wiki (based on the new
> Makefile.PL) and then Chris can re-create the text version?

Easier to just modify the text version based on what is changed in the wiki,
at least for the time being.  The text dumping from elinks/lynx isn't
full-proof re: tables and such, which is one reason I think we should move
the prereqs to a separate file as it's easier to maintain long-term (this
seems to be where most changes occur anyway).  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Fri Oct 20 11:23:38 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 16:23:38 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>	<45379BBB.1040400@sheffield.ac.uk>
	<1161270180.453793a432e4f@webmail.shef.ac.uk>
Message-ID: <4538E9FA.60701@sendu.me.uk>

Nathan Haigh wrote:
> I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be
> consistent with other tests.
> 
> Failing that - Is there a good test writing style I should follow in one of the other test files?

I originally based mine on one of Chris's EUtilities tests, but now 
refer to t/ESEfinder.t since it is small and demonstrates all the major 
tricky things you might have to do - skip remote tests if no 
BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests 
under some condition, fall-back to t/lib for Test::More if necessary.

(Though I just spotted an oops in the latter...)


From cjfields at uiuc.edu  Fri Oct 20 11:38:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 10:38:02 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <4538E9FA.60701@sendu.me.uk>
Message-ID: <001601c6f45d$bb824350$15327e82@pyrimidine>

> Nathan Haigh wrote:
> > I thought I'd have my first proper try at writing some tests. I was
> wondering if there is a template test file that I should use/study in
> order to be
> > consistent with other tests.
> >
> > Failing that - Is there a good test writing style I should follow in one
> of the other test files?
> 
> I originally based mine on one of Chris's EUtilities tests, but now
> refer to t/ESEfinder.t since it is small and demonstrates all the major
> tricky things you might have to do - skip remote tests if no
> BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests
> under some condition, fall-back to t/lib for Test::More if necessary.
> 
> (Though I just spotted an oops in the latter...)

I agree.  The EUtilities tests are quite long.  I plan on eventually cutting
out some of them  Making them somewhat less prone to changes in returned XML
data has also been a pain, as demonstrated by some of the tests from MAIN
now failing... d'oh!

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Fri Oct 20 11:39:32 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 16:39:32 +0100
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <001501c6f45b$019103c0$15327e82@pyrimidine>
References: <001501c6f45b$019103c0$15327e82@pyrimidine>
Message-ID: <4538EDB4.3030500@sendu.me.uk>

Chris Fields wrote:
> BTW, do any of Lincoln's Bio::DB*
> use DBD::mySQL?  Bio::DB::GFF comes to mind.

No, just a require on a user-passed variable as I described.


>> Also, I note that very many Bioperl modules need IO::String, including
>> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
>> optional module. I didn't make any change though.
> 
> Do they all require IO::String or is it an option?

Oops, I take that back. Bio::SeqIO doesn't use IO::String. That's what 
you get for relying on grep output...
It's still many modules that use it, but I suppose you could do useful 
things without. So actually, let's keep it optional.


From cjfields at uiuc.edu  Fri Oct 20 16:32:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 15:32:32 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
Message-ID: <000001c6f486$df508930$15327e82@pyrimidine>


Seth, 

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto:bioperl-l-
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------
> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

-- 
Best Regards,

Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From olenka.m at gmail.com  Fri Oct 20 17:47:15 2006
From: olenka.m at gmail.com (Olena Morozova)
Date: Fri, 20 Oct 2006 14:47:15 -0700
Subject: [Bioperl-l] GO annotations
Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>

Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena


From olenka.m at gmail.com  Fri Oct 20 17:47:15 2006
From: olenka.m at gmail.com (Olena Morozova)
Date: Fri, 20 Oct 2006 14:47:15 -0700
Subject: [Bioperl-l] GO annotations
Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>

Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena


From sdavis2 at mail.nih.gov  Sat Oct 21 11:05:26 2006
From: sdavis2 at mail.nih.gov (Davis, Sean (NIH/NCI) [E])
Date: Sat, 21 Oct 2006 11:05:26 -0400
Subject: [Bioperl-l] GO annotations
References: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>
Message-ID: <014DBF86B19310419F0DF8910FC56457240CE3@nihcesmlbx10.nih.gov>

You can use the ensembl perl API, or (more simply) use the Ensembl MART interface:

http://www.ensembl.org/Multi/martview

Sean


-----Original Message-----
From: Olena Morozova [mailto:olenka.m at gmail.com]
Sent: Fri 10/20/2006 5:47 PM
To: bioperl-l
Subject: [Bioperl-l] GO annotations
 
Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Sun Oct 22 06:34:51 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 22 Oct 2006 10:34:51 +0000
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
Message-ID: <453B494B.7040702@sheffield.ac.uk>

Hilmar Lapp wrote:
> On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote:
>
>   
>> Also, I note that very many Bioperl modules need IO::String, including
>> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
>> optional module. I didn't make any change though.
>>     
>
> I agree. There's really not that many terribly useful things you can  
> do with Bioperl w/o having IO::String installed, which is in stark  
> contrast to many other dependencies.
>
> I don't have a problem with making it (and a few others used all over  
> the place) required, to better contrast them with the dependencies  
> that are really optional (and not needed for 90% of users).
>
> 	-hilmar
>
>   

Is it possible to  make a distinction in Makefile.PL between those
modules that are an absolute must for Bioperl-core and those which are
optional and should go into Bundle::BioPerl?

Once I'm sure what should be "option" I'll do the Bundle::BioPerl
package and PPD's.

Cheers
Nath


From vitacolonna at appliedgenomics.org  Sun Oct 22 09:04:48 2006
From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna)
Date: Sun, 22 Oct 2006 15:04:48 +0200
Subject: [Bioperl-l] Submission proposal: ABIF module
Message-ID: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>

Hi everybody,
I would like to submit to CPAN a module for reading and parsing the  
ABIF files (with .ab1 suffix) produced by Applied Biosequence  
sequencers. The need for such a module arose in our lab because the  
existing ABI module we found on CPAN had too limited functionality.  
As an example, our module allows us to easily produce analysis  
reports similar to the ones generated by the Sequencing Analysis  
software.

May I call the module Bio::ABIF? Or should I follow other conventions?

Nicola


From cjfields at uiuc.edu  Sun Oct 22 09:54:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 08:54:51 -0500
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
Message-ID: <F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>


On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:

> Hi everybody,
> I would like to submit to CPAN a module for reading and parsing the
> ABIF files (with .ab1 suffix) produced by Applied Biosequence
> sequencers. The need for such a module arose in our lab because the
> existing ABI module we found on CPAN had too limited functionality.
> As an example, our module allows us to easily produce analysis
> reports similar to the ones generated by the Sequencing Analysis
> software.
>
> May I call the module Bio::ABIF? Or should I follow other conventions?
>
> Nicola

It depends.  Does it interact with bioperl in any way?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct 22 09:57:18 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 08:57:18 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <453B494B.7040702@sheffield.ac.uk>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
	<453B494B.7040702@sheffield.ac.uk>
Message-ID: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>


On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote:

> Is it possible to  make a distinction in Makefile.PL between those
> modules that are an absolute must for Bioperl-core and those which are
> optional and should go into Bundle::BioPerl?
>
> Once I'm sure what should be "option" I'll do the Bundle::BioPerl
> package and PPD's.
>
> Cheers
> Nath

We probably should steer this way eventually.  Do you aim on placing  
prereqs required for bioperl core in the bioperl PPD and the  
'optional' ones with the bundle?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From vitacolonna at appliedgenomics.org  Sun Oct 22 10:16:26 2006
From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna)
Date: Sun, 22 Oct 2006 16:16:26 +0200
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
	<F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
Message-ID: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>


On 22/ott/06, at 15:54, Chris Fields wrote:

>
> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:
>
>> Hi everybody,
>> I would like to submit to CPAN a module for reading and parsing the
>> ABIF files (with .ab1 suffix) [...]
>> May I call the module Bio::ABIF? Or should I follow other  
>> conventions?
>
> It depends.  Does it interact with bioperl in any way?

No. Can you suggest a suitable pattern for the name?

Nicola


From cjfields at uiuc.edu  Sun Oct 22 10:55:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 09:55:46 -0500
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
	<F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
	<8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>
Message-ID: <B4155C40-8E3D-4AA0-88F5-7A1FFBD3A134@uiuc.edu>

On Oct 22, 2006, at 9:16 AM, Nicola Vitacolonna wrote:

> On 22/ott/06, at 15:54, Chris Fields wrote:
>
>>
>> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:
>>
>>> Hi everybody,
>>> I would like to submit to CPAN a module for reading and parsing the
>>> ABIF files (with .ab1 suffix) [...]
>>> May I call the module Bio::ABIF? Or should I follow other
>>> conventions?
>>
>> It depends.  Does it interact with bioperl in any way?
>
> No. Can you suggest a suitable pattern for the name?
>
> Nicola

I don't think it will be a problem to name it Bio::ABIF; there is  
already a Bio::ASN1::EntrezGene, and Rutger Vos's Bio::Phylo modules  
(the latter doesn't require BioPerl either).

Saying that, if you plan on contributing more CPAN modules with  
similar functionality (such as parsing other trace files), you might  
want to consider using a namespace that isn't limiting but doesn't  
conflict with Bioperl core (like Bio::Trace or similar, then name  
your module Bio::Trace::ABIF).  You can use search.cpan.org to check  
namespaces for conflicts.

Just as an note: we have bioperl-ext, which also parses ABI and other  
trace file formats.  It's a bit old now and needs updating, but is  
supposed to be quite fast (it uses the Staden io_lib C library via  
PerlXS).

-c

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Sun Oct 22 13:26:37 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Sun, 22 Oct 2006 12:26:37 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
References: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <453BA9CD.4060107@campus.iztacala.unam.mx>

Works fine on FreeBSD.

Mauricio.

Sendu Bala wrote:
> Hi,
> I've just committed an updated Makefile.PL to HEAD for bioperl-live. 
> Could some people test it on multiple platforms and confirm it is ok 
> (try out the different possible options as well)?
> 
> (NB. in the below, 'pre-reqs' are things the makefile considers optional 
> dependencies)
> 
> Note that some pre-reqs have been removed:
> # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end 
> up requiring it but only after the user makes an explicit choice by 
> typing 'DBD::mysql' in their own code to supply as an option to Bioperl 
> code)
> # File::Temp (standard in 5.6.1)
> 
> 
> This pre-req was wrong:
> # Data::Stag::Writer
> and has been replaced with:
> Data::Stag::XMLWriter
> 
> 
> Also, I note that very many Bioperl modules need IO::String, including 
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an 
> optional module. I didn't make any change though.
> 
> 
> I don't know if these changes affect the Windows ppm Nathan, or anything 
> else (Bundle?)?
> 
> The INSTALL docs need updating with these new and improved pre-reqs 
> (note that some pre-reqs had wrong/not enough Bioperl modules listed as 
> needing them); does someone want to correct the wiki (based on the new 
> Makefile.PL) and then Chris can re-create the text version?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From n.haigh at sheffield.ac.uk  Sun Oct 22 15:37:07 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 22 Oct 2006 20:37:07 +0100
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
	<453B494B.7040702@sheffield.ac.uk>
	<7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>
Message-ID: <453BC863.4090803@sheffield.ac.uk>

Chris Fields wrote:
>
> On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote:
>
>> Is it possible to  make a distinction in Makefile.PL between those
>> modules that are an absolute must for Bioperl-core and those which are
>> optional and should go into Bundle::BioPerl?
>>
>> Once I'm sure what should be "option" I'll do the Bundle::BioPerl
>> package and PPD's.
>>
>> Cheers
>> Nath
>
> We probably should steer this way eventually.  Do you aim on placing 
> prereqs required for bioperl core in the bioperl PPD and the 
> 'optional' ones with the bundle?
>
That's correct. However, PPM will always try to update packages to the 
latest available. Therefore, if at some point in the future, a 
dependency is removed, and thus removed from Bundle::BioPerl, a 
situation may arise where an older version of BioPerl is running with 
the a recent version of Bundle::BioPerl and could have missing 
dependencies - not ideal but it is how things currently stand. The 
process of making the Bundle::BioPerl PPD would be simplified if these 
"optional" dependencies are separated from the "core" dependencies. If 
one of the following solutions is possible (i'm not sure if they are), 
it would be very useful:

1) Maintain 2 hashes in Makefile.PL that contain the "core" and 
"optional" dependencies. In unsure of the way dependencies are ordered 
during a "make ppd", but it may be possible to pass hash references of 
both to PREREQS_PM in MakeMakefile and have the "optional" depenencies 
grouped separately from "core" depenedcies in the ppd file - thus making 
it easy to stip them out into a Bundle::BioPerl ppd.

2) Again, maintain 2 hashes in Makefile.PL that contain the "core" and 
"optional" dependencies. Have some Makefile setup that allows the 
generation of a Bundle::BioPerl ppd separately from the main Bioperl ppd.

Like I said, these are just some thoughts and I'm not sure if they are 
even viable options.

Nath


From chhalling at alumni.ls.berkeley.edu  Sun Oct 22 19:45:33 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Sun, 22 Oct 2006 19:45:33 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
Message-ID: <453C029D.1070708@alumni.ls.berkeley.edu>

I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 
that prevent these modules from being installed:

Data::Stag::Writer (listed as Data::Stag::writer)
HTTP::Request::Common (listed as HTTP::Request::Common-)
Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From cjfields at uiuc.edu  Sun Oct 22 22:24:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 21:24:07 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
Message-ID: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>

Thanks for letting us know!  Did PPM4 throw errors or just silently  
pass them over?

Chris

On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:

> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17- 
> Oct-2006
> that prevent these modules from being installed:
>
> Data::Stag::Writer (listed as Data::Stag::writer)
> HTTP::Request::Common (listed as HTTP::Request::Common-)
> Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)
>
> -- 
> Conrad Halling
> chhalling at alumni.ls.berkeley.edu
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Mon Oct 23 02:45:29 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 06:45:29 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
Message-ID: <453C6509.90005@sheffield.ac.uk>

Chris Fields wrote:
> Thanks for letting us know!  Did PPM4 throw errors or just silently  
> pass them over?
>
> Chris
>
> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:
>
>   
I believe he is talking about the bundle on cpan and not the ppd. I will
get this updated as soon as possible.

Sendu/Chris - can you confirm to me which Bioperl modules are essential
to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
reason for not putting *all* dependencies into the bundle?

Nath


From bix at sendu.me.uk  Mon Oct 23 02:43:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 07:43:36 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
Message-ID: <453C6498.5@sendu.me.uk>

Conrad Halling wrote:
> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 
> that prevent these modules from being installed:
> 
> Data::Stag::Writer (listed as Data::Stag::writer)

This should be Data::Stag::XMLWriter

> HTTP::Request::Common (listed as HTTP::Request::Common-)
> Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)


From bix at sendu.me.uk  Mon Oct 23 02:52:47 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 07:52:47 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C6509.90005@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
Message-ID: <453C66BF.1060008@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu/Chris - can you confirm to me which Bioperl modules are essential
> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
> reason for not putting *all* dependencies into the bundle?

AFAIK, there are no essential external dependencies. Everything in 
%packages in Makefile.PL, for example, is optional.

We had the discussion about making all the easy-to-install ones a forced 
requirement anyway (so that most things work out of the box), but 
perhaps we'll hold off on making such a change until after 1.5.2.


From jyotikshah at gmail.com  Mon Oct 23 03:10:43 2006
From: jyotikshah at gmail.com (Jyoti Shah)
Date: Mon, 23 Oct 2006 00:10:43 -0700
Subject: [Bioperl-l] short motif searches
Message-ID: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>

Hi,

I am interested in searching motifs as small as 6 or 7 nucleotides in
genomic databases. I need exact matches. Is there any bioperl module
available which can help me do this? I tried WU BLAST with word size one,
but I am getting warning messages such as "WARNING: the maximum achievable
score of 7 in context 0 (frame +1) is less than the ungapped cutoff score S2
(=13). Exit code 0...". Any suggestions?

Thanks in advance,
Jyoti


From bix at sendu.me.uk  Mon Oct 23 03:55:40 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 08:55:40 +0100
Subject: [Bioperl-l] short motif searches
In-Reply-To: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>
References: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>
Message-ID: <453C757C.1010408@sendu.me.uk>

Jyoti Shah wrote:
> Hi,
> 
> I am interested in searching motifs as small as 6 or 7 nucleotides in
> genomic databases. I need exact matches. Is there any bioperl module
> available which can help me do this?

At 6 or 7bp long doing a simple exact match I should point out you're 
going to get very many hits; are you sure this is an appropriate thing 
to do for your purposes?

Assuming yes, you can use Bio::SeqIO, Bio::Index or Bio::DB::<something> 
to get your genomic sequences of interest, then simply use a normal perl 
regexp on the resulting $seq->seq strings.

If your motifs are anything like transcription factor binding sites, and 
you have more information than just a single sequence string for the 
motif, investigate Bio::Matrix::PSM.


From bix at sendu.me.uk  Mon Oct 23 04:29:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 09:29:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C7648.8030004@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk>
	<453C7648.8030004@sheffield.ac.uk>
Message-ID: <453C7D80.80207@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> Sendu/Chris - can you confirm to me which Bioperl modules are essential
>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
>>> reason for not putting *all* dependencies into the bundle?
>> AFAIK, there are no essential external dependencies. Everything in
>> %packages in Makefile.PL, for example, is optional.
>>
>> We had the discussion about making all the easy-to-install ones a
>> forced requirement anyway (so that most things work out of the box),
>> but perhaps we'll hold off on making such a change until after 1.5.2.
 >
> How are they forced?

They're not. Right now they're optional. I'm suggesting we might change 
that in the future.

If you're asking how we /would/ force them, probably by adding 
PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs 
successfully (or should!) without its optional dependencies given in 
PREREQ_PM because make test succeeds (because tests skip ok when the 
optional dependency isn't there).

I don't really know how CPAN discovers dependencies and auto-installs 
them before a dependent module though. Anyone care to explain?


From n.haigh at sheffield.ac.uk  Mon Oct 23 06:09:12 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 10:09:12 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C7D80.80207@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk>
	<453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk>
Message-ID: <453C94C8.5040900@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Nathan S. Haigh wrote:
>>>> Sendu/Chris - can you confirm to me which Bioperl modules are
>>>> essential
>>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
>>>> reason for not putting *all* dependencies into the bundle?
>>> AFAIK, there are no essential external dependencies. Everything in
>>> %packages in Makefile.PL, for example, is optional.
>>>
>>> We had the discussion about making all the easy-to-install ones a
>>> forced requirement anyway (so that most things work out of the box),
>>> but perhaps we'll hold off on making such a change until after 1.5.2.
> >
>> How are they forced?
>
> They're not. Right now they're optional. I'm suggesting we might
> change that in the future.
> If you're asking how we /would/ force them, probably by adding
> PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs
> successfully (or should!) without its optional dependencies given in
> PREREQ_PM because make test succeeds (because tests skip ok when the
> optional dependency isn't there).
>
> I don't really know how CPAN discovers dependencies and auto-installs
> them before a dependent module though. Anyone care to explain?

I thought so! I misunderstood something earlier which confused me. Just
to clarify for my own sanities sake:

1) Currently all dependencies are optional.
2) All dependencies are in %packages
3) all these are passed to PREREQ_PM

As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's:
--snip--

    I installed a Bundle and had a couple of fails. When I retried,
    everything resolved nicely. Can this be fixed to work on first try?

    The reason for this is that CPAN does not know the dependencies of
    all modules when it starts out. To decide about the additional items
    to install, it just uses data found in the META.yml file or the
    generated Makefile. An undetected missing piece breaks the process.
    But it may well be that your Bundle installs some prerequisite later
    than some depending item and thus your second try is able to resolve
    everything. Please note, CPAN.pm does not know the dependency tree
    in advance and cannot sort the queue of things to install in a
    topologically correct order. It resolves perfectly well IF all
    modules declare the prerequisites correctly with the PREREQ_PM
    attribute to MakeMaker or the |requires| stanza of Module::Build.
    For bundles which fail and you need to install often, it is
    recommended to sort the Bundle definition file manually.

--snip--

Therefore, recent modifications to Makefile.PL should result in a fully
operational Bioperl installation, if installed via CPAN. Although only
Bioperl 1.4 is available via CPAN currently. It is possible to upload a
developer release to CPAN which can only be ownloaded via CPAN if
specifically asked for - would be good for 1.5.x.:
--snip--

    How do I install a "DEVELOPER RELEASE" of a module?

    By default, CPAN will install the latest non-developer release of a
    module. If you want to install a dev release, you have to specify
    the partial path starting with the author id to the tarball you wish
    to install, like so:

        cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz

    Note that you can use the |ls| command to get this path listed.

--snip--

HTH
Nath


From bix at sendu.me.uk  Mon Oct 23 05:41:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 10:41:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C94C8.5040900@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
Message-ID: <453C8E60.7000105@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>
>> I don't really know how CPAN discovers dependencies and auto-installs
>> them before a dependent module though. Anyone care to explain?
> 
> I thought so! I misunderstood something earlier which confused me. Just
> to clarify for my own sanities sake:
> 
> 1) Currently all dependencies are optional.
> 2) All dependencies are in %packages
> 3) all these are passed to PREREQ_PM

All correct.


> As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's:
> --snip--
> 
>     I installed a Bundle and had a couple of fails. When I retried,
>     everything resolved nicely. Can this be fixed to work on first try?
> 
>     The reason for this is that CPAN does not know the dependencies of
>     all modules when it starts out. To decide about the additional items
>     to install, it just uses data found in the META.yml file or the
>     generated Makefile. An undetected missing piece breaks the process.
>     But it may well be that your Bundle installs some prerequisite later
>     than some depending item and thus your second try is able to resolve
>     everything. Please note, CPAN.pm does not know the dependency tree
>     in advance and cannot sort the queue of things to install in a
>     topologically correct order. It resolves perfectly well IF all
>     modules declare the prerequisites correctly with the PREREQ_PM
>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>     For bundles which fail and you need to install often, it is
>     recommended to sort the Bundle definition file manually.
> 
> --snip--
>
> Therefore, recent modifications to Makefile.PL should result in a fully
> operational Bioperl installation, if installed via CPAN.

Right, thanks for that.


> Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a
> developer release to CPAN which can only be ownloaded via CPAN if
> specifically asked for - would be good for 1.5.x.:
> --snip--
> 
>     How do I install a "DEVELOPER RELEASE" of a module?
> 
>     By default, CPAN will install the latest non-developer release of a
>     module. If you want to install a dev release, you have to specify
>     the partial path starting with the author id to the tarball you wish
>     to install, like so:
> 
>         cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz
> 
>     Note that you can use the |ls| command to get this path listed.
> 
> --snip--

That's the user point of view - how does the developer actually tell 
CPAN that something is a developer release so that normal users don't 
automatically install it?


From bix at sendu.me.uk  Mon Oct 23 05:59:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 10:59:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C8E60.7000105@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk>
Message-ID: <453C9298.9000900@sendu.me.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> As far as CPAN discovering dependencies, here is a snip from the CPAN 
>> FAQ's:
>> --snip--
>>
>>     I installed a Bundle and had a couple of fails. When I retried,
>>     everything resolved nicely. Can this be fixed to work on first try?
>>
>>     The reason for this is that CPAN does not know the dependencies of
>>     all modules when it starts out. To decide about the additional items
>>     to install, it just uses data found in the META.yml file or the
>>     generated Makefile. An undetected missing piece breaks the process.
>>     But it may well be that your Bundle installs some prerequisite later
>>     than some depending item and thus your second try is able to resolve
>>     everything. Please note, CPAN.pm does not know the dependency tree
>>     in advance and cannot sort the queue of things to install in a
>>     topologically correct order. It resolves perfectly well IF all
>>     modules declare the prerequisites correctly with the PREREQ_PM
>>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>>     For bundles which fail and you need to install often, it is
>>     recommended to sort the Bundle definition file manually.
>>
>> --snip--
>>
>> Therefore, recent modifications to Makefile.PL should result in a fully
>> operational Bioperl installation, if installed via CPAN.
> 
> Right, thanks for that.

Oh, so this effectively means that our 'optional' dependencies are 
installed for CPAN users, which matches up to my 'force the optional 
ones anyway' desire, leaving Bundle::BioPerl without any use.

Makefile.PL could be altered again to remove from PREREQ_PM those 
modules the user didn't already have installed, thus CPAN would only 
install Bioperl itself and nothing optional. The user could then install 
Bundle::BioPerl if they wanted a quick way of getting all the optional 
stuff to work.

I'm happy either way; what do other people think?


From n.haigh at sheffield.ac.uk  Mon Oct 23 07:22:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 11:22:17 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C9298.9000900@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk>
Message-ID: <453CA5E9.1060406@sheffield.ac.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> As far as CPAN discovering dependencies, here is a snip from the
>>> CPAN FAQ's:
>>> --snip--
>>>
>>>     I installed a Bundle and had a couple of fails. When I retried,
>>>     everything resolved nicely. Can this be fixed to work on first try?
>>>
>>>     The reason for this is that CPAN does not know the dependencies of
>>>     all modules when it starts out. To decide about the additional
>>> items
>>>     to install, it just uses data found in the META.yml file or the
>>>     generated Makefile. An undetected missing piece breaks the process.
>>>     But it may well be that your Bundle installs some prerequisite
>>> later
>>>     than some depending item and thus your second try is able to
>>> resolve
>>>     everything. Please note, CPAN.pm does not know the dependency tree
>>>     in advance and cannot sort the queue of things to install in a
>>>     topologically correct order. It resolves perfectly well IF all
>>>     modules declare the prerequisites correctly with the PREREQ_PM
>>>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>>>     For bundles which fail and you need to install often, it is
>>>     recommended to sort the Bundle definition file manually.
>>>
>>> --snip--
>>>
>>> Therefore, recent modifications to Makefile.PL should result in a fully
>>> operational Bioperl installation, if installed via CPAN.
>>
>> Right, thanks for that.
>
> Oh, so this effectively means that our 'optional' dependencies are
> installed for CPAN users, which matches up to my 'force the optional
> ones anyway' desire, leaving Bundle::BioPerl without any use.
>
> Makefile.PL could be altered again to remove from PREREQ_PM those
> modules the user didn't already have installed, thus CPAN would only
> install Bioperl itself and nothing optional. The user could then
> install Bundle::BioPerl if they wanted a quick way of getting all the
> optional stuff to work.
>
> I'm happy either way; what do other people think?
>From my point of view, removing them from PREREQ_PM means building the
Bundle::BioPerl a bit of a pain :o(

I prefer the way it is currently set up - most people have fast internet
connections and GB of harddrive space. Other than the reason "why
install something I won't ever need" I don't see much point maintaining
Bundle::BioPerl and having "optional" dependencies. I think if there are
any modules which are not going to be used by the majority of users,
then this could be used as the rationale for removing them from
bioperl-core into another package?

Nath


From n.haigh at sheffield.ac.uk  Mon Oct 23 07:38:05 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 11:38:05 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C8E60.7000105@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk>
Message-ID: <453CA99D.9060009@sheffield.ac.uk>


>> Although only Bioperl 1.4 is available via CPAN currently. It is
>> possible to upload a
>> developer release to CPAN which can only be ownloaded via CPAN if
>> specifically asked for - would be good for 1.5.x.:
>> --snip--
>>
>>     How do I install a "DEVELOPER RELEASE" of a module?
>>
>>     By default, CPAN will install the latest non-developer release of a
>>     module. If you want to install a dev release, you have to specify
>>     the partial path starting with the author id to the tarball you wish
>>     to install, like so:
>>
>>         cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz
>>
>>     Note that you can use the |ls| command to get this path listed.
>>
>> --snip--
>
> That's the user point of view - how does the developer actually tell
> CPAN that something is a developer release so that normal users don't
> automatically install it?

I found this:
http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt

Is says that $VERSION should simply be changed from a naked number into
a single quoted number and this should be recognized by the CPAN indexer.

Nath


From bix at sendu.me.uk  Mon Oct 23 06:47:38 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 11:47:38 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
	<E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
Message-ID: <453C9DCA.4020802@sendu.me.uk>

Hilmar Lapp wrote:
> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:
> 
>> For example, I have made no effort to setup biosql-schema but I
>> thought that maybe there would be a test that would detect this
> 
> I'm afraid there isn't. Bioperl-db is meaningless without
> biosql-schema.

Can you suggest a way we might detect if biosql-schema has been 
installed prior to running the test suite, so we can give some 
meaningful error message?


From bix at sendu.me.uk  Mon Oct 23 08:43:30 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 13:43:30 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
Message-ID: <453CB8F2.7070703@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>
>> Makefile.PL could be altered again to remove from PREREQ_PM those
>> modules the user didn't already have installed, thus CPAN would only
>> install Bioperl itself and nothing optional. The user could then
>> install Bundle::BioPerl if they wanted a quick way of getting all the
>> optional stuff to work.
>>
>> I'm happy either way; what do other people think?
 >
> From my point of view, removing them from PREREQ_PM means building the
> Bundle::BioPerl a bit of a pain :o(

Can I ask how you're generating Bundle::BioPerl? That is, how did the 
typos get in there? Is there a way to certainly avoid typos in the future?


From n.haigh at sheffield.ac.uk  Mon Oct 23 09:46:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 13:46:17 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CB8F2.7070703@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk>
Message-ID: <453CC7A9.6090609@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>
>>> Makefile.PL could be altered again to remove from PREREQ_PM those
>>> modules the user didn't already have installed, thus CPAN would only
>>> install Bioperl itself and nothing optional. The user could then
>>> install Bundle::BioPerl if they wanted a quick way of getting all the
>>> optional stuff to work.
>>>
>>> I'm happy either way; what do other people think?
> >
>> From my point of view, removing them from PREREQ_PM means building the
>> Bundle::BioPerl a bit of a pain :o(
>
> Can I ask how you're generating Bundle::BioPerl? That is, how did the
> typos get in there? Is there a way to certainly avoid typos in the
> future?

I just modified the list by hand a while back :o( - I'm sure there must
be a better way.


From bix at sendu.me.uk  Mon Oct 23 08:58:13 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 13:58:13 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CC7A9.6090609@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk>
Message-ID: <453CBC65.2020202@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> Sendu Bala wrote:
>>>
>>>> Makefile.PL could be altered again to remove from PREREQ_PM those
>>>> modules the user didn't already have installed, thus CPAN would only
>>>> install Bioperl itself and nothing optional. The user could then
>>>> install Bundle::BioPerl if they wanted a quick way of getting all the
>>>> optional stuff to work.
>>>>
>>>> I'm happy either way; what do other people think?
 >>>
>>> From my point of view, removing them from PREREQ_PM means building the
>>> Bundle::BioPerl a bit of a pain :o(
 >>
>> Can I ask how you're generating Bundle::BioPerl? That is, how did the
>> typos get in there? Is there a way to certainly avoid typos in the
>> future?
> 
> I just modified the list by hand a while back :o( - I'm sure there must
> be a better way.

I'm not sure I understand why removing things from PREREQ_PM would be a 
problem for you then; the %packages hash would remain unchanged (ie. 
have everything) so you have something to refer to when manually editing 
the Bundle.

http://www.cpan.org/misc/cpan-faq.html#How_make_bundle
might be helpful? I didn't really pay too much attention to the advice - 
does it offer a typo-avoiding solution?


From n.haigh at sheffield.ac.uk  Mon Oct 23 10:04:12 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 14:04:12 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CBC65.2020202@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk>
	<453CBC65.2020202@sendu.me.uk>
Message-ID: <453CCBDC.6030904@sheffield.ac.uk>


> I'm not sure I understand why removing things from PREREQ_PM would be
> a problem for you then; the %packages hash would remain unchanged (ie.
> have everything) so you have something to refer to when manually
> editing the Bundle.
>
> http://www.cpan.org/misc/cpan-faq.html#How_make_bundle
> might be helpful? I didn't really pay too much attention to the advice
> - does it offer a typo-avoiding solution?

It's helpful in producing the Bundle PPD as all the XML tags are present
in the Bioperl PPD and they simply need to be copied over to a
Bundle-BioPerl PPD file.

Looks like manual editing of the relevant file is required for making a
CPAN bundle. Unfortunately - no typo-avoiding solution. :o(


From dhoworth at mrc-lmb.cam.ac.uk  Mon Oct 23 08:46:29 2006
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Mon, 23 Oct 2006 13:46:29 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA99D.9060009@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453CA99D.9060009@sheffield.ac.uk>
Message-ID: <453CB9A5.2020409@mrc-lmb.cam.ac.uk>

>> That's the user point of view - how does the developer actually tell
>> CPAN that something is a developer release so that normal users don't
>> automatically install it?
> 
> I found this:
> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
> 
> Is says that $VERSION should simply be changed from a naked number into
> a single quoted number and this should be recognized by the CPAN indexer.

<http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>

Cheers, Dave


From hlapp at gmx.net  Mon Oct 23 09:40:29 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 23 Oct 2006 09:40:29 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <453C9DCA.4020802@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
	<E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
	<453C9DCA.4020802@sendu.me.uk>
Message-ID: <5C22B9C8-CEF0-457B-8565-793D56389A86@gmx.net>

You would need a lot of information to make that determination (host,  
port, db driver, db name, user, password; i.e., the entire connection  
information, and there is no 'standard').

You might just ask a simple question in Makefile.PL as to whether  
biosql is installed or not, similar to the DB::GFF tests.

	-hilmar

On Oct 23, 2006, at 6:47 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:
>>
>>> For example, I have made no effort to setup biosql-schema but I
>>> thought that maybe there would be a test that would detect this
>>
>> I'm afraid there isn't. Bioperl-db is meaningless without
>> biosql-schema.
>
> Can you suggest a way we might detect if biosql-schema has been
> installed prior to running the test suite, so we can give some
> meaningful error message?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Mon Oct 23 09:59:23 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 14:59:23 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CB9A5.2020409@mrc-lmb.cam.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>
	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
Message-ID: <453CCABB.2060308@sendu.me.uk>

Dave Howorth wrote:
>>> That's the user point of view - how does the developer actually tell
>>> CPAN that something is a developer release so that normal users don't
>>> automatically install it?
>> I found this:
>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>
>> Is says that $VERSION should simply be changed from a naked number into
>> a single quoted number and this should be recognized by the CPAN indexer.
> 
> <http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>

Thanks for that.

I guess from that the 1.5.2 version number should be:

$VERSION = 1.05_02

And 1.6 would be

$VERSION = 1.06

But will this cause a problem wrt 1.4? 1.4 has:

$VERSION = 1.4;

Is 1.4 lower than 1.06? Should we keep to a single digit version, so 
1.5_02 and 1.6? Does this really not work with CPAN? Should we call them 
version fifty and version sixty? 1.50_02, 1.60?


From cjfields at uiuc.edu  Mon Oct 23 10:12:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:12:16 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C9298.9000900@sendu.me.uk>
Message-ID: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>

...
> > Right, thanks for that.
> 
> Oh, so this effectively means that our 'optional' dependencies are
> installed for CPAN users, which matches up to my 'force the optional
> ones anyway' desire, leaving Bundle::BioPerl without any use.
> 
> Makefile.PL could be altered again to remove from PREREQ_PM those
> modules the user didn't already have installed, thus CPAN would only
> install Bioperl itself and nothing optional. The user could then install
> Bundle::BioPerl if they wanted a quick way of getting all the optional
> stuff to work.
> 
> I'm happy either way; what do other people think?

I think that we should have it so Bioperl installs as-is (no additional
reqs) and have Bundle::BioPerl used as a convenient way to install all
optional modules for full functionality.  The catch is to make sure that any
optional installations do not crash tests during a CPAN bioperl
installation, otherwise they aren't considered optional by CPAN, and the
install won't work without forcing it.

Frankly, most users will find themselves wanting to install the Bundle
anyway to get full functionality, so we could always 'strongly recommend'
preceding the bioperl installation with a Bundle::Bioperl CPAN installation
to avoid problems, at least for this release. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 23 10:23:04 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:23:04 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk>
Message-ID: <002101c6f6ae$c14d7860$15327e82@pyrimidine>

...
> >> Right, thanks for that.
> >
> > Oh, so this effectively means that our 'optional' dependencies are
> > installed for CPAN users, which matches up to my 'force the optional
> > ones anyway' desire, leaving Bundle::BioPerl without any use.
> >
> > Makefile.PL could be altered again to remove from PREREQ_PM those
> > modules the user didn't already have installed, thus CPAN would only
> > install Bioperl itself and nothing optional. The user could then
> > install Bundle::BioPerl if they wanted a quick way of getting all the
> > optional stuff to work.
> >
> > I'm happy either way; what do other people think?
> >From my point of view, removing them from PREREQ_PM means building the
> Bundle::BioPerl a bit of a pain :o(
> 
> I prefer the way it is currently set up - most people have fast internet
> connections and GB of harddrive space. Other than the reason "why
> install something I won't ever need" I don't see much point maintaining
> Bundle::BioPerl and having "optional" dependencies. I think if there are
> any modules which are not going to be used by the majority of users,
> then this could be used as the rationale for removing them from
> bioperl-core into another package?
> 
> Nath

I think you'll likely find it much easier to maintain a Bundle package
long-term and indicate that it should be installed along with bioperl, than
to have users complain about a particular Bioperl module failing b/c a
particular dependency wasn't installed.  

If we have the Bundle around in CPAN and in PPM for Win32 users, and
indicate in the INSTALL docs and the wiki our preference that it be
installed prior to or along with a Bioperl installation for beginners, we
can mitigate most of those problems.  Nip it in the bud, to quote a Mr.
Barney Fife.

My 2c

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 23 10:29:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:29:33 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CCABB.2060308@sendu.me.uk>
Message-ID: <002201c6f6af$a91e4200$15327e82@pyrimidine>

> Dave Howorth wrote:
> >>> That's the user point of view - how does the developer actually tell
> >>> CPAN that something is a developer release so that normal users don't
> >>> automatically install it?
> >> I found this:
> >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
> >>
> >> Is says that $VERSION should simply be changed from a naked number into
> >> a single quoted number and this should be recognized by the CPAN
> indexer.
> >
> > <http://search.cpan.org/~nwclark/perl-
> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
> 
> Thanks for that.
> 
> I guess from that the 1.5.2 version number should be:
> 
> $VERSION = 1.05_02
> 
> And 1.6 would be
> 
> $VERSION = 1.06
> 
> But will this cause a problem wrt 1.4? 1.4 has:
> 
> $VERSION = 1.4;
> 
> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them
> version fifty and version sixty? 1.50_02, 1.60?

Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
much simpler to use that. 

Simon Cozens wrote about this a while back:

http://www.perl.com/pub/a/2000/04/whatsnew.html

...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Mon Oct 23 10:41:24 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 15:41:24 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <002201c6f6af$a91e4200$15327e82@pyrimidine>
References: <002201c6f6af$a91e4200$15327e82@pyrimidine>
Message-ID: <453CD494.8070905@sendu.me.uk>

Chris Fields wrote:
>> Dave Howorth wrote:
>>>>> That's the user point of view - how does the developer actually tell
>>>>> CPAN that something is a developer release so that normal users don't
>>>>> automatically install it?
>>>> I found this:
>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>>
>>>> Is says that $VERSION should simply be changed from a naked number into
>>>> a single quoted number and this should be recognized by the CPAN
>> indexer.
>>> <http://search.cpan.org/~nwclark/perl-
>> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>
>> Thanks for that.
>>
>> I guess from that the 1.5.2 version number should be:
>>
>> $VERSION = 1.05_02
>>
>> And 1.6 would be
>>
>> $VERSION = 1.06
>>
>> But will this cause a problem wrt 1.4? 1.4 has:
>>
>> $VERSION = 1.4;
>>
>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
>> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them
>> version fifty and version sixty? 1.50_02, 1.60?
> 
> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
> much simpler to use that. 

That does not present us with a way to have 1.5.2 marked as a developer 
release in CPAN.

Also, see the discussion here: 
http://perldoc.perl.org/functions/require.html

Since we require 5.6.1 the backwards-compatible issues maybe don't apply 
to us, but do these ideas work with modules, or just Perl itself? Is 
CPAN et al. happy with this form of versioning?

/Something/ needs to be done about Bioperl versioning, because the 
current 1.4 or 1.5 is completely inadequate.


From bix at sendu.me.uk  Mon Oct 23 10:51:25 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 15:51:25 +0100
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
Message-ID: <453CD6ED.5050507@sendu.me.uk>

Chris Fields wrote:

[option 1]
>> Oh, so this effectively means that our 'optional' dependencies are 
>> installed for CPAN users, which matches up to my 'force the
>> optional ones anyway' desire, leaving Bundle::BioPerl without any
>> use.

[option 2]
>> Makefile.PL could be altered again to remove from PREREQ_PM those 
>> modules the user didn't already have installed, thus CPAN would
>> only install Bioperl itself and nothing optional. The user could
>> then install Bundle::BioPerl if they wanted a quick way of getting
>> all the optional stuff to work.
>> 
>> I'm happy either way; what do other people think?
> 
> I think that we should have it so Bioperl installs as-is (no
> additional reqs) and have Bundle::BioPerl used as a convenient way to
> install all optional modules for full functionality.

Note we're specifically considering a CPAN install here. If you download
the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
still needed as a convenience if you want to install the optional
external dependencies.


> The catch is to make sure that any optional installations do not
> crash tests during a CPAN bioperl installation, otherwise they aren't
> considered optional by CPAN, and the install won't work without
> forcing it.

I'm pretty sure this isn't a problem, though it would be nice if someone 
could test it on a clean system: does 'make test' pass all ok with none 
of the optional modules installed?


Anyway, to reiterate the question: Do we care if CPAN users get all the 
optional external dependencies installed for them automatically, or do 
we want to force them to install Bundle?

The current situation is: CPAN users will get all optional external 
dependencies without using Bundle::BioPerl. Manual installers of bioperl 
(from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to 
get full functionality.


From n.haigh at sheffield.ac.uk  Mon Oct 23 12:30:34 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 16:30:34 +0000
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CCABB.2060308@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
	<453CCABB.2060308@sendu.me.uk>
Message-ID: <453CEE2A.8000002@sheffield.ac.uk>

Sendu Bala wrote:
> Dave Howorth wrote:
>   
>>>> That's the user point of view - how does the developer actually tell
>>>> CPAN that something is a developer release so that normal users don't
>>>> automatically install it?
>>>>         
>>> I found this:
>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>
>>> Is says that $VERSION should simply be changed from a naked number into
>>> a single quoted number and this should be recognized by the CPAN indexer.
>>>       
>> <http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>     
>
> Thanks for that.
>
> I guess from that the 1.5.2 version number should be:
>
> $VERSION = 1.05_02
>
> And 1.6 would be
>
> $VERSION = 1.06
>
> But will this cause a problem wrt 1.4? 1.4 has:
>
> $VERSION = 1.4;
>
> Is 1.4 lower than 1.06? Should we keep to a single digit version, so 
> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them 
> version fifty and version sixty? 1.50_02, 1.60?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
I believe the link to the documentation above describes a common CPAN
versioning scheme as follows:

1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32

Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
be better as 1.52. Then to indicate that the 1.5 series is a developer
release, you append the underscore and at least 2 digits. Thus resulting
in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
1.52_01. The only thing i'm unsure about would be when does the _01 get
incremented? I suspect we would probably not increment this number since
each release would be an increment of the minor release number e.g.
1.52_01, 1.53_01, 1.54_01 etc.

Although I'm still not sure how this versioning would affect bioperl 1.4
since 1.4 uses a non-standard versioning scheme :o(

As I understand it, the versioning of the Perl releases uses the x.y.z
scheme. But apparently CPAN modules should use the above versioning scheme.

Nath


From cjfields at uiuc.edu  Mon Oct 23 11:36:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 10:36:37 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CD6ED.5050507@sendu.me.uk>
Message-ID: <000c01c6f6b9$0781af40$15327e82@pyrimidine>

...
> 
> Note we're specifically considering a CPAN install here. If you download
> the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
> still needed as a convenience if you want to install the optional
> external dependencies.
> 

Agreed.  I don't think the Bundle is dispensable.  For instance, it's very
easy for us to just state to beginners to install Bundle::Bioperl before
installing bioperl itself,  as opposed to having them inundate the mail list
with requests on why x.pl script didn't work, which could be simply from
lack of the required module. 

> I'm pretty sure this isn't a problem, though it would be nice if someone
> could test it on a clean system: does 'make test' pass all ok with none
> of the optional modules installed?

So far on WinXP everything passes; I ran a clean perl installation a while
ago using nmake and tests passed.

> Anyway, to reiterate the question: Do we care if CPAN users get all the
> optional external dependencies installed for them automatically, or do
> we want to force them to install Bundle?
> 
> The current situation is: CPAN users will get all optional external
> dependencies without using Bundle::BioPerl. Manual installers of bioperl
> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
> get full functionality.

I don't think forcing is necessary, so a CPAN installation shouldn't force
someone to install optional modules.  Graph.pm, for instance has a few
optional modules, and the tests which use those get skipped and pass so the
installation proceeds w/o problems.  We could do the same (any tests using
those optional modules display the reason why they are skipped).  

I would strongly state in the INSTALL and INSTALL.WIN docs that (new) users
should install Bundle::Bioperl before installing Bioperl core for full
functionality.  If you are an advanced user and know your way around
CPAN/Perl, then you can install the various independent requirements
depending on your particular requirements. 

Chris


From n.haigh at sheffield.ac.uk  Mon Oct 23 12:38:00 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 16:38:00 +0000
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CD6ED.5050507@sendu.me.uk>
References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
	<453CD6ED.5050507@sendu.me.uk>
Message-ID: <453CEFE8.4000704@sheffield.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>
> [option 1]
>   
>>> Oh, so this effectively means that our 'optional' dependencies are 
>>> installed for CPAN users, which matches up to my 'force the
>>> optional ones anyway' desire, leaving Bundle::BioPerl without any
>>> use.
>>>       
>
> [option 2]
>   
>>> Makefile.PL could be altered again to remove from PREREQ_PM those 
>>> modules the user didn't already have installed, thus CPAN would
>>> only install Bioperl itself and nothing optional. The user could
>>> then install Bundle::BioPerl if they wanted a quick way of getting
>>> all the optional stuff to work.
>>>
>>> I'm happy either way; what do other people think?
>>>       
>> I think that we should have it so Bioperl installs as-is (no
>> additional reqs) and have Bundle::BioPerl used as a convenient way to
>> install all optional modules for full functionality.
>>     
>
> Note we're specifically considering a CPAN install here. If you download
> the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
> still needed as a convenience if you want to install the optional
> external dependencies.
>
>
>   
>> The catch is to make sure that any optional installations do not
>> crash tests during a CPAN bioperl installation, otherwise they aren't
>> considered optional by CPAN, and the install won't work without
>> forcing it.
>>     
>
> I'm pretty sure this isn't a problem, though it would be nice if someone 
> could test it on a clean system: does 'make test' pass all ok with none 
> of the optional modules installed?
>
>   

I could definitely do this on WinXP and *possibly* on a Linux system.

> Anyway, to reiterate the question: Do we care if CPAN users get all the 
> optional external dependencies installed for them automatically, or do 
> we want to force them to install Bundle?
>
>   

I'd prefer any dependencies, whether the are seen as vital to the main
functionality of Bioperl or not actually specified in PREREQ_PM (as they
currently are). A dependency is a dependency - is it not? If a
distinction is to be made based on whether the requiring module is
simply adding additional functionality to Bioperl-core, then shouldn't
it be moved out of core and into another package as with the run modules
if we are to have "optional" dependencies?

my 2p
Nath

> The current situation is: CPAN users will get all optional external 
> dependencies without using Bundle::BioPerl. Manual installers of bioperl 
> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to 
> get full functionality.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   


From cjfields at uiuc.edu  Mon Oct 23 11:39:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 10:39:09 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CD494.8070905@sendu.me.uk>
Message-ID: <000d01c6f6b9$62033d80$15327e82@pyrimidine>

...
> That does not present us with a way to have 1.5.2 marked as a developer
> release in CPAN.
> 
> Also, see the discussion here:
> http://perldoc.perl.org/functions/require.html
> 
> Since we require 5.6.1 the backwards-compatible issues maybe don't apply
> to us, but do these ideas work with modules, or just Perl itself? Is
> CPAN et al. happy with this form of versioning?
> 
> /Something/ needs to be done about Bioperl versioning, because the
> current 1.4 or 1.5 is completely inadequate.

I think using 'require Foo x.y.z' is applicable to modules as well.  There
is something in Programming Perl about this, just don't have it on hand...

Not sure about CPAN, so we need to look into it.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Oct 23 11:42:15 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 16:42:15 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CEE2A.8000002@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
	<453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk>
Message-ID: <453CE2D7.5080608@sendu.me.uk>

Nathan S. Haigh wrote:
> I believe the link to the documentation above describes a common CPAN
> versioning scheme as follows:
> 
> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32
> 
> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
> be better as 1.52. Then to indicate that the 1.5 series is a developer
> release, you append the underscore and at least 2 digits. Thus resulting
> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
> 1.52_01. The only thing i'm unsure about would be when does the _01 get
> incremented? I suspect we would probably not increment this number since
> each release would be an increment of the minor release number e.g.
> 1.52_01, 1.53_01, 1.54_01 etc.
> 
> Although I'm still not sure how this versioning would affect bioperl 1.4
> since 1.4 uses a non-standard versioning scheme :o(

Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
treated higher than 1.4? Anyway, we can cross that bridge when we get 
there, but this seems appropriate now.


Cheers,
Sendu.


From bix at sendu.me.uk  Mon Oct 23 11:59:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 16:59:01 +0100
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
References: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
Message-ID: <453CE6C5.6000108@sendu.me.uk>

Chris Fields wrote:
> ...
>> The current situation is: CPAN users will get all optional external
>> dependencies without using Bundle::BioPerl. Manual installers of bioperl
>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
>> get full functionality.
> 
> I don't think forcing is necessary, so a CPAN installation shouldn't force
> someone to install optional modules.  Graph.pm, for instance has a few
> optional modules, and the tests which use those get skipped and pass so the
> installation proceeds w/o problems.  We could do the same (any tests using
> those optional modules display the reason why they are skipped).  

I should clarify and say that that's what happens in Bioperl as well. 
The 'forcing' that I talk about is simply what I assume will happen if 
the user has CPAN set to automatically install dependencies. The user 
could say 'no' to every question regarding the installation of 
dependencies that CPAN discovers and Bioperl would still install fine.

So really the difference between the current situation and, say, the 
situation when 1.5.1 was released, is that the CPAN user doesn't have to 
use Bundle::BioPerl for full functionality anymore, but can still chose 
not to install all the optional external modules.

The difference is the possible default behaviour. Those users that 
auto-install dependencies get all the optional ones, whereas in the past 
they would not have. I have to point out the benefit of this behaviour: 
those people that don't care and just want it to work are more likely to 
get an installation that does just work. People who know what they're 
doing can still do what they want.


Before we decide what to do I guess we need hard confirmation of how 
CPAN will actually behave with the current Makefile.PL. Any ideas how we 
can find out?

It would also be good to have more options to break the current tie 
(Nathan is for keeping PREREQ_PM populated, Chris is for having it 
empty, I can go either way)...


From dhoworth at mrc-lmb.cam.ac.uk  Mon Oct 23 11:55:42 2006
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Mon, 23 Oct 2006 16:55:42 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CD494.8070905@sendu.me.uk>
References: <002201c6f6af$a91e4200$15327e82@pyrimidine>
	<453CD494.8070905@sendu.me.uk>
Message-ID: <453CE5FE.9070001@mrc-lmb.cam.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>>> Dave Howorth wrote:
>>>>>> That's the user point of view - how does the developer actually tell
>>>>>> CPAN that something is a developer release so that normal users don't
>>>>>> automatically install it?
>>>>> I found this:
>>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>>>
>>>>> Is says that $VERSION should simply be changed from a naked number into
>>>>> a single quoted number and this should be recognized by the CPAN
>>> indexer.
>>>> <http://search.cpan.org/~nwclark/perl-
>>> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>>
>>> Thanks for that.
>>>
>>> I guess from that the 1.5.2 version number should be:
>>>
>>> $VERSION = 1.05_02

I believe so - the underscore is key. Look at your favourite CPAN
modules and see what they do.

>>> And 1.6 would be
>>>
>>> $VERSION = 1.06
>>>
>>> But will this cause a problem wrt 1.4? 1.4 has:

I think it will cause a problem, yes. 1.4 > 1.06 As a workaround, you
could remove 1.4 from CPAN and require everybody who installs from CPAN
to uninstall it before installing 1.06.

>>> $VERSION = 1.4;
>>>
>>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
>>> 1.5_02 and 1.6? Does this really not work with CPAN?

I think that would work but see at the end.

>> Should we call them
>>> version fifty and version sixty? 1.50_02, 1.60?

Then you can count 1.50_02, 1.50_03, 1.52, 1.53_01 ... if you wish.

>> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
>> much simpler to use that. 
> 
> That does not present us with a way to have 1.5.2 marked as a developer 
> release in CPAN.
> 
> Also, see the discussion here: 
> http://perldoc.perl.org/functions/require.html
> 
> Since we require 5.6.1 the backwards-compatible issues maybe don't apply 
> to us, but do these ideas work with modules, or just Perl itself? Is 
> CPAN et al. happy with this form of versioning?

I'm not an expert :( It's my understanding that there is an awful lot of
flexibility in Perl module version numbering (as you might expect :)
However, I believe there are some gotchas. So I would recommend (a)
finding an expert and (b) trying an experiment!

> /Something/ needs to be done about Bioperl versioning, because the 
> current 1.4 or 1.5 is completely inadequate.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 


From n.haigh at sheffield.ac.uk  Mon Oct 23 13:37:13 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 17:37:13 +0000
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CE6C5.6000108@sendu.me.uk>
References: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
	<453CE6C5.6000108@sendu.me.uk>
Message-ID: <453CFDC9.8030107@sheffield.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>   
>> ...
>>     
>>> The current situation is: CPAN users will get all optional external
>>> dependencies without using Bundle::BioPerl. Manual installers of bioperl
>>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
>>> get full functionality.
>>>       
>> I don't think forcing is necessary, so a CPAN installation shouldn't force
>> someone to install optional modules.  Graph.pm, for instance has a few
>> optional modules, and the tests which use those get skipped and pass so the
>> installation proceeds w/o problems.  We could do the same (any tests using
>> those optional modules display the reason why they are skipped).  
>>     
>
> I should clarify and say that that's what happens in Bioperl as well. 
> The 'forcing' that I talk about is simply what I assume will happen if 
> the user has CPAN set to automatically install dependencies. The user 
> could say 'no' to every question regarding the installation of 
> dependencies that CPAN discovers and Bioperl would still install fine.
>
> So really the difference between the current situation and, say, the 
> situation when 1.5.1 was released, is that the CPAN user doesn't have to 
> use Bundle::BioPerl for full functionality anymore, but can still chose 
> not to install all the optional external modules.
>
>   
--snip--

Obviously, we could maintain a Bundle::BioPerl which includes all
dependencies required for a fully functional Bioperl. I think the whole
idea for a Bundle is to provide a common environment for a particular
package. If for example, someone chooses not to install the dependencies
through CPAN (in the current setup), that can easily go back and install
Bundle::BioPerl and it would retrieve any missing dependencies for a
fully functional Bioperl-core.

Nath


From n.haigh at sheffield.ac.uk  Mon Oct 23 14:06:16 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 18:06:16 +0000
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CE2D7.5080608@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
Message-ID: <453D0498.8050206@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>   
>> I believe the link to the documentation above describes a common CPAN
>> versioning scheme as follows:
>>
>> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32
>>
>> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
>> be better as 1.52. Then to indicate that the 1.5 series is a developer
>> release, you append the underscore and at least 2 digits. Thus resulting
>> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
>> 1.52_01. The only thing i'm unsure about would be when does the _01 get
>> incremented? I suspect we would probably not increment this number since
>> each release would be an increment of the minor release number e.g.
>> 1.52_01, 1.53_01, 1.54_01 etc.
>>
>> Although I'm still not sure how this versioning would affect bioperl 1.4
>> since 1.4 uses a non-standard versioning scheme :o(
>>     
>
> Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
> treated higher than 1.4? Anyway, we can cross that bridge when we get 
> there, but this seems appropriate now.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just tried the suggested:
perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)'
bioperl-1-5-2/Bio/Root/Version.pm

To see how it parses the various different version schemes - here are
the results:
1.5       -> 1.5
1.4       -> 1.4
1.60      -> 1.60
1.05_01   -> 1.0501
1.5_01    -> 1.501
1.50_01   -> 1.5001

Nath


From cjfields at uiuc.edu  Mon Oct 23 13:15:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:15:44 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CE6C5.6000108@sendu.me.uk>
Message-ID: <002701c6f6c6$e2622c40$15327e82@pyrimidine>

...
> I should clarify and say that that's what happens in Bioperl as well.
> The 'forcing' that I talk about is simply what I assume will happen if
> the user has CPAN set to automatically install dependencies. The user
> could say 'no' to every question regarding the installation of
> dependencies that CPAN discovers and Bioperl would still install fine.
> 
> So really the difference between the current situation and, say, the
> situation when 1.5.1 was released, is that the CPAN user doesn't have to
> use Bundle::BioPerl for full functionality anymore, but can still chose
> not to install all the optional external modules.
> 
> The difference is the possible default behaviour. Those users that
> auto-install dependencies get all the optional ones, whereas in the past
> they would not have. I have to point out the benefit of this behaviour:
> those people that don't care and just want it to work are more likely to
> get an installation that does just work. People who know what they're
> doing can still do what they want.

OK with me.  Any way we go about it, we have to assume that anyone who set
CPAN to automatically install dependencies would want this behavior.

> Before we decide what to do I guess we need hard confirmation of how
> CPAN will actually behave with the current Makefile.PL. Any ideas how we
> can find out?
> 
> It would also be good to have more options to break the current tie
> (Nathan is for keeping PREREQ_PM populated, Chris is for having it
> empty, I can go either way)...

Frankly I'm for whatever is easiest for the end-user.  I think we should
continue maintaining Bundle::Bioperl b/c of its convenience (easier for us
to say 'install Bundle::Bioperl' as opposed to 'install modules a b d d e f
g...'  ).  I should note that Chris D. maintains Bundle::Bioperl via CPAN
and can easily add/remove modules as needed, so all that would be necessary
prior to a release is to make sure the various modules present in the Bundle
are up-to-date.

The only difficulty would updating the bundle PPM version for Win32; I agree
with Nathan that it would be nice if it were easier to maintain.  The PPD
file generated using 'nmake ppd' needs modifications, likely b/c these are
probably still generated as PPM3-compatible vs PPM4-compatible.

I also think the idea of having the developer releases available via CPAN is
a good one, as long as they are marked as such (which you are taking care of
with versioning changes).  It makes them a little more official, even if
they are interim developer releases.

Chris


From cjfields at uiuc.edu  Mon Oct 23 13:19:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:19:08 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CFDC9.8030107@sheffield.ac.uk>
Message-ID: <002801c6f6c7$5a58ed60$15327e82@pyrimidine>

...
> > So really the difference between the current situation and, say, the
> > situation when 1.5.1 was released, is that the CPAN user doesn't have to
> > use Bundle::BioPerl for full functionality anymore, but can still chose
> > not to install all the optional external modules.
> >
> >
> --snip--
> 
> Obviously, we could maintain a Bundle::BioPerl which includes all
> dependencies required for a fully functional Bioperl. I think the whole
> idea for a Bundle is to provide a common environment for a particular
> package. If for example, someone chooses not to install the dependencies
> through CPAN (in the current setup), that can easily go back and install
> Bundle::BioPerl and it would retrieve any missing dependencies for a
> fully functional Bioperl-core.
> 
> Nath

Succinctly put; I would've spent five paragraphs describing that!  Too much
coffee (from lab meetings...)

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 23 13:26:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:26:57 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>
Message-ID: <002c01c6f6c8$7163dd20$15327e82@pyrimidine>

Seth, 

Did you try this with a clean, taxonomy-installed database?  There may be
some junk left over tfrom the previous test runs.

I'm looking into it this week; it may not make the developer release but
we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
with a call to gzip.  I'll look into a workaround for that.  

Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
introduces others.  One alternative which I found works is cygwin, but
there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
another...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 11:37 AM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
--- 
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): 
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85) 
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' 
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88) 
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis, J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2) 
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2) 
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3) 
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' 
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4) 
#    Expected: '99199225'
==============================


On 10/20/06, Chris Fields < <mailto:cjfields at uiuc.edu>  cjfields at uiuc.edu>
wrote:


Seth,

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with 
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
<mailto:bioperl-l-> 
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM 
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------

> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42 
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list 
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
<http://lists.open-bio.org/mailman/listinfo/bioperl-l> 
>
>


From johnson.biotech at gmail.com  Mon Oct 23 12:36:36 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 23 Oct 2006 12:36:36 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <000001c6f486$df508930$15327e82@pyrimidine>
References: <000001c6f486$df508930$15327e82@pyrimidine>
Message-ID: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>

Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31):
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85)
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators'
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88)
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2)
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2)
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3)
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein'
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4)
#    Expected: '99199225'
==============================


On 10/20/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
>
> Seth,
>
> Did you work out the problem here?  There was a recent CVS update to OBDA
> tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
> apparently left data from tests in the database, which caused problems
> with
> repeated test runs.
>
> Chris
>
> > > -----Original Message-----
> > > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
> > > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > > Sent: Saturday, September 30, 2006 6:35 PM
> > > To: Hilmar Lapp
> > > Cc: Chris Fields; Bioperl List
> > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> > >
> > > Here're complete test details:
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > ...
> >
> > > FAILED tests 10-12
> > >     Failed 3/12 tests, 75.00% okay
> > > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> > >
> >
> --------------------------------------------------------------------------
> > > -----
> > > t\02species.t                 65    2   3.08%  63 65
> > > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > > t\16obda.t                    12    3  25.00%  10-12
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l <at> lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>
>


From n.haigh at sheffield.ac.uk  Mon Oct 23 16:08:00 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 20:08:00 +0000
Subject: [Bioperl-l] CPAN testing Service
Message-ID: <453D2120.9010301@sheffield.ac.uk>

We should also check the CPAN testing service (CPANTS) to see how "good"
our package is for CPAN and try to increase the Kwalitee score. There
only appears to be details for bioperl-1.2.3 for some reason:
http://cpants.perl.org/dist/bioperl

Nath


From pabloivan at gmail.com  Sun Oct 22 15:54:35 2006
From: pabloivan at gmail.com (Pablo Ivan)
Date: Sun, 22 Oct 2006 16:54:35 -0300
Subject: [Bioperl-l] Bioperl installation under Windows
Message-ID: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>

Hello,

I have been trying to install Bioperl 1.4 on a Windows XP system, but I
didn't get too far; my perl installation was made using ActiveState
5.8.8build 816. I then tried the ppm method of searching for bioperl
in the
repositories and installing the core package 1.4. It says that the
installation was made successfully, but the /Bio folder doesn't show up in
/lib, and it's like nothing new was installed at all. I was wondering if
using that version of ActiveState could be causing it, but the uninstall
option for it isn't showing in Add/Remove, and I'm afraid just deleting the
folders and installing version 5.6 of AS could somehow damage and make
things worse. Or should I just forget about it and try using Cygwin?

Thank you,

Pablo.


From cjfields at uiuc.edu  Mon Oct 23 17:34:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 16:34:47 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880610231422o24029a0cu229fccc2b5809b85@mail.gmail.com>
Message-ID: <000401c6f6eb$111df040$15327e82@pyrimidine>

Don't know what that particular error is, but it looks ActivePerl-related
(PPM generates HTML from the blib directory).  You may need to run 'nmake
clean' in between test cycles get rid of old blib and other files.

 
The carryover issue from old test runs was a definite problem.  Brian fixed
that in the bioperl-db CVS recently.  Also,  I tried Sendu's fixes from CVS
head to Bio::Root::Root and they seem to fix the problems with
Bio::Root::Root.  The issue came down to a use of indirect syntax (a bad
perl practice).  There are other errors popping up related to Bio::Species,
but these seem fixable at least.

 
I committed a few changes to bioperl-db CVS to fix 03simpleseq.t test
failures due to a lack of gzip on WinXP (I didn't see them b/c I had a copy
on GNU gzip in my path).  These should pass w/o problems now on WinXP.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 4:22 PM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

I have not cleaned my test database yet.  I'll purge it and redo the tests. 

This error keeps popping up in unexpected places while running nmake during
installation: 
 "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. 
NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code
'0xff'"

Is there a way around it??

Seth

On 10/23/06, Chris Fields <cjfields at uiuc.edu> wrote:

Seth, 

Did you try this with a clean, taxonomy-installed database?  There may be
some junk left over tfrom the previous test runs.

I'm looking into it this week; it may not make the developer release but
we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
with a call to gzip.  I'll look into a workaround for that.  

Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
introduces others.  One alternative which I found works is cygwin, but
there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
another...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 11:37 AM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
--- 
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): 
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85) 
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' 
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88) 
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis, J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2) 
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2) 
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3) 
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' 
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4) 
#    Expected: '99199225'
==============================

On 10/20/06, Chris Fields < cjfields at uiuc.edu <mailto:cjfields at uiuc.edu> >
wrote:


Seth,

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with 
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
<mailto:bioperl-l-> 
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM 
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------

> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42 
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list 
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
<http://lists.open-bio.org/mailman/listinfo/bioperl-l> 
>
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358 


From cjfields at uiuc.edu  Mon Oct 23 17:53:27 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 16:53:27 -0500
Subject: [Bioperl-l] Bioperl installation under Windows
In-Reply-To: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>
References: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>
Message-ID: <9994CFF6-FCA1-4C7F-9A33-31765C6AE255@uiuc.edu>

It won't install in Perl\lib, but in Perl\site\lib.  Check there.

We are working intently on the next developer release for BioPerl and  
plan on having several PPMs available, but we only are supporting  
ActivePerl 5.8.8.819.  I would suggest that you upgrade your  
ActivePerl installation to that if possible since PPM has undergone  
major changes (they use PPM4 now, which has a GUI by default).  Most  
repositories are now moving over to using PPM4 so you'll likely be  
seeing less PPM3-compatible packages being made.

Chris

On Oct 22, 2006, at 2:54 PM, Pablo Ivan wrote:

> Hello,
>
> I have been trying to install Bioperl 1.4 on a Windows XP system,  
> but I
> didn't get too far; my perl installation was made using ActiveState
> 5.8.8build 816. I then tried the ppm method of searching for bioperl
> in the
> repositories and installing the core package 1.4. It says that the
> installation was made successfully, but the /Bio folder doesn't  
> show up in
> /lib, and it's like nothing new was installed at all. I was  
> wondering if
> using that version of ActiveState could be causing it, but the  
> uninstall
> option for it isn't showing in Add/Remove, and I'm afraid just  
> deleting the
> folders and installing version 5.6 of AS could somehow damage and make
> things worse. Or should I just forget about it and try using Cygwin?
>
> Thank you,
>
> Pablo.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnson.biotech at gmail.com  Mon Oct 23 17:22:13 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 23 Oct 2006 17:22:13 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <002c01c6f6c8$7163dd20$15327e82@pyrimidine>
References: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>
	<002c01c6f6c8$7163dd20$15327e82@pyrimidine>
Message-ID: <b99962880610231422o24029a0cu229fccc2b5809b85@mail.gmail.com>

Chris,

I have not cleaned my test database yet.  I'll purge it and redo the tests.

This error keeps popping up in unexpected places while running nmake during
installation:
 "Undefined subroutine &main::UpdateHTML_blib called at -e line 1.
NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code
'0xff'"

Is there a way around it??

Seth

On 10/23/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>  Seth,
>
> Did you try this with a clean, taxonomy-installed database?  There may be
> some junk left over tfrom the previous test runs.
>
> I'm looking into it this week; it may not make the developer release but
> we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
> with a call to gzip.  I'll look into a workaround for that.
>
> Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
> introduces others.  One alternative which I found works is cygwin, but
> there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
> another...
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>   ------------------------------
>
> *From:* Seth Johnson [mailto:johnson.biotech at gmail.com]
> *Sent:* Monday, October 23, 2006 11:37 AM
> *To:* Chris Fields
> *Cc:* bioperl-l
> *Subject:* Re: Error retrieving sequence from BioSQL
>
>
>
> Chris,
>
> There's definite improvement:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> -------------------------------------------------------------------------------
>
> t/02species.t                 65    2   3.08%  63 65
> t/03simpleseq.t    1   256    59  106 179.66%  7-59
> t/04swiss.t                   52   14  26.92%  25 27-34 38-42
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> There's some weirdness going on during the 'swiss.t' test.  It almost
> seems to me that expectations of some tests are swapped (27 & 39, 28 & 40,
> 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31):
> ================================
> not ok 25
> # Test 25 got: '10097078' (t/04swiss.t at line 79)
> #    Expected: '91309150'
> ok 26
> not ok 27
> # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
> at line 85)
> #    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
> not ok 28
> # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
> mitochondrial matrix protein' (t/04swiss.t at line 86)
> #    Expected: 'Functional expression of cloned human splicing factor SF2:
> homology to RNA-binding proteins, U1 70K, and Drosophila splicing
> regulators'
> not ok 29
> # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
> (t/04swiss.t at line 87)
> #    Expected: 'Cell 66 (2), 383-394 (1991)'
> not ok 30
> # Test 30 got: <UNDEF> (t/04swiss.t at line 88)
> #    Expected: '91309150'
> not ok 31
> # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
> (t/04swiss.t at line 85 fail #2)
> #    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
> Celis, J.E. and Leffers,H.'
> not ok 32
> # Test 32 got: 'Functional expression of cloned human splicing factor SF2:
> homology to RNA-binding proteins, U1 70K, and Drosophila splicing
> regulators' (t/04swiss.t at line 86 fail #2)
> #    Expected: 'Cloning and expression of a cDNA covering the complete
> coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
> not ok 33
> # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
> #2)
> #    Expected: 'Gene 134 (2), 283-287 (1993)'
> not ok 34
> # Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
> #    Expected: '94085792'
> ok 35
> ok 36
> ok 37
> not ok 38
> # Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3)
> #    Expected: '94253723'
> not ok 39
> # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
> Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
> #    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
> not ok 40
> # Test 40 got: 'Cloning and expression of a cDNA covering the complete
> coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
> (t/04swiss.t at line 86 fail #4)
> #    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
> mitochondrial matrix protein'
> not ok 41
> # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
> #4)
> #    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
> not ok 42
> # Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4)
> #    Expected: '99199225'
> ==============================
>
>  On 10/20/06, *Chris Fields* < cjfields at uiuc.edu> wrote:
>
>
>
> Seth,
>
> Did you work out the problem here?  There was a recent CVS update to OBDA
> tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
> apparently left data from tests in the database, which caused problems
> with
> repeated test runs.
>
> Chris
>
> > > -----Original Message-----
> > > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
> > > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > > Sent: Saturday, September 30, 2006 6:35 PM
> > > To: Hilmar Lapp
> > > Cc: Chris Fields; Bioperl List
> > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> > >
> > > Here're complete test details:
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > ...
> >
> > > FAILED tests 10-12
> > >     Failed 3/12 tests, 75.00% okay
> > > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> > >
> >
> --------------------------------------------------------------------------
> > > -----
> > > t\02species.t                 65    2   3.08%  63 65
> > > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > > t\16obda.t                    12    3  25.00%  10-12
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l <at> lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358


From chhalling at alumni.ls.berkeley.edu  Mon Oct 23 21:02:24 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Mon, 23 Oct 2006 21:02:24 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C6509.90005@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
Message-ID: <453D6620.5020401@alumni.ls.berkeley.edu>

Sorry, I should know better about giving all the details.

This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a 
fresh compile) with Mac OS X 10.4.8.

-- Conrad

Nathan S. Haigh wrote:
> Chris Fields wrote:
>   
>> Thanks for letting us know!  Did PPM4 throw errors or just silently  
>> pass them over?
>>
>> Chris
>>
>> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:
>>
>>   
>>     
> I believe he is talking about the bundle on cpan and not the ppd. I will
> get this updated as soon as possible.
>
> Sendu/Chris - can you confirm to me which Bioperl modules are essential
> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
> reason for not putting *all* dependencies into the bundle?
>
> Nath
>
>
>
>
>
>   


-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From n.haigh at sheffield.ac.uk  Tue Oct 24 03:05:53 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 24 Oct 2006 08:05:53 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453D6620.5020401@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453D6620.5020401@alumni.ls.berkeley.edu>
Message-ID: <453DBB51.6010505@sheffield.ac.uk>

Conrad Halling wrote:
> Sorry, I should know better about giving all the details.
>
> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a 
> fresh compile) with Mac OS X 10.4.8.
>
> -- Conrad
>
>   
My apologies Conrad, this was my bad! Are you in need of the corrections 
being made swiftly or can you wait until the Bioperl 1.5.2 release when 
I'll ensure the Bundle is updated correctly for that release?

Cheers
Nath


From n.haigh at sheffield.ac.uk  Tue Oct 24 05:57:25 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 10:57:25 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CE2D7.5080608@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
Message-ID: <453DE385.8010700@sheffield.ac.uk>

--snip--
> Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
> treated higher than 1.4? Anyway, we can cross that bridge when we get 
> there, but this seems appropriate now.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just been having a think about this versioning. Does this work well and
is it intuitive with versioning the official 1.5.2 developer release and
also the 1.6 stable release? I'd like to put forward the following
versioning scheme for consideration (most is the same as what it is now,
but with some clarification - hopefully):
major-version . minor-version sub-version _ developer-release-version
RC-version

The sub-version represents bug-fixes and possibly some minor feature
enhancements with no API changes.
The minor-version represents some significant feature enhancements/API
changes/bug fixes.
The major-version represents significant rewrites of Bioperl.

For an RC of a developer release the version would have _0x (where x=the
RC number)
For a non RC of a developer release the version would have _10
For an RC of a stable release the version would have _0x (where x=RC number)
Fo a non RC of a stable release the version would not have the
underscore suffix

Therefore I would see the following $VERSION being applied:
1.5.2 RC1            = 1.52_01
1.5.2 RC2            = 1.52_02
1.5.2 RC3            = 1.52_03
1.5.2                = 1.52_10
1.6 RC1              = 1.60_01
1.6 RC2              = 1.60_02
1.6                  = 1.60
1.6.1 RC1            = 1.61_01
1.6.1                = 1.61

This should satisfy the requirement of CPAN for having underscores in
versions to indicate a developer release, which here is a Bioperl
release with an odd minor version number or any RC whether it be of a
developer release or a stable release. This should mean that we could
have the RC's on CPAN, but by default, CPAN would only install the
latest "non developer release" (i.e. the last package without an
underscore in the version).

If we are going ahead with the new $VERSION scheme (as it currently is
in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
1.52 instead of Bioperl 1.5.2 and make an effort to sync the
documentation with regards to this.

Nath


From bix at sendu.me.uk  Tue Oct 24 06:19:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 11:19:05 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DE385.8010700@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
	<453DE385.8010700@sheffield.ac.uk>
Message-ID: <453DE899.4030603@sendu.me.uk>

Nathan Haigh wrote:
>
> Therefore I would see the following $VERSION being applied:
> 1.5.2 RC1            = 1.52_01
> 1.5.2 RC2            = 1.52_02
> 1.5.2 RC3            = 1.52_03
> 1.5.2                = 1.52_10
> 1.6 RC1              = 1.60_01
> 1.6 RC2              = 1.60_02
> 1.6                  = 1.60
> 1.6.1 RC1            = 1.61_01
> 1.6.1                = 1.61
> 
> This should satisfy the requirement of CPAN for having underscores in
> versions to indicate a developer release, which here is a Bioperl
> release with an odd minor version number or any RC whether it be of a
> developer release or a stable release. This should mean that we could
> have the RC's on CPAN, but by default, CPAN would only install the
> latest "non developer release" (i.e. the last package without an
> underscore in the version).

That all sounds good to me, except I worry about potential confusion if 
people look manually at the things available in CPAN, see 1.60_02 and 
think it is more recent than 1.60 and try to install it manually.

Since
$VERSION = 1.52_10;
is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
final release version should be
$VERSION = 1.6010.


> If we are going ahead with the new $VERSION scheme (as it currently is
> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
> documentation with regards to this.

I might disagree with this though. I think perl people, and perhaps unix 
people in general, should be used to version numbers like '1.5.2', but 
then getting '1.52' from the code since such a number allows simple 
numerical comparisons while the former does not. The former is easier to 
read and understand. This is just how Perl itself behaves.

Most users who wouldn't expect such a behaviour aren't going to be 
checking the version number programatically anyway.


BTW. do we have someone with a CPAN account, or should I get one?


From n.haigh at sheffield.ac.uk  Tue Oct 24 07:37:12 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 12:37:12 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DE899.4030603@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk>
Message-ID: <453DFAE8.5050602@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>   
>> Therefore I would see the following $VERSION being applied:
>> 1.5.2 RC1            = 1.52_01
>> 1.5.2 RC2            = 1.52_02
>> 1.5.2 RC3            = 1.52_03
>> 1.5.2                = 1.52_10
>> 1.6 RC1              = 1.60_01
>> 1.6 RC2              = 1.60_02
>> 1.6                  = 1.60
>> 1.6.1 RC1            = 1.61_01
>> 1.6.1                = 1.61
>>
>> This should satisfy the requirement of CPAN for having underscores in
>> versions to indicate a developer release, which here is a Bioperl
>> release with an odd minor version number or any RC whether it be of a
>> developer release or a stable release. This should mean that we could
>> have the RC's on CPAN, but by default, CPAN would only install the
>> latest "non developer release" (i.e. the last package without an
>> underscore in the version).
>>     
>
> That all sounds good to me, except I worry about potential confusion if 
> people look manually at the things available in CPAN, see 1.60_02 and 
> think it is more recent than 1.60 and try to install it manually.
>
>   

I not sure if this would be a problem. As far as I understand, CPAN
treats these packages with underscores in $VERSION as something
distinctly different to the others releases (i.e. developer releases).
If you look at such a page, it is clearly evident that it is a
developers release. For example, if you search on CPAN for the latest
version of the CPAN module is shows 1.8802. if you go to that page:
http://search.cpan.org/~andk/CPAN-1.8802/
There is also a link for the latest developer release, released 1 day
after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).
This too appears to be later that 1.8802, but since it is dealt with as
a developer release it doesn't seem to matter - CPAN will only deal with
the stable (non-developer) releases, while the developer releases can be
used as a convenient way to access developer releases. Although I'm
thinking CPAN uses some hocus pocus with release dates too.

> Since
> $VERSION = 1.52_10;
> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
> final release version should be
> $VERSION = 1.6010.
>
>
>   

Because they are dealt with separately, I don't think this is an issue
(see above).

>> If we are going ahead with the new $VERSION scheme (as it currently is
>> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
>> documentation with regards to this.
>>     
>
> I might disagree with this though. I think perl people, and perhaps unix 
> people in general, should be used to version numbers like '1.5.2', but 
> then getting '1.52' from the code since such a number allows simple 
> numerical comparisons while the former does not. The former is easier to 
> read and understand. This is just how Perl itself behaves.
>
> Most users who wouldn't expect such a behaviour aren't going to be 
> checking the version number programatically anyway.
>
>
> BTW. do we have someone with a CPAN account, or should I get one?
>   

It says Ewan Birney is the author of Bioperl - I assume it must be
possible to have multiple people have the permissions to update a single
package.

Nath


From chhalling at alumni.ls.berkeley.edu  Tue Oct 24 07:15:12 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Tue, 24 Oct 2006 07:15:12 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453DBB51.6010505@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
	<453D6620.5020401@alumni.ls.berkeley.edu>
	<453DBB51.6010505@sheffield.ac.uk>
Message-ID: <453DF5C0.3040104@alumni.ls.berkeley.edu>

Nathan S. Haigh wrote:
> Conrad Halling wrote:
>> Sorry, I should know better about giving all the details.
>>
>> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 
>> (a fresh compile) with Mac OS X 10.4.8.
>>
>> -- Conrad  
> My apologies Conrad, this was my bad! Are you in need of the 
> corrections being made swiftly or can you wait until the Bioperl 1.5.2 
> release when I'll ensure the Bundle is updated correctly for that 
> release?
>
> Cheers
> Nath

No, I'm fine. I used the cpan utility to load the three modules manually.

-- Conrad

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From bix at sendu.me.uk  Tue Oct 24 08:16:54 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 13:16:54 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DFAE8.5050602@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
Message-ID: <453E0436.3050903@sendu.me.uk>

Nathan Haigh wrote:
> Sendu Bala wrote:
>
>> That all sounds good to me, except I worry about potential confusion if 
>> people look manually at the things available in CPAN, see 1.60_02 and 
>> think it is more recent than 1.60 and try to install it manually.
> 
> I not sure if this would be a problem. As far as I understand, CPAN
> treats these packages with underscores in $VERSION as something
> distinctly different to the others releases (i.e. developer releases).
> If you look at such a page, it is clearly evident that it is a
> developers release. For example, if you search on CPAN for the latest
> version of the CPAN module is shows 1.8802. if you go to that page:
> http://search.cpan.org/~andk/CPAN-1.8802/
> There is also a link for the latest developer release, released 1 day
> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).

[snip]

>> Since
>> $VERSION = 1.52_10;
>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
>> final release version should be
>> $VERSION = 1.6010.
>
> Because they are dealt with separately, I don't think this is an issue
> (see above).

If you don't notice the dates, or are doing numerical version number 
comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may 
not be automatic, but you can still chose to download the developer 
releases. Which means if we say to someone 'use Bioperl 1.6 or better' 
they may choose to get the latest version and think it is 1.6002 when 
infact 1.60 was the more recent version. 1.6010 solves the problem, is 
consistent with your 1.50_10 suggestion, and doesn't cause any problems 
as far as I can see.


>>> If we are going ahead with the new $VERSION scheme (as it currently is
>>> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
>>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
>>> documentation with regards to this.
>>>     
>> I might disagree with this though. I think perl people, and perhaps unix 
>> people in general, should be used to version numbers like '1.5.2', but 
>> then getting '1.52' from the code since such a number allows simple 
>> numerical comparisons while the former does not. The former is easier to 
>> read and understand. This is just how Perl itself behaves.
>>
>> Most users who wouldn't expect such a behaviour aren't going to be 
>> checking the version number programatically anyway.
>>
>>
>> BTW. do we have someone with a CPAN account, or should I get one?
>>   
> 
> It says Ewan Birney is the author of Bioperl - I assume it must be
> possible to have multiple people have the permissions to update a single
> package.

How did you get Bundle::BioPerl updated? Did you just ask Chris 
Dagdigian to do it for you? Or do you have access to his account? I'll 
ask Ewan about it.


From n.haigh at sheffield.ac.uk  Tue Oct 24 08:21:56 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 13:21:56 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0436.3050903@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk>
Message-ID: <453E0564.9030302@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> Sendu Bala wrote:
>>
>>> That all sounds good to me, except I worry about potential confusion
>>> if people look manually at the things available in CPAN, see 1.60_02
>>> and think it is more recent than 1.60 and try to install it manually.
>>
>> I not sure if this would be a problem. As far as I understand, CPAN
>> treats these packages with underscores in $VERSION as something
>> distinctly different to the others releases (i.e. developer releases).
>> If you look at such a page, it is clearly evident that it is a
>> developers release. For example, if you search on CPAN for the latest
>> version of the CPAN module is shows 1.8802. if you go to that page:
>> http://search.cpan.org/~andk/CPAN-1.8802/
>> There is also a link for the latest developer release, released 1 day
>> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).
>
> [snip]
>
>>> Since
>>> $VERSION = 1.52_10;
>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before
>>> release, final release version should be
>>> $VERSION = 1.6010.
>>
>> Because they are dealt with separately, I don't think this is an issue
>> (see above).
>
> If you don't notice the dates, or are doing numerical version number
> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
> not be automatic, but you can still chose to download the developer
> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
> they may choose to get the latest version and think it is 1.6002 when
> infact 1.60 was the more recent version. 1.6010 solves the problem, is
> consistent with your 1.50_10 suggestion, and doesn't cause any
> problems as far as I can see.
>
>

I see - you mean for a non-RC release append 10 to the version number
and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
the version.

--snip--
>
> How did you get Bundle::BioPerl updated? Did you just ask Chris
> Dagdigian to do it for you? Or do you have access to his account? I'll
> ask Ewan about it.
I just asked Chris D. to do it for me :o)

Nath


From bix at sendu.me.uk  Tue Oct 24 09:01:22 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 14:01:22 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0564.9030302@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk>
Message-ID: <453E0EA2.6050306@sendu.me.uk>

Nathan Haigh wrote:
> I see - you mean for a non-RC release append 10 to the version number
> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
> the version.

Precisely.

1.5.2 RC3 will have in Bio::Root::Version :

$VERSION = 1.52_03;
$VERSION = eval $VERSION; # $VERSION is 1.5203

1.5.2 final release would have:

$VERSION = 1.52_10;
$VERSION = eval $VERSION; # $VERSION is 1.5210

1.6.0 RC1 would have:

$VERSION = 1.60_01;
$VERSION = eval $VERSION; # $VERSION is 1.6001

1.6.0 final release would have:

$VERSION = 1.6010;


Nice thing about putting RCs up on CPAN is that I suppose we'd see the 
test results from cpantesters. The more test results the better :)


From n.haigh at sheffield.ac.uk  Tue Oct 24 09:05:54 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 14:05:54 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0EA2.6050306@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk>
	<453E0EA2.6050306@sendu.me.uk>
Message-ID: <453E0FB2.4080002@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> I see - you mean for a non-RC release append 10 to the version number
>> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
>> the version.
>
> Precisely.
>
> 1.5.2 RC3 will have in Bio::Root::Version :
>
> $VERSION = 1.52_03;
> $VERSION = eval $VERSION; # $VERSION is 1.5203
>
> 1.5.2 final release would have:
>
> $VERSION = 1.52_10;
> $VERSION = eval $VERSION; # $VERSION is 1.5210
>
> 1.6.0 RC1 would have:
>
> $VERSION = 1.60_01;
> $VERSION = eval $VERSION; # $VERSION is 1.6001
>
> 1.6.0 final release would have:
>
> $VERSION = 1.6010;
>
>
> Nice thing about putting RCs up on CPAN is that I suppose we'd see the
> test results from cpantesters. The more test results the better :)
Did you see the cpants site I sent earlier:
http://cpants.perl.org/dist/bioperl

But I'm not sure why 1.4 didn't make it in there instead of 1.2.3


From bix at sendu.me.uk  Tue Oct 24 09:14:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 14:14:08 +0100
Subject: [Bioperl-l] CPAN testing Service
In-Reply-To: <453D2120.9010301@sheffield.ac.uk>
References: <453D2120.9010301@sheffield.ac.uk>
Message-ID: <453E11A0.20304@sendu.me.uk>

Nathan S. Haigh wrote:
> We should also check the CPAN testing service (CPANTS) to see how "good"
> our package is for CPAN and try to increase the Kwalitee score. There
> only appears to be details for bioperl-1.2.3 for some reason:
> http://cpants.perl.org/dist/bioperl

Yes, but I think it will be pretty similar score this time round. We'll 
resolve the remaining issues for 1.6.


From cjfields at uiuc.edu  Tue Oct 24 10:24:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 09:24:44 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0436.3050903@sendu.me.uk>
Message-ID: <000501c6f778$279cee10$15327e82@pyrimidine>

...
> >> Since
> >> $VERSION = 1.52_10;
> >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release,
> >> final release version should be
> >> $VERSION = 1.6010.
> >
> > Because they are dealt with separately, I don't think this is an issue
> > (see above).
> 
> If you don't notice the dates, or are doing numerical version number
> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
> not be automatic, but you can still chose to download the developer
> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
> they may choose to get the latest version and think it is 1.6002 when
> infact 1.60 was the more recent version. 1.6010 solves the problem, is
> consistent with your 1.50_10 suggestion, and doesn't cause any problems
> as far as I can see.

CPAN looks like it can handle 'x.y.z', at least for Pugs:

http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/

>From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':

our $VERSION = 6.002013;

That's also a very perlish-way to do it.  And there are no developer
versions of Pugs, since it is always under active development.  We could try
something like:

our $VERSION = 1.005002_01;

just to tag it as a developer release or release candidate, if that's what
you want; I'm neutral to that point.  I don't think it's necessary to post
every RC to CPAN, though, unless you feel very strongly about it.  It just
seems like more hassle than it's worth, esp. since you've been releasing
about one per week leading up to a final 1.5.2 (due soon).  

> >> I might disagree with this though. I think perl people, and perhaps
> unix
> >> people in general, should be used to version numbers like '1.5.2', but
> >> then getting '1.52' from the code since such a number allows simple
> >> numerical comparisons while the former does not. The former is easier
> to
> >> read and understand. This is just how Perl itself behaves.
> >>
> >> Most users who wouldn't expect such a behaviour aren't going to be
> >> checking the version number programatically anyway.
> >>
> >>
> >> BTW. do we have someone with a CPAN account, or should I get one?
> >>
> >
> > It says Ewan Birney is the author of Bioperl - I assume it must be
> > possible to have multiple people have the permissions to update a single
> > package.

As a quick response to the above, I would read 'rel. 1.5.2' as the second
patched release of the second revision (here in a developer cycle) of the
first major release.  I would read 'rel 1.52' as the 52nd release of the
major release (just can't quite make it to version 2, I guess).  I don't
think we can use the latter as it is just too confusing, especially since
we've adopted the 'major.minor.patch' versioning quite early on.  

As for CPAN, I believe there is usually a person or group responsible for
maintaining each distribution.  As Ewan seems to be the point man, you'll
have to ask him.  I suppose it is possible to add more if needed

> How did you get Bundle::BioPerl updated? Did you just ask Chris
> Dagdigian to do it for you? Or do you have access to his account? I'll
> ask Ewan about it.

When I inquired about XML::Simple, I emailed Chris D. via his contact
information from CPAN.  He let me know that adding it would be pretty easy,
so all you need to do is let him know about any errors/additions/deletions.
I think his wiki page also has some contact info.  

Which reminds me, if anyone contacts him, could you make sure that
XML::Simple is added?  I can't remember if it has been.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 24 10:29:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 09:29:11 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0FB2.4080002@sheffield.ac.uk>
Message-ID: <000601c6f778$c639f0e0$15327e82@pyrimidine>

> Sendu Bala wrote:
> > Nathan Haigh wrote:
> >> I see - you mean for a non-RC release append 10 to the version number
> >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
> >> the version.
> >
> > Precisely.
> >
> > 1.5.2 RC3 will have in Bio::Root::Version :
> >
> > $VERSION = 1.52_03;
> > $VERSION = eval $VERSION; # $VERSION is 1.5203
> >
> > 1.5.2 final release would have:
> >
> > $VERSION = 1.52_10;
> > $VERSION = eval $VERSION; # $VERSION is 1.5210
> >
> > 1.6.0 RC1 would have:
> >
> > $VERSION = 1.60_01;
> > $VERSION = eval $VERSION; # $VERSION is 1.6001
> >
> > 1.6.0 final release would have:
> >
> > $VERSION = 1.6010;
> >
> >
> > Nice thing about putting RCs up on CPAN is that I suppose we'd see the
> > test results from cpantesters. The more test results the better :)
> Did you see the cpants site I sent earlier:
> http://cpants.perl.org/dist/bioperl
> 
> But I'm not sure why 1.4 didn't make it in there instead of 1.2.3

Yes, odd.  Another thing to note is that CPAN also list two bugs related to
bioperl 1.4.  We may need to have some way of either redirecting users from
there to bugzilla, or routinely checking the CPAN site.  Otherwise we'll
miss those. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From JK at novozymes.com  Tue Oct 24 10:45:26 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 16:45:26 +0200
Subject: [Bioperl-l] Keeping references around in the objects?
Message-ID: <934F95E71B6C9347A873C42AE3C196191299E011@NZT0004E.dknz.nzcorp.net>

Hi All. 

When getting a Bio::Seq object back from a feature it would be really 
nice to have access to the old objects through the new object as:

$featseq->feature()->parent_seq();

Would it be possible to keep the references around for (as an example) 
to be able to access the global information through the particular
feature. 

Most of the annotation in the general header of a EMBL/Genbank-record
also
applies to the specific features. 

Jesper


From JK at novozymes.com  Tue Oct 24 10:28:22 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 16:28:22 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
Message-ID: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>

Hi. 

We're trying to "extend" bioperl in our own setup. We have some funtions

that we'd like to "allways" have available on a Bio::Seq-object. As an
example, 
I'd like to have the sequence-digest available on ->digest that just
returns
A hex-encoded message-digest of the sequence in the object. This is
really comfortable
when trying to figure out wether we've got some computations stored in
the cache
for this particular sequence. 

Another example is that we have some fields we want to be mandatory in
the objects,
thus adding additional checks in the constructor is nessesary. 

Our approach has been to "subclass" Bio::Seq in a new object: (Nz::Seq)
and add 
the functionality there. This generally works fine (->translate() calls
->can_call_new()
and instantiates the correct subclassed object. 

But the logic fails when the ->seq of a feature just instantiates a
Bio::PrimarySeq 
without trying to get the subclassed object. 

So the question basically is: 
What is the preferred way of extending/subclassing Bio-perl -objects
with 
our own methods? 

Jesper


From bix at sendu.me.uk  Tue Oct 24 11:26:19 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 16:26:19 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <000501c6f778$279cee10$15327e82@pyrimidine>
References: <000501c6f778$279cee10$15327e82@pyrimidine>
Message-ID: <453E309B.9090007@sendu.me.uk>

Chris Fields wrote:
> ...
>>>> Since
>>>> $VERSION = 1.52_10;
>>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release,
>>>> final release version should be
>>>> $VERSION = 1.6010.
>>> Because they are dealt with separately, I don't think this is an issue
>>> (see above).
>> If you don't notice the dates, or are doing numerical version number
>> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
>> not be automatic, but you can still chose to download the developer
>> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
>> they may choose to get the latest version and think it is 1.6002 when
>> infact 1.60 was the more recent version. 1.6010 solves the problem, is
>> consistent with your 1.50_10 suggestion, and doesn't cause any problems
>> as far as I can see.
> 
> CPAN looks like it can handle 'x.y.z', at least for Pugs:
> 
> http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/

'handle'? I think it shows up as '6.2.13' simply because it was uploaded 
with the filename Perl6-Pugs-6.2.13.tar.gz


As you point out, the code has the kind of $VERSION number we've been 
suggesting in this thread:

> From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> 
> our $VERSION = 6.002013;
> 
> That's also a very perlish-way to do it.  And there are no developer
> versions of Pugs, since it is always under active development.  We could try
> something like:
> 
> our $VERSION = 1.005002_01;

Yes, this was already like one of my suggestions (1.0502_01), but I 
brought up the concern that 1.05 might be < 1.4.

So then we have a question: do we try and fumble a 1.4 compatible number 
by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if 
it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no 
room for RC numbering, or 1.006000010 (1.6.0.10) - the first final 
release following some 1.006000_001 (1.6.0.01 == rc1) RCs?


> just to tag it as a developer release or release candidate, if that's what
> you want; I'm neutral to that point.  I don't think it's necessary to post
> every RC to CPAN, though, unless you feel very strongly about it.  It just
> seems like more hassle than it's worth, esp. since you've been releasing
> about one per week leading up to a final 1.5.2 (due soon).  

I don't think it would be a hassle; on the contrary it would be very 
useful to know the CPAN distribution actually works. I'm very happy with 
the idea that a release candidate gets fully tested...


From bix at sendu.me.uk  Tue Oct 24 11:39:16 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 16:39:16 +0100
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
Message-ID: <453E33A4.5060004@sendu.me.uk>

JK (Jesper Agerbo Krogh) wrote:
> Hi. 
> 
> We're trying to "extend" bioperl in our own setup. We have some funtions
> that we'd like to "allways" have available on a Bio::Seq-object.
[snip]
> So the question basically is: 
> What is the preferred way of extending/subclassing Bio-perl -objects
> with our own methods? 

http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit


From hlapp at gmx.net  Tue Oct 24 12:24:09 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 24 Oct 2006 12:24:09 -0400
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
Message-ID: <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>

I think you've generally taken the right path, but see below.

First off, object factories are used extensively already but not yet  
in each and every place where Bioperl creates an object internally.  
Achieving your goal may entail fixes to Bioperl to use a factory  
instead of a hard-coded module name. Also be on the lookout for  
factory() or seq_factory() methods for classes whose work entails  
creating sequence objects and that already give you control over the  
type to be created.

The problem that hits you here though isn't one of determining the  
type of the object to be created, because the respective method  
doesn't create a sequence object. It only returns the sequence object  
that the feature has a reference to.

The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your  
extension of the latter is that the Perl garbage collector can't deal  
with circular references. The way we've circumvented the problem with  
sequence (who hold references to their feature objects) and feature  
objects (who need to hold a reference to their sequence object) is to  
make Bio::Seq a wrapper around Bio::PrimarySeq (i.e., Bio::Seq  
implements Bio::PrimarySeqI by delegating all the Bio::PrimarySeqI  
methods to an instance of Bio::PrimarySeq, and then adds  
implementations of the Bio::SeqI methods), and then make feature  
objects only hold a reference to the 'base' Bio::PrimarySeq instance.  
This works because Bio::PrimarySeq doesn't hold features, only  
Bio::SeqI objects do.

Having said all that, note that if all what you want to do is  
defining computations on Bio::Seq objects, as opposed to storing  
values for additional attributes, the best design approach is not to  
extend the class but to create a class with those computations as  
static methods (which would accept the seq object on which to compute  
as an argument; e.g., print $seqComputations->message_digest($seq)).

	-hlmar


On Oct 24, 2006, at 10:28 AM, JK ((Jesper Agerbo Krogh)) wrote:

> Hi.
>
> We're trying to "extend" bioperl in our own setup. We have some  
> funtions
>
> that we'd like to "allways" have available on a Bio::Seq-object. As an
> example,
> I'd like to have the sequence-digest available on ->digest that just
> returns
> A hex-encoded message-digest of the sequence in the object. This is
> really comfortable
> when trying to figure out wether we've got some computations stored in
> the cache
> for this particular sequence.
>
> Another example is that we have some fields we want to be mandatory in
> the objects,
> thus adding additional checks in the constructor is nessesary.
>
> Our approach has been to "subclass" Bio::Seq in a new object:  
> (Nz::Seq)
> and add
> the functionality there. This generally works fine (->translate()  
> calls
> ->can_call_new()
> and instantiates the correct subclassed object.
>
> But the logic fails when the ->seq of a feature just instantiates a
> Bio::PrimarySeq
> without trying to get the subclassed object.
>
> So the question basically is:
> What is the preferred way of extending/subclassing Bio-perl -objects
> with
> our own methods?
>
> Jesper
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 24 12:45:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 11:45:25 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E309B.9090007@sendu.me.uk>
Message-ID: <000001c6f78b$d1c65a30$15327e82@pyrimidine>

...
> 
> 'handle'? I think it shows up as '6.2.13' simply because it was uploaded
> with the filename Perl6-Pugs-6.2.13.tar.gz

Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is
'6.002013'.  So maybe we should follow a similar convention.  Seems easier
and less confusing to me, at least.
 
> As you point out, the code has the kind of $VERSION number we've been
> suggesting in this thread:
> 
> > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> >
> > our $VERSION = 6.002013;
> >
> > That's also a very perlish-way to do it.  And there are no developer
> > versions of Pugs, since it is always under active development.  We could
> try
> > something like:
> >
> > our $VERSION = 1.005002_01;
> 
> Yes, this was already like one of my suggestions (1.0502_01), but I
> brought up the concern that 1.05 might be < 1.4.
> 
> So then we have a question: do we try and fumble a 1.4 compatible number
> by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if
> it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no
> room for RC numbering, or 1.006000010 (1.6.0.10) - the first final
> release following some 1.006000_001 (1.6.0.01 == rc1) RCs?

I would go for the clean break if it follows perl/CPAN convention.
'1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing.

If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6
RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. 

BTW, the reason I looked at Pugs was to see what some of the Perl6
developers were using.  Who knows; they'll probably change it!

...

> I don't think it would be a hassle; on the contrary it would be very
> useful to know the CPAN distribution actually works. I'm very happy with
> the idea that a release candidate gets fully tested...

So you obviously feel strongly about it!  ;> 

I don't have a problem as long as we stick with doing this from now on (i.e.
have a consistent versioning scheme, release policy, CPAN release policy,
etc).  Would be nice for Jason/Brian/Hilmar to chime in as to the reasoning
behind the older versioning scheme.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From JK at novozymes.com  Tue Oct 24 13:59:10 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 19:59:10 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>

>  
> I think you've generally taken the right path, but see below.
> 
> First off, object factories are used extensively already but not yet  
> in each and every place where Bioperl creates an object internally.  
> Achieving your goal may entail fixes to Bioperl to use a factory  
> instead of a hard-coded module name. Also be on the lookout for  
> factory() or seq_factory() methods for classes whose work entails  
> creating sequence objects and that already give you control over the  
> type to be created.

Can you elaborate/describe this a bit more? 

> The problem that hits you here though isn't one of determining the  
> type of the object to be created, because the respective method  
> doesn't create a sequence object. It only returns the sequence object  
> that the feature has a reference to.

This was what Data::Dumper told me, but stuff I'd likewise would like to 
change was to get a RichSeq object returned every-time from Bio::Seq, adding
in the stuff that allways seems appropriate. 

> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your  
> extension of the latter is that the Perl garbage collector can't deal  
> with circular references. 

Doesn't Scalar::Util::weaken solve that? 

> Having said all that, note that if all what you want to do is  
> defining computations on Bio::Seq objects, as opposed to storing  
> values for additional attributes, the best design approach is not to  
> extend the class but to create a class with those computations as  
> static methods (which would accept the seq object on which to compute  
> as an argument; e.g., print $seqComputations->message_digest($seq)).

I could but there are some functionality that I'd by design would like to 
have available on every sequence in the system. This way I would end up 
coding the functionality for getting the message_digest every place that
I needed to get the value (which would be quite often in this application), 
whereas it by design belongs into the Bio::Seq-stuff. 

Jesper


From JK at novozymes.com  Tue Oct 24 13:59:19 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 19:59:19 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <453E33A4.5060004@sendu.me.uk>
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FD@NZT0004E.dknz.nzcorp.net>


> JK (Jesper Agerbo Krogh) wrote:
> > Hi. 
> > 
> > We're trying to "extend" bioperl in our own setup. We have some funtions
> > that we'd like to "allways" have available on a Bio::Seq-object.
> [snip]
> > So the question basically is: 
> > What is the preferred way of extending/subclassing Bio-perl -objects
> > with our own methods? 
> 
> http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit

That is definately a way of extending Bio-perl, thanks. 

Jesper


From hlapp at gmx.net  Tue Oct 24 14:57:02 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 24 Oct 2006 14:57:02 -0400
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>
	<934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>
Message-ID: <C8DB5DCD-E5BB-4AA0-9CDA-3C2EC7B88621@gmx.net>


On Oct 24, 2006, at 1:59 PM, JK ((Jesper Agerbo Krogh)) wrote:

>>
>> I think you've generally taken the right path, but see below.
>>
>> First off, object factories are used extensively already but not yet
>> in each and every place where Bioperl creates an object internally.
>> Achieving your goal may entail fixes to Bioperl to use a factory
>> instead of a hard-coded module name. Also be on the lookout for
>> factory() or seq_factory() methods for classes whose work entails
>> creating sequence objects and that already give you control over the
>> type to be created.
>
> Can you elaborate/describe this a bit more?

See for example the POD of Bio::SeqIO (sorry, the method is called  
sequence_factory()).

>
>> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your
>> extension of the latter is that the Perl garbage collector can't deal
>> with circular references.
>
> Doesn't Scalar::Util::weaken solve that?

You're welcome to test and try. It should be a simple change in  
Bio::Seq::add_SeqFeature(). You will see that it is this method and  
not the feature object that makes sure the wrapped primarySeq gets  
passed as sequence reference. Just change that to creating a new  
reference to the sequence object and make it a weak reference before  
passing it to the feature object.

(The feature object has no requirement (or knowledge) that the  
referenced sequence object is a PrimarySeq.)

>
>> Having said all that, note that if all what you want to do is
>> defining computations on Bio::Seq objects, as opposed to storing
>> values for additional attributes, the best design approach is not to
>> extend the class but to create a class with those computations as
>> static methods (which would accept the seq object on which to compute
>> as an argument; e.g., print $seqComputations->message_digest($seq)).
>
> I could but there are some functionality that I'd by design would  
> like to
> have available on every sequence in the system. This way I would  
> end up
> coding the functionality for getting the message_digest every place  
> that
> I needed to get the value (which would be quite often in this  
> application),
> whereas it by design belongs into the Bio::Seq-stuff.

I'm not following you why this would make any difference (it would be  
$seq->message_digest() compared to $seqCompute->message_digest 
($seq)), unless what you are saying is that you would like to cache  
the result of the computation.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Wed Oct 25 06:36:27 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 25 Oct 2006 11:36:27 +0100
Subject: [Bioperl-l] Lagan environment variable
Message-ID: <453F3E2B.2040309@sendu.me.uk>

Notification to say I'm changing the environmental variable that 
Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
default variable that the lagan installation and scripts themselves look 
for.

I hope this isn't too much of a burden, but it seems like the sensible 
approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.


Thank you,
Sendu.


From n.haigh at sheffield.ac.uk  Wed Oct 25 09:07:47 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 25 Oct 2006 13:07:47 +0000
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F3E2B.2040309@sendu.me.uk>
References: <453F3E2B.2040309@sendu.me.uk>
Message-ID: <453F61A3.4090904@sheffield.ac.uk>

Sendu Bala wrote:
> Notification to say I'm changing the environmental variable that 
> Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
> default variable that the lagan installation and scripts themselves look 
> for.
>
> I hope this isn't too much of a burden, but it seems like the sensible 
> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.
>
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Woudn't it make more sense to change the test? That is what I've just
done for t/Genscan.t

It seemed to fit in with the ENV variable syntax that other modules in
Bioperl-run used.

Nath

-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>


From bix at sendu.me.uk  Wed Oct 25 08:12:00 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 25 Oct 2006 13:12:00 +0100
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F61A3.4090904@sheffield.ac.uk>
References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk>
Message-ID: <453F5490.7060808@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Notification to say I'm changing the environmental variable that 
>> Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
>> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
>> default variable that the lagan installation and scripts themselves look 
>> for.
>>
>> I hope this isn't too much of a burden, but it seems like the sensible 
>> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.
>
> Woudn't it make more sense to change the test? That is what I've just
> done for t/Genscan.t

For Genscan.t, the test script looked at the wrong environment variable.

Here I'm talking about lagan itself (the thing you get from 
http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with 
Bioperl) needing the environment variable LAGAN_DIR to be set in order 
to work.

Since you need to set LAGAN_DIR to make lagan work, it makes sense that 
the Bioperl front-end to lagan also use the same variable.


From n.haigh at sheffield.ac.uk  Wed Oct 25 09:16:16 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 25 Oct 2006 13:16:16 +0000
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F5490.7060808@sendu.me.uk>
References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk>
	<453F5490.7060808@sendu.me.uk>
Message-ID: <453F63A0.7040609@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Notification to say I'm changing the environmental variable that
>>> Bio::Tools::Run::Alignment::Lagan expects to define the location of
>>> the lagan executables from LAGANDIR to LAGAN_DIR, since the latter
>>> is the default variable that the lagan installation and scripts
>>> themselves look for.
>>>
>>> I hope this isn't too much of a burden, but it seems like the
>>> sensible approach to getting Bio::Tools::Run::Alignment::Lagan to
>>> actually work.
>>
>> Woudn't it make more sense to change the test? That is what I've just
>> done for t/Genscan.t
>
> For Genscan.t, the test script looked at the wrong environment variable.
>
> Here I'm talking about lagan itself (the thing you get from
> http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with
> Bioperl) needing the environment variable LAGAN_DIR to be set in order
> to work.
>
> Since you need to set LAGAN_DIR to make lagan work, it makes sense
> that the Bioperl front-end to lagan also use the same variable.
>
Ah, OK! :-[  teach me for speak up about something I know nothing about!
:-)

FYI, I've been busy this morning installing as much Bioperl-run external
software as I could (those that have tests). Will be posting results shorty.

Nath


From massimo.ubaldi at gmail.com  Wed Oct 25 10:28:52 2006
From: massimo.ubaldi at gmail.com (Massimo Ubaldi)
Date: Wed, 25 Oct 2006 16:28:52 +0200
Subject: [Bioperl-l] blastxml format
Message-ID: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>

Hi
I'm using the script below to parse a blastn output to multiple sequences
I got the output from the blast web interface asking for xml formatted
output.
Everything work fine except that I cannot print the name of each input
sequence (see below).
That is, using the line (see below) $result->query_description I got just
the name of the first sequence. Infact this is defined by the
<BlastOutput_query-def> tag.
What I really want is to extract the name that is defined by the
<Iteration_query-def> tag.
Now I digged out the bioperl mailing list and other sources but I did not
find anything to solve this.
Can somebody help me?
Thanks alot
Massimo


 This is an example of ouput I got

MRDNA_probe
46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form B
(LOC562171), mRNA    68354945    XM_685568
81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
68420187    XM_684078

This what I'd like to get
MRDNA_probe
46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form B
(LOC562171), mRNA    68354945    XM_685568
VDRacterm_probe
81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
ARalpcterm_probe
PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
68420187    XM_684078

This is the script
#!/usr/bin/perl
use strict;
use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'blast',
                            -file   => 'Blastn_danio.bls');
open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
stopped";
my $result = $in->next_result;
print OUTFILE $result->algorithm, "\n";
print OUTFILE $result->database_name, "\n";

print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
"\t", "GenBank Accession", "\n";

while($result = $in->next_result ) {
    print OUTFILE $result->query_description, "\n";
      while( my $hit = $result->next_hit ) {
           while( my $hsp = $hit->next_hsp ) {

                my $acc=$hit->name;
                my $description= $hit->description;

                $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;

                print OUTFILE

                  $hit->raw_score, "\t", # Score
                  $hit->description, "\t", # Description

                $1, "\t", $2, "\n";
         }
      }
}


From cjfields at uiuc.edu  Wed Oct 25 11:04:14 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Oct 2006 10:04:14 -0500
Subject: [Bioperl-l] blastxml format
In-Reply-To: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>
Message-ID: <000301c6f846$d6227760$15327e82@pyrimidine>

Iterations (which are related to PSIBLAST) aren't currently handled in
blastxml, which is why the tag isn't being parsed.  I'll give it a look but
I don't think it will be properly fixed anytime soon, since we're gearing up
for a developer release and are sorting out various bugs in relation to
that.

In the meantime, you could always try changing the relevant tag in the
%MAPPING hash in your local copy of Bio::SearchIO::blastxml from
'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick for
you.  I'm a bit reluctant to change this in CVS as it would be better to add
this in when iterations are handled properly by blastxml, and I'm not sure
all BLAST XML varieties have the <Iteration_query-def> tag.

If you want you can add this to the bioperl bugzilla as an enhancement
request to remind us:

http://bugzilla.open-bio.org/

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi
> Sent: Wednesday, October 25, 2006 9:29 AM
> To: bioperl-l List
> Subject: [Bioperl-l] blastxml format
> 
> Hi
> I'm using the script below to parse a blastn output to multiple sequences
> I got the output from the blast web interface asking for xml formatted
> output.
> Everything work fine except that I cannot print the name of each input
> sequence (see below).
> That is, using the line (see below) $result->query_description I got just
> the name of the first sequence. Infact this is defined by the
> <BlastOutput_query-def> tag.
> What I really want is to extract the name that is defined by the
> <Iteration_query-def> tag.
> Now I digged out the bioperl mailing list and other sources but I did not
> find anything to solve this.
> Can somebody help me?
> Thanks alot
> Massimo
> 
> 
>  This is an example of ouput I got
> 
> MRDNA_probe
> 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form
> B
> (LOC562171), mRNA    68354945    XM_685568
> 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> 68420187    XM_684078
> 
> This what I'd like to get
> MRDNA_probe
> 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form
> B
> (LOC562171), mRNA    68354945    XM_685568
> VDRacterm_probe
> 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> ARalpcterm_probe
> PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> 68420187    XM_684078
> 
> This is the script
> #!/usr/bin/perl
> use strict;
> use Bio::SearchIO;
> my $in = new Bio::SearchIO(-format => 'blast',
>                             -file   => 'Blastn_danio.bls');
> open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
> stopped";
> my $result = $in->next_result;
> print OUTFILE $result->algorithm, "\n";
> print OUTFILE $result->database_name, "\n";
> 
> print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
> "\t", "GenBank Accession", "\n";
> 
> while($result = $in->next_result ) {
>     print OUTFILE $result->query_description, "\n";
>       while( my $hit = $result->next_hit ) {
>            while( my $hsp = $hit->next_hsp ) {
> 
>                 my $acc=$hit->name;
>                 my $description= $hit->description;
> 
>                 $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;
> 
>                 print OUTFILE
> 
>                   $hit->raw_score, "\t", # Score
>                   $hit->description, "\t", # Description
> 
>                 $1, "\t", $2, "\n";
>          }
>       }
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From massimo.ubaldi at gmail.com  Wed Oct 25 11:20:49 2006
From: massimo.ubaldi at gmail.com (Massimo Ubaldi)
Date: Wed, 25 Oct 2006 17:20:49 +0200
Subject: [Bioperl-l] blastxml format
In-Reply-To: <000301c6f846$d6227760$15327e82@pyrimidine>
References: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>
	<000301c6f846$d6227760$15327e82@pyrimidine>
Message-ID: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com>

Thanks for the reply. I've already tried this but I got exactly the same
results as before.
What other can I try?
Massimo

On 10/25/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Iterations (which are related to PSIBLAST) aren't currently handled in
> blastxml, which is why the tag isn't being parsed.  I'll give it a look
> but
> I don't think it will be properly fixed anytime soon, since we're gearing
> up
> for a developer release and are sorting out various bugs in relation to
> that.
>
> In the meantime, you could always try changing the relevant tag in the
> %MAPPING hash in your local copy of Bio::SearchIO::blastxml from
> 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick
> for
> you.  I'm a bit reluctant to change this in CVS as it would be better to
> add
> this in when iterations are handled properly by blastxml, and I'm not sure
> all BLAST XML varieties have the <Iteration_query-def> tag.
>
> If you want you can add this to the bioperl bugzilla as an enhancement
> request to remind us:
>
> http://bugzilla.open-bio.org/
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi
> > Sent: Wednesday, October 25, 2006 9:29 AM
> > To: bioperl-l List
> > Subject: [Bioperl-l] blastxml format
> >
> > Hi
> > I'm using the script below to parse a blastn output to multiple
> sequences
> > I got the output from the blast web interface asking for xml formatted
> > output.
> > Everything work fine except that I cannot print the name of each input
> > sequence (see below).
> > That is, using the line (see below) $result->query_description I got
> just
> > the name of the first sequence. Infact this is defined by the
> > <BlastOutput_query-def> tag.
> > What I really want is to extract the name that is defined by the
> > <Iteration_query-def> tag.
> > Now I digged out the bioperl mailing list and other sources but I did
> not
> > find anything to solve this.
> > Can somebody help me?
> > Thanks alot
> > Massimo
> >
> >
> >  This is an example of ouput I got
> >
> > MRDNA_probe
> > 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor
> form
> > B
> > (LOC562171), mRNA    68354945    XM_685568
> > 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> > 68420187    XM_684078
> >
> > This what I'd like to get
> > MRDNA_probe
> > 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor
> form
> > B
> > (LOC562171), mRNA    68354945    XM_685568
> > VDRacterm_probe
> > 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> > ARalpcterm_probe
> > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> > 68420187    XM_684078
> >
> > This is the script
> > #!/usr/bin/perl
> > use strict;
> > use Bio::SearchIO;
> > my $in = new Bio::SearchIO(-format => 'blast',
> >                             -file   => 'Blastn_danio.bls');
> > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
> > stopped";
> > my $result = $in->next_result;
> > print OUTFILE $result->algorithm, "\n";
> > print OUTFILE $result->database_name, "\n";
> >
> > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
> > "\t", "GenBank Accession", "\n";
> >
> > while($result = $in->next_result ) {
> >     print OUTFILE $result->query_description, "\n";
> >       while( my $hit = $result->next_hit ) {
> >            while( my $hsp = $hit->next_hsp ) {
> >
> >                 my $acc=$hit->name;
> >                 my $description= $hit->description;
> >
> >                 $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;
> >
> >                 print OUTFILE
> >
> >                   $hit->raw_score, "\t", # Score
> >                   $hit->description, "\t", # Description
> >
> >                 $1, "\t", $2, "\n";
> >          }
> >       }
> > }
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at uiuc.edu  Wed Oct 25 12:56:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Oct 2006 11:56:46 -0500
Subject: [Bioperl-l] blastxml format
In-Reply-To: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com>
Message-ID: <000001c6f856$8ee44bc0$15327e82@pyrimidine>


> Thanks for the reply. I've already tried this but I got exactly the same >
> results as before.
> What other can I try? 
> Massimo

If you don't mind me asking, what version of perl and Bioperl are you using,
and what version of BLAST is used?  

I want to point out there are a number of problems with your script, now I
have had a chance to look at it.  

1) You have the SearchIO format set to 'blast'.  It should be 'blastxml' if
you are parsing XML format.  

2) Every time you call next_result() you iterate through each BLAST report.
In effect, you're doing something like this:

  my $result = $in->next_result();
   ....# do something here (in first BLAST report)
 
  while ($result = $in->next_result()) { # change to second BLAST report
      # more stuff here (in second BLAST report, if there is one)
  }

I don't know if it's intentional though, but it's something to point out.

3) You also use raw_score(), which doesn't return a value for me (this may
be related to the bioperl version, which is why I asked above).  If you use
$hit->bits() or $hit->significance() you can get the bits or hit evalue,
respectively.

4) Also, I didn't see a difference with the two XML tags
<BlastOutput_query-def> and <Iteration_query-def> using BLAST 2.2.15 output
(WebBLAST at NCBI), which makes sense since they should originate from the
same query sequence anyway.  This could be related to the BLAST version.

Here's my version of your script, using WinXP and bioperl-live (CVS):

use Bio::SearchIO;
my $file = shift @ARGV;

my $in = new Bio::SearchIO(-format => 'blastxml',
                            -file   => $file);

open OUTFILE, ">parsed_blastn_danio.txt" || 
die "Could not open file, stopped";

while(my $result = $in->next_result ) {
    print OUTFILE $result->algorithm, "\n";
    print OUTFILE $result->database_name, "\n";
    print OUTFILE "Score", "\t",
                  "Description", "\t",
                  "NCBI gi identifiers", "\t",
                  "GenBank Accession", "\n";
    print OUTFILE $result->query_description, "\n";
    while( my $hit = $result->next_hit ) {
        while( my $hsp = $hit->next_hsp ) {
            my $acc=$hit->name;
            my $description= $hit->description;
            if ($acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/) {
                print OUTFILE $hit->bits, "\t", # Score
                  $hit->description, "\t", # Description
                  $1, "\t", $2, "\n";
            }
        }
    }
}

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign

...


From n.haigh at sheffield.ac.uk  Thu Oct 26 04:47:27 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 09:47:27 +0100
Subject: [Bioperl-l] More extensive Bioperl-run 1.5.2RC2 tests
Message-ID: <4540761F.6010904@sheffield.ac.uk>

Oops, I posted this to the Biojava list the other day by mistake!

I have recently installed some more software for which there are
bioperl-run tests and run the test suite with several versions of the
software I could find. I've added info to
http://www.bioperl.org/wiki/Release_1.5.2#bioperl-run. If there were any
fails in any of the versions I tested I've noted them together with
versions that were ok (if any).

There maybe another 6 or so programs I'm trying to get hold of to run
further tests - I'll update when I get them.
Nath


From n.haigh at sheffield.ac.uk  Thu Oct 26 05:14:07 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 10:14:07 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
Message-ID: <45407C5F.40104@sheffield.ac.uk>

I'm thinking that it's not wise to test for things like
overall_percentage_identity etc in alignments that are generated by
external software like T-Coffee, Clustalw etc. Changes to software
algorithms/efficiency, bug fixes etc may well alter the quality of the
alignment produced in different versions and thus affect the value
returned by such methods. Therefore, I think these methods should only
be tested from alignments loaded directly from t/data.

Nath


From bix at sendu.me.uk  Thu Oct 26 05:48:37 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 26 Oct 2006 10:48:37 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45407C5F.40104@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk>
Message-ID: <45408475.30903@sendu.me.uk>

Nathan Haigh wrote:
> I'm thinking that it's not wise to test for things like
> overall_percentage_identity etc in alignments that are generated by
> external software like T-Coffee, Clustalw etc. Changes to software
> algorithms/efficiency, bug fixes etc may well alter the quality of the
> alignment produced in different versions and thus affect the value
> returned by such methods. Therefore, I think these methods should only
> be tested from alignments loaded directly from t/data.

Did you discover some specific problem cases?


From n.haigh at sheffield.ac.uk  Thu Oct 26 06:04:54 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 11:04:54 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408475.30903@sendu.me.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
Message-ID: <45408846.1050001@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> I'm thinking that it's not wise to test for things like
>> overall_percentage_identity etc in alignments that are generated by
>> external software like T-Coffee, Clustalw etc. Changes to software
>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>> alignment produced in different versions and thus affect the value
>> returned by such methods. Therefore, I think these methods should only
>> be tested from alignments loaded directly from t/data.
>
> Did you discover some specific problem cases?
My messages seem to be taking a while to come through, but, yes. It may
be due to the software changing default parameters, but it makes testing
the output for specific details pretty difficult and inconsistent. For
example, running T-Coffee, the following command from t/TCoffee.t
results in slightly different alignment:
$aln = $factory->run('-type' => 'profile',
                     '-profile' => $aln1,
                     '-seq'  =>
Bio::Root::IO->catfile("t","data","cysprot1b.fa"));

Of particular note, is the gaps on the last line of the sequences. In
4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
<v4.45 this is ('gkn----mcg').

T-Coffee v4.45 returns the following alignment:

>CATH_RAT/1-333
------mwtalpllcagawllsagat----------aeltvnaiek------------fh
ftswmkqhqktyss-reyshrlqvfannwrkiqahn----qrnhtfkmglnqfsdmsfae
ikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqgacgscwtfs
ttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqafeyilynk
gimgedsypyigkngqckfnpekavafvknvv-nitlndeaamveavalynpvsfafevt
-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivknswgsnwgnn
gyfliergk-nm---cglaacasypipqv
>CATL_HUMAN/1-333
--------------------------------mnptlilaafclgiasatltfdhsleaq
wtkwkamhnrlygmnee-gwrravweknmkmielhnqeyregkhsftmamnafgdmtsee
frqvmngfqnrkpr----kgkvfqeplfyeaprsvdwrekg-yvtpvknqgqcgscwafs
atgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdyafqyvqdng
gldseesypyeateesckynpkysvandtgfv-dip-kqekalmkavatvgpisvaidag
hesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvknswgeewgmg
gyvkmakdrrnh---cgiasaasyptv--
>CATL_RAT/1-334
--------------------------------mtpllllavlclgtalatpkfdqtfnaq
whqwksthrrlygtnee-ewrravweknmrmiqlhngeysngkhgftmemnafgdmtnee
frqivngyrhqkhk----kgrlfqeplmlqipktvdwrekg-cvtpvknqgqcgscwafs
asgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfafqyikeng
gldseesypyeakdgsckyraeyavandtgfv-dip-qqekalmkavatvgpisvamdas
hpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvknswgkewgmd
gyikiakdrnnh---cglataasypivn-
>PAPA_CARPA/1-345
mamipsiskllfvaiclfvymglsfg-------------dfsivgysqndltsterliql
feswmlkhnkiyknidekiyrfeifkdnlkyidetn----kknnsywlglnvfadmsnde
fkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgscgscwafs
avvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsalqlvaqy-
gihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysian-qpvsvvleaa
gkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yiliknswgtgwgen
gyirikrgtgnsygvcglytssfypvkn-
>ALEU_HORVU/1-362
maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtrhalr
farfavrygksyesaaevrrrfrifsesleevrstn----rkglpyrlginrfsdmswee
fqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqahcgscwtfs
ttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqafeyikyng
gidteesypykgvngvchykaenaavqvldsv-nitlnaedelknavglvrpvsvafqvi
-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywliknswgadwgdn
gyfkmemgk-nm---caiatcasypvvaa
>CATH_HUMAN/1-335
------mwatlpllcagawllg--------vpvcgaaelsvnslek------------fh
fkswmskhrktys-teeyhhrlqtfasnwrkinahn----ngnhtfkmalnqfsdmsfae
ikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqgacgscwtfs
ttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqafeyilynk
gimgedtypyqgkdgyckfqpgkaigfvkdva-nitiydeeamveavalynpvsfafevt
-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivknswgpqwgmn
gyfliergk-nm---cglaacasypiplv
>CYS1_DICDI/1-343
-----mkvillfvlavftvfvs---------------srgippeeq------------sq
flefqdkfnkkys-heeylerfeifksnlgkieelnliainhkadtkfgvnkfadlssde
fknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgqcgscwsfs
ttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpnaynyiikng
giqtessypytaetgtqcnfnsanigakisnf-tmipknetvmagyivstgplaiaadav
-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivknswgadwgeq
gyiylrrgk-nt---cgvsnfvstsii--

While T-Coffee <4.45 returned:
>CATH_RAT/1-333
----------mwtalpllcagawllsagat----------aeltvnaiek----------
--fhftswmkqhqktyss-reyshrlqvfannwrkiqahn----q----rnhtfkmglnq
fsdmsfaeikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqga
cgscwtfsttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqa
feyilynkgimgedsypyigkngqckfnpekavafvknvvn-itlndeaamveavalynp
vsfafevt-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivkns
wgsnwgnngyfliergkn----mcglaacasypipqv
>PAPA_CARPA/1-345
mamipsiskllfvaiclfvymglsfgdfsivgysqndltsterliqlfeswml-------
-------------khnkiyknidekiyrf-----eifkdnlkyidetnkknnsywlglnv
fadmsndefkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgs
cgscwafsavvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsa
lq-lvaqygihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysia-nqp
vsvvleaagkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yilikns
wgtgwgengyirikrgtgnsygvcglytssfypvkn-
>CATL_HUMAN/1-333
-----------------------------------------mnptlilaafclgiasatl
tfdhsleaqwtkwkamhnrlygmneegwrravweknmkmielhnqeyregkhsftmamna
fgdmtseefrqvmngfqnrkprkgkvfqeplf----yeaprsvdwrekg-yvtpvknqgq
cgscwafsatgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdya
fqyvqdnggldseesypyeateesckynpkysvandtgfvd--ipkqekalmkavatvgp
isvaidaghesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvkns
wgeewgmggyvkmakdrrnh---cgiasaasyptv--
>CATL_RAT/1-334
-----------------------------------------mtpllllavlclgtalatp
kfdqtfnaqwhqwksthrrlygtneeewrravweknmrmiqlhngeysngkhgftmemna
fgdmtneefrqivngyrhqkhkkgrlfqeplm----lqipktvdwrekg-cvtpvknqgq
cgscwafsasgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfa
fqyikenggldseesypyeakdgsckyraeyavandtgfvd--ipqqekalmkavatvgp
isvamdashpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvkns
wgkewgmdgyikiakdrnnh---cglataasypivn-
>ALEU_HORVU/1-362
----maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtr
halrfarfavrygksyesaaevrrrfrifsesleevrstn----r----kglpyrlginr
fsdmsweefqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqah
cgscwtfsttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqa
feyikynggidteesypykgvngvchykaenaavqvldsvn-itlnaedelknavglvrp
vsvafqvi-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywlikns
wgadwgdngyfkmemgkn----mcaiatcasypvvaa
>CATH_HUMAN/1-335
----------mwatlpllcagawllg--------vpvcgaaelsvnslek----------
--fhfkswmskhrktys-teeyhhrlqtfasnwrkinahn----n----gnhtfkmalnq
fsdmsfaeikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqga
cgscwtfsttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqa
feyilynkgimgedtypyqgkdgyckfqpgkaigfvkdvan-itiydeeamveavalynp
vsfafevt-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivkns
wgpqwgmngyfliergkn----mcglaacasypiplv
>CYS1_DICDI/1-343
---------mkvillfvlavftvfvs---------------srgippeeq----------
--sqflefqdkfnkkys-heeylerfeifksnlgkieelnliain----hkadtkfgvnk
fadlssdefknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgq
cgscwsfsttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpna
ynyiiknggiqtessypytaetgtqcnfnsanigakisnft-mipknetvmagyivstgp
laiaadav-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivkns
wgadwgeqgyiylrrgkn----tcgvsnfvstsii--


From sanges at biogem.it  Thu Oct 26 06:26:36 2006
From: sanges at biogem.it (Remo Sanges)
Date: Thu, 26 Oct 2006 11:26:36 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408846.1050001@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk>
Message-ID: <45408D5C.1000305@biogem.it>

Nathan Haigh wrote:
> Sendu Bala wrote:
>   
>> Nathan Haigh wrote:
>>     
>>> I'm thinking that it's not wise to test for things like
>>> overall_percentage_identity etc in alignments that are generated by
>>> external software like T-Coffee, Clustalw etc. Changes to software
>>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>>> alignment produced in different versions and thus affect the value
>>> returned by such methods. Therefore, I think these methods should only
>>> be tested from alignments loaded directly from t/data.
>>>       
>> Did you discover some specific problem cases?
>>     
> My messages seem to be taking a while to come through, but, yes. It may
> be due to the software changing default parameters, but it makes testing
> the output for specific details pretty difficult and inconsistent. For
> example, running T-Coffee, the following command from t/TCoffee.t
> results in slightly different alignment:
> $aln = $factory->run('-type' => 'profile',
>                      '-profile' => $aln1,
>                      '-seq'  =>
> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>
> Of particular note, is the gaps on the last line of the sequences. In
> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
> <v4.45 this is ('gkn----mcg').
>   
I'm not a T-coffee user but usually you can come across
these problems when you use different scoring parameters
when align sequences.

Could it be possible that they have simply changed the
default parameters for gap penalties and that kind of
stuff? It is possible to set them?

If so you can just run the test by defining
the scores in the param hash without using the default.

HTH

Remo


From n.haigh at sheffield.ac.uk  Thu Oct 26 06:33:55 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 11:33:55 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408D5C.1000305@biogem.it>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it>
Message-ID: <45408F13.9020209@sheffield.ac.uk>

Remo Sanges wrote:
> Nathan Haigh wrote:
>> Sendu Bala wrote:
>>  
>>> Nathan Haigh wrote:
>>>    
>>>> I'm thinking that it's not wise to test for things like
>>>> overall_percentage_identity etc in alignments that are generated by
>>>> external software like T-Coffee, Clustalw etc. Changes to software
>>>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>>>> alignment produced in different versions and thus affect the value
>>>> returned by such methods. Therefore, I think these methods should only
>>>> be tested from alignments loaded directly from t/data.
>>>>       
>>> Did you discover some specific problem cases?
>>>     
>> My messages seem to be taking a while to come through, but, yes. It may
>> be due to the software changing default parameters, but it makes testing
>> the output for specific details pretty difficult and inconsistent. For
>> example, running T-Coffee, the following command from t/TCoffee.t
>> results in slightly different alignment:
>> $aln = $factory->run('-type' => 'profile',
>>                      '-profile' => $aln1,
>>                      '-seq'  =>
>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>>
>> Of particular note, is the gaps on the last line of the sequences. In
>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
>> <v4.45 this is ('gkn----mcg').
>>   
> I'm not a T-coffee user but usually you can come across
> these problems when you use different scoring parameters
> when align sequences.
>
> Could it be possible that they have simply changed the
> default parameters for gap penalties and that kind of
> stuff? It is possible to set them?
>
> If so you can just run the test by defining
> the scores in the param hash without using the default.
>
> HTH
>
> Remo
That is true, but it depends on the whether the wrapper is complete
enough to be able to set all the parameters provided by the software.

Nath


From n.haigh at sheffield.ac.uk  Thu Oct 26 12:13:03 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 17:13:03 +0100
Subject: [Bioperl-l] Bio::Restriction::Enzyme
Message-ID: <4540DE8F.7070501@sheffield.ac.uk>

I'm in the middle of writing some code that uses
Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
Bioperl from HEAD.

I seem to find that $enzyme->is_palindromic always seems to return true.
Can anyone verify this? If needs be, I can send some code.

Thanks
Nathan


From info at nanotechcongresssmailer.net  Tue Oct 24 10:45:10 2006
From: info at nanotechcongresssmailer.net (International Association of Nanotechnology)
Date: Tue, 24 Oct 2006 09:45:10 -0500
Subject: [Bioperl-l] ICNT2006-presents Nanotechnology Workforce Development
Message-ID: <200610241445.k9OEjBBA024478@portal.open-bio.org>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061024/d185772e/attachment-0003.html>

From bosborne11 at verizon.net  Thu Oct 26 12:37:06 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Thu, 26 Oct 2006 12:37:06 -0400
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk>
Message-ID: <C1665C72.B068%bosborne11@verizon.net>

Nathan,

Perhaps because most restriction sites are palindromes. Anyway, I added
tests for palindromic() and is_palindromic() where the site is not a
palindrome, these tests pass (t/RestrictionAnalyis.t).

Brian O.


On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:

> I'm in the middle of writing some code that uses
> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
> Bioperl from HEAD.
> 
> I seem to find that $enzyme->is_palindromic always seems to return true.
> Can anyone verify this? If needs be, I can send some code.
> 
> Thanks
> Nathan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Thu Oct 26 12:49:48 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 17:49:48 +0100
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1665C72.B068%bosborne11@verizon.net>
References: <C1665C72.B068%bosborne11@verizon.net>
Message-ID: <4540E72C.5020800@sheffield.ac.uk>

Brian Osborne wrote:
> Nathan,
>
> Perhaps because most restriction sites are palindromes. Anyway, I added
> tests for palindromic() and is_palindromic() where the site is not a
> palindrome, these tests pass (t/RestrictionAnalyis.t).
>
> Brian O.
>
>
> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>
>   
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>
>   
Ok, thanks - nice to know :-)


From cjfields at uiuc.edu  Thu Oct 26 12:58:34 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 26 Oct 2006 11:58:34 -0500
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk>
Message-ID: <001301c6f91f$f9611770$15327e82@pyrimidine>

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh
> Sent: Thursday, October 26, 2006 11:13 AM
> To: Bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bio::Restriction::Enzyme
> 
> I'm in the middle of writing some code that uses
> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
> Bioperl from HEAD.
> 
> I seem to find that $enzyme->is_palindromic always seems to return true.
> Can anyone verify this? If needs be, I can send some code.
> 
> Thanks
> Nathan

You should file a bug report if you have found a test case where this method
isn't working as it should, especially if Brian's tests pass and you're
still getting the wrong results.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From jason at bioperl.org  Thu Oct 26 12:57:32 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 26 Oct 2006 09:57:32 -0700
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408F13.9020209@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it>
	<45408F13.9020209@sheffield.ac.uk>
Message-ID: <C2AC4DE8-7E99-4744-9FA9-B11C51788BDE@bioperl.org>

Nathan -

I agree - the values tend to change with different versions of the  
applications unfortunately.  It would make sense to just test that  
you get out sequences that are in valid alignment format and perhaps  
have as many ending sequences as you started with.   The more  
restrictive tests probably aren't reliable with mixing and matching  
versions.

One thing we do for PAML is condition tests on the version used - but  
of course when a new version comes out we have to add more stuff to  
the tests (or just have some code that skips those tests).

-jason
On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:

> Remo Sanges wrote:
>> Nathan Haigh wrote:
>>> Sendu Bala wrote:
>>>
>>>> Nathan Haigh wrote:
>>>>
>>>>> I'm thinking that it's not wise to test for things like
>>>>> overall_percentage_identity etc in alignments that are  
>>>>> generated by
>>>>> external software like T-Coffee, Clustalw etc. Changes to software
>>>>> algorithms/efficiency, bug fixes etc may well alter the quality  
>>>>> of the
>>>>> alignment produced in different versions and thus affect the value
>>>>> returned by such methods. Therefore, I think these methods  
>>>>> should only
>>>>> be tested from alignments loaded directly from t/data.
>>>>>
>>>> Did you discover some specific problem cases?
>>>>
>>> My messages seem to be taking a while to come through, but, yes.  
>>> It may
>>> be due to the software changing default parameters, but it makes  
>>> testing
>>> the output for specific details pretty difficult and  
>>> inconsistent. For
>>> example, running T-Coffee, the following command from t/TCoffee.t
>>> results in slightly different alignment:
>>> $aln = $factory->run('-type' => 'profile',
>>>                      '-profile' => $aln1,
>>>                      '-seq'  =>
>>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>>>
>>> Of particular note, is the gaps on the last line of the  
>>> sequences. In
>>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
>>> <v4.45 this is ('gkn----mcg').
>>>
>> I'm not a T-coffee user but usually you can come across
>> these problems when you use different scoring parameters
>> when align sequences.
>>
>> Could it be possible that they have simply changed the
>> default parameters for gap penalties and that kind of
>> stuff? It is possible to set them?
>>
>> If so you can just run the test by defining
>> the scores in the param hash without using the default.
>>
>> HTH
>>
>> Remo
> That is true, but it depends on the whether the wrapper is complete
> enough to be able to set all the parameters provided by the software.
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From cjfields at uiuc.edu  Thu Oct 26 18:01:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 26 Oct 2006 17:01:08 -0500
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <C2AC4DE8-7E99-4744-9FA9-B11C51788BDE@bioperl.org>
Message-ID: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>

I have been running into similar issues with EUtilities tests.  Since the
data on the server is constantly updated I have to try an future-proof the
tests so they don't constantly fail.  

I have been using Test::More and like/unlike or cmp_ok to get around some of
those 'fuzzy data' issues.  If some methods consistently return a particular
type of value, such as an integer, you could use:

like($foo->get_value, qr{^\d+$}, 'value test'); #integer

or similar.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> Nathan -
> 
> I agree - the values tend to change with different versions of the
> applications unfortunately.  It would make sense to just test that
> you get out sequences that are in valid alignment format and perhaps
> have as many ending sequences as you started with.   The more
> restrictive tests probably aren't reliable with mixing and matching
> versions.
> 
> One thing we do for PAML is condition tests on the version used - but
> of course when a new version comes out we have to add more stuff to
> the tests (or just have some code that skips those tests).
> 
> -jason
> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:
> 
> > Remo Sanges wrote:
> >> Nathan Haigh wrote:
> >>> Sendu Bala wrote:
> >>>
> >>>> Nathan Haigh wrote:
> >>>>
> >>>>> I'm thinking that it's not wise to test for things like
> >>>>> overall_percentage_identity etc in alignments that are
> >>>>> generated by
> >>>>> external software like T-Coffee, Clustalw etc. Changes to software
> >>>>> algorithms/efficiency, bug fixes etc may well alter the quality
> >>>>> of the
> >>>>> alignment produced in different versions and thus affect the value
> >>>>> returned by such methods. Therefore, I think these methods
> >>>>> should only
> >>>>> be tested from alignments loaded directly from t/data.
> >>>>>
> >>>> Did you discover some specific problem cases?
> >>>>
> >>> My messages seem to be taking a while to come through, but, yes.
> >>> It may
> >>> be due to the software changing default parameters, but it makes
> >>> testing
> >>> the output for specific details pretty difficult and
> >>> inconsistent. For
> >>> example, running T-Coffee, the following command from t/TCoffee.t
> >>> results in slightly different alignment:
> >>> $aln = $factory->run('-type' => 'profile',
> >>>                      '-profile' => $aln1,
> >>>                      '-seq'  =>
> >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
> >>>
> >>> Of particular note, is the gaps on the last line of the
> >>> sequences. In
> >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
> >>> <v4.45 this is ('gkn----mcg').
> >>>
> >> I'm not a T-coffee user but usually you can come across
> >> these problems when you use different scoring parameters
> >> when align sequences.
> >>
> >> Could it be possible that they have simply changed the
> >> default parameters for gap penalties and that kind of
> >> stuff? It is possible to set them?
> >>
> >> If so you can just run the test by defining
> >> the scores in the param hash without using the default.
> >>
> >> HTH
> >>
> >> Remo
> > That is true, but it depends on the whether the wrapper is complete
> > enough to be able to set all the parameters provided by the software.
> >
> > Nath
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From gbazykin at Princeton.EDU  Thu Oct 26 18:49:56 2006
From: gbazykin at Princeton.EDU (Georgii A Bazykin)
Date: Thu, 26 Oct 2006 18:49:56 -0400
Subject: [Bioperl-l] about PAML running within bioperl
In-Reply-To: <001901c6dbcf$9af4de50$0915020a@zchou>
References: <001901c6dbcf$9af4de50$0915020a@zchou>
Message-ID: <185431468.20061026184956@princeton.edu>

I just had the exact same problem, which was also (as in Caleb Davis's
case) was solved by switching to PAML 3.14 from 3.15.


------------------------------
Tuesday, September 19, 2006, 5:40:07 AM, you wrote:

> Hello, every one,

> I use code in the PAML HOWTO (running PAML fom within Bioperl) on
> my Linux OS. And I set ENV as described by instructions. At the
> beginning, it seems that ClustalW run smoothly. However, when the
> programme run to call method "get_MLmatrix", somethign happened. The
> following information was listed as follows: (What reason or How to solve these problems?)
> ........
> Sequences (2:3) Aligned. Score:  87
> Sequences (2:4) Aligned. Score:  88
> Sequences (2:5) Aligned. Score:  87
> Sequences (2:6) Aligned. Score:  87
> Sequences (2:7) Aligned. Score:  87
> Sequences (2:8) Aligned. Score:  87
> Sequences (3:4) Aligned. Score:  93
> Sequences (3:5) Aligned. Score:  93
> Sequences (3:6) Aligned. Score:  93
> Sequences (3:7) Aligned. Score:  92
> Sequences (3:8) Aligned. Score:  92
> Sequences (4:5) Aligned. Score:  99
> Sequences (4:6) Aligned. Score:  99
> Sequences (4:7) Aligned. Score:  98
> Sequences (4:8) Aligned. Score:  98
> Sequences (5:6) Aligned. Score:  100
> Sequences (5:7) Aligned. Score:  99
> Sequences (5:8) Aligned. Score:  99
> Sequences (6:7) Aligned. Score:  99
> Sequences (6:8) Aligned. Score:  99
> Sequences (7:8) Aligned. Score:  100
> Guide tree        file created:  
> [/home/zchou/TMPDIR/8QEqLivAKY/JU833u8OTP.dnd]
> Start of Multiple Alignment
> There are 7 groups
> Aligning...
> Group 1: Sequences:   2      Score:5875
> Group 2: Sequences:   2      Score:5877
> Group 3: Sequences:   4      Score:5864
> Group 4: Sequences:   5      Score:5537
> Group 5: Sequences:   6      Score:5727
> Group 6: Sequences:   7      Score:5608
> Group 7: Sequences:   8      Score:5607
> Alignment Score 43650
> GCG-Alignment file created     
> [/home/zchou/TMPDIR/8QEqLivAKY/CussPD56rZ]
> aligned aa sequences were: Bio::SimpleAlign=HASH(0x87b93f4)
> Can't call method "get_MLmatrix" on an undefined value at
> originalpaml.pl line 57, <GEN2> line 332.


> Zhuocheng Hou
> Department of Animal Genetics and Breeding
> China Agricultural University


From himanshu.ardawatia at bccs.uib.no  Thu Oct 26 21:54:36 2006
From: himanshu.ardawatia at bccs.uib.no (Himanshu Ardawatia)
Date: Fri, 27 Oct 2006 03:54:36 +0200
Subject: [Bioperl-l] Query on tree bootstrap values
Message-ID: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>

Hi,

2 questions :

1. I have a phylogenetic tree and I wish to set (or modify or query)
bootstrap values for all internal nodes. How do I do that using BioPerl ?

2. I tried the example script attached below for general purpose for the
example newick tree with bootstrap values (also attached below) and It gives
strange results even for branch length. It shows Parent ID as 0.71 which
actually is the bootstrap value for the last ancestral node for human and
chimp and It shows the Child node ID as 'Human' ! Am I missing something in
the tree formatting ? Results also attached below. Also how to extract /
modify/ add bootstrap values in this tree ?

Thanks
Himanshu

EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
#################################
(
  ('Chimp'  : 0.052,
   'Human'  : 0.042) 0.71 : 0.007,
  'Gorilla'  : 0.060,
  ('Gibbon'  : 0.124,
   'Orangutan'  : 0.0971) 1 : 0.038
);
#################################

EXAMPLE SCRIPT:

#################################
#!/usr/bin/perl -w

use Bio::Seq;
# use Bio::TreeIO;
use Bio::Tree::TreeI;

# get a Tree::NodeI somehow
    # like from a TreeIO
    use Bio::TreeIO;
    # read in a clustalw NJ in phylip/newick format
    my $treeio = new Bio::TreeIO(-format => 'newick', -file =>
'example_newick_tree.newick');

    my $tree = $treeio->next_tree; # we'll assume it worked for demo
purposes
                                   # you might want to test that it was
defined

    my $rootnode = $tree->get_root_node;

    # process just the next generation
    foreach my $node ( $rootnode->each_Descendent() ) {
        print "branch len is ", $node->branch_length, "\n";
    }

    # process all the children
    my $example_leaf_node;
    foreach my $node ( $rootnode->get_Descendents() ) {
        if( $node->is_Leaf ) {
            print "node is a leaf ... ";
            # for example use below
            $example_leaf_node = $node unless defined $example_leaf_node;
        }
        print "branch len is ", $node->branch_length, "\n";
    }

    # The ancestor() method points to the parent of a node
    # A node can only have one parent

    my $parent = $example_leaf_node->ancestor;

    # parent won't likely have an description because it is an internal node
    # but child will because it is a leaf

    print "Parent id: ", $parent->id," child id: ",
          $example_leaf_node->id, "\n";

##########################################

RESULTS:
branch len is  0.007
branch len is  0.060
branch len is  0.038
node is a leaf ... branch len is  0.042
node is a leaf ... branch len is  0.052
branch len is  0.007
node is a leaf ... branch len is  0.060
node is a leaf ... branch len is  0.0971
node is a leaf ... branch len is  0.124
branch len is  0.038
Parent id: _0.71_ child id: ___'Human'__


From n.haigh at sheffield.ac.uk  Fri Oct 27 04:42:23 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 08:42:23 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1665C72.B068%bosborne11@verizon.net>
References: <C1665C72.B068%bosborne11@verizon.net>
Message-ID: <4541C66F.1020404@sheffield.ac.uk>

Hi Brian,

I wonder if i'm using is_prototype() correctly as I don't seem to get
any returning true:

my $enz_coll = Bio::Restriction::EnzymeCollection->new();
my $prototype = 0;
foreach my $enz ($enz_coll->each_enzyme) {
    $prototype++ if $enz->is_prototype;
}
print "$prototype have unique recognition sites\n";

prints:
0 have unique recognition sites

Thanks
Nath

Brian Osborne wrote:
> Nathan,
>
> Perhaps because most restriction sites are palindromes. Anyway, I added
> tests for palindromic() and is_palindromic() where the site is not a
> palindrome, these tests pass (t/RestrictionAnalyis.t).
>
> Brian O.
>
>
> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>
>   
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>
>   


-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>


From n.haigh at sheffield.ac.uk  Fri Oct 27 04:47:21 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 08:47:21 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <001301c6f91f$f9611770$15327e82@pyrimidine>
References: <001301c6f91f$f9611770$15327e82@pyrimidine>
Message-ID: <4541C799.4090507@sheffield.ac.uk>

Chris Fields wrote:
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh
>> Sent: Thursday, October 26, 2006 11:13 AM
>> To: Bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] Bio::Restriction::Enzyme
>>
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>>     
>
> You should file a bug report if you have found a test case where this method
> isn't working as it should, especially if Brian's tests pass and you're
> still getting the wrong results.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   

I was doing some filtering of the default set of enzymes and happened to
removed the 2 that are not palindromic before I used is_palindromic().
Thus, I didn't see any that were not palindromic - if that makes sense!
Since I know very little about restriction enzymes, I'll trust that
these are correct :-)  and I'm getting the correct results.

Thanks
Nath
<http://www.mozilla.org/products/thunderbird/>


From n.haigh at sheffield.ac.uk  Fri Oct 27 05:04:40 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 09:04:40 +0000
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>
References: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>
Message-ID: <4541CBA8.10006@sheffield.ac.uk>

Chris Fields wrote:
> I have been running into similar issues with EUtilities tests.  Since the
> data on the server is constantly updated I have to try an future-proof the
> tests so they don't constantly fail.  
>
> I have been using Test::More and like/unlike or cmp_ok to get around some of
> those 'fuzzy data' issues.  If some methods consistently return a particular
> type of value, such as an integer, you could use:
>
> like($foo->get_value, qr{^\d+$}, 'value test'); #integer
>
> or similar.
>
> Chris
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>   
>> Nathan -
>>
>> I agree - the values tend to change with different versions of the
>> applications unfortunately.  It would make sense to just test that
>> you get out sequences that are in valid alignment format and perhaps
>> have as many ending sequences as you started with.   The more
>> restrictive tests probably aren't reliable with mixing and matching
>> versions.
>>
>> One thing we do for PAML is condition tests on the version used - but
>> of course when a new version comes out we have to add more stuff to
>> the tests (or just have some code that skips those tests).
>>
>> -jason
>> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:
>>
>>     
I think it makes sense to test that data of the expected type was
returned by the xternal resource but not to test the specifics of what
was retured. If specifics are tested we are then in the realm of testing
whether we believe the data returned by the external resource or not. We
should assume that the domain experts for these resources know what they
are doing - in some cases this might not be true :-)  but I think we
should stick to testing that the objects created hold the expected type
of data.

I like what Chris had to say (above) but wonder whether tests
would/should be tested for in the module itself - i.e. testing that a
stored value is an integer and warn/throw if not?

Nath


From bix at sendu.me.uk  Fri Oct 27 05:08:18 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 10:08:18 +0100
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
Message-ID: <4541CC82.2040705@sendu.me.uk>

Himanshu Ardawatia wrote:
> Hi,
> 
> 2 questions :
> 
> 1. I have a phylogenetic tree and I wish to set (or modify or query)
> bootstrap values for all internal nodes. How do I do that using BioPerl ?

Does bootstrap() not do what you need?


> 2. I tried the example script attached below for general purpose for the
> example newick tree with bootstrap values (also attached below) and It gives
> strange results even for branch length. It shows Parent ID as 0.71 which
> actually is the bootstrap value for the last ancestral node for human and
> chimp and It shows the Child node ID as 'Human' ! Am I missing something in
> the tree formatting ? Results also attached below. Also how to extract /
> modify/ add bootstrap values in this tree ?
[snip]
> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
> #################################
> (
>   ('Chimp'  : 0.052,
>    'Human'  : 0.042) 0.71 : 0.007,
>   'Gorilla'  : 0.060,
>   ('Gibbon'  : 0.124,
>    'Orangutan'  : 0.0971) 1 : 0.038
> );
> #################################

Are you sure this is in the correct format?

For example, with the tree:
( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 
'Gorilla':0.060, 
('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038);

and your script (with a print "--\n" between the two printing loops for 
clarity) I get...

> ##########################################
> 
> RESULTS:
> branch len is  0.007
> branch len is  0.060
> branch len is  0.038
> node is a leaf ... branch len is  0.042
> node is a leaf ... branch len is  0.052
> branch len is  0.007
> node is a leaf ... branch len is  0.060
> node is a leaf ... branch len is  0.0971
> node is a leaf ... branch len is  0.124
> branch len is  0.038
> Parent id: _0.71_ child id: ___'Human'__

...

branch len is 0.007
branch len is 0.060
branch len is 0.038
--
branch len is 0.007
node is a leaf ... branch len is 0.052
node is a leaf ... branch len is 0.042
node is a leaf ... branch len is 0.060
branch len is 0.038
node is a leaf ... branch len is 0.124
node is a leaf ... branch len is 0.0971
Parent id: 'Human_Chimp_Ancestor' child id: 'Chimp'

This seems reasonable to me. What were you expecting?


From n.haigh at sheffield.ac.uk  Fri Oct 27 07:36:10 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 11:36:10 +0000
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541CC82.2040705@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
	<4541CC82.2040705@sendu.me.uk>
Message-ID: <4541EF2A.4050600@sheffield.ac.uk>

Sendu Bala wrote:
> Himanshu Ardawatia wrote:
>   
>> Hi,
>>
>> 2 questions :
>>
>> 1. I have a phylogenetic tree and I wish to set (or modify or query)
>> bootstrap values for all internal nodes. How do I do that using BioPerl ?
>>     
>
> Does bootstrap() not do what you need?
>
>
>   
>> 2. I tried the example script attached below for general purpose for the
>> example newick tree with bootstrap values (also attached below) and It gives
>> strange results even for branch length. It shows Parent ID as 0.71 which
>> actually is the bootstrap value for the last ancestral node for human and
>> chimp and It shows the Child node ID as 'Human' ! Am I missing something in
>> the tree formatting ? Results also attached below. Also how to extract /
>> modify/ add bootstrap values in this tree ?
>>     
> [snip]
>   
>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>> #################################
>> (
>>   ('Chimp'  : 0.052,
>>    'Human'  : 0.042) 0.71 : 0.007,
>>   'Gorilla'  : 0.060,
>>   ('Gibbon'  : 0.124,
>>    'Orangutan'  : 0.0971) 1 : 0.038
>> );
>> #################################
>>     
>
> Are you sure this is in the correct format?
>   

He/she may have a tree that already contains bootstrap values output
from another program. If this is so, which program did you use? Without
reminding myself of the formats, you should lookup newick format and
whther it is possible to store bootstraps in it. In addition you should
also look up the nhx format.

> For example, with the tree:
> ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 
> 'Gorilla':0.060, 
> ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038);
>
>   

This tree does not contain any bootstrap values - only branch lengths.

Sorry I can't be much more help at the moment - if i get a spare 10 mins
i'll have a closer look.
Nath


From bix at sendu.me.uk  Fri Oct 27 07:16:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 12:16:08 +0100
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EF2A.4050600@sheffield.ac.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk>
Message-ID: <4541EA78.3050404@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Himanshu Ardawatia wrote:
>>>
>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>>> #################################
>>> (
>>>   ('Chimp'  : 0.052,
>>>    'Human'  : 0.042) 0.71 : 0.007,
>>>   'Gorilla'  : 0.060,
>>>   ('Gibbon'  : 0.124,
>>>    'Orangutan'  : 0.0971) 1 : 0.038
>>> );
>>> #################################
>>>     
>> Are you sure this is in the correct format?
>>   
> 
> He/she may have a tree that already contains bootstrap values output
> from another program. If this is so, which program did you use? Without
> reminding myself of the formats, you should lookup newick format and
> whther it is possible to store bootstraps in it. In addition you should
> also look up the nhx format.

Ah, well from a brief google it seemed like some software do store 
boostrap values for internal nodes as the node ids when outputting in 
Newick format. I don't think Bioperl should be able to tell the 
difference between a normal id and a bootstrap value, so you'll have to 
detect that yourself and manually use bootstrap() when you get an id 
that looks like a number.

Or should Bioperl be making this assumption for you? Is that a safe 
thing to do? Maybe as an option only?


From n.haigh at sheffield.ac.uk  Fri Oct 27 08:24:49 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 12:24:49 +0000
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EA78.3050404@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk>
Message-ID: <4541FA91.3040505@sheffield.ac.uk>

--snip--
>
> Ah, well from a brief google it seemed like some software do store
> boostrap values for internal nodes as the node ids when outputting in
> Newick format. I don't think Bioperl should be able to tell the
> difference between a normal id and a bootstrap value, so you'll have
> to detect that yourself and manually use bootstrap() when you get an
> id that looks like a number.

If I remember rightly, in programs like Clustal you can specify where
bootstrap values are stored - node or branch. I can't remember which is
the default way, but TreeView can only see bootstraps in they are stored
using the "non-default" setting. This "could" be the same issue here.

>
> Or should Bioperl be making this assumption for you? Is that a safe
> thing to do? Maybe as an option only?
I don't know without a closer look - i'd also need to look at the newick
format definition as to whether this is an "extension" to the format or
if something is just flouting the newick rules.

Nath


From n.haigh at sheffield.ac.uk  Fri Oct 27 08:59:51 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 12:59:51 +0000
Subject: [Bioperl-l] Caching sequences
Message-ID: <454202C7.1040701@sheffield.ac.uk>

I have a script that is capable of downloading sequences from GenBank
based on GI numbers. I retrieve them if fasta format in order to save
bandwidth, but I'd like to take this one step further and cache the
sequences in case the user want to rerun the script using some of the
GI's they used previously.

Does anyone have any guidance on how best to do this?

Cheers
Nath


From bix at sendu.me.uk  Fri Oct 27 08:35:13 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 13:35:13 +0100
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <454202C7.1040701@sheffield.ac.uk>
References: <454202C7.1040701@sheffield.ac.uk>
Message-ID: <4541FD01.6090803@sendu.me.uk>

Nathan S. Haigh wrote:
> I have a script that is capable of downloading sequences from GenBank
> based on GI numbers. I retrieve them if fasta format in order to save
> bandwidth, but I'd like to take this one step further and cache the
> sequences in case the user want to rerun the script using some of the
> GI's they used previously.
> 
> Does anyone have any guidance on how best to do this?

You'd probably write the sequences out in some suitable format and 
access them via Bio::Index

Or, I'm sure bioperl-db excels at this kind of thing, but is a little 
more involved if this is only a simple situation.


From bosborne11 at verizon.net  Fri Oct 27 09:09:30 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 27 Oct 2006 09:09:30 -0400
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4541C66F.1020404@sheffield.ac.uk>
Message-ID: <C1677D4A.B0AF%bosborne11@verizon.net>

Nathan,

I don't know how this is supposed to work, there would be different ways to
make is_prototype true. One way would be to make the enzyme with the first
occurrence of a given restriction site the prototype (and the next enzymes
with the same site are isoschizomers). Or, one could wait until one site had
appeared twice, with 2 different enzymes, then make the first the prototype,
etc. I would have done it the first way myself but I took a quick look at
IO/withrefm.pm and it looks like it's doing it the second way. That means
one can read an enzyme file and end up with no duplicated restriction sites,
or prototypes and isoschizomers.

Brian O.


On 10/27/06 4:42 AM, "Nathan S. Haigh" <n.haigh at sheffield.ac.uk> wrote:

> Hi Brian,
> 
> I wonder if i'm using is_prototype() correctly as I don't seem to get
> any returning true:
> 
> my $enz_coll = Bio::Restriction::EnzymeCollection->new();
> my $prototype = 0;
> foreach my $enz ($enz_coll->each_enzyme) {
>     $prototype++ if $enz->is_prototype;
> }
> print "$prototype have unique recognition sites\n";
> 
> prints:
> 0 have unique recognition sites
> 
> Thanks
> Nath
> 
> Brian Osborne wrote:
>> Nathan,
>> 
>> Perhaps because most restriction sites are palindromes. Anyway, I added
>> tests for palindromic() and is_palindromic() where the site is not a
>> palindrome, these tests pass (t/RestrictionAnalyis.t).
>> 
>> Brian O.
>> 
>> 
>> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>> 
>>   
>>> I'm in the middle of writing some code that uses
>>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>>> Bioperl from HEAD.
>>> 
>>> I seem to find that $enzyme->is_palindromic always seems to return true.
>>> Can anyone verify this? If needs be, I can send some code.
>>> 
>>> Thanks
>>> Nathan
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>     
>> 
>> 
>>   
> 


From n.haigh at sheffield.ac.uk  Fri Oct 27 10:19:02 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 14:19:02 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1677D4A.B0AF%bosborne11@verizon.net>
References: <C1677D4A.B0AF%bosborne11@verizon.net>
Message-ID: <45421556.9060300@sheffield.ac.uk>

Brian Osborne wrote:
> Nathan,
>
> I don't know how this is supposed to work, there would be different ways to
> make is_prototype true. One way would be to make the enzyme with the first
> occurrence of a given restriction site the prototype (and the next enzymes
> with the same site are isoschizomers). Or, one could wait until one site had
> appeared twice, with 2 different enzymes, then make the first the prototype,
> etc. I would have done it the first way myself but I took a quick look at
> IO/withrefm.pm and it looks like it's doing it the second way. That means
> one can read an enzyme file and end up with no duplicated restriction sites,
> or prototypes and isoschizomers.
>
> Brian O.
>
>   
Hmm, I'd have done it the first way also. Doing it the second way would
mean you only ended up with something as a prototype if there were
multiple enzymes with the same restriction site - is that correct
biologically?

Nath


From n.haigh at sheffield.ac.uk  Fri Oct 27 10:23:20 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 14:23:20 +0000
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
Message-ID: <45421658.5000103@sheffield.ac.uk>

As you may be aware by now, i'm working with Bio::Restriction::Analysis
and friends.

I'm doing restriction analysis on large sequences - chromosomes. I need
to identify an appropriate enzyme based on the total length of fragments
that are of a certain size (e.g. 100 - 500 bp). However, the amount of
memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
have the following code (bottom) which downloads 2 thaliana chromosomes
(mito and chloro - so pretty small) and runs an analysis and then loops
through the fragments for all enzymes in the default collection.

My memory usage just keep on climbing and none seems to get freed up
even when a $ra goes out of scope (start dealing with the next
sequence). Is this a memory leak of some sort, is there a way to free up
memory as I go? I'd appreciate any help/advice on how to reduce the
amount of memory being consumed as I'd like to use all the thaliana
chromosomes (not just mito and chloro), which at the moment probably
won't work.

Cheers
Nath

use strict;
use Bio::DB::GenBank;
use Bio::Restriction::Analysis;
use Bio::Restriction::EnzymeCollection;

my @seq_objs;
my @gis = ( 7525012,  26556996 );

my $db = Bio::DB::GenBank->new(-format => "fasta");
foreach my $gi (@gis) {
  print "Getting GI: $gi\n";
  push @seq_objs, $db->get_Seq_by_id($gi)
}

my $min_fragment_size = 100;
my $max_fragment_size = 500;
my $enz_Coll = Bio::Restriction::EnzymeCollection->new();

foreach my $seq (@seq_objs) {
  my $tot_size = 0;
  print "Processing ", $seq->primary_id,"\n";
  my $ra = Bio::Restriction::Analysis->new(
                                         -seq=>$seq,
                                         -enzymes=>$enz_Coll,
  );
 
  my @all_enzymes = $ra->cutters->each_enzyme;
  print "  Calc total length of fragments in range: $min_fragment_size -
$max_fragment_size\n";
  foreach my $enzyme ( @all_enzymes ) {
    # fragments() is a real memory hog
    foreach my $frag ($ra->fragments($enzyme)) {
      next if $min_fragment_size && (length $frag < $min_fragment_size);
      next if $max_fragment_size && (length $frag > $max_fragment_size);
      $tot_size += length $frag;
    }
    # do something based on value of $tot_size
    #print "    ", $enzyme->name, " total = $tot_size\n";
  }
  print "DONE\n";
}


From avilella at gmail.com  Fri Oct 27 09:39:41 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 27 Oct 2006 14:39:41 +0100
Subject: [Bioperl-l] scale branch lengths of a tree to sum 1
In-Reply-To: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>
References: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>
Message-ID: <358f4d650610270639q14870a6erae2e3c4e9063105d@mail.gmail.com>

I respond to myself: I think I found the way:

my $tree = $treeio->next_tree;
my $total_branch_length = 0;
foreach my $node ($tree->get_nodes) {
    $total_branch_length += $node->branch_length;
}
foreach my $node ($tree->get_nodes) {
    my $branch_length = $node->branch_length;
    next unless (defined($branch_length));
    $node->branch_length($branch_length/$total_branch_length);
    1;
}

my $new_branch_length;
foreach my $node ($tree->get_nodes) {
    $new_branch_length += $node->branch_length;
}
1;

On 10/27/06, Albert Vilella <avilella at gmail.com> wrote:
> Hi all,
>
> I am in need of a method that would scale the different branch lengths
> of a tree so that after the scaling they all sum up to exactly 1.
>
> Any pointers? Has anyone done that before?
>
> Thanks in advance,
>
>     Albert.
>


From cjfields at uiuc.edu  Fri Oct 27 10:35:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 09:35:35 -0500
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <4541CBA8.10006@sheffield.ac.uk>
Message-ID: <001501c6f9d5$2e33e120$15327e82@pyrimidine>

...
> I think it makes sense to test that data of the expected type was
> returned by the xternal resource but not to test the specifics of what
> was retured. If specifics are tested we are then in the realm of testing
> whether we believe the data returned by the external resource or not. We
> should assume that the domain experts for these resources know what they
> are doing - in some cases this might not be true :-)  but I think we
> should stick to testing that the objects created hold the expected type
> of data.
> 
> I like what Chris had to say (above) but wonder whether tests
> would/should be tested for in the module itself - i.e. testing that a
> stored value is an integer and warn/throw if not?
> 
> Nath

Yeah, sorry about the top post (stupid Outlook always sticks the sig at the
top of the page!).  

Testing in the module would be best but can be tricky for the very same
reasons that writing tests entail, even more so.  For instance, for NCBI
esummary data, I parse the data in a very generic way in order to have
access to as much data as possible.  

For tests, I have to assume that NCBI will always return a particular type
of value (string, integer, date).  I can test for each of those with a regex
in the module fairly simply and throw/wanr, as you indicate.  However, if
they decide to add new data with a data tag other that the ones I test for
in the module (i.e. String, Integer, Date), I suddenly have warns/throws
showing up and cluttering/clobbering the code for perfectly valid data.  

However, if these are caught in tests and the tests fail, no big loss.  The
actual module still works, even if the tests are failing based on an new
unknown value being returned.  

For me, failed tests are sort of a warning light to let me know that
something has changed, but it doesn't necessarily mean a module doesn't
work.  I generally use throw/warn for something truly catastrophic, like no
response from the server or an error in the XML, which affects downstream
methods.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Oct 27 11:09:36 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 10:09:36 -0500
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <454202C7.1040701@sheffield.ac.uk>
Message-ID: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>

> I have a script that is capable of downloading sequences from GenBank
> based on GI numbers. I retrieve them if fasta format in order to save
> bandwidth, but I'd like to take this one step further and cache the
> sequences in case the user want to rerun the script using some of the
> GI's they used previously.
> 
> Does anyone have any guidance on how best to do this?
> 
> Cheers
> Nath

There is Bio::DB::InMemoryCache, which is really an interface but appears to
have several methods defined; you could look for modules which implement it.
Sendu's suggestion of the Bio::Index modules and bioperl-db are also good
starting points.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Fri Oct 27 11:21:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 10:21:49 -0500
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <45421556.9060300@sheffield.ac.uk>
Message-ID: <001701c6f9db$9f90d160$15327e82@pyrimidine>

> Brian Osborne wrote:
> > Nathan,
> >
> > I don't know how this is supposed to work, there would be different ways
> to
> > make is_prototype true. One way would be to make the enzyme with the
> first
> > occurrence of a given restriction site the prototype (and the next
> enzymes
> > with the same site are isoschizomers). Or, one could wait until one site
> had
> > appeared twice, with 2 different enzymes, then make the first the
> prototype,
> > etc. I would have done it the first way myself but I took a quick look
> at
> > IO/withrefm.pm and it looks like it's doing it the second way. That
> means
> > one can read an enzyme file and end up with no duplicated restriction
> sites,
> > or prototypes and isoschizomers.
> >
> > Brian O.
> >
> >
> Hmm, I'd have done it the first way also. Doing it the second way would
> mean you only ended up with something as a prototype if there were
> multiple enzymes with the same restriction site - is that correct
> biologically?
> 
> Nath

I had a look at all the Restriction::IO modules a while back; most need
serious updating!  It just hasn't been a top priority unfortunately.

I think the prototype issue may depend on the IO format and whether or not
one is defined explicitly in the file being parsed or is just chosen based
on what Brian said (order in the file, similar cutting site).

By the strictest definition (and cheating by looking at the Fermentas web
site), the prototype is supposed to be the first enzyme discovered which
cleaves a unique sequence, so it may not be the first enzyme found in the
file.  Isoschizomers are those discovered to cleave the same sequence
subsequent to the prototype.  Neoschizomers cleave the same sequence as a
prototype but at a different site.

So this calls into question whether the prototype should be defined at all
unless it is specifically indicated in the file.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Fri Oct 27 12:47:53 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 16:47:53 +0000
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>
References: <454202C7.1040701@sheffield.ac.uk>	
	<001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
	<8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>
Message-ID: <45423839.9040503@sheffield.ac.uk>

Jason Stajich wrote:
> Bio::DB::FileCache does one better and lets you cache the data in a
> persistent file.  Not sure this index is shareable among users though
> - bioperl-db is a better soln when that is desired.
Thanks I'll have a look into it. No need for being sharable among users
- not unless the script becomes heavily used.

Thanks
Nath


From cjfields at uiuc.edu  Fri Oct 27 12:15:00 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 11:15:00 -0500
Subject: [Bioperl-l] StandAloneFasta.t bioperl-run tests
Message-ID: <000101c6f9e3$0e5e95d0$15327e82@pyrimidine>

Nathan,

The test fails you posted on the wiki seem to indicate that using the
wrapper works but the order of the returned hits is off.  Does the order of
the returned hits match the actual FASTA report order?  If it does then the
tests need to be fixed in a way to make it more flexible, to account for
some data 'fuzziness' due to variations in output based on different
versions.  

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From jason at bioperl.org  Fri Oct 27 12:50:54 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 27 Oct 2006 09:50:54 -0700
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EA78.3050404@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk>
Message-ID: <1230E110-01AB-4D4E-842F-20B939555299@bioperl.org>

I've answered to this effect this multiple times in the past on the  
mailing list.  newick format does not distinguish between internal  
ids and bootstrap values (or whatever else you want to attach  
there).  Different programs have different conventions.  when both  
values are present and encoded so that we can parse out the  
bootstrap  like this: [BOOTSTRAP] the parser grabs it out.   If you  
know all the internal ids are boostraps you can just copy the values  
over manually very simply

for my $node ( grep { ! $_->is_Leaf } $tree->get_nodes ) { # get all  
the internal nodes
  $node->bootstrap($node->id) if defined $node->id && length($node- 
 >id); # copy id to boostrap
  $node->id(''); # set internal id to empty
}

If someone can make this clearer on a wiki page that would be great.

On Oct 27, 2006, at 4:16 AM, Sendu Bala wrote:

> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Himanshu Ardawatia wrote:
>>>>
>>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>>>> #################################
>>>> (
>>>>   ('Chimp'  : 0.052,
>>>>    'Human'  : 0.042) 0.71 : 0.007,
>>>>   'Gorilla'  : 0.060,
>>>>   ('Gibbon'  : 0.124,
>>>>    'Orangutan'  : 0.0971) 1 : 0.038
>>>> );
>>>> #################################
>>>>
>>> Are you sure this is in the correct format?
>>>
>>
>> He/she may have a tree that already contains bootstrap values output
>> from another program. If this is so, which program did you use?  
>> Without
>> reminding myself of the formats, you should lookup newick format and
>> whther it is possible to store bootstraps in it. In addition you  
>> should
>> also look up the nhx format.
>
> Ah, well from a brief google it seemed like some software do store
> boostrap values for internal nodes as the node ids when outputting in
> Newick format. I don't think Bioperl should be able to tell the
> difference between a normal id and a bootstrap value, so you'll  
> have to
> detect that yourself and manually use bootstrap() when you get an id
> that looks like a number.
>
> Or should Bioperl be making this assumption for you? Is that a safe
> thing to do? Maybe as an option only?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From avilella at gmail.com  Fri Oct 27 09:23:07 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 27 Oct 2006 14:23:07 +0100
Subject: [Bioperl-l] scale branch lengths of a tree to sum 1
Message-ID: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>

Hi all,

I am in need of a method that would scale the different branch lengths
of a tree so that after the scaling they all sum up to exactly 1.

Any pointers? Has anyone done that before?

Thanks in advance,

    Albert.


From cjfields at uiuc.edu  Fri Oct 27 14:34:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 13:34:57 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
Message-ID: <000001c6f9f6$9ab12710$15327e82@pyrimidine>

I am working an refactoring the AlignIO::stockholm parser to get it reading
and writing Pfam/Rfam alignments, and noticed that many alignments have
EMBL-like annotations attached, which pertain to the entire alignment:

# STOCKHOLM 1.0
#=GF ID    ykkC-yxkD
#=GF AC    RF00442
#=GF DE    ykkC-yxkD element
#=GF AU    Moxon SJ
#=GF GA    20.0
#=GF NC    0.1
#=GF TC    59.4
#=GF SE    Barrick JE, Breaker RR
#=GF SS    Predicted; Barrick JE, Breaker RR
#=GF TP    Cis-reg; riboswitch;
#=GF BM    cmbuild CM SEED
#=GF BM    cmsearch -W 175 CM SEQDB
#=GF RN    [1]
#=GF RM    15096624
#=GF RT    New RNA motifs suggest an expanded scope for riboswitches in
#=GF RT    bacterial genetic control.
#=GF RA    Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, Collins J,
Lee
#=GF RA    M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR;
#=GF RL    Proc Natl Acad Sci U S A 2004;101:6421-6426.
#=GF CC    This family represents the bacterial ykkC/yxkD element. The
function of
#=GF CC    this family is unclear although it has been suggested that it may
function
#=GF CC    to switch on efflux pumps and detoxification systems in response
to harmful
#=GF CC    environmental molecules [1]. The Thermoanaerobacter tengcongensis
sequence
#=GF CC    EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that the two
#=GF CC    riboswitches may work in conjunction to regulate the the upstream
gene
#=GF CC    which codes for Swiss:Q8RC62, a member of Pfam:PF00860 (Personal
obs. Moxon
#=GF CC    SJ).
#=GF SQ    16

SimpleAlign, as implemented, seemingly doesn't have a way to store this
information.

I'll work on getting the core alignment IO working, but would there be any
interest in having a way to store annotations in Bio::SimpleAlign?  I'm
guessing the methods would be similar to the various Bio::Seq Annotation
methods.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From hlapp at gmx.net  Fri Oct 27 16:23:46 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 27 Oct 2006 16:23:46 -0400
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
Message-ID: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>

You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose  
this is what you meant by the 'various Bio::Seq Annotation methods'  
too.)

Just to make sure I'm not misunderstanding, I suppose the annotation  
pertains to the entire alignment?

	-hilmar

On Oct 27, 2006, at 2:34 PM, Chris Fields wrote:

> I am working an refactoring the AlignIO::stockholm parser to get it  
> reading
> and writing Pfam/Rfam alignments, and noticed that many alignments  
> have
> EMBL-like annotations attached, which pertain to the entire alignment:
>
> # STOCKHOLM 1.0
> #=GF ID    ykkC-yxkD
> #=GF AC    RF00442
> #=GF DE    ykkC-yxkD element
> #=GF AU    Moxon SJ
> #=GF GA    20.0
> #=GF NC    0.1
> #=GF TC    59.4
> #=GF SE    Barrick JE, Breaker RR
> #=GF SS    Predicted; Barrick JE, Breaker RR
> #=GF TP    Cis-reg; riboswitch;
> #=GF BM    cmbuild CM SEED
> #=GF BM    cmsearch -W 175 CM SEQDB
> #=GF RN    [1]
> #=GF RM    15096624
> #=GF RT    New RNA motifs suggest an expanded scope for  
> riboswitches in
> #=GF RT    bacterial genetic control.
> #=GF RA    Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M,  
> Collins J,
> Lee
> #=GF RA    M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR;
> #=GF RL    Proc Natl Acad Sci U S A 2004;101:6421-6426.
> #=GF CC    This family represents the bacterial ykkC/yxkD element. The
> function of
> #=GF CC    this family is unclear although it has been suggested  
> that it may
> function
> #=GF CC    to switch on efflux pumps and detoxification systems in  
> response
> to harmful
> #=GF CC    environmental molecules [1]. The Thermoanaerobacter  
> tengcongensis
> sequence
> #=GF CC    EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that  
> the two
> #=GF CC    riboswitches may work in conjunction to regulate the the  
> upstream
> gene
> #=GF CC    which codes for Swiss:Q8RC62, a member of Pfam:PF00860  
> (Personal
> obs. Moxon
> #=GF CC    SJ).
> #=GF SQ    16
>
> SimpleAlign, as implemented, seemingly doesn't have a way to store  
> this
> information.
>
> I'll work on getting the core alignment IO working, but would there  
> be any
> interest in having a way to store annotations in Bio::SimpleAlign?   
> I'm
> guessing the methods would be similar to the various Bio::Seq  
> Annotation
> methods.
>
> Chris
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct 27 16:38:17 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 15:38:17 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
Message-ID: <000001c6fa07$d8659990$15327e82@pyrimidine>

Hilmar Lapp wrote:
> You could make SimpleAlign be a Bio::AnnotationHolderI. (I
> suppose this is what you meant by the 'various Bio::Seq Annotation
> methods' too.)
> 
> Just to make sure I'm not misunderstanding, I suppose the
> annotation pertains to the entire alignment?
> 
> 	-hilmar
...

Yes, that's correct.  I would probably use Bio::Seq::Meta for the
sequence-specific markup lines.  I would have to add another new method to
deal with non-sequence-based consensus data (like sec. structure) for now.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Fri Oct 27 11:38:05 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 27 Oct 2006 08:38:05 -0700
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
References: <454202C7.1040701@sheffield.ac.uk>
	<001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
Message-ID: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>

Bio::DB::FileCache does one better and lets you cache the data in a
persistent file.  Not sure this index is shareable among users though -
bioperl-db is a better soln when that is desired.

-jason

On 10/27/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> > I have a script that is capable of downloading sequences from GenBank
> > based on GI numbers. I retrieve them if fasta format in order to save
> > bandwidth, but I'd like to take this one step further and cache the
> > sequences in case the user want to rerun the script using some of the
> > GI's they used previously.
> >
> > Does anyone have any guidance on how best to do this?
> >
> > Cheers
> > Nath
>
> There is Bio::DB::InMemoryCache, which is really an interface but appears
> to
> have several methods defined; you could look for modules which implement
> it.
> Sendu's suggestion of the Bio::Index modules and bioperl-db are also good
> starting points.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Jason Stajich
jason at bioperl.org
http://www.duke.edu/~jes12/


From cjfields at uiuc.edu  Fri Oct 27 21:57:58 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 20:57:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
Message-ID: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>


On Oct 27, 2006, at 3:23 PM, Hilmar Lapp wrote:

> You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose
> this is what you meant by the 'various Bio::Seq Annotation methods'
> too.)
>
> Just to make sure I'm not misunderstanding, I suppose the annotation
> pertains to the entire alignment?
>
> 	-hilmar

BTW, was that supposed to be Bio::AnnotatableI, or  
Bio::AnnotationHolderI?  The latter isn't present in CVS HEAD.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From eric.ross at neuro.utah.edu  Sat Oct 28 17:24:30 2006
From: eric.ross at neuro.utah.edu (Eric Ross)
Date: Sat, 28 Oct 2006 15:24:30 -0600
Subject: [Bioperl-l] PAML
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>

I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object.

I am able to extract other data from the report, but there seems to be a conflict in the documentation.  One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. 


I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far.  Anyone have suggestions?


code:

----begin code-------
#!/usr/bin/perl -w

use strict;


use Bio::Tools::Phylo::PAML;
my $parser = new Bio::Tools::Phylo::PAML
             (-file => "mlc");
my $result = $parser->next_result;
my @posteriors = $result->get_posteriors();

print "@posteriors";

exit(0);

---------end code-------------


---------------
Eric Ross
Computer Analyst II
ejr at neuro.utah.edu
Howard Hughes Medical Institute
University of Utah
S?nchez Lab


From avilella at gmail.com  Sun Oct 29 05:52:04 2006
From: avilella at gmail.com (Albert Vilella)
Date: Sun, 29 Oct 2006 10:52:04 +0000
Subject: [Bioperl-l] PAML
In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>

I don't know if this method is implemented. I can't grep-find it.
Maybe it's simply not there yet, but was planned when the
documentation was written.

On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
> I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>
> I am able to extract other data from the report, but there seems to be a conflict in the documentation.  One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object.
>
>
> I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far.  Anyone have suggestions?
>
>
> code:
>
> ----begin code-------
> #!/usr/bin/perl -w
>
> use strict;
>
>
> use Bio::Tools::Phylo::PAML;
> my $parser = new Bio::Tools::Phylo::PAML
>              (-file => "mlc");
> my $result = $parser->next_result;
> my @posteriors = $result->get_posteriors();
>
> print "@posteriors";
>
> exit(0);
>
> ---------end code-------------
>
>
>
> ---------------
> Eric Ross
> Computer Analyst II
> ejr at neuro.utah.edu
> Howard Hughes Medical Institute
> University of Utah
> S?nchez Lab
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Sun Oct 29 09:23:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 08:23:45 -0600
Subject: [Bioperl-l] PAML
In-Reply-To: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
Message-ID: <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>

Does the data show up in the object using Data::Dumper?

This should be filed as a bug since the docs imply the method  
exists.  This could be written up fairly quickly if one had test data  
and and a script to work with (hint hint...)

Chris

On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote:

> I don't know if this method is implemented. I can't grep-find it.
> Maybe it's simply not there yet, but was planned when the
> documentation was written.
>
> On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
>> I am trying to extract the "Naive Empirical Bayes (NEB)  
>> probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>>
>> I am able to extract other data from the report, but there seems  
>> to be a conflict in the documentation.  One doc implies that there  
>> should be a get_posteriors method. (It's used as an example in the  
>> Bio::Tools::Phylo::PAML doc), but the method does not appear to  
>> exist in the Bio::Tools::Phylo::PAML::Result object.
>>
>>
>> I have been trying various methods, in the event I'm just  
>> "confused", but I've had no luck, thus far.  Anyone have suggestions?
>>
>>
>> code:
>>
>> ----begin code-------
>> #!/usr/bin/perl -w
>>
>> use strict;
>>
>>
>> use Bio::Tools::Phylo::PAML;
>> my $parser = new Bio::Tools::Phylo::PAML
>>              (-file => "mlc");
>> my $result = $parser->next_result;
>> my @posteriors = $result->get_posteriors();
>>
>> print "@posteriors";
>>
>> exit(0);
>>
>> ---------end code-------------
>>
>>
>>
>> ---------------
>> Eric Ross
>> Computer Analyst II
>> ejr at neuro.utah.edu
>> Howard Hughes Medical Institute
>> University of Utah
>> S?nchez Lab
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From eric.ross at neuro.utah.edu  Sun Oct 29 12:06:54 2006
From: eric.ross at neuro.utah.edu (Eric Ross)
Date: Sun, 29 Oct 2006 10:06:54 -0700
Subject: [Bioperl-l] PAML
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
	<9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>
Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>

Thanks for all the help.

I've been looking at the code for the PAML rst parser.  It's a bit tricky. 

We have written a parser specific for our needs, but it looks to be a pretty complicated matter to make it generic.  

The output of PAML can vary a lot depending upon your options and this section can be repeated multiple times.  I'm sure someone with a good grasp of the potential output of PAML could come up with something, but I'll admit to being at a loss. 


---------------
Eric Ross
Computer Analyst II
ejr at neuro.utah.edu
Howard Hughes Medical Institute
University of Utah
S?nchez Lab


-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu]
Sent: Sun 2006-10-29 7:23 AM
To: Albert Vilella
Cc: Eric Ross; Bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] PAML
 
Does the data show up in the object using Data::Dumper?

This should be filed as a bug since the docs imply the method  
exists.  This could be written up fairly quickly if one had test data  
and and a script to work with (hint hint...)

Chris

On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote:

> I don't know if this method is implemented. I can't grep-find it.
> Maybe it's simply not there yet, but was planned when the
> documentation was written.
>
> On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
>> I am trying to extract the "Naive Empirical Bayes (NEB)  
>> probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>>
>> I am able to extract other data from the report, but there seems  
>> to be a conflict in the documentation.  One doc implies that there  
>> should be a get_posteriors method. (It's used as an example in the  
>> Bio::Tools::Phylo::PAML doc), but the method does not appear to  
>> exist in the Bio::Tools::Phylo::PAML::Result object.
>>
>>
>> I have been trying various methods, in the event I'm just  
>> "confused", but I've had no luck, thus far.  Anyone have suggestions?
>>
>>
>> code:
>>
>> ----begin code-------
>> #!/usr/bin/perl -w
>>
>> use strict;
>>
>>
>> use Bio::Tools::Phylo::PAML;
>> my $parser = new Bio::Tools::Phylo::PAML
>>              (-file => "mlc");
>> my $result = $parser->next_result;
>> my @posteriors = $result->get_posteriors();
>>
>> print "@posteriors";
>>
>> exit(0);
>>
>> ---------end code-------------
>>
>>
>>
>> ---------------
>> Eric Ross
>> Computer Analyst II
>> ejr at neuro.utah.edu
>> Howard Hughes Medical Institute
>> University of Utah
>> S?nchez Lab
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sun Oct 29 12:43:20 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 29 Oct 2006 17:43:20 +0000
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <45421658.5000103@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
Message-ID: <4544E838.7090400@sheffield.ac.uk>

Sorry for the repeat post but I haven't had a response. Just wondered if 
anyone had any idea about this?

Thanks
Nath

Nathan S. Haigh wrote:
> As you may be aware by now, i'm working with Bio::Restriction::Analysis
> and friends.
>
> I'm doing restriction analysis on large sequences - chromosomes. I need
> to identify an appropriate enzyme based on the total length of fragments
> that are of a certain size (e.g. 100 - 500 bp). However, the amount of
> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
> have the following code (bottom) which downloads 2 thaliana chromosomes
> (mito and chloro - so pretty small) and runs an analysis and then loops
> through the fragments for all enzymes in the default collection.
>
> My memory usage just keep on climbing and none seems to get freed up
> even when a $ra goes out of scope (start dealing with the next
> sequence). Is this a memory leak of some sort, is there a way to free up
> memory as I go? I'd appreciate any help/advice on how to reduce the
> amount of memory being consumed as I'd like to use all the thaliana
> chromosomes (not just mito and chloro), which at the moment probably
> won't work.
>
> Cheers
> Nath
>
> use strict;
> use Bio::DB::GenBank;
> use Bio::Restriction::Analysis;
> use Bio::Restriction::EnzymeCollection;
>
> my @seq_objs;
> my @gis = ( 7525012,  26556996 );
>
> my $db = Bio::DB::GenBank->new(-format => "fasta");
> foreach my $gi (@gis) {
>   print "Getting GI: $gi\n";
>   push @seq_objs, $db->get_Seq_by_id($gi)
> }
>
> my $min_fragment_size = 100;
> my $max_fragment_size = 500;
> my $enz_Coll = Bio::Restriction::EnzymeCollection->new();
>
> foreach my $seq (@seq_objs) {
>   my $tot_size = 0;
>   print "Processing ", $seq->primary_id,"\n";
>   my $ra = Bio::Restriction::Analysis->new(
>                                          -seq=>$seq,
>                                          -enzymes=>$enz_Coll,
>   );
>  
>   my @all_enzymes = $ra->cutters->each_enzyme;
>   print "  Calc total length of fragments in range: $min_fragment_size -
> $max_fragment_size\n";
>   foreach my $enzyme ( @all_enzymes ) {
>     # fragments() is a real memory hog
>     foreach my $frag ($ra->fragments($enzyme)) {
>       next if $min_fragment_size && (length $frag < $min_fragment_size);
>       next if $max_fragment_size && (length $frag > $max_fragment_size);
>       $tot_size += length $frag;
>     }
>     # do something based on value of $tot_size
>     #print "    ", $enzyme->name, " total = $tot_size\n";
>   }
>   print "DONE\n";
> }
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   


From cjfields at uiuc.edu  Sun Oct 29 13:09:54 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 12:09:54 -0600
Subject: [Bioperl-l] PAML
In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
	<9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <C775A898-5D18-48F6-874F-3B359C1A10C5@uiuc.edu>

On Oct 29, 2006, at 11:06 AM, Eric Ross wrote:

> Thanks for all the help.
>
> I've been looking at the code for the PAML rst parser.  It's a bit  
> tricky.
>
> We have written a parser specific for our needs, but it looks to be  
> a pretty complicated matter to make it generic.
>
> The output of PAML can vary a lot depending upon your options and  
> this section can be repeated multiple times.  I'm sure someone with  
> a good grasp of the potential output of PAML could come up with  
> something, but I'll admit to being at a loss.

Eric,

I planned on looking at ways to integrate the protein-based PAML  
programs but I'm working on a different area at the moment.  I agree  
it may be hard to adequately genericize parsing/methods to accomplish  
this, but if you have any ideas feel free to post them.  Again, I  
would suggest adding any proposed enhancements or bugs to Bugzilla:

http://bugzilla.open-bio.org/

Suggestions or bug reports on the list sometimes get lost in the  
shuffle, esp. since we're planning on a new developer release soon.

Chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct 29 13:16:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 12:16:37 -0600
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <4544E838.7090400@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
	<4544E838.7090400@sheffield.ac.uk>
Message-ID: <6D9EAA04-199C-4BDD-AA60-4833BC1CE250@uiuc.edu>


On Oct 29, 2006, at 11:43 AM, Nathan S. Haigh wrote:

> Sorry for the repeat post but I haven't had a response. Just  
> wondered if
> anyone had any idea about this?
>
> Thanks
> Nath

...

I think Warnock applies here.  Likely no one is really sure, hence  
they aren't answering.  It probably bears investigating by submitting  
and tracking as a bug.  My guess is something isn't garbage-collected  
properly (i.e. there are circular references present), leading to a  
memory leak.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From chhalling at alumni.ls.berkeley.edu  Sun Oct 29 14:16:36 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Sun, 29 Oct 2006 14:16:36 -0500
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <4544E838.7090400@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
	<4544E838.7090400@sheffield.ac.uk>
Message-ID: <4544FE14.7030701@alumni.ls.berkeley.edu>

Nathan S. Haigh wrote:
> Sorry for the repeat post but I haven't had a response. Just wondered if 
> anyone had any idea about this?
>
> Thanks
> Nath
>
> Nathan S. Haigh wrote:
>   
>> As you may be aware by now, i'm working with Bio::Restriction::Analysis
>> and friends.
>>
>> I'm doing restriction analysis on large sequences - chromosomes. I need
>> to identify an appropriate enzyme based on the total length of fragments
>> that are of a certain size (e.g. 100 - 500 bp). However, the amount of
>> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
>> have the following code (bottom) which downloads 2 thaliana chromosomes
>> (mito and chloro - so pretty small) and runs an analysis and then loops
>> through the fragments for all enzymes in the default collection.
>>
>> My memory usage just keep on climbing and none seems to get freed up
>> even when a $ra goes out of scope (start dealing with the next
>> sequence). Is this a memory leak of some sort, is there a way to free up
>> memory as I go? I'd appreciate any help/advice on how to reduce the
>> amount of memory being consumed as I'd like to use all the thaliana
>> chromosomes (not just mito and chloro), which at the moment probably
>> won't work.
>>
>> Cheers
>> Nath
>>
>> use strict;
>> use Bio::DB::GenBank;
>> use Bio::Restriction::Analysis;
>> use Bio::Restriction::EnzymeCollection;
>>
>> my @seq_objs;
>> my @gis = ( 7525012,  26556996 );
>>
>> my $db = Bio::DB::GenBank->new(-format => "fasta");
>> foreach my $gi (@gis) {
>>   print "Getting GI: $gi\n";
>>   push @seq_objs, $db->get_Seq_by_id($gi)
>> }
>>
>> my $min_fragment_size = 100;
>> my $max_fragment_size = 500;
>> my $enz_Coll = Bio::Restriction::EnzymeCollection->new();
>>
>> foreach my $seq (@seq_objs) {
>>   my $tot_size = 0;
>>   print "Processing ", $seq->primary_id,"\n";
>>   my $ra = Bio::Restriction::Analysis->new(
>>                                          -seq=>$seq,
>>                                          -enzymes=>$enz_Coll,
>>   );
>>  
>>   my @all_enzymes = $ra->cutters->each_enzyme;
>>   print "  Calc total length of fragments in range: $min_fragment_size -
>> $max_fragment_size\n";
>>   foreach my $enzyme ( @all_enzymes ) {
>>     # fragments() is a real memory hog
>>     foreach my $frag ($ra->fragments($enzyme)) {
>>       next if $min_fragment_size && (length $frag < $min_fragment_size);
>>       next if $max_fragment_size && (length $frag > $max_fragment_size);
>>       $tot_size += length $frag;
>>     }
>>     # do something based on value of $tot_size
>>     #print "    ", $enzyme->name, " total = $tot_size\n";
>>   }
>>   print "DONE\n";
>> }
>>
>>     
Try this code, which creates a new Bio::Restriction::Analysis object for 
each digest. On my PowerBook, this doesn't use more than 13 Mb of memory.

Reading the code for Bio::Restriction::Analysis reveals that the 
fragments() method calls the cut() method. The documentation for the cut 
method states:

Note: cut doesn't now re-initialize everything before figuring out
cuts. This is so that you can do multiple digests, or add more data or
whatever. You'll have to use new to reset everything.

This means there is no memory leak; it's just that the 
Bio::Restriction::Analysis object is retaining cut information for each 
enzyme, which takes a lot of memory.

use strict;
use warnings;
use Bio::DB::GenBank;
use Bio::Restriction::Analysis;
use Bio::Restriction::EnzymeCollection;

my @seq_objs;
my @gis = ( 7525012,  26556996 );

my $db = Bio::DB::GenBank->new(-format => "fasta");
foreach my $gi (@gis) {
  print "Getting GI: $gi\n";
  push @seq_objs, $db->get_Seq_by_id($gi)
}

my $min_fragment_size = 100;
my $max_fragment_size = 500;
my $enz_Coll = Bio::Restriction::EnzymeCollection->new();

foreach my $seq (@seq_objs) {
  print "Processing ", $seq->primary_id, "\n";
  foreach my $enzyme ( $enz_Coll->each_enzyme() ) {
    my $ra = Bio::Restriction::Analysis->new(
      -seq => $seq,
      -enzymes => $enzyme );
    my $tot_size = 0;
 
    print "  Calc total length of fragments in range: $min_fragment_size 
-" .
      " $max_fragment_size\n";

    foreach my $frag ($ra->fragments($enzyme)) {
      next if $min_fragment_size && (length $frag < $min_fragment_size);
      next if $max_fragment_size && (length $frag > $max_fragment_size);
      $tot_size += length $frag;
    }
    # do something based on value of $tot_size
    print "    ", $enzyme->name, " total = $tot_size\n";
  }
  print "DONE\n";
}

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From n.haigh at sheffield.ac.uk  Mon Oct 30 03:51:49 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 30 Oct 2006 08:51:49 +0000
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
Message-ID: <4545BD25.3030107@sheffield.ac.uk>

In my script I retrieve sequences from GenBank in FASTA format by GI
numbers and optionally store the sequence in a cache using
Bio::DB::Fasta. On subsequent runs of the script, the cache is first
checked for the GI and returns the sequence if it is found or the
sequence is obtained from GenBank as above.

I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
object which is defined within the Bio::DB::Fasta file. This is
annoying, since $seq_obj in my script would be either a Bio::Seq if it
was obtained from GenBank or a Bio::PrimarySeq if obtained from the
cache and calling primary_id() on it doesn't do the expected thing with
Bio::PrimarySeq:
ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)

Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?

Nath


From yuhki at ncifcrf.gov  Mon Oct 30 08:57:35 2006
From: yuhki at ncifcrf.gov (Naoya Yuhki)
Date: Mon, 30 Oct 2006 08:57:35 -0500
Subject: [Bioperl-l] bptutorial.pl 0
Message-ID: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov>

Hello,
I run

perl bptutorial.pl 0

and I got the following error.

-------------------- WARNING ---------------------
MSG: id (ROA1_HUMAN) does not exist
---------------------------------------------------
Can't call method "display_id" on an undefined value at bptutorial.pl  
line 3945.

other tests all worked.

I thank any suggestions from you.

NAOYA YUHKI.


From cjfields at uiuc.edu  Mon Oct 30 12:42:21 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 30 Oct 2006 11:42:21 -0600
Subject: [Bioperl-l] bptutorial.pl 0
In-Reply-To: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov>
Message-ID: <000601c6fc4a$c3e43450$15327e82@pyrimidine>

> Hello,
> I run
> 
> perl bptutorial.pl 0
> 
> and I got the following error.
> 
> -------------------- WARNING ---------------------
> MSG: id (ROA1_HUMAN) does not exist
> ---------------------------------------------------
> Can't call method "display_id" on an undefined value at bptutorial.pl
> line 3945. 
> 
> other tests all worked.
> 
> I thank any suggestions from you.
> 
> NAOYA YUHKI.

What version of Bioperl are you running?  

As a warning, the bptutorial.pl script has been removed from CVS and will
not be included in future versions of Bioperl.  It can be found on the
bioperl wiki instead:

http://www.bioperl.org/wiki/Bptutorial

chris


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Mon Oct 30 13:08:15 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 30 Oct 2006 10:08:15 -0800
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <4545BD25.3030107@sheffield.ac.uk>
References: <4545BD25.3030107@sheffield.ac.uk>
Message-ID: <29F47393-D134-4093-8751-E948BF521843@bioperl.org>

Bio::PrimarySeq makes sense because Fasta databases only provide  
sequences without features.  But you are actually getting a  
Bio::PrimarySeq::Fasta object which is a proxy object since the  
module won't pull a whole sequence into memory unless seq() is  
requested.

The problem is really why you are getting something useless set for  
primary_id.

What do you want it to be - the GI number?  you'll need to explicitly  
set it because DB::Fasta has no concept of GI numbers encoded in the  
header line.
AFAIK you cannot also set the primary_id to a value of your liking  
because this a proxy object.  The best bet is to create a Bio::Seq  
object out of one of these and set the primary_id and display_id to  
values that you can compute from the display_id.

At least that has been my strategy when using this - maybe someone  
wants to code something new into the object itsself.

-jason
On Oct 30, 2006, at 12:51 AM, Nathan S. Haigh wrote:

> In my script I retrieve sequences from GenBank in FASTA format by GI
> numbers and optionally store the sequence in a cache using
> Bio::DB::Fasta. On subsequent runs of the script, the cache is first
> checked for the GI and returns the sequence if it is found or the
> sequence is obtained from GenBank as above.
>
> I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
> returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
> object which is defined within the Bio::DB::Fasta file. This is
> annoying, since $seq_obj in my script would be either a Bio::Seq if it
> was obtained from GenBank or a Bio::PrimarySeq if obtained from the
> cache and calling primary_id() on it doesn't do the expected thing  
> with
> Bio::PrimarySeq:
> ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)
>
> Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From golharam at umdnj.edu  Mon Oct 30 15:11:51 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 15:11:51 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String?
Message-ID: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>

I'm trying to parse some blast output w/o actually creating the output
file.  Instead, I'm capturing the output in a variable and would like to
use IO::String to represent the file:

	$_ = `megablast -d somedatabase -i somesequence -D 2`;
	my $blast_file = new IO::String($_);
	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
$blast_file);
	my $results = $searchio->next_result;
	my $hit = $results->next_hit;
	if (! defined($hit)) {
		warn "No BLAST hit for $accession on chr $chr for
Seq/$orth_id/$organism\n\n";
		return;
	}

Now, when Bio::SearchIO tries to read the output line by line, instead
it reads the entire output as 1 line.

If I provide the output in a file and use:

	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
'/tmp/somefile.blast');

This works...so is it possible to use IO::String to provide
Bio::SearchIO with BLAST output?  

Ryan


From golharam at umdnj.edu  Mon Oct 30 15:54:29 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 15:54:29 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com>
Message-ID: <00e801c6fc65$9849aee0$e6028a0a@GOLHARMOBILE1>

Thanks.  How are you getting the output?  system()?  BTW- I'm using
v1.5.1...


> -----Original Message-----
> From: Bernd Web [mailto:bernd.web at gmail.com] 
> Sent: Monday, October 30, 2006 3:45 PM
> To: golharam at umdnj.edu
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] Is it possible to parse BLAST output 
> using IO:String?
> 
> 
> Hi Ryan,
> 
> I parse blastn output using IO::String w/o problems:
> 
>  my $stringfh = new IO::String($input);
>  my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh);
> 
> however this is input does not come via backticks.
> 
> 
> bernd
> 
> On 10/30/06, Ryan Golhar <golharam at umdnj.edu> wrote:
> > I'm trying to parse some blast output w/o actually creating 
> the output 
> > file.  Instead, I'm capturing the output in a variable and 
> would like 
> > to use IO::String to represent the file:
> >
> >         $_ = `megablast -d somedatabase -i somesequence -D 2`;
> >         my $blast_file = new IO::String($_);
> >         my $searchio = new Bio::SearchIO(-format => 'blast', -fh => 
> > $blast_file);
> >         my $results = $searchio->next_result;
> >         my $hit = $results->next_hit;
> >         if (! defined($hit)) {
> >                 warn "No BLAST hit for $accession on chr $chr for 
> > Seq/$orth_id/$organism\n\n";
> >                 return;
> >         }
> >
> > Now, when Bio::SearchIO tries to read the output line by 
> line, instead 
> > it reads the entire output as 1 line.
> >
> > If I provide the output in a file and use:
> >
> >         my $searchio = new Bio::SearchIO(-format => 
> 'blast', -file => 
> > '/tmp/somefile.blast');
> >
> > This works...so is it possible to use IO::String to provide 
> > Bio::SearchIO with BLAST output?
> >
> > Ryan
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org 
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 


From bix at sendu.me.uk  Mon Oct 30 16:27:58 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 30 Oct 2006 21:27:58 +0000
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
Message-ID: <45466E5E.9000504@sendu.me.uk>

Ryan Golhar wrote:
> I'm trying to parse some blast output w/o actually creating the output
> file.  Instead, I'm capturing the output in a variable and would like to
> use IO::String to represent the file:
> 
> 	$_ = `megablast -d somedatabase -i somesequence -D 2`;
> 	my $blast_file = new IO::String($_);
> 	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
> $blast_file);
> 	my $results = $searchio->next_result;
> 	my $hit = $results->next_hit;
> 	if (! defined($hit)) {
> 		warn "No BLAST hit for $accession on chr $chr for
> Seq/$orth_id/$organism\n\n";
> 		return;
> 	}
> 
> Now, when Bio::SearchIO tries to read the output line by line, instead
> it reads the entire output as 1 line.
> 
> If I provide the output in a file and use:
> 
> 	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
> '/tmp/somefile.blast');
> 
> This works...so is it possible to use IO::String to provide
> Bio::SearchIO with BLAST output?

Why must it be IO::String? Why not just open() your megablast and 
provide $searchio the real filehandle? It would be faster that way as well.

Read the docs for `. Your usage above is inappropriate.


From golharam at umdnj.edu  Mon Oct 30 16:54:45 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 16:54:45 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <C3209DC5-433B-4BAD-A184-AC9D2A2B4A90@bioperl.org>
Message-ID: <00f901c6fc6e$03916460$e6028a0a@GOLHARMOBILE1>

Hmmm.  Yes, I suppose I could.  
 
I did it with the backtick because I based my code off of the "To and
>From a String" from the SeqIO HOWTO...
 

-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason
Stajich
Sent: Monday, October 30, 2006 4:44 PM
To: Sendu Bala
Cc: golharam at umdnj.edu; 'bioperl-l'
Subject: Re: [Bioperl-l] Is it possible to parse BLAST output using
IO:String?


right - can't you just do: 

my $fh;
open($fh, "megablast -d ... | ") || die $!;
my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh);

On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote:


Ryan Golhar wrote:

I'm trying to parse some blast output w/o actually creating the output
file.  Instead, I'm capturing the output in a variable and would like to
use IO::String to represent the file:

$_ = `megablast -d somedatabase -i somesequence -D 2`;
my $blast_file = new IO::String($_);
my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
$blast_file);
my $results = $searchio->next_result;
my $hit = $results->next_hit;
if (! defined($hit)) {
warn "No BLAST hit for $accession on chr $chr for
Seq/$orth_id/$organism\n\n";
return;
}

Now, when Bio::SearchIO tries to read the output line by line, instead
it reads the entire output as 1 line.

If I provide the output in a file and use:

my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
'/tmp/somefile.blast');

This works...so is it possible to use IO::String to provide
Bio::SearchIO with BLAST output?


Why must it be IO::String? Why not just open() your megablast and 
provide $searchio the real filehandle? It would be faster that way as
well.

Read the docs for `. Your usage above is inappropriate.


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


--
Jason Stajich, PhD 
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From bernd.web at gmail.com  Mon Oct 30 15:44:31 2006
From: bernd.web at gmail.com (Bernd Web)
Date: Mon, 30 Oct 2006 21:44:31 +0100
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
Message-ID: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com>

Hi Ryan,

I parse blastn output using IO::String w/o problems:

 my $stringfh = new IO::String($input);
 my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh);

however this is input does not come via backticks.


bernd

On 10/30/06, Ryan Golhar <golharam at umdnj.edu> wrote:
> I'm trying to parse some blast output w/o actually creating the output
> file.  Instead, I'm capturing the output in a variable and would like to
> use IO::String to represent the file:
>
>         $_ = `megablast -d somedatabase -i somesequence -D 2`;
>         my $blast_file = new IO::String($_);
>         my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
> $blast_file);
>         my $results = $searchio->next_result;
>         my $hit = $results->next_hit;
>         if (! defined($hit)) {
>                 warn "No BLAST hit for $accession on chr $chr for
> Seq/$orth_id/$organism\n\n";
>                 return;
>         }
>
> Now, when Bio::SearchIO tries to read the output line by line, instead
> it reads the entire output as 1 line.
>
> If I provide the output in a file and use:
>
>         my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
> '/tmp/somefile.blast');
>
> This works...so is it possible to use IO::String to provide
> Bio::SearchIO with BLAST output?
>
> Ryan
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From jason at bioperl.org  Mon Oct 30 16:44:18 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 30 Oct 2006 13:44:18 -0800
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <45466E5E.9000504@sendu.me.uk>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
	<45466E5E.9000504@sendu.me.uk>
Message-ID: <C3209DC5-433B-4BAD-A184-AC9D2A2B4A90@bioperl.org>

right - can't you just do:

my $fh;
open($fh, "megablast -d ... | ") || die $!;
my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh);

On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote:

> Ryan Golhar wrote:
>> I'm trying to parse some blast output w/o actually creating the  
>> output
>> file.  Instead, I'm capturing the output in a variable and would  
>> like to
>> use IO::String to represent the file:
>>
>> 	$_ = `megablast -d somedatabase -i somesequence -D 2`;
>> 	my $blast_file = new IO::String($_);
>> 	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
>> $blast_file);
>> 	my $results = $searchio->next_result;
>> 	my $hit = $results->next_hit;
>> 	if (! defined($hit)) {
>> 		warn "No BLAST hit for $accession on chr $chr for
>> Seq/$orth_id/$organism\n\n";
>> 		return;
>> 	}
>>
>> Now, when Bio::SearchIO tries to read the output line by line,  
>> instead
>> it reads the entire output as 1 line.
>>
>> If I provide the output in a file and use:
>>
>> 	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
>> '/tmp/somefile.blast');
>>
>> This works...so is it possible to use IO::String to provide
>> Bio::SearchIO with BLAST output?
>
> Why must it be IO::String? Why not just open() your megablast and
> provide $searchio the real filehandle? It would be faster that way  
> as well.
>
> Read the docs for `. Your usage above is inappropriate.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From lstein at cshl.edu  Mon Oct 30 13:59:29 2006
From: lstein at cshl.edu (Lincoln Stein)
Date: Mon, 30 Oct 2006 13:59:29 -0500
Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase
Message-ID: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>

Hi All,

I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not
to validate. I have committed a new version to live and to the release
candidate branch. I hope it isn't too late to get this into the release.

Lincoln

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From huangyi1 at hkusua.hku.hk  Tue Oct 31 00:46:20 2006
From: huangyi1 at hkusua.hku.hk (Huang Yi)
Date: Tue, 31 Oct 2006 13:46:20 +0800
Subject: [Bioperl-l] bioperl1.5 and GD2.35
Message-ID: <200610310546.k9V5kQGT010481@hkusua.hku.hk>

Hi,

 
I just installed bioperl 1.4 from CPAN to my Gentoo linux computer. But the
installation was failed. I had to install by force.

 
However, the GD module couldn't be installed for some unknown reasons.

 
I therefore use "emerge" tool of Gentoo to get bioperl and GD again. They
are fine. The version of bioperl became upgrade to1.5 and GD was 2.35.

 
However, when I tested it by using the program in HOWTO wiki page
(http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me:

 
Can't locate object method "png" via package "GD::Image" at
/usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 799, <> line 9.

 
In my other computer, bioperl1.4 and GD2.34 work fine. I therefore want to
remove the CPAN bioperl from the system and re-install it, but it seems to
be impossible.

 
Would you please give me some advices on how to let my GD and bioperl work. 

 
Thanks!

 
Huang Yi

 
From bix at sendu.me.uk  Tue Oct 31 03:20:21 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 31 Oct 2006 08:20:21 +0000
Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase
In-Reply-To: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>
References: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>
Message-ID: <45470745.1050605@sendu.me.uk>

Lincoln Stein wrote:
> Hi All,
> 
> I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not
> to validate. I have committed a new version to live and to the release
> candidate branch. I hope it isn't too late to get this into the release.

It isn't too late, thank you.


From avilella at gmail.com  Tue Oct 31 08:54:39 2006
From: avilella at gmail.com (Albert Vilella)
Date: Tue, 31 Oct 2006 13:54:39 +0000
Subject: [Bioperl-l] catfile and catdir
Message-ID: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>

Hi,

I was testing the bioperl-run/t/PAML.t and stumbled upon this a
catdir/catfile error:

Can't locate object method "catdir" via package "Bio::Root::IO" at
/home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
113.
BEGIN failed--compilation aborted at
/home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
143.
Compilation failed in require at t/PAML.t line 64.
BEGIN failed--compilation aborted at t/PAML.t line 64.

Should be be using File::Spec for catdir and catfile instead of Root::IO?

Cheers,

    Albert.


From Kevin.M.Brown at asu.edu  Tue Oct 31 10:34:34 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Tue, 31 Oct 2006 08:34:34 -0700
Subject: [Bioperl-l] bioperl1.5 and GD2.35
Message-ID: <1A4207F8295607498283FE9E93B775B4023B5F3C@EX02.asurite.ad.asu.edu>

Not really a Bioperl issue per se, but sounds like when you had Gentoo
emerge GD it didn't include libpng and so didn't build the needed parts
to create PNG type graphics. 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Huang Yi
> Sent: Monday, October 30, 2006 10:46 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] bioperl1.5 and GD2.35
> 
> Hi,
> 
>  
> 
> I just installed bioperl 1.4 from CPAN to my Gentoo linux 
> computer. But the
> installation was failed. I had to install by force.
> 
>  
> 
> However, the GD module couldn't be installed for some unknown reasons.
> 
>  
> 
> I therefore use "emerge" tool of Gentoo to get bioperl and GD 
> again. They
> are fine. The version of bioperl became upgrade to1.5 and GD was 2.35.
> 
>  
> 
> However, when I tested it by using the program in HOWTO wiki page
> (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me:
> 
>  
> 
> Can't locate object method "png" via package "GD::Image" at
> /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 
> 799, <> line 9.
> 
>  
> 
> In my other computer, bioperl1.4 and GD2.34 work fine. I 
> therefore want to
> remove the CPAN bioperl from the system and re-install it, 
> but it seems to
> be impossible.
> 
>  
> 
> Would you please give me some advices on how to let my GD and 
> bioperl work. 
> 
>  
> 
> Thanks!
> 
>  
> 
> Huang Yi
> 
>  
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From hlapp at gmx.net  Tue Oct 31 11:21:40 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 11:21:40 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
	<24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
Message-ID: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>


On Oct 27, 2006, at 9:57 PM, Chris Fields wrote:

> BTW, was that supposed to be Bio::AnnotatableI, or  
> Bio::AnnotationHolderI?

Sorry, the former. I guess I got confused with FeatureHolders. Too  
bad Featureable isn't an English word.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Tue Oct 31 12:01:44 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 12:01:44 -0500
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <4545BD25.3030107@sheffield.ac.uk>
References: <4545BD25.3030107@sheffield.ac.uk>
Message-ID: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net>

The only thing I would add to Jason's reply is that it is easy to do

	if (! $seq->isa("Bio::SeqI")) {
		my $bioseq = Bio::Seq->new();
		$bioseq->primary_seq($seq);
		$seq = $bioseq;
	}

and from that point on all your objects are Bio::SeqI compliant  
regardless of whether they were obtained that way or not.

Aside from that I wonder why there isn't a -primary_seq option in  
Bio::Seq::new - this would shorten the above into a (more perl'ish)  
single line:

	$seq = Bio::Seq->new(-primary_seq=>$seq) unless $seq->isa("Bio::SeqI");

Anyone takers to add that capability?

-hilmar

On Oct 30, 2006, at 3:51 AM, Nathan S. Haigh wrote:

> In my script I retrieve sequences from GenBank in FASTA format by GI
> numbers and optionally store the sequence in a cache using
> Bio::DB::Fasta. On subsequent runs of the script, the cache is first
> checked for the GI and returns the sequence if it is found or the
> sequence is obtained from GenBank as above.
>
> I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
> returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
> object which is defined within the Bio::DB::Fasta file. This is
> annoying, since $seq_obj in my script would be either a Bio::Seq if it
> was obtained from GenBank or a Bio::PrimarySeq if obtained from the
> cache and calling primary_id() on it doesn't do the expected thing  
> with
> Bio::PrimarySeq:
> ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)
>
> Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 31 12:08:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 11:08:56 -0600
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
Message-ID: <001401c6fd0f$4239aa50$15327e82@pyrimidine>

>> BTW, was that supposed to be Bio::AnnotatableI, or
>> Bio::AnnotationHolderI?
> 
> Sorry, the former. I guess I got confused with
> FeatureHolders. Too bad Featureable isn't an English word.
> 
> 	-hilmar

Having SimpleAlign be AnnotatableI shouldn't be too much of a burden, since
the only additional implemented method is annotation().  So, I think all the
various Stockholm tags can be placed somewhere.

A bit OT: were we planning on getting rid of the various *_tag_* methods in
AnnotatableI at some point?  I'm a bit confused as to why they were added.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Tue Oct 31 12:09:26 2006
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 31 Oct 2006 09:09:26 -0800
Subject: [Bioperl-l] catfile and catdir
In-Reply-To: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>
References: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>
Message-ID: <1AD4DB38-E08D-4E47-8A59-6539068474CB@bioperl.org>

Yep.  Unless we want this to also exist in Root::IO and delegate to  
File::Spec.

-jason
On Oct 31, 2006, at 5:54 AM, Albert Vilella wrote:

> Hi,
>
> I was testing the bioperl-run/t/PAML.t and stumbled upon this a
> catdir/catfile error:
>
> Can't locate object method "catdir" via package "Bio::Root::IO" at
> /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
> 113.
> BEGIN failed--compilation aborted at
> /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
> 143.
> Compilation failed in require at t/PAML.t line 64.
> BEGIN failed--compilation aborted at t/PAML.t line 64.
>
> Should be be using File::Spec for catdir and catfile instead of  
> Root::IO?
>
> Cheers,
>
>     Albert.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Tue Oct 31 12:10:51 2006
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 31 Oct 2006 09:10:51 -0800
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
	<24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
	<8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
Message-ID: <65F92B54-33FD-4D8F-90B7-49E2697CDBA2@bioperl.org>

It just needs to have an annotation collection - so it would be  
Bio::AnnotateableI

On Oct 31, 2006, at 8:21 AM, Hilmar Lapp wrote:

>
> On Oct 27, 2006, at 9:57 PM, Chris Fields wrote:
>
>> BTW, was that supposed to be Bio::AnnotatableI, or
>> Bio::AnnotationHolderI?
>
> Sorry, the former. I guess I got confused with FeatureHolders. Too
> bad Featureable isn't an English word.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From hlapp at gmx.net  Tue Oct 31 12:44:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 12:44:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C16CF3EE.B1A9%bosborne11@verizon.net>
References: <C16CF3EE.B1A9%bosborne11@verizon.net>
Message-ID: <ACF19E78-7FC3-42BE-8F41-86C45C710F4B@gmx.net>

Well isn't this a result of conflating some of the SeqFeatureI  
methods into the annotation collection?

If I'm not mistaken on this then those methods were introduced in  
1.5.0 and hence can go away without deprecation.

	-hilmar

On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote:

> Chris,
>
> I don't think the intent was to remove the methods, rather we'd  
> just call
> deprecated(). Example from AnnotatableI:
>
> sub remove_tag {
>   my ($self, at args) = @_;
>
>   #uncomment in 1.6
>   #$self->deprecated('remove_tag() is deprecated, use
> remove_Annotations()');
>
>   return $self->annotation->remove_Annotations(@args);
> }
>
> With regards to "why", I can't reconstruct the entire rationale  
> myself but I
> can say that the newer names make more sense. Take that example  
> above - it's
> function is to remove entire Annotations not just to remove tags, so
> remove_Annotations is a better name.
>
> Brian O.
>
>
> On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>
>> A bit OT: were we planning on getting rid of the various *_tag_*  
>> methods in
>> AnnotatableI at some point?  I'm a bit confused as to why they  
>> were added.
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bosborne11 at verizon.net  Tue Oct 31 11:37:01 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 31 Oct 2006 12:37:01 -0400
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <001401c6fd0f$4239aa50$15327e82@pyrimidine>
Message-ID: <C16CF3EE.B1A9%bosborne11@verizon.net>

Chris,

I don't think the intent was to remove the methods, rather we'd just call
deprecated(). Example from AnnotatableI:

sub remove_tag {
  my ($self, at args) = @_;

  #uncomment in 1.6
  #$self->deprecated('remove_tag() is deprecated, use
remove_Annotations()');

  return $self->annotation->remove_Annotations(@args);
}

With regards to "why", I can't reconstruct the entire rationale myself but I
can say that the newer names make more sense. Take that example above - it's
function is to remove entire Annotations not just to remove tags, so
remove_Annotations is a better name.

Brian O.


On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> A bit OT: were we planning on getting rid of the various *_tag_* methods in
> AnnotatableI at some point?  I'm a bit confused as to why they were added.


From cjfields at uiuc.edu  Tue Oct 31 13:44:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 12:44:02 -0600
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <ACF19E78-7FC3-42BE-8F41-86C45C710F4B@gmx.net>
Message-ID: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>

Hilmar Lapp wrote:
> Well isn't this a result of conflating some of the
> SeqFeatureI methods into the annotation collection?
> 
> If I'm not mistaken on this then those methods were
> introduced in 1.5.0 and hence can go away without deprecation.
> 
> 	-hilmar
> 
> On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote:
> 
>> Chris,
>> 
>> I don't think the intent was to remove the methods, rather we'd just
>> call deprecated(). Example from AnnotatableI:
>> 
>> sub remove_tag {
>>   my ($self, at args) = @_;
>> 
>>   #uncomment in 1.6
>>   #$self->deprecated('remove_tag() is deprecated, use
>> remove_Annotations()'); 
>> 
>>   return $self->annotation->remove_Annotations(@args); }
>> 
>> With regards to "why", I can't reconstruct the entire rationale
>> myself but I can say that the newer names make more sense. Take that
>> example above - it's function is to remove entire Annotations not
>> just to remove tags, so remove_Annotations is a better name.
>> 
>> Brian O.
>> 
>> 
>> On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>> 
>>> A bit OT: were we planning on getting rid of the various *_tag_*
>>> methods in AnnotatableI at some point?  I'm a bit confused as to why
>>> they were added.

Sorry Brian, what I meant was, based on CVS history, the various *tag*
methods in AnnotatableI were added all at once, with deprecations already
present in the commit.  So the methods weren't there to begin with, then
added only to be deprecated later?  Hence the confusion...

I think Hilmar's right; the CVS history indicates these were added just
prior to rel. 1.5 by Allen and seem to be related to SeqFeatureI.  I'm sure
the intent was good, but they contradict methods in the Feature/Annotation
HOWTO on retrieving Annotation objects via the Annotation::Collection
object.  I think that agrees with your point about the various Annotation*
method names being the more appropriate ones.  

Does everybody agree we should just remove them?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 31 13:53:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 12:53:16 -0600
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net>
Message-ID: <000001c6fd1d$d4359c80$15327e82@pyrimidine>


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp
> Sent: Tuesday, October 31, 2006 11:02 AM
> To: n.haigh at sheffield.ac.uk
> Cc: Bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
> 
> The only thing I would add to Jason's reply is that it is easy to do
> 
> 	if (! $seq->isa("Bio::SeqI")) {
> 		my $bioseq = Bio::Seq->new();
> 		$bioseq->primary_seq($seq);
> 		$seq = $bioseq;
> 	}
> 
> and from that point on all your objects are Bio::SeqI 
> compliant regardless of whether they were obtained that way or not.
> 
> Aside from that I wonder why there isn't a -primary_seq 
> option in Bio::Seq::new - this would shorten the above into a 
> (more perl'ish) single line:
> 
> 	$seq = Bio::Seq->new(-primary_seq=>$seq) unless 
> $seq->isa("Bio::SeqI");
> 
> Anyone takers to add that capability?
> 
> -hilmar

Sounds good to me!

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
From nhansen at nhgri.nih.gov  Tue Oct 31 14:51:23 2006
From: nhansen at nhgri.nih.gov (Nancy Hansen)
Date: Tue, 31 Oct 2006 14:51:23 -0500 (EST)
Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling
Message-ID: <Pine.GSO.4.58.0610311438470.17750@stout.nhgri.nih.gov>


Hello,

	As sequencing centers begin to deposit trace data from "Medical
Sequencing" projects into the public archives, there is now the need to
"anonymize" sequence trace files by removing embedded information which
might be used to identify the individual who was the original source of
the DNA being sequenced.

	I was hoping I might be able to use Bio::SeqIO to manipulate the
comments contained in an SCF-formatted trace file, but I'm finding that
Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information.
Since SCF is a widely-accepted standard for trace files, would it be
reasonable to include fields like "scf_comments" and "scf_header" in a
Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them?
Likewise, it would be great if write_seq could pull these values right
from a SequenceTrace object rather than requiring them as arguments.

	I'd be happy to help in this effort if necessary.

	Thanks,
	--Nancy

*************************************
Nancy F. Hansen, PhD	nhansen at nhgri.nih.gov
Bioinformatics Group
NIH Intramural Sequencing Center (NISC)
5625 Fishers Lane
Rockville, MD 20852
Phone: (301) 435-1560	Fax: (301) 435-6170


From lincoln.stein at gmail.com  Tue Oct 31 15:24:17 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 31 Oct 2006 15:24:17 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <000001c6f78b$d1c65a30$15327e82@pyrimidine>
References: <453E309B.9090007@sendu.me.uk>
	<000001c6f78b$d1c65a30$15327e82@pyrimidine>
Message-ID: <6dce9a0b0610311224x79256b29sf102eb5c35865caf@mail.gmail.com>

Are you going to go ahead with 1.52_XX ? If so, I will code GBrowse to look
for 1.52 or higher.

Lincoln

On 10/24/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> ..
> >
> > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded
> > with the filename Perl6-Pugs-6.2.13.tar.gz
>
> Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is
> '6.002013'.  So maybe we should follow a similar convention.  Seems easier
> and less confusing to me, at least.
>
> > As you point out, the code has the kind of $VERSION number we've been
> > suggesting in this thread:
> >
> > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> > >
> > > our $VERSION = 6.002013;
> > >
> > > That's also a very perlish-way to do it.  And there are no developer
> > > versions of Pugs, since it is always under active development.  We
> could
> > try
> > > something like:
> > >
> > > our $VERSION = 1.005002_01;
> >
> > Yes, this was already like one of my suggestions (1.0502_01), but I
> > brought up the concern that 1.05 might be < 1.4.
> >
> > So then we have a question: do we try and fumble a 1.4 compatible number
> > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if
> > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no
> > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final
> > release following some 1.006000_001 (1.6.0.01 == rc1) RCs?
>
> I would go for the clean break if it follows perl/CPAN convention.
> '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing.
>
> If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6
> RC1, 1.6 RC2 etc then that would be consistent and perl-compatible.
>
> BTW, the reason I looked at Pugs was to see what some of the Perl6
> developers were using.  Who knows; they'll probably change it!
>
> ..
>
> > I don't think it would be a hassle; on the contrary it would be very
> > useful to know the CPAN distribution actually works. I'm very happy with
> > the idea that a release candidate gets fully tested...
>
> So you obviously feel strongly about it!  ;>
>
> I don't have a problem as long as we stick with doing this from now on (
> i.e.
> have a consistent versioning scheme, release policy, CPAN release policy,
> etc).  Would be nice for Jason/Brian/Hilmar to chime in as to the
> reasoning
> behind the older versioning scheme.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From hlapp at gmx.net  Tue Oct 31 16:53:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 16:53:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>
References: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>
Message-ID: <F244DEC6-0ADE-437E-9AED-1F864A54F7AD@gmx.net>


On Oct 31, 2006, at 1:44 PM, Chris Fields wrote:

> Does everybody agree we should just remove them?

I wish you could but I'm afraid that would break stuff? Otherwise why  
were they added in the first place? I thought  
Bio::SeqFeature::Annotated needs them maybe?

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 31 17:41:17 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 16:41:17 -0600
Subject: [Bioperl-l] AnnotatableI tag methods,
	was  Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <F244DEC6-0ADE-437E-9AED-1F864A54F7AD@gmx.net>
Message-ID: <000001c6fd3d$ae37c240$15327e82@pyrimidine>


> On Oct 31, 2006, at 1:44 PM, Chris Fields wrote:
> 
> > Does everybody agree we should just remove them?
> 
> I wish you could but I'm afraid that would break stuff? 
> Otherwise why were they added in the first place? I thought 
> Bio::SeqFeature::Annotated needs them maybe?
> 
> 	-hilmar
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Yep, removing them clobbers a ton of tests, including anything that requires
SeqIO::FTHelper.  Looks like SeqFeature::Generic and a few others use them.


I could understand if these were meant to be permanent methods, but why add
these in if they were to be deprecated in 1.6?  Something that was meant to
be a transition but wasn't finished?  That seems to be indicated in the
commented out lines for all the *tag* methods:

  #uncomment in 1.6
  #$self->deprecated('remove_tag() is deprecated, use
remove_Annotations()');

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
From lincoln.stein at gmail.com  Tue Oct 31 18:18:07 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 31 Oct 2006 18:18:07 -0500
Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning
In-Reply-To: <loom.20061020T041338-193@post.gmane.org>
References: <loom.20061020T041338-193@post.gmane.org>
Message-ID: <6dce9a0b0610311518l3bec852q5d04a9b488621377@mail.gmail.com>

Hi Keith,

The current Bio/DB/GFF/Util/Binning.pm file just contains the hierarchical
binning system that I implemented some time ago. Where is the R-tree system
that you describe? How much of an improvement did the R-tree scheme give
over the hierarchical scheme?

FTYI the GFF3 implementation uses a different binning scheme in which there
is a fixed-size bin. Every time a feature overlaps a bin, it creates a new
row in a table. So big features will have multiple rows and little features
that fit inside a bin will have only one row. The query for this is simpler
and seems to give the same relative speedup as the hierarchical binning
system. I'd really like to get these queries to go as fast as possible and
would love to work with you on this if you're interested.

Lincoln

On 10/19/06, Keith Player <keithplayer at hotmail.com> wrote:
>
> I know that there may be some changes resulting from new GFF3
> implementations,
> but thought I would see if the following is useful anyway.
>
> I implemented the R-tree binning schema as used by
> Bio::DB::GFF::Util::Binning
> and as mention in this article:
>
> I tested the following query on a normal table (no binning), but it
> assumes
> that you know the longest range in the table.  So for example with a table
> of
> human genes, where the longest gene we know of is around 2.4Mb.
>
> SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb])
> AND
> g.start < [end] AND g.end > [start] AND g.chromosome = '1'
>
> so for 100Mb:101Mb
>
> SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start <
> 101000000 AND g.end > 100000000 AND g.chromosome = '1'
>
>
> where [start] and [end] define the region of interest.  This query
> outperforms
> the R-Tree implementation on all tests that I have performed (for lengths
> of
> 200bp to 10Mb across a whole chromsome).  Could this be of some practical
> use?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From bosborne11 at verizon.net  Tue Oct 31 21:31:49 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 31 Oct 2006 22:31:49 -0400
Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling
In-Reply-To: <Pine.GSO.4.58.0610311438470.17750@stout.nhgri.nih.gov>
Message-ID: <C16D7F55.B1D9%bosborne11@verizon.net>

Nancy,

It looks like a good place to start would be the get_header() and
_get_header methods in Bio::SeqIO::scf. If you read t/scf.t you can see that
the author, at some point, wanted get_header to return meaningful
information but stepping through the test shows it returning a lot of UNDEF.
Now I don't know if this is due to the method or the source SCF file, but
you might be able to get these methods to work yourself.

But to answer your questions, yes, it certainly sounds reasonable that these
values would be extracted by Bio::SeqIO::scf.

Brian O.


On 10/31/06 3:51 PM, "Nancy Hansen" <nhansen at nhgri.nih.gov> wrote:

> 
> Hello,
> 
> As sequencing centers begin to deposit trace data from "Medical
> Sequencing" projects into the public archives, there is now the need to
> "anonymize" sequence trace files by removing embedded information which
> might be used to identify the individual who was the original source of
> the DNA being sequenced.
> 
> I was hoping I might be able to use Bio::SeqIO to manipulate the
> comments contained in an SCF-formatted trace file, but I'm finding that
> Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information.
> Since SCF is a widely-accepted standard for trace files, would it be
> reasonable to include fields like "scf_comments" and "scf_header" in a
> Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them?
> Likewise, it would be great if write_seq could pull these values right
> from a SequenceTrace object rather than requiring them as arguments.
> 
> I'd be happy to help in this effort if necessary.
> 
> Thanks,
> --Nancy
> 
> *************************************
> Nancy F. Hansen, PhD nhansen at nhgri.nih.gov
> Bioinformatics Group
> NIH Intramural Sequencing Center (NISC)
> 5625 Fishers Lane
> Rockville, MD 20852
> Phone: (301) 435-1560 Fax: (301) 435-6170
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Sun Oct  1 17:05:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 12:05:25 -0500
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>
	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>
	<451E3707.4090400@sendu.me.uk>
	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>
	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
Message-ID: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>


On Sep 30, 2006, at 4:43 PM, Hilmar Lapp wrote:

>
> On Sep 30, 2006, at 10:57 AM, Chris Fields wrote:
>
>> There should be a failed test to let us know of the problem.  As
>> currently set up, the XEMBL server failure doesn't show up in
>> Test::Harness test summaries.  Biblio_biofetch.t had the similar
>> problems before Brian's fixes.
>
> Just keep in mind that you may not want somebody's CPAN installation
> to fail (or require a 'forced' install) just because some server
> happens to be down for maintenance.
>
> 	-hilmar

I don't think this would be a problem unless users specifically set  
BIOPERLDEBUG to 1, which is something most people don't bother with  
before installation (and probably not something we should promote for  
normal installation anyway).  So, for CPAN installation we would  
suggest that BIOPERLDEBUG be 0 or not set at all, and outline the  
reasons why.

The idea is to retain current behavior (remote DB access will not be  
run unless BIOPERLDEBUG is set to 1) and apply it to all tests  
requiring such access.  Otherwise, just those tests are skipped (and  
not the rest of the tests, which occurs currently).  If BIOPERLDEBUG  
is set, the next tests would check the URL, which passes/fails (based  
on the specific value of $@), and runs/skips tests based on the mere  
presence of $@, which indicates some URL issue.  You can do this with  
Test::More, but I'm not sure this can be done with Test.pm or  
Test::Simple.

The current behavior just skips all tests based on a single failed  
URL.  Then, Test::Harness, as currently set, shows skipped tests as  
passed.  The last run I posted previously where XEMBL_DB.t remote DB  
tests failed, I also ran all tests (make test) and get this, which  
doesn't tell us that the remote URL failed:

-----------------------------------------

...
t/WABA.......................ok
t/XEMBL_DB...................ok
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext  
is not installed or is installed incorrectly - skipping ztr.t tests
ok
All tests successful, 5 subtests skipped.

-----------------------------------------


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct  1 17:17:24 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 12:17:24 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880609301610w7b838543t5f8ba7313d285915@mail.gmail.com>
References: <b99962880609271039s75cc4af4nc109cd637b5b267@mail.gmail.com>
	<7A592EAB-A869-4A6C-BFA8-F73F3DFD8F5B@gmx.net>
	<09FB1EB0-2C1C-4FCF-8339-E78556EFEFF2@uiuc.edu>
	<b99962880609280842w47401efnd6d00ff2a6e7fd98@mail.gmail.com>
	<8D75FE6D-C02D-4A86-93FA-B7256050AF11@uiuc.edu>
	<b99962880609280910i68a649fw38a4a77d514eccf@mail.gmail.com>
	<40155903-555A-4662-BCCE-38E5E3784118@uiuc.edu>
	<54E79A5F-5446-4D8E-AD26-B70894048D60@gmx.net>
	<b99962880609301444h3e0a8bd2y5d3ecb2ca9e222e6@mail.gmail.com>
	<1D69005A-DF0E-4F37-93FE-7577A32CC625@gmx.net>
	<b99962880609301610w7b838543t5f8ba7313d285915@mail.gmail.com>
Message-ID: <CAD572AC-B108-4520-8335-6B2F138905C9@uiuc.edu>

The '-w' flag on the shebang line is the source of those errors.  I  
never set it anymore on Windows due to this; I just use the 'use  
warnings' pragma.

If you use 'perl -I. t/test.t' you can normally get around the '-w'  
assumed by using 'make test'.

I will try running tests on bioperl-db and bioperl tomorrow on WinXP  
to confirm these.

Chris

On Sep 30, 2006, at 6:10 PM, Seth Johnson wrote:

> How do I get rid of all of the warnings for "redefined subroutines"  
> during
> the test??  It clutters the output and I can't see the errors.
>
> On 9/30/06, Hilmar Lapp <hlapp at gmx.net> wrote:
>>
>> It doesn't shed more light but it does raise an alert flag. All tests
>> are supposed to pass. The fact that they don't means the problems you
>> are seeing have nothing to do with your specific data or script.
>>
>> First off - can anyone else confirm those errors using the latest
>> Bioperl-db and Bioperl?
>>
>> Second - Seth could you run those tests individually, e.g., using
>>
>>         $ make test test_02species TEST_VERBOSE=1
>>
>> and similarly for the other tests that have failures and post the
>> output. Let's start with 02species and 03simpleseq.
>>
>>         -hilmar
>>
>> On Sep 30, 2006, at 5:44 PM, Seth Johnson wrote:
>>
>>> There are errors during the test. Here's their summary:
>>> ____________________________
>>> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
>>> -------------------------------------------------------------
>>> t\02species.t                 65    2   3.08%  63 65
>>> t\03simpleseq.t    1   256    59  106 179.66%  7-59
>>> t\04swiss.t                   52   14  26.92%  25 27-34 38-42
>>> t\12ontology.t     2   512   738 1471 199.32%  3-738
>>> t\16obda.t                    12    3  25.00%  10-12
>>> ____________________________
>>>
>>> May be that can shed some light on the problem?!?!
>>>
>>> On 9/29/06, Hilmar Lapp < hlapp at gmx.net> wrote:This may in fact be
>>> a knock-on effect of the fixes? <sigh>
>>>
>>> Seth, did you run the test suite that comes with bioperl-db, and did
>>> you get any errors?
>>>
>>>         -hilmar
>>>
>>> On Sep 28, 2006, at 2:26 PM, Chris Fields wrote:
>>>
>>>> Seth,
>>>>
>>>> The organism issue is a bug and has been reported, though I thought
>>>> it was fixed.
>>>>
>>>> The lack of the date and the version is a bit odd, but there have
>>>> been a lot of changes lately to bioperl-live (core bioperl in CVS),
>>>> and a few to bioperl-db.  How old is your bioperl and bioperl-db
>>>> installation.  Hilmar, any additional thoughts?
>>>>
>>>> Chris
>>>>
>>>> On Sep 28, 2006, at 11:10 AM, Seth Johnson wrote:
>>>>
>>>>> Thank you.  That takes care of that, however, I do have another
>>>>> gripe.  When
>>>>> running my script, quoted before, with "my $out =
>>>>> Bio::SeqIO->newFh('-format' => 'genbank');", I have several key
>>>>> pieces of
>>>>> information missing.  The most important one is the version
>>>>> number.  There's
>>>>> also a date missing, and source organism name is corrupted.
>>>>> Here's what I
>>>>> get:
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> LOCUS       NM_014580               2145 bp    dna     linear    
>>>>> UNK
>>>>> DEFINITION  Homo sapiens solute carrier family 2, (facilitated
>>>>> glucose
>>>>>             transporter) member 8 (SLC2A8), mRNA.
>>>>> ACCESSION   NM_014580
>>>>> SOURCE      sapiens.
>>>>>   ORGANISM  sapiens
>>>>>             Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa;
>>>>> Bilateria;
>>>>>             Coelomata; Deuterostomia; Chordata; Craniata;
>>> Vertebrata;
>>>>>             Gnathostomata; Teleostomi; Euteleostomi;  
>>>>> Sarcopterygii;
>>>>> Tetrapoda;
>>>>>             Amniota; Mammalia; Theria; Eutheria; Euarchontoglires;
>>>>> Primates;
>>>>>             Haplorrhini; Simiiformes; Catarrhini; Hominoidea;
>>>>> Hominidae;
>>>>>             Homo/Pan/Gorilla group; Homo.
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>> All of the missing information is stored in BioSQL and
>>>>> theoretically should
>>>>> be in the outpu. Here's how NCBI genbank file looks:
>>>>>
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>> LOCUS       NM_014580               2145 bp    mRNA    linear
>>>>> PRI 17-OCT-2005
>>>>> DEFINITION  Homo sapiens solute carrier family 2, (facilitated
>>>>> glucose
>>>>>             transporter) member 8 (SLC2A8), mRNA.
>>>>> ACCESSION   NM_014580
>>>>> VERSION     NM_014580.3  GI:51870928
>>>>> KEYWORDS    .
>>>>> SOURCE      Homo sapiens (human)
>>>>>   ORGANISM  Homo sapiens
>>>>> <http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606 >
>>>>>             Eukaryota; Metazoa; Chordata; Craniata; Vertebrata;
>>>>> Euteleostomi;
>>>>>             Mammalia; Eutheria; Euarchontoglires; Primates;
>>>>> Haplorrhini;
>>>>>             Catarrhini; Hominidae; Homo.
>>>>>
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>>
>>>>> On 9/28/06, Chris Fields <cjfields at uiuc.edu> wrote:
>>>>>>
>>>>>> Those are from the excessively paranoid '-w' flag on the shebang
>>>>>> line.  If you remove the flag but add the 'use warnings' pragma
>>> the
>>>>>> 'subroutine x redefined' warnings go away.  This, BTW, is one
>>> of the
>>>>>> quirks of the ActivePerl distribution; other OSs don't have the
>>> same
>>>>>> problem.
>>>>>>
>>>>>> The 'solution' described on that page is actually a workaround,
>>>>>> not a
>>>>>> bugfix.  It causes problems with stack traces with error handling
>>>>>> but
>>>>>> seems harmless beyond that.  I haven't been able to find a
>>>>>> satisfactory fix which works on all OS's.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>> On Sep 28, 2006, at 10:42 AM, Seth Johnson wrote:
>>>>>>
>>>>>>> This is under Windows, but using ActiveState Komodo 3.5 and  
>>>>>>> their
>>>>>>> latest Perl for Windows and latest BioPerl & BioPerl-db from  
>>>>>>> CVS.
>>>>>>>
>>>>>>> I actually just stumbled upon a solution.  It's described in the
>>>>>>> "Installing Bioperl on Windows" by adding a comma after
>>> $class: in
>>>>>>> Bio::Root::Root throw() subroutine.  Thanks for hinting me about
>>>>>>> what I run it on.
>>>>>>>
>>>>>>> The code works now, BUT it spews whole bunch of warnings about
>>>>>>> "Subroutine .... redefined":
>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\BioEntry
>>>>>>> .pm line 88.
>>>>>>> Subroutine object_id redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 128.
>>>>>>> Subroutine version redefined at c:/Perl/site/lib/Bio\BioEntry.pm
>>>>>>> line 150.
>>>>>>> Subroutine authority redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 171.
>>>>>>> Subroutine namespace redefined at c:/Perl/site/lib/Bio
>>> \BioEntry.pm
>>>>>>> line 192.
>>>>>>> Subroutine display_name redefined at c:/Perl/site/lib/Bio
>>>>>>> \BioEntry.pm line 217.
>>>>>>> Subroutine description redefined at c:/Perl/site/lib/Bio
>>>>>>> \BioEntry.pm line 241.
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>> line
>>>>>>> 201.
>>>>>>> Subroutine verbose redefined at c:/Perl/site/lib/Bio\Root 
>>>>>>> \Root.pm
>>>>>>> line 234.
>>>>>>> Subroutine _register_for_cleanup redefined at c:/Perl/site/lib/
>>> Bio
>>>>>>> \Root\Root.pm line 246.
>>>>>>> Subroutine _unregister_for_cleanup redefined at c:/Perl/site/ 
>>>>>>> lib/
>>>>>>> Bio
>>>>>>> \Root\Root.pm line 256.
>>>>>>> Subroutine _cleanup_methods redefined at c:/Perl/site/lib/Bio
>>> \Root
>>>>>>> \Root.pm line 263.
>>>>>>> Subroutine throw redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>>>>>> line 316.
>>>>>>> Subroutine debug redefined at c:/Perl/site/lib/Bio\Root\Root.pm
>>>>>>> line 379.
>>>>>>> Subroutine _load_module redefined at c:/Perl/site/lib/Bio\Root
>>>>>>> \Root.pm line 398.
>>>>>>> Subroutine DESTROY redefined at c:/Perl/site/lib/Bio\Root 
>>>>>>> \Root.pm
>>>>>>> line 426.
>>>>>>> Subroutine new redefined at c:/Perl/site/lib/Bio\Root\RootI.pm
>>> line
>>>>>>> 117.
>>>>>>> Subroutine _initialize redefined at c:/Perl/site/lib/Bio\Root
>>>>>>> \RootI.pm line 128.
>>>>>>> ...
>>>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>>
>>>>>>>
>>>>>>> On 9/28/06, Chris Fields <cjfields at uiuc.edu> wrote: I had
>>> problems
>>>>>>> with bioperl-db on native WinXP (not cygwin), but I
>>>>>>> did manage to get it running in cygwin with some effort.  The
>>> issue
>>>>>>> on native WinXP was related to Bio::Root::Root::throw(), though.
>>>>>>>
>>>>>>> There is a bug and workaround filed on Bugzilla, but I haven't
>>>>>>> worked
>>>>>>> on it in a while (and the workaround has some problems as
>>> well).  I
>>>>>>> may try running it again to see what happens.
>>>>>>>
>>>>>>> http://bugzilla.open-bio.org/show_bug.cgi?id=1938
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On Sep 28, 2006, at 9:04 AM, Hilmar Lapp wrote:
>>>>>>>
>>>>>>>> Very odd. This is under Windows, presumably using Cygwin?
>>>>>>>>
>>>>>>>> The method Bio::Root::Root::throw() clearly exists, and
>>>>>>>> PersistentObject inherits from it. The exception it was
>>> trying to
>>>>>>>> throw has nothing to do with failure or success to find the
>>>>>>>> database
>>>>>>>> row (actually it did succeed since otherwise it wouldn't
>>> construct
>>>>>>>> the object) but with dynamically loading a class, presumably
>>>>>>>> Bio::DB::Persistent::Seq.
>>>>>>>>
>>>>>>>> Are you using the 1.5.x release of bioperl?
>>>>>>>>
>>>>>>>> Does anyone on the list have any experience with these sorts of
>>>>>>>> things on Windows?
>>>>>>>>
>>>>>>>> (Seth, I've moved this thread to the bioperl list, since  
>>>>>>>> this is
>>>>>>> what
>>>>>>>> the problem is about.)
>>>>>>>>
>>>>>>>>       -hilmar
>>>>>>>>
>>>>>>>> On Sep 27, 2006, at 1:39 PM, Seth Johnson wrote:
>>>>>>>>
>>>>>>>>> Hello guys,
>>>>>>>>>
>>>>>>>>> I successfully populated the biosql database, thanks to you.
>>>>>>>>> Now,
>>>>>>>>> I'm
>>>>>>>>> trying to retrieve a sequence from it following the example
>>> from
>>>>>>>>> BOSC2003
>>>>>>>>> slides and ran into uninformative error (at least to me it
>>>>>>>>> doesn't
>>>>>>>>> mean
>>>>>>>>> anyting).  I suspect that I'm missing something and hope you
>>> can
>>>>>>>>> point me in
>>>>>>>>> the right direction.  Here's my source code:
>>>>>>>>>
>>>>>>>
>>> -------------------------------------------------------------------
>>>>>>> --
>>>>>>>>> -
>>>>>>>>> ---
>>>>>>>>> #!/usr/bin/perl -w
>>>>>>>>> use strict;
>>>>>>>>> use warnings;
>>>>>>>>>
>>>>>>>>> use Bio::Seq;
>>>>>>>>> use Bio::Seq::SeqFactory;
>>>>>>>>> use Bio::DB::SimpleDBContext;
>>>>>>>>> use Bio::DB::BioDB;
>>>>>>>>>
>>>>>>>>> my $dbc = Bio::DB::SimpleDBContext->new(
>>>>>>>>>     -driver => 'mysql',
>>>>>>>>>     -dbname => 'BioSQL_1',
>>>>>>>>>     -host => ' 192.168.1.3',
>>>>>>>>>     -user => 'xxxxx',
>>>>>>>>>     -pass => 'xxxxxx'
>>>>>>>>> );
>>>>>>>>>
>>>>>>>>> my $db = Bio::DB::BioDB->new(-database  => 'biosql',
>>>>>>>>>                             -dbcontext => $dbc);
>>>>>>>>>
>>>>>>>>> my $seq = Bio::Seq->new(-accession_number => 'NM_014580', -
>>>>>>>>> namespace =>
>>>>>>>>> 'refseq_H_sapiens');
>>>>>>>>> my $seqfact = Bio::Seq::SeqFactory->new(-type => 'Bio::Seq');
>>>>>>>>> my $adp = $db->get_object_adaptor($seq);
>>>>>>>>> my $dbseq = $adp->find_by_unique_key($seq, -obj_factory =>
>>>>>>> $seqfact);
>>>>>>>>>
>>>>>>>>> my $out = Bio::SeqIO->newFh('-format' => 'EMBL');
>>>>>>>>> print $out $dbseq;
>>>>>>>>>
>>>>>>>>> exit;
>>>>>>>>>
>>> -----------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> Just when the "find_by_unique_key" function is executed I
>>> get the
>>>>>>>>> following
>>>>>>>>> error:
>>>>>>>>>
>>>>>>>>> ================================
>>>>>>>>> Undefined subroutine &Bio::Root::Root::throw called at
>>>>>>>>> c:/Perl/site/lib/Bio/DB/Persistent/PersistentObject.pm line
>>> 199.
>>>>>>>>> ================================
>>>>>>>>>
>>>>>>>>> The sequence does exist in the database. I checked that.  Any
>>>>>>>>> ideas???
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best Regards,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Seth Johnson
>>>>>>>>> Senior Bioinformatics Associate
>>>>>>>>> _______________________________________________
>>>>>>>>> BioSQL-l mailing list
>>>>>>>>> BioSQL-l at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biosql-l
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ===========================================================
>>>>>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>>>>>> ===========================================================
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>
>>>>>>> Christopher Fields
>>>>>>> Postdoctoral Researcher
>>>>>>> Lab of Dr. Robert Switzer
>>>>>>> Dept of Biochemistry
>>>>>>> University of Illinois Urbana-Champaign
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>>
>>>>>>>
>>>>>>> Seth Johnson
>>>>>>> Senior Bioinformatics Associate
>>>>>>>
>>>>>>> Ph: (202) 470-0900
>>>>>>> Fx: (775) 251-0358
>>>>>>
>>>>>> Christopher Fields
>>>>>> Postdoctoral Researcher
>>>>>> Lab of Dr. Robert Switzer
>>>>>> Dept of Biochemistry
>>>>>> University of Illinois Urbana-Champaign
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>>
>>>>>
>>>>> Seth Johnson
>>>>> Senior Bioinformatics Associate
>>>>>
>>>>> Ph: (202) 470-0900
>>>>> Fx: (775) 251-0358
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher
>>>> Lab of Dr. Robert Switzer
>>>> Dept of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>
>>>
>>> --
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>>
>>>
>>> Seth Johnson
>>> Senior Bioinformatics Associate
>>>
>>> Ph: (202) 470-0900
>>> Fx: (775) 251-0358
>>
>> --
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>>
>
>
> -- 
> Best Regards,
>
>
> Seth Johnson
> Senior Bioinformatics Associate
>
> Ph: (202) 470-0900
> Fx: (775) 251-0358
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From osborne1 at optonline.net  Sun Oct  1 21:49:47 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Sun, 01 Oct 2006 17:49:47 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061001183214.GB12075@iucha.net>
Message-ID: <C145B03B.A8A5%osborne1@optonline.net>

Florin,

This is fixed in CVS now. What had happened is that the DIP file had some
minimal protein (node) entries where the only id available was DIP's
internal identifier. Not ideal to have to use these as accessions but
there's no other choice.

Thank you for the note, and in the future write to bioperl-l since there may
be others who are interested in hearing about what you've encountered.

Brian O.


On 10/1/06 2:32 PM, "Florin Iucha" <florin at iucha.net> wrote:

> Hello,
> 
> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and
> I am using it to read the 20060402 edition release of the DIP [2] dataset.
> 
> Starting with the simple program you show in the man page:
> 
>    my $io = Bio::Network::IO->new(-format => 'psi',
>                                   -file   => $ARGV[0]);
> 
>    my $network = $io->next_network;
> 
> I get 772 instances of:
> 
>    Use of uninitialized value in string eq at
>    /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 326.
> 
> I don't know if it is just an annoyance or something bad, so you might
> want to take a look at it.
> 
> Thank you for your work,
> florin
> 
> [1] http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-network/
> [2] http://dip.doe-mbi.ucla.edu/


From osborne1 at optonline.net  Sun Oct  1 21:56:39 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Sun, 01 Oct 2006 17:56:39 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061001211844.GC12075@iucha.net>
Message-ID: <C145B1D7.A8A8%osborne1@optonline.net>

Florin,

I'm not seeing any segmentation fault using the same file you're using as
input (dip20060402.mif). I'm assuming you don't see this error when you use
smaller files as input, like those in the t/data directory.

When I watch the script in top I see Perl using about 135Mb (RSIZE) right
before the script exits. How much memory do you use?

Thank you for the note, and in the future write to bioperl-l since there may
be others who are interested in hearing about what you've encountered.

Brian O.


On 10/1/06 5:18 PM, "Florin Iucha" <florin at iucha.net> wrote:

> On Sun, Oct 01, 2006 at 01:32:14PM -0500, Florin Iucha wrote:
>> I have downloaded a CVS snapshot [1] of your module, bioperl-network, and
>> I am using it to read the 20060402 edition release of the DIP [2] dataset.
> 
> Using the attached script, I am getting a segmentation fault at the
> end, right after printing "That's all, Folks!"  Maybe some cleanup is
> going off in a wrong direction.
> 
> florin


From florin at iucha.net  Mon Oct  2 00:24:03 2006
From: florin at iucha.net (Florin Iucha)
Date: Sun, 1 Oct 2006 19:24:03 -0500
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <C145B1D7.A8A8%osborne1@optonline.net>
References: <20061001211844.GC12075@iucha.net>
	<C145B1D7.A8A8%osborne1@optonline.net>
Message-ID: <20061002002403.GD12075@iucha.net>

On Sun, Oct 01, 2006 at 05:56:39PM -0400, Brian Osborne wrote:
> I'm not seeing any segmentation fault using the same file you're using as
> input (dip20060402.mif). I'm assuming you don't see this error when you use
> smaller files as input, like those in the t/data directory.

The t/data files are fine.

Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the
MINT [1] database does not produce the crash.  It has a new warning, however:

   Can't call method "text" on an undefined value at
   /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290.

> When I watch the script in top I see Perl using about 135Mb (RSIZE) right
> before the script exits. How much memory do you use?

"ps ux" tells me VSZ = 272788 and RSZ = 254992. This is on x86-64 with
64 bit perl.  The box has 2 GB of physical memory so these numbers
don't seem to be a concern.

> Thank you for the note, and in the future write to bioperl-l since there may
> be others who are interested in hearing about what you've encountered.

Do'h! You have the list address loud and clear in three places, but I got
your contact info from the AUTHORS.  Will use the proper channel from now
on!

Thanks,
florin

[1] ftp://mint.bio.uniroma2.it/pub/release/psi1/

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061001/901e447e/attachment.sig>

From cjfields at uiuc.edu  Mon Oct  2 04:35:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 1 Oct 2006 23:35:22 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880609301635w421fae0er3497ba655679f0bc@mail.gmail.com>
Message-ID: <000001c6e5dc$2eceabe0$15327e82@pyrimidine>

Seth,

What version of MySQL and perl are you using?  I'm using MySQL 5.0.18 (but
am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819.

I ran into a few problems with bioperl-db tests which were unrelated the
ones below, but I'm wondering if it is a difference in MySQL versions.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Seth Johnson
> Sent: Saturday, September 30, 2006 6:35 PM
> To: Hilmar Lapp
> Cc: Chris Fields; Bioperl List
> Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> 
> Here're complete test details:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

...

> FAILED tests 10-12
>     Failed 3/12 tests, 75.00% okay
> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> --------------------------------------------------------------------------
> -----
> t\02species.t                 65    2   3.08%  63 65
> t\03simpleseq.t    1   256    59  106 179.66%  7-59
> t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> t\12ontology.t     2   512   738 1471 199.32%  3-738
> t\16obda.t                    12    3  25.00%  10-12
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From torsten.seemann at infotech.monash.edu.au  Mon Oct  2 06:06:50 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Mon, 02 Oct 2006 16:06:50 +1000
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
References: <451C8ED8.2060003@infotech.monash.edu.au>
	<451CC40D.2030401@sendu.me.uk>
	<2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
Message-ID: <4520AC7A.1050009@infotech.monash.edu.au>


 >>> I have removed all use/@ISA Bio::Root::Object references from
 >>> bioperl-live, except for those in Bio::Root::* itself:

 >> So I'd say they're both relics that can be removed. In fact I was
 >> planning on getting rid off all references to both of these modules
 >> before you did, so thanks! :)

> I think they can go. It's probably a pre-1.0 deprecation that somehow  
> was never followed through on.

Today I did a fresh CVS checkout of bioperl-live, and deleted the 
following modules and tests, and all tests passed with BIOPERLDEBUG=0

     * Bio::Root::Err
     * Bio::Root::Global
     * Bio::Root::IOManager
     * Bio::Root::Object
     * Bio::Root::Storable
     * Bio::Root::Utilities  # may be used by third parties?
     * Bio::Root::Vector
     * Bio::Root::Xref
     * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
     * t/RootStorable.t

Should we schedule for deprecation, or deprecate immediately as Hilmar 
suggested they were meant to be deprecated long ago ?

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From bix at sendu.me.uk  Mon Oct  2 09:40:02 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 10:40:02 +0100
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>	<451E3707.4090400@sendu.me.uk>	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
	<3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
Message-ID: <4520DE72.4000603@sendu.me.uk>

Chris Fields wrote:
>
> The idea is to retain current behavior (remote DB access will not be  
> run unless BIOPERLDEBUG is set to 1) and apply it to all tests  
> requiring such access.  Otherwise, just those tests are skipped (and  
> not the rest of the tests, which occurs currently).  If BIOPERLDEBUG  
> is set, the next tests would check the URL, which passes/fails (based  
> on the specific value of $@), and runs/skips tests based on the mere  
> presence of $@, which indicates some URL issue.  You can do this with  
> Test::More, but I'm not sure this can be done with Test.pm or  
> Test::Simple.

Firstly, BIOPERLDEBUG should not be abused; it should be used only when 
you want to see extra debugging messages. There should be another 
variable that you can set to choose if network-requiring tests are run, 
and it should also be a configurable choice when you run perl Makefile.PL.

(But changing this isn't going to happen for 1.5.2)

When the server problem is ambiguous we should not fail the test. Just 
make the skip message visible and pass all ok...


> The current behavior just skips all tests based on a single failed  
> URL.  Then, Test::Harness, as currently set, shows skipped tests as  
> passed.  The last run I posted previously where XEMBL_DB.t remote DB  
> tests failed, I also ran all tests (make test) and get this, which  
> doesn't tell us that the remote URL failed:
> 
> -----------------------------------------
> 
> ...
> t/WABA.......................ok
> t/XEMBL_DB...................ok
> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext  
> is not installed or is installed incorrectly - skipping ztr.t tests
> ok
> All tests successful, 5 subtests skipped.

All you have to do to make it visible is start the skip message with the 
work 'Skip':

skip('Skip server may be down',1);

...
t/WABA.......................ok 

t/XEMBL_DB...................ok 

         1/9 skipped: server may be down
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is 
not installed or is installed incorrectly - skipping ztr.t tests
t/ztr........................ok


It's nicer when using Test::More.


From bix at sendu.me.uk  Mon Oct  2 09:55:27 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 10:55:27 +0100
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au>
References: <451C8ED8.2060003@infotech.monash.edu.au>	<451CC40D.2030401@sendu.me.uk>	<2A31B5F3-8D44-45C3-89C6-E42D802C5D14@gmx.net>
	<4520AC7A.1050009@infotech.monash.edu.au>
Message-ID: <4520E20F.6040406@sendu.me.uk>

Torsten Seemann wrote:
>  >>> I have removed all use/@ISA Bio::Root::Object references from
>  >>> bioperl-live, except for those in Bio::Root::* itself:
> 
>  >> So I'd say they're both relics that can be removed. In fact I was
>  >> planning on getting rid off all references to both of these modules
>  >> before you did, so thanks! :)
> 
>> I think they can go. It's probably a pre-1.0 deprecation that somehow  
>> was never followed through on.
> 
> Today I did a fresh CVS checkout of bioperl-live, and deleted the 
> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> 
>      * Bio::Root::Err
>      * Bio::Root::Global
>      * Bio::Root::IOManager
>      * Bio::Root::Object
>      * Bio::Root::Storable
>      * Bio::Root::Utilities  # may be used by third parties?
>      * Bio::Root::Vector
>      * Bio::Root::Xref
>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>      * t/RootStorable.t
> 
> Should we schedule for deprecation, or deprecate immediately as Hilmar 
> suggested they were meant to be deprecated long ago ?

I'm happy to get rid of them all straight away. Does anyone object?


From florin at iucha.net  Mon Oct  2 01:40:07 2006
From: florin at iucha.net (Florin Iucha)
Date: Sun, 1 Oct 2006 20:40:07 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext on
	AMD64
Message-ID: <20061002014007.GG12075@iucha.net>

Hello,

I am trying to install bioperl-network from CVS.  I found this to
require bioperl from CVS, which requires bioperl-ext from CVS.
I have compiled and installed io_lib 1.10.1.

After running "perl Makefile.PL; make test" in bioperl-ext I see a lot 
sources being compiled, then:

cc -c  -I./libs -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -O2   -DVERSION=\"1.5.1\" -DXS_VERSION=\"1.5.1\" -fPIC "-I/usr/lib/perl/5.8/CORE"  -DPOSIX -DNOERROR Align.c
Running Mkbootstrap for Bio::Ext::Align ()
chmod 644 Align.bs
rm -f ../blib/arch/auto/Bio/Ext/Align/Align.so
cc  -shared -L/usr/local/lib Align.o  -o ../blib/arch/auto/Bio/Ext/Align/Align.so libs/libsw.a  \
           -lm          \

/usr/bin/ld: libs/libsw.a(aln.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC
libs/libsw.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[1]: *** [../blib/arch/auto/Bio/Ext/Align/Align.so] Error 1
make[1]: Leaving directory `/scratch/dmbio/tools/bioperl-ext/Bio/Ext/Align'
make: *** [subdirs] Error 2

This is on a Debian AMD64 box:

florin at zeus $ gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu
Thread model: posix
gcc version 4.1.2 20060901 (prerelease) (Debian 4.1.1-13)
florin at zeus $ perl -V
Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
  Platform:
    osname=linux, osvers=2.6.16-1-vserver-amd64-k8, archname=x86_64-linux-gnu-thread-multi
    uname='linux excelsior 2.6.16-1-vserver-amd64-k8 #2 smp tue apr 4 03:40:49 utc 2006 x86_64 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.8 -Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=define use64bitall=define uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O2',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include'
    ccversion='', gccversion='4.1.2 20060729 (prerelease) (Debian 4.1.1-10)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.6.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8
    gnulibc_version='2.3.6'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl):
  Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT
                        PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ALL
                        USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES
                        USE_PERLIO USE_REENTRANT_API

The compiler command line for aln.o is lacking -fPIC:

cc -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN
-fno-strict-aliasing -pipe -I/usr/local/include -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -DPOSIX -DNOERROR   -c -o aln.o aln.c

Adding -fPIC to the CCFLAGS variable in Bio/Ext/Align/Makefile and
Makefile seems to take build further, but it fails with a similar
error in Bio/SeqIO/staden/_Inline/build/Bio/SeqIO/staden/read. That
Makefile seems to be regenerated every time I run 'make test' in the
top level directory.

The error in ../staden/read is:

rm -f blib/arch/auto/Bio/SeqIO/staden/read/read.so
cc  -shared -L/usr/local/lib read.o  -o blib/arch/auto/Bio/SeqIO/staden/read/read.so    \
           -L/usr/local/lib -lread -lz          \

/usr/bin/ld: /usr/local/lib/libread.a(libread_a-Read.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC
/usr/local/lib/libread.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [blib/arch/auto/Bio/SeqIO/staden/read/read.so] Error 1

So, the questions appears to be:
   - should "-fPIC" be appended to CFLAGS in the generated Makefiles?
   - is there anything wrong with io_lib flags?
   - has anybody built bioperl-ext on AMD64?

I can help with debugging or testing if given a gentle nudge in the right
direction, but I have little experience with the interactions between perl
and static libraries on 64 bit.

Thanks,
florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061001/bc134c7e/attachment.sig>

From bix at sendu.me.uk  Mon Oct  2 10:52:47 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 11:52:47 +0100
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on	AMD64
In-Reply-To: <20061002014007.GG12075@iucha.net>
References: <20061002014007.GG12075@iucha.net>
Message-ID: <4520EF7F.40908@sendu.me.uk>

Florin Iucha wrote:
> Hello,
> 
> I am trying to install bioperl-network from CVS.  I found this to
> require bioperl from CVS, which requires bioperl-ext from CVS.

I can't help with the compile problems you encountered (other than to 
say I also have problems under AMD64), but from where did you get the 
idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
recent changes to Makefile.PL may give that impression...


From cjfields at uiuc.edu  Mon Oct  2 12:26:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 07:26:57 -0500
Subject: [Bioperl-l] Tests involving remote databases
In-Reply-To: <4520DE72.4000603@sendu.me.uk>
References: <000001c6e3e6$81630010$15327e82@pyrimidine>	<6761B0A9-DEF0-4F33-9C89-39AFE104B61F@gmx.net>	<79D77258-5A4E-4428-9322-00962D5C0419@uiuc.edu>	<451E3707.4090400@sendu.me.uk>	<0DA94AF7-DB0E-4DC5-8896-1048089D9AC5@uiuc.edu>	<84789BFC-E00D-44BC-9F1E-41C34A2B899B@gmx.net>
	<3D3091D9-CF24-49D3-B4B2-CDCC4145B541@uiuc.edu>
	<4520DE72.4000603@sendu.me.uk>
Message-ID: <DAAC7FDC-0C03-4345-9E09-DBF04D521628@uiuc.edu>


On Oct 2, 2006, at 4:40 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>> The idea is to retain current behavior (remote DB access will not be
>> run unless BIOPERLDEBUG is set to 1) and apply it to all tests
>> requiring such access.  Otherwise, just those tests are skipped (and
>> not the rest of the tests, which occurs currently).  If BIOPERLDEBUG
>> is set, the next tests would check the URL, which passes/fails (based
>> on the specific value of $@), and runs/skips tests based on the mere
>> presence of $@, which indicates some URL issue.  You can do this with
>> Test::More, but I'm not sure this can be done with Test.pm or
>> Test::Simple.
>
> Firstly, BIOPERLDEBUG should not be abused; it should be used only  
> when
> you want to see extra debugging messages. There should be another
> variable that you can set to choose if network-requiring tests are  
> run,
> and it should also be a configurable choice when you run perl  
> Makefile.PL.
>
> (But changing this isn't going to happen for 1.5.2)
>
> When the server problem is ambiguous we should not fail the test. Just
> make the skip message visible and pass all ok...

I agree, as well as with your assessment of BIOPERLDEBUG (which I  
alluded to in a previous post).  Torsten suggested creating a new  
env. variable for network tests.

It's obvious this won't be done before 1.5.2, but we can make plans  
towards the next release.

>> The current behavior just skips all tests based on a single failed
>> URL.  Then, Test::Harness, as currently set, shows skipped tests as
>> passed.  The last run I posted previously where XEMBL_DB.t remote DB
>> tests failed, I also ran all tests (make test) and get this, which
>> doesn't tell us that the remote URL failed:
>>
>> -----------------------------------------
>>
>> ...
>> t/WABA.......................ok
>> t/XEMBL_DB...................ok
>> t/ztr........................Bio::SeqIO::staden::read of bioperl-ext
>> is not installed or is installed incorrectly - skipping ztr.t tests
>> ok
>> All tests successful, 5 subtests skipped.
>
> All you have to do to make it visible is start the skip message  
> with the
> work 'Skip':
>
> skip('Skip server may be down',1);
>
> ...
> t/WABA.......................ok
>
> t/XEMBL_DB...................ok
>
>          1/9 skipped: server may be down
> t/ztr........................Bio::SeqIO::staden::read of bioperl- 
> ext is
> not installed or is installed incorrectly - skipping ztr.t tests
> t/ztr........................ok
>
>
> It's nicer when using Test::More.

Okay, if Test::Harness picks that up it would be okay.  We could use  
skip blocks to skip subsets of tests that require remote access (like  
SeqFeature.t) as opposed to skipping all tests.

I think we want to avoid promoting running tests with BIOPERLDEBUG  
(or similar) upon installation for everyday installation anyway (such  
as from CPAN, which Hilmar points out).  It's not something everybody  
installing a new BioPerl should be running unless they run into  
problems.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From florin at iucha.net  Mon Oct  2 12:15:06 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 07:15:06 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
	on	AMD64
In-Reply-To: <4520EF7F.40908@sendu.me.uk>
References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk>
Message-ID: <20061002121506.GB14409@iucha.net>

On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
> Florin Iucha wrote:
> > I am trying to install bioperl-network from CVS.  I found this to
> > require bioperl from CVS, which requires bioperl-ext from CVS.
> 
> I can't help with the compile problems you encountered (other than to 
> say I also have problems under AMD64), but from where did you get the 
> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
> recent changes to Makefile.PL may give that impression...

Running the tests for bioperl-live mention in some places that 'this
test has been skipped since $foo is not available' and I found the
'foos' in bioperl-ext.

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/8fc9df03/attachment.sig>

From bix at sendu.me.uk  Mon Oct  2 14:05:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 02 Oct 2006 15:05:11 +0100
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on	AMD64
In-Reply-To: <20061002121506.GB14409@iucha.net>
References: <20061002014007.GG12075@iucha.net> <4520EF7F.40908@sendu.me.uk>
	<20061002121506.GB14409@iucha.net>
Message-ID: <45211C97.2060800@sendu.me.uk>

Florin Iucha wrote:
> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
>> Florin Iucha wrote:
>>> I am trying to install bioperl-network from CVS.  I found this to
>>> require bioperl from CVS, which requires bioperl-ext from CVS.
>> I can't help with the compile problems you encountered (other than to 
>> say I also have problems under AMD64), but from where did you get the 
>> idea that bioperl (live/core) requires bioperl-ext? It doesn't, though 
>> recent changes to Makefile.PL may give that impression...
> 
> Running the tests for bioperl-live mention in some places that 'this
> test has been skipped since $foo is not available' and I found the
> 'foos' in bioperl-ext.

Right, yes. The idea is, you'd only need to install bioperl-ext if you 
wanted to use the modules that the complaining tests test.
So if none of the things that were skipped matter to you, don't install ext.

I guess this needs to be clarified in documentation somewhere.


From cjfields at uiuc.edu  Mon Oct  2 14:13:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:13:56 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520AC7A.1050009@infotech.monash.edu.au>
Message-ID: <001801c6e62d$02c883d0$15327e82@pyrimidine>


>  >>> I have removed all use/@ISA Bio::Root::Object references from
>  >>> bioperl-live, except for those in Bio::Root::* itself:
> 
>  >> So I'd say they're both relics that can be removed. In fact I was
>  >> planning on getting rid off all references to both of these modules
>  >> before you did, so thanks! :)
> 
> > I think they can go. It's probably a pre-1.0 deprecation that somehow
> > was never followed through on.
> 
> Today I did a fresh CVS checkout of bioperl-live, and deleted the
> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> 
>      * Bio::Root::Err
>      * Bio::Root::Global
>      * Bio::Root::IOManager
>      * Bio::Root::Object
>      * Bio::Root::Storable
>      * Bio::Root::Utilities  # may be used by third parties?
>      * Bio::Root::Vector
>      * Bio::Root::Xref
>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>      * t/RootStorable.t
> 
> Should we schedule for deprecation, or deprecate immediately as Hilmar
> suggested they were meant to be deprecated long ago ?

I vote for quick deprecation; I had also noticed that these were superfluous
and added them as possible deprecations to the wiki page.  However, we need
to be careful about that 'third-party use' caveat you have for
Bio::Root::Utilities; there's another one with Bio::Root::Storable and
Ensembl:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2924/focus=2924

and it seems to have it's users:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/8242/focus=8242

The others (including Bio::Root::Utilities) haven't had any major threads on
the mail lists in a very long time.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> --
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Mon Oct  2 14:16:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:16:31 -0500
Subject: [Bioperl-l] Failure to compile the CVS snapshot of
	bioperl-exton	AMD64
In-Reply-To: <20061002121506.GB14409@iucha.net>
Message-ID: <001901c6e62d$5c4fac80$15327e82@pyrimidine>

They're not absolutely necessary; the tests are skipped w/o failure because
bioperl-ext is optional.  These are only necessary if you want the ability
to read sequence trace files.  

BTW, you might have a rough time on trying to install bioperl-ext depending
on your platform.  Note the following bug report:

http://bugzilla.open-bio.org/show_bug.cgi?id=2074

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Florin Iucha
> Sent: Monday, October 02, 2006 7:15 AM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-
> exton AMD64
> 
> On Mon, Oct 02, 2006 at 11:52:47AM +0100, Sendu Bala wrote:
> > Florin Iucha wrote:
> > > I am trying to install bioperl-network from CVS.  I found this to
> > > require bioperl from CVS, which requires bioperl-ext from CVS.
> >
> > I can't help with the compile problems you encountered (other than to
> > say I also have problems under AMD64), but from where did you get the
> > idea that bioperl (live/core) requires bioperl-ext? It doesn't, though
> > recent changes to Makefile.PL may give that impression...
> 
> Running the tests for bioperl-live mention in some places that 'this
> test has been skipped since $foo is not available' and I found the
> 'foos' in bioperl-ext.
> 
> florin
> 
> --
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra


From osborne1 at optonline.net  Mon Oct  2 14:14:13 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 10:14:13 -0400
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <4520E20F.6040406@sendu.me.uk>
Message-ID: <C14696F5.A903%osborne1@optonline.net>

Sendu,

No objection but someone should check the scripts in examples/root to make
sure that they are not used there.

Brian O.


On 10/2/06 5:55 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> Torsten Seemann wrote:
>>>>> I have removed all use/@ISA Bio::Root::Object references from
>>>>> bioperl-live, except for those in Bio::Root::* itself:
>> 
>>>> So I'd say they're both relics that can be removed. In fact I was
>>>> planning on getting rid off all references to both of these modules
>>>> before you did, so thanks! :)
>> 
>>> I think they can go. It's probably a pre-1.0 deprecation that somehow
>>> was never followed through on.
>> 
>> Today I did a fresh CVS checkout of bioperl-live, and deleted the
>> following modules and tests, and all tests passed with BIOPERLDEBUG=0
>> 
>>      * Bio::Root::Err
>>      * Bio::Root::Global
>>      * Bio::Root::IOManager
>>      * Bio::Root::Object
>>      * Bio::Root::Storable
>>      * Bio::Root::Utilities  # may be used by third parties?
>>      * Bio::Root::Vector
>>      * Bio::Root::Xref
>>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
>>      * t/RootStorable.t
>> 
>> Should we schedule for deprecation, or deprecate immediately as Hilmar
>> suggested they were meant to be deprecated long ago ?
> 
> I'm happy to get rid of them all straight away. Does anyone object?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From johnson.biotech at gmail.com  Mon Oct  2 14:21:50 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 2 Oct 2006 10:21:50 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <000001c6e5dc$2eceabe0$15327e82@pyrimidine>
References: <b99962880609301635w421fae0er3497ba655679f0bc@mail.gmail.com>
	<000001c6e5dc$2eceabe0$15327e82@pyrimidine>
Message-ID: <b99962880610020721j776d3801m4f5b49cd1bdf66c6@mail.gmail.com>

I'm using MySQL 5.0.19 and Perl v5.8.7 [MSWin32-x86-multi-thread]

On 10/2/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Seth,
>
> What version of MySQL and perl are you using?  I'm using MySQL 5.0.18 (but
> am upgrading to 5.0.24 tomorrow) and ActivePerl 5.8.819.
>
> I ran into a few problems with bioperl-db tests which were unrelated the
> ones below, but I'm wondering if it is a difference in MySQL versions.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------
> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358


From osborne1 at optonline.net  Mon Oct  2 14:08:50 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 10:08:50 -0400
Subject: [Bioperl-l] Failure to compile the CVS snapshot of bioperl-ext
 on AMD64
In-Reply-To: <20061002014007.GG12075@iucha.net>
Message-ID: <C14695B2.A900%osborne1@optonline.net>

Florian,

Minor correction here, the Bioperl package does not require bioperl-ext.
However we see there is a problem compiling bioperl-ext...

Brian O.


On 10/1/06 9:40 PM, "Florin Iucha" <florin at iucha.net> wrote:

> I am trying to install bioperl-network from CVS.  I found this to
> require bioperl from CVS, which requires bioperl-ext from CVS.


From JK at novozymes.com  Mon Oct  2 14:05:34 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Mon, 2 Oct 2006 16:05:34 +0200
Subject: [Bioperl-l] Blast parser.
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net>


Hi. 

I've tried to use the blast-parser but I cannot get the original alignment
out of the parser. Is it possible to get that out of the 
Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a
clustalw alignment out when it isn't that type of alignment people are
used to get from blast. 

Thanks 

Jesper


From cjfields at uiuc.edu  Mon Oct  2 14:36:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 09:36:31 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <C14696F5.A903%osborne1@optonline.net>
Message-ID: <001d01c6e630$27792fb0$15327e82@pyrimidine>

> Sendu,
> 
> No objection but someone should check the scripts in examples/root to make
> sure that they are not used there.
> 
> Brian O.

I suppose it's also possible that the other bioperl distributions (like
bioperl-run) could use them as well.  

If they do we can take care of them as they pop up.  These are really old
and haven't been revised in a long time.  

The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
anyone know where Will Spooner is?  He's the maintainer for
Bio::Root::Storable.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct  2 15:01:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 10:01:44 -0500
Subject: [Bioperl-l] Blast parser.
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5D6@NZT0004E.dknz.nzcorp.net>
Message-ID: <000001c6e633$ad0a6ce0$15327e82@pyrimidine>

The alignment that you get should come from GenericHSP, not BLASTHSP.
Either way, the HSP alignment that is retrieved using $hsp->get_aln() should
be a Bio::SimpleAlign object.  You can then output that to the proper
AlignIO format using an AlignIO stream object or use the Bio::SimpleAlign
methods for further analysis.  

my $aln = $hsp->get_aln();
my $alnout = Bio::AlignIO->new(-format => 'msf',
                               -fh  => \*STDOUT);
$alnout->write_aln($aln);

Quick note: not all AlignIO formats have write_aln() support at this time,
but most do.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of JK (Jesper Agerbo Krogh)
> Sent: Monday, October 02, 2006 9:06 AM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Blast parser.
> 
> 
> Hi.
> 
> I've tried to use the blast-parser but I cannot get the original alignment
> out of the parser. Is it possible to get that out of the
> Bio::Search::HSP::BlastHSP in some way. It seems quite odd to get a
> clustalw alignment out when it isn't that type of alignment people are
> used to get from blast.
> 
> Thanks
> 
> Jesper
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From whs at ebi.ac.uk  Mon Oct  2 16:00:19 2006
From: whs at ebi.ac.uk (Will Spooner)
Date: Mon, 2 Oct 2006 17:00:19 +0100 (BST)
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <001d01c6e630$27792fb0$15327e82@pyrimidine>
References: <001d01c6e630$27792fb0$15327e82@pyrimidine>
Message-ID: <Pine.LNX.4.64.0610021651550.1560@parrot.ebi.ac.uk>

On Mon, 2 Oct 2006, Chris Fields wrote:

>> Sendu,
>>
>> No objection but someone should check the scripts in examples/root to make
>> sure that they are not used there.
>>
>> Brian O.
>
> I suppose it's also possible that the other bioperl distributions (like
> bioperl-run) could use them as well.
>
> If they do we can take care of them as they pop up.  These are really old
> and haven't been revised in a long time.
>
> The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
> anyone know where Will Spooner is?  He's the maintainer for
> Bio::Root::Storable.
>

Hi Chris,

I'm still lurking...

If the tests for Bio::Root::Storable still pass (I assume that they do), 
then the module is working as advertised.

The idea behind Storable is very simple; object instances of any 
inhereting class can be serialised/retrieved from disk. BioPerl objects 
will probably not want this functionality by default, but it is trival to 
implement if needed.

Will


From cjfields at uiuc.edu  Mon Oct  2 17:58:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 12:58:15 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <Pine.LNX.4.64.0610021651550.1560@parrot.ebi.ac.uk>
Message-ID: <000601c6e64c$5746f990$15327e82@pyrimidine>

> On Mon, 2 Oct 2006, Chris Fields wrote:
> 
> >> Sendu,
> >>
> >> No objection but someone should check the scripts in examples/root to
> make
> >> sure that they are not used there.
> >>
> >> Brian O.
> >
> > I suppose it's also possible that the other bioperl distributions (like
> > bioperl-run) could use them as well.
> >
> > If they do we can take care of them as they pop up.  These are really
> old
> > and haven't been revised in a long time.
> >
> > The only one I worry about is Bio::Root::Storable b/c of Ensembl.  Does
> > anyone know where Will Spooner is?  He's the maintainer for
> > Bio::Root::Storable.
> >
> 
> Hi Chris,
> 
> I'm still lurking...
> 
> If the tests for Bio::Root::Storable still pass (I assume that they do),
> then the module is working as advertised.
> 
> The idea behind Storable is very simple; object instances of any
> inhereting class can be serialised/retrieved from disk. BioPerl objects
> will probably not want this functionality by default, but it is trival to
> implement if needed.
> 
> Will

Okay, nice to know you're listening in!  Based on that we should keep it in.
The rest that Torsten mentioned could probably be removed right away.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From osborne1 at optonline.net  Mon Oct  2 17:59:58 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Mon, 02 Oct 2006 13:59:58 -0400
Subject: [Bioperl-l] bioperl-network warnings when loading the DIP
	dataset
In-Reply-To: <20061002002403.GD12075@iucha.net>
Message-ID: <C146CBDE.A938%osborne1@optonline.net>

Florin,

OK, this is fixed in CVS now. The problem is that there's some variability
in how the PSI MI "standard" is used. In this case there was a species that
was not given a value for its scientific name ("fullName"), I had to use
common name in its place. Fortunately there's an NCBI taxon id behind all
this.

Thanks again,

Brian O.


On 10/1/06 8:24 PM, "Florin Iucha" <florin at iucha.net> wrote:

> Also the largest file (full_2_psi1.xml) from release 2006-09-19 of the
> MINT [1] database does not produce the crash.  It has a new warning, however:
> 
>    Can't call method "text" on an undefined value at
>    /usr/local/share/perl/5.8.8/Bio/Network/IO/psi.pm line 290.


From mmacho at gmail.com  Mon Oct  2 17:43:13 2006
From: mmacho at gmail.com (ende)
Date: Mon, 2 Oct 2006 19:43:13 +0200
Subject: [Bioperl-l] Variable scope
Message-ID: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>


	Hi

this may be a typical perl topic and then out of this list center  
topic.  My apologize for any inconvenience.

It is a annoying problem that is making me waste lot of time.

I have a package with its new object, etc... and constants in it like:

#-----
use constant False => 0;
use constant True => 1;

our %CLRFG = (
               PLASMIDO      => RED,
               POLY_A        => GREEN,
               RESTR_SITES   => BLUE,
               CONECTORS     => MAGENTA,
               CONTAMINANTS  => CYAN,
           );

our %CLRBG = (
               PLASMIDO      => "",
               POLY_A        => "",
               RESTR_SITES   => "",
               CONECTORS     => "",
               CONTAMINANTS  => "",
           );
#------

this constants are include with require "h.pl" from the main package  
file.

I use this module from the mail command line driver to test it  
"using" it.  In the command line driver I can use with no gripe the  
constants False and True directly, for example "return True", etc  
without any reference to the origin of that constant.

But, with respect to the variables (I would like they also were  
constants.. but how?), %CLRFG and %CLRBG I can't find the way of  
refering those int the module.  Finally I have desisted and _copy_  
the definitions where  I have needed it (in the sub were I print Ansi  
terminal colouring seqs...).  I don't find how to refer those  
variables out of the module.

I have tried %modulename::CLRFG, for example, but Perl gives me errors.

Any help?


--
     Juan Falgueras
     Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n
     Universidad de M?laga


From cjfields at uiuc.edu  Mon Oct  2 20:52:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 15:52:11 -0500
Subject: [Bioperl-l] Do we need Bio::Root::Object anymore?
In-Reply-To: <C14696F5.A903%osborne1@optonline.net>
Message-ID: <000001c6e664$a25538d0$15327e82@pyrimidine>

I have updated the Deprecation page with the Bio::Root::* modules that we
plan on deprecating (note that I have them being removed for rel. 1.5.2).  I
have left out Bio::Root::Storable for now based on Will's response.  

http://www.bioperl.org/wiki/Deprecated_modules

I'll update the DEPRECATED doc in CVS as well.  There is a tentative
schedule for when warnings are added for modules before they are removed.  

In relation to the recent trend for house-cleaning, I noticed that all of
the Bio::Tools::BP* BLAST-related modules all are still present but haven't
been modified or had deprecation warnings added.  BPLite was marked for
deprecation around rel 1.5 since the functionality is present in
Bio::SearchIO, as well as the others.  Judging by the mail list, no one has
used these in quite a while, and everyone has been redirected to use
Bio::SearchIO instead.  Based on that I have added warnings in CVS for
deprecation to BPlite and the related modules BPpsilite and BPbl2seq.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Brian Osborne
> Sent: Monday, October 02, 2006 9:14 AM
> To: Sendu Bala; bioperl-l
> Subject: Re: [Bioperl-l] Do we need Bio::Root::Object anymore?
> 
> Sendu,
> 
> No objection but someone should check the scripts in examples/root to make
> sure that they are not used there.
> 
> Brian O.
> 
> 
> On 10/2/06 5:55 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:
> 
> > Torsten Seemann wrote:
> >>>>> I have removed all use/@ISA Bio::Root::Object references from
> >>>>> bioperl-live, except for those in Bio::Root::* itself:
> >>
> >>>> So I'd say they're both relics that can be removed. In fact I was
> >>>> planning on getting rid off all references to both of these modules
> >>>> before you did, so thanks! :)
> >>
> >>> I think they can go. It's probably a pre-1.0 deprecation that somehow
> >>> was never followed through on.
> >>
> >> Today I did a fresh CVS checkout of bioperl-live, and deleted the
> >> following modules and tests, and all tests passed with BIOPERLDEBUG=0
> >>
> >>      * Bio::Root::Err
> >>      * Bio::Root::Global
> >>      * Bio::Root::IOManager
> >>      * Bio::Root::Object
> >>      * Bio::Root::Storable
> >>      * Bio::Root::Utilities  # may be used by third parties?
> >>      * Bio::Root::Vector
> >>      * Bio::Root::Xref
> >>      * t/Root-Utilities.t    # need to keep if we keep Utilities.pm
> >>      * t/RootStorable.t
> >>
> >> Should we schedule for deprecation, or deprecate immediately as Hilmar
> >> suggested they were meant to be deprecated long ago ?
> >
> > I'm happy to get rid of them all straight away. Does anyone object?
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From florin at iucha.net  Mon Oct  2 20:47:01 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 15:47:01 -0500
Subject: [Bioperl-l] Variable scope
In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
Message-ID: <20061002204701.GG14409@iucha.net>

On Mon, Oct 02, 2006 at 07:43:13PM +0200, ende wrote:
> It is a annoying problem that is making me waste lot of time.
> 
> I have a package with its new object, etc... and constants in it like:
> 
> #-----
> use constant False => 0;
> use constant True => 1;
> 
> our %CLRFG = (
>                PLASMIDO      => RED,
>                POLY_A        => GREEN,
>                RESTR_SITES   => BLUE,
>                CONECTORS     => MAGENTA,
>                CONTAMINANTS  => CYAN,
>            );
> 
> our %CLRBG = (
>                PLASMIDO      => "",
>                POLY_A        => "",
>                RESTR_SITES   => "",
>                CONECTORS     => "",
>                CONTAMINANTS  => "",
>            );
> #------
> 
> this constants are include with require "h.pl" from the main package  
> file.
> 
> I use this module from the mail command line driver to test it  
> "using" it.  In the command line driver I can use with no gripe the  
> constants False and True directly, for example "return True", etc  
> without any reference to the origin of that constant.

It is possible you get them from somewhere else.

> But, with respect to the variables (I would like they also were  
> constants.. but how?), %CLRFG and %CLRBG I can't find the way of  
> refering those int the module.  Finally I have desisted and _copy_  
> the definitions where  I have needed it (in the sub were I print Ansi  
> terminal colouring seqs...).  I don't find how to refer those  
> variables out of the module.
> 
> I have tried %modulename::CLRFG, for example, but Perl gives me errors.

Did you actually declare a package name in "h.pl" ?

Is there any reason you don't call the file ".pm" and load it with
"use"?  I have attached a small example of importing that works.

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra
-------------- next part --------------
A non-text attachment was scrubbed...
Name: one.pm
Type: text/x-perl
Size: 118 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0012.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: two.pl
Type: text/x-perl
Size: 69 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment-0013.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061002/cf339845/attachment.sig>

From Kevin.M.Brown at asu.edu  Mon Oct  2 23:44:50 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 2 Oct 2006 16:44:50 -0700
Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module
Message-ID: <1A4207F8295607498283FE9E93B775B4021960CD@EX02.asurite.ad.asu.edu>

Well, for anyone that wants to know, I found a way to capture the output
of ClustalW to get at things like the score.

Copy STDOUT to another handle
open(OUTCOPY, ">&STDOUT") or die "Couldn't dup STDOUT: $!";

Change where STDOUT goes
open(STDOUT, ">log.test") or die "Couldn't open log.test: $!";

Run the alignment and its output will be captured by the STDOUT
redirection
$aln, $factory->align(\@seq);

Restore STDOUT to its normal location for the rest of the script
close STDOUT;
open(STDOUT, ">&OUTCOPY");

I guess I can understand why most of this is just dropped by the
ClustalW.pm module since there doesn't seem to be a way to hold it all
in a SimpleAlign object.

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Kevin Brown
> Sent: Thursday, September 28, 2006 2:48 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] ClustalW alignment and Bioperl::Run module
> 
> I've gotten a very simple script to run using bioperl that creates an
> alignment using clustalw of two sequences.  I see that clustal outputs
> to stdout information like the score, but I don't see any way to store
> that or retrieve that from the alignment object that is 
> returned (unless
> I'm just blind).  What follows is my very basic script which used code
> found in the Wiki.
> 
> print $aln->score() spits out an error about using an uninitialized
> value.
> 
> 
> #!/usr/bin/perl -w
> 
> use strict;
> use Bio::SeqIO;
> use Bio::Perl;
> use Bio::AlignIO;
> use Getopt::Long qw(:config no_ignore_case bundling pass_through);
> use POSIX;
> use Bio::Tools::Run::Alignment::Clustalw;
> 
> my $fileName   = "";         # filename(s) to be parsed for 
> information
> my $output_dir = "";
> my $format     = 'fasta';    # default format for SeqIO module
> 
> GetOptions(
>                    'file=s'   => \$fileName,
>                    'output=s' => \$output_dir,
>                   );
> 
> # Parse the input file for the needed information
> # SeqIO supports several normal formats including <tab>, <fasta> and
> <excel>
> 
> my @files = split(/\|/, $fileName);
> my @seq_array;
> 
> my $stream_out =
>   Bio::AlignIO->new(-file => '>test.msf', -format => 'msf', -flush =>
> 0);
> 
> foreach my $fileName (@files)
> {
>         my $file = Bio::SeqIO->new(-format => $format, -file =>
> $fileName);
>         my $seq;
>         while ($seq = $file->next_seq())
>         {
>                 push(@seq_array, $seq);
>         }
> }
> 
> my @params  = ('ktuple' => 2, 'matrix' => 'BLOSUM');
> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
> my $ktuple  = 3;
> $factory->ktuple($ktuple);    # change the parameter before executing
>     # where @seq_array is an array of {{PM|Bio::Seq}} objects
> 
> open my $out, ">seq.txt";
> 
> for (my $i = 1 ; $i <= $#seq_array ; $i++)
> {
>         my @seq = ($seq_array[0], $seq_array[$i]);
>         my $aln = $factory->align(\@seq);
>         $stream_out->write_aln($aln);
>         print $aln->score;
>         for my $seq ($aln->each_seq) {
>                 print $out $seq->display_id() ."\t". $seq->seq()."\n";
>         }
> }
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bix at sendu.me.uk  Mon Oct  2 23:48:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 00:48:34 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
Message-ID: <4521A552.60301@sendu.me.uk>

Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
upload tar.gz files when I have access to the server, then reply here 
with links.

In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
instructions on getting and testing this RC.

Developers:
   Make sure you're in the AUTHORS file in all 4 packages, as
   appropriate.

Users:
   Even though 1.5.2 is a 'developer' release, we consider it the most
   stable and capable version of Bioperl, and recommend that you use
   it in all but the most critical production environments. Please
   try it out and let us know of any problems or difficulties you run
   into.


Thank you,
Sendu.


From lincoln.stein at gmail.com  Mon Oct  2 21:53:38 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 2 Oct 2006 21:53:38 +0000
Subject: [Bioperl-l] Variable scope
In-Reply-To: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
References: <8C1100B4-5EE1-48C7-B2E8-E1392D6AFBCD@gmail.com>
Message-ID: <6dce9a0b0610021453va2132c7u73747b9253211a66@mail.gmail.com>

Hi,

Read the documentation in Export. It is much better to formally export
constants, variables and functions and to import them with "use" than to use
"require". Also be sure that you understand how namespaces and modules work.

This is not a BioPerl topic and should have been directed to a general Perl
discussion list, such as Perl Monks.

Lincoln

On 10/2/06, ende <mmacho at gmail.com> wrote:
>
>
>         Hi
>
> this may be a typical perl topic and then out of this list center
> topic.  My apologize for any inconvenience.
>
> It is a annoying problem that is making me waste lot of time.
>
> I have a package with its new object, etc... and constants in it like:
>
> #-----
> use constant False => 0;
> use constant True => 1;
>
> our %CLRFG = (
>                PLASMIDO      => RED,
>                POLY_A        => GREEN,
>                RESTR_SITES   => BLUE,
>                CONECTORS     => MAGENTA,
>                CONTAMINANTS  => CYAN,
>            );
>
> our %CLRBG = (
>                PLASMIDO      => "",
>                POLY_A        => "",
>                RESTR_SITES   => "",
>                CONECTORS     => "",
>                CONTAMINANTS  => "",
>            );
> #------
>
> this constants are include with require "h.pl" from the main package
> file.
>
> I use this module from the mail command line driver to test it
> "using" it.  In the command line driver I can use with no gripe the
> constants False and True directly, for example "return True", etc
> without any reference to the origin of that constant.
>
> But, with respect to the variables (I would like they also were
> constants.. but how?), %CLRFG and %CLRBG I can't find the way of
> refering those int the module.  Finally I have desisted and _copy_
> the definitions where  I have needed it (in the sub were I print Ansi
> terminal colouring seqs...).  I don't find how to refer those
> variables out of the module.
>
> I have tried %modulename::CLRFG, for example, but Perl gives me errors.
>
> Any help?
>
>
>
>
> --
>      Juan Falgueras
>      Profesor del Depto. de Lenguajes y Ciencias de la Computaci?n
>      Universidad de M?laga
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From florin at iucha.net  Tue Oct  3 02:30:31 2006
From: florin at iucha.net (Florin Iucha)
Date: Mon, 2 Oct 2006 21:30:31 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <20061003023031.GI14409@iucha.net>

On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.
> 
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
> instructions on getting and testing this RC.

[I won't create a wiki account just to report this.]

Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
not set.  Lots of warnings about missing packages and all, but this
looks interesting:

   Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.

Otherwise:

   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay.

The failed test is:

   t/ESEfinder..................dubious
      Test returned status 255 (wstat 65280, 0xff00)
   DIED. FAILED test 15

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra


From cjfields at uiuc.edu  Tue Oct  3 03:50:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 22:50:47 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>

So far all tests pass on Mac OS X.  I'll add this to the release page.

This RC will throw warnings for four tests I didn't remove in time  
(BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which  
correspond to their namesake deprecated Bio::Tools modules.  These  
are no longer in CVS HEAD so should be gone by the next RC, and the  
relevant modules marked for deprecation.

I can verify the Bio::DB::SeqFeature.t warning on Mac OS X that  
Florin reported, but ESEFinder.t works fine:

t/BioDBSeqFeature............Argument "+" isn't numeric in numeric lt  
(<) at Bio/DB/SeqFeature/Segment.pm line 423.
ok
....

I'll report WinXP tests tomorrow on the wiki.

Chris


On Oct 2, 2006, at 6:48 PM, Sendu Bala wrote:

> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll
> upload tar.gz files when I have access to the server, then reply here
> with links.
>
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
>
> Developers:
>    Make sure you're in the AUTHORS file in all 4 packages, as
>    appropriate.
>
> Users:
>    Even though 1.5.2 is a 'developer' release, we consider it the most
>    stable and capable version of Bioperl, and recommend that you use
>    it in all but the most critical production environments. Please
>    try it out and let us know of any problems or difficulties you run
>    into.
>
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct  3 03:54:29 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 2 Oct 2006 22:54:29 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003023031.GI14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
Message-ID: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>

> [I won't create a wiki account just to report this.]
>
> Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> not set.  Lots of warnings about missing packages and all, but this
> looks interesting:
>
>    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/ 
> SeqFeature/Segment.pm line 423.

This is verified on Mac OS X.

> Otherwise:
>
>    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,  
> 99.99% okay.
>
> The failed test is:
>
>    t/ESEfinder..................dubious
>       Test returned status 255 (wstat 65280, 0xff00)
>    DIED. FAILED test 15

What do you get when you run that set of tests using 'perl -I. -w t/ 
ESEFinder.t'?  The bad status code is odd and could be a remote  
server issue.

Chris


>
> florin
>
> -- 
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Tue Oct  3 04:30:06 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 03 Oct 2006 14:30:06 +1000
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
Message-ID: <4521E74E.1040404@infotech.monash.edu.au>

My understanding is that all Bioperl-compliant classes should inherit 
from Bio::Root::Root, not Bio::Root::RootI.

Additionally, if functions such as throw() or _rearrange() are to be 
used without a class instance reference, they are to be used as class 
methods via Bio::Root::Root, not Bio::Root::RootI.

Is this correct?

My naive audit of bioperl-live CVS brought up the following statistics:

# Root.pm
/cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
26
/cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
346

# RootI.pm
/cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
9
/cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
79

My guess would be that all RootI should be changed to plain Root ?

Any help appreciated,

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From jason at bioperl.org  Tue Oct  3 06:03:17 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 2 Oct 2006 23:03:17 -0700
Subject: [Bioperl-l] t/ESEFinder.t fixed on branch
Message-ID: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>

Looks like good work everyone.

All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1  
with RC1 except for the t/ESEFinder problem which I've fixed.

It skipped too few tests when BIOPERLDEBUG=0.

Don't forget to merge branch changes back to head for this test when  
it is done.   I don't want to muddy water so I'm holding off  
migrating the changes to main trunk as the files is substantially  
different (I presume pre-Test::More adoption?).

-jason


From bix at sendu.me.uk  Tue Oct  3 07:28:48 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:28:48 +0100
Subject: [Bioperl-l] t/ESEFinder.t fixed on branch
In-Reply-To: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>
References: <8927E580-997E-402A-BA81-D09FF9AD5D9A@bioperl.org>
Message-ID: <45221130.2060405@sendu.me.uk>

Jason Stajich wrote:
> Looks like good work everyone.
> 
> All tests pass for me on OSX perl 5.8.6. w and w/o BIOPERLDEBUG=1  
> with RC1 except for the t/ESEFinder problem which I've fixed.
> 
> It skipped too few tests when BIOPERLDEBUG=0.
> 
> Don't forget to merge branch changes back to head for this test when  
> it is done.   I don't want to muddy water so I'm holding off  
> migrating the changes to main trunk as the files is substantially  
> different (I presume pre-Test::More adoption?).

Actually, it was the same until Torsten made his own (different) fixes 
to HEAD but not to branch. It was my mistake and I've corrected in yet a 
third way, and now branch and HEAD match.

No harm done :)


From bix at sendu.me.uk  Tue Oct  3 07:31:10 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:31:10 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>
References: <4521A552.60301@sendu.me.uk>
	<7F5CF5FF-5A1D-4DEA-A861-D4C47756F67D@uiuc.edu>
Message-ID: <452211BE.6080107@sendu.me.uk>

Chris Fields wrote:
> So far all tests pass on Mac OS X.  I'll add this to the release page.
> 
> This RC will throw warnings for four tests I didn't remove in time  
> (BPlite.t, BPpsilite, BPbl2seq.t, and RestrictionEnzyme.t), which  
> correspond to their namesake deprecated Bio::Tools modules.  These  
> are no longer in CVS HEAD so should be gone by the next RC, and the  
> relevant modules marked for deprecation.

Thanks Chris. Sorry I missed these.


From bix at sendu.me.uk  Tue Oct  3 07:32:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 08:32:08 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003023031.GI14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
Message-ID: <452211F8.8040104@sendu.me.uk>

Florin Iucha wrote:
> On Tue, Oct 03, 2006 at 12:48:34AM +0100, Sendu Bala wrote:
>> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
>> upload tar.gz files when I have access to the server, then reply here 
>> with links.
>>
>> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
>> instructions on getting and testing this RC.
> 
> [I won't create a wiki account just to report this.]
> 
> Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> not set.  Lots of warnings about missing packages and all, but this
> looks interesting:
> 
>    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.
> 
> Otherwise:
> 
>    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed, 99.99% okay.
> 
> The failed test is:
> 
>    t/ESEfinder..................dubious
>       Test returned status 255 (wstat 65280, 0xff00)
>    DIED. FAILED test 15

Thanks for your feedback Florin. The ESEfinder fail will be fixed in the 
next RC.


From bix at sendu.me.uk  Tue Oct  3 08:29:37 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 03 Oct 2006 09:29:37 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <45221F71.40206@sendu.me.uk>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.

Live/core:
http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-1.5.2-RC1.zip

Run:
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-run-1.5.2-RC1.zip

DB:
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-db-1.5.2-RC1.zip

Network:
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.gz
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.tar.bz2
http://bioperl.org/DIST/bioperl-network-1.5.2-RC1.zip

Md5 checksums are in:
http://bioperl.org/DIST/SIGNATURES.md5


From jason at bioperl.org  Tue Oct  3 06:11:30 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 2 Oct 2006 23:11:30 -0700
Subject: [Bioperl-l]  Use of Root.pm versus RootI.pm
Message-ID: <87F9B64E-8BDA-464B-814D-3F117AA646A1@bioperl.org>

I only briefly saw your question - but RootI is for interfaces,  
Root.pm is for instantiated objects.


From florin at iucha.net  Tue Oct  3 11:39:12 2006
From: florin at iucha.net (Florin Iucha)
Date: Tue, 3 Oct 2006 06:39:12 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
Message-ID: <20061003113912.GJ14409@iucha.net>

On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote:
> >Otherwise:
> >
> >   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,  
> >99.99% okay.
> >
> >The failed test is:
> >
> >   t/ESEfinder..................dubious
> >      Test returned status 255 (wstat 65280, 0xff00)
> >   DIED. FAILED test 15

$ perl -I. -w t/ESEfinder.t
1..15
ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
ok 2 - use Data::Dumper;
ok 3 - use Bio::PrimarySeq;
ok 4 - use Bio::Seq;
ok 5
ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
# Looks like you planned 15 tests but only ran 14.
$ grep Id t/ESEfinder.t
# $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $

florin

-- 
If we wish to count lines of code, we should not regard them as lines
produced but as lines spent.                       -- Edsger Dijkstra


From hlapp at gmx.net  Tue Oct  3 12:27:46 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 3 Oct 2006 08:27:46 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au>
References: <4521E74E.1040404@infotech.monash.edu.au>
Message-ID: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net>

The interface classes (those ending in 'I') should actually inherit  
from RootI, not Root.

In reality this recommendation is more theoretical than it makes that  
much of a difference I think. The motivation is that interface  
classes should not determine the actual implementation of a class  
(hash ref, array ref, whatever), and since Root.pm contains lots of  
implementation using a hash ref that decision will basically have  
been made.

On the contrary though, RootI contains implementation too, although  
I'm not sure it would prescribe the object implementation as opposed  
to merely implementing static methods (like throw(), warn(), etc).  
That would need to be checked.

	-hilmar

On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:

> My understanding is that all Bioperl-compliant classes should inherit
> from Bio::Root::Root, not Bio::Root::RootI.
>
> Additionally, if functions such as throw() or _rearrange() are to be
> used without a class instance reference, they are to be used as class
> methods via Bio::Root::Root, not Bio::Root::RootI.
>
> Is this correct?
>
> My naive audit of bioperl-live CVS brought up the following  
> statistics:
>
> # Root.pm
> /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> 26
> /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
> 346
>
> # RootI.pm
> /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> 9
> /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
> 79
>
> My guess would be that all RootI should be changed to plain Root ?
>
> Any help appreciated,
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct  3 12:33:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 07:33:37 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <20061003113912.GJ14409@iucha.net>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
	<20061003113912.GJ14409@iucha.net>
Message-ID: <44724E16-74CD-4778-B04F-529475B47E37@uiuc.edu>

Florin,

Looks like this is fixed and should be working in the next release.

Chris

On Oct 3, 2006, at 6:39 AM, Florin Iucha wrote:

> On Mon, Oct 02, 2006 at 10:54:29PM -0500, Chris Fields wrote:
>>> Otherwise:
>>>
>>>   Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,
>>> 99.99% okay.
>>>
>>> The failed test is:
>>>
>>>   t/ESEfinder..................dubious
>>>      Test returned status 255 (wstat 65280, 0xff00)
>>>   DIED. FAILED test 15
>
> $ perl -I. -w t/ESEfinder.t
> 1..15
> ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
> ok 2 - use Data::Dumper;
> ok 3 - use Bio::PrimarySeq;
> ok 4 - use Bio::Seq;
> ok 5
> ok 6 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 7 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 8 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 9 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 10 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 11 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 12 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 13 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> ok 14 # skip Skipping tests which require remote servers, set  
> BIOPERLDEBUG=1 to test
> # Looks like you planned 15 tests but only ran 14.
> $ grep Id t/ESEfinder.t
> # $Id: ESEfinder.t,v 1.13.6.2 2006/10/02 23:10:39 sendu Exp $
>
> florin
>
> -- 
> If we wish to count lines of code, we should not regard them as lines
> produced but as lines spent.                       -- Edsger Dijkstra
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct  3 14:29:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 09:29:51 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <338034D7-BCB0-4653-9C80-2E7EC9E09E9B@gmx.net>
Message-ID: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>

> The interface classes (those ending in 'I') should actually inherit
> from RootI, not Root.
> 
> In reality this recommendation is more theoretical than it makes that
> much of a difference I think. The motivation is that interface
> classes should not determine the actual implementation of a class
> (hash ref, array ref, whatever), and since Root.pm contains lots of
> implementation using a hash ref that decision will basically have
> been made.
> 
> On the contrary though, RootI contains implementation too, although
> I'm not sure it would prescribe the object implementation as opposed
> to merely implementing static methods (like throw(), warn(), etc).
> That would need to be checked.
> 
> 	-hilmar

The constructor in Bio::Root::RootI lets one know that its use is
deprecated, so you shouldn't have any cases of 'our qw(Bio::Root::RootI)';
there should be some way of inheriting Root directly or indirectly.  I would
say that any direct use of RootI is not good practice, though.  For the
current implementation we should only inherit Bio::Root::Root, which
implements RootI.

Is there any reason to shut off the warning with BIOPERLDEBUG?  

>From RootI:

sub new {
  my $class = shift;
  my @args = @_;
  unless ( $ENV{'BIOPERLDEBUG'} ) {
      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
Bio::Root::Root instead");
  }
  eval "require Bio::Root::Root";
  return Bio::Root::Root->new(@args);
}


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> 
> On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> 
> > My understanding is that all Bioperl-compliant classes should inherit
> > from Bio::Root::Root, not Bio::Root::RootI.
> >
> > Additionally, if functions such as throw() or _rearrange() are to be
> > used without a class instance reference, they are to be used as class
> > methods via Bio::Root::Root, not Bio::Root::RootI.
> >
> > Is this correct?
> >
> > My naive audit of bioperl-live CVS brought up the following
> > statistics:
> >
> > # Root.pm
> > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > 26
> > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | wc -l
> > 346
> >
> > # RootI.pm
> > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > 9
> > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | wc -l
> > 79
> >
> > My guess would be that all RootI should be changed to plain Root ?
> >
> > Any help appreciated,
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From slenk at emich.edu  Tue Oct  3 17:31:47 2006
From: slenk at emich.edu (Stephen Gordon Lenk)
Date: Tue, 03 Oct 2006 13:31:47 -0400
Subject: [Bioperl-l] Perl 6 has 'roles' - may be cleanly applicable to the
	Root/RootI issue
Message-ID: <5147da5514e402.514e4025147da5@emich.edu>

I looked at the Perl6 site, there is an RFC on interfaces:
http://dev.perl.org/perl6/rfc/265.html

Roles seem to be the Perl 6 answer to the Root/RootI issue in Bioperl. 
Maybe it is too early to suggest this.

http://dev.perl.org/perl6/doc/design/apo/A12.html:
The primary role of a class is to manage instances, that is, objects. 
So a class must worry about object creation and destruction, and 
everything that happens in between. Classes have a secondary role as 
units of software reuse, in that they can be inherited from or 
delegated to. However, because this is a secondary role, and because 
of weaknesses in models of inheritance, composition, and delegation, 
Perl 6 will split out the notion of software reuse into a separate 
class-like entity called a "role". Roles are an abstraction mechanism 
for use by classes that don't care about the secondary aspects of 
software reuse, or that (looking at it the other way) care so much 
about it that they want to encapsulate any decisions about 
implementation, composition, delegation, and maybe even inheritance. 
Sounds fancy, but just think of them as includes of partial classes, 
with some safety checks. Roles don't manage objects. They manage 
interfaces and other abstract behavior (like default implementations), 
and they help classes manage objects. As such, a role may only be 
composed into a class or into another role, never inherited from or 
delegated to. That's what classes are for.


From slenk at emich.edu  Tue Oct  3 16:45:15 2006
From: slenk at emich.edu (Stephen Gordon Lenk)
Date: Tue, 03 Oct 2006 12:45:15 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
Message-ID: <5120d6a511f5a7.511f5a75120d6a@emich.edu>

The separation of interface and implementation is generally
regarded as a good idea. Right now the Bioperl community is
doing this as part of the implementation of Bioperl. I suggest
that this is an example of something which you might want to
have as part of the Perl implementation. If Perl 6 (or even
Perl 5) does not have this as a core part of the language or
as a standard package (reusable by all in a common fashion),
you may want to suggest to the Perl implementers that a way
for interface/implementation distinctions be made part of the
core language. My 2 cents, as you people are the experts on 
your own code.


----- Original Message -----
From: Chris Fields <cjfields at uiuc.edu>
Date: Tuesday, October 3, 2006 10:29 am
Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm

> > The interface classes (those ending in 'I') should actually inherit
> > from RootI, not Root.
> > 
> > In reality this recommendation is more theoretical than it makes 
> that> much of a difference I think. The motivation is that interface
> > classes should not determine the actual implementation of a class
> > (hash ref, array ref, whatever), and since Root.pm contains lots of
> > implementation using a hash ref that decision will basically have
> > been made.
> > 
> > On the contrary though, RootI contains implementation too, although
> > I'm not sure it would prescribe the object implementation as 
opposed
> > to merely implementing static methods (like throw(), warn(), etc).
> > That would need to be checked.
> > 
> > 	-hilmar
> 
> The constructor in Bio::Root::RootI lets one know that its use is
> deprecated, so you shouldn't have any cases of 'our 
> qw(Bio::Root::RootI)';there should be some way of inheriting Root 
> directly or indirectly.  I would
> say that any direct use of RootI is not good practice, though.  
> For the
> current implementation we should only inherit Bio::Root::Root, which
> implements RootI.
> 
> Is there any reason to shut off the warning with BIOPERLDEBUG?  
> 
> >From RootI:
> 
> sub new {
>  my $class = shift;
>  my @args = @_;
>  unless ( $ENV{'BIOPERLDEBUG'} ) {
>      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
> Bio::Root::Root instead");
>  }
>  eval "require Bio::Root::Root";
>  return Bio::Root::Root->new(@args);
> }
> 
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> > 
> > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> > 
> > > My understanding is that all Bioperl-compliant classes should 
> inherit> > from Bio::Root::Root, not Bio::Root::RootI.
> > >
> > > Additionally, if functions such as throw() or _rearrange() are 
> to be
> > > used without a class instance reference, they are to be used 
> as class
> > > methods via Bio::Root::Root, not Bio::Root::RootI.
> > >
> > > Is this correct?
> > >
> > > My naive audit of bioperl-live CVS brought up the following
> > > statistics:
> > >
> > > # Root.pm
> > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > > 26
> > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio | 
> wc -l
> > > 346
> > >
> > > # RootI.pm
> > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > > 9
> > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio | 
> wc -l
> > > 79
> > >
> > > My guess would be that all RootI should be changed to plain 
> Root ?
> > >
> > > Any help appreciated,
> > >
> > > --
> > > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > > Victorian Bioinformatics Consortium, Monash University, Australia
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > 
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Tue Oct  3 17:49:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 3 Oct 2006 12:49:35 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <5120d6a511f5a7.511f5a75120d6a@emich.edu>
Message-ID: <000001c6e714$4c2cbb80$15327e82@pyrimidine>

Perl6 already has added flexibility for separation of
implementation/interface (I believe they are called roles).  

http://dev.perl.org/perl6/doc/design/syn/S12.html

To tell the truth, I'm not sure about Perl 5, except the way the Bioperl
devs have up the distinction between interface and implementation.  However,
I find the way we use interfaces is very simple (set up interface with
some/all methods as unimplemented, use the module as an abstract base class,
then override the unimplemented methods).  It works for me.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Stephen Gordon Lenk [mailto:slenk at emich.edu]
> Sent: Tuesday, October 03, 2006 11:45 AM
> To: Chris Fields
> Cc: 'Hilmar Lapp'; 'Torsten Seemann'; 'bioperl-l'
> Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm
> 
> The separation of interface and implementation is generally
> regarded as a good idea. Right now the Bioperl community is
> doing this as part of the implementation of Bioperl. I suggest
> that this is an example of something which you might want to
> have as part of the Perl implementation. If Perl 6 (or even
> Perl 5) does not have this as a core part of the language or
> as a standard package (reusable by all in a common fashion),
> you may want to suggest to the Perl implementers that a way
> for interface/implementation distinctions be made part of the
> core language. My 2 cents, as you people are the experts on
> your own code.
> 
> 
> ----- Original Message -----
> From: Chris Fields <cjfields at uiuc.edu>
> Date: Tuesday, October 3, 2006 10:29 am
> Subject: Re: [Bioperl-l] Use of Root.pm versus RootI.pm
> 
> > > The interface classes (those ending in 'I') should actually inherit
> > > from RootI, not Root.
> > >
> > > In reality this recommendation is more theoretical than it makes
> > that> much of a difference I think. The motivation is that interface
> > > classes should not determine the actual implementation of a class
> > > (hash ref, array ref, whatever), and since Root.pm contains lots of
> > > implementation using a hash ref that decision will basically have
> > > been made.
> > >
> > > On the contrary though, RootI contains implementation too, although
> > > I'm not sure it would prescribe the object implementation as
> opposed
> > > to merely implementing static methods (like throw(), warn(), etc).
> > > That would need to be checked.
> > >
> > > 	-hilmar
> >
> > The constructor in Bio::Root::RootI lets one know that its use is
> > deprecated, so you shouldn't have any cases of 'our
> > qw(Bio::Root::RootI)';there should be some way of inheriting Root
> > directly or indirectly.  I would
> > say that any direct use of RootI is not good practice, though.
> > For the
> > current implementation we should only inherit Bio::Root::Root, which
> > implements RootI.
> >
> > Is there any reason to shut off the warning with BIOPERLDEBUG?
> >
> > >From RootI:
> >
> > sub new {
> >  my $class = shift;
> >  my @args = @_;
> >  unless ( $ENV{'BIOPERLDEBUG'} ) {
> >      carp("Use of new in Bio::Root::RootI is deprecated.  Please use
> > Bio::Root::Root instead");
> >  }
> >  eval "require Bio::Root::Root";
> >  return Bio::Root::Root->new(@args);
> > }
> >
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > >
> > > On Oct 3, 2006, at 12:30 AM, Torsten Seemann wrote:
> > >
> > > > My understanding is that all Bioperl-compliant classes should
> > inherit> > from Bio::Root::Root, not Bio::Root::RootI.
> > > >
> > > > Additionally, if functions such as throw() or _rearrange() are
> > to be
> > > > used without a class instance reference, they are to be used
> > as class
> > > > methods via Bio::Root::Root, not Bio::Root::RootI.
> > > >
> > > > Is this correct?
> > > >
> > > > My naive audit of bioperl-live CVS brought up the following
> > > > statistics:
> > > >
> > > > # Root.pm
> > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::Root;' Bio | wc -l
> > > > 26
> > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::Root' Bio |
> > wc -l
> > > > 346
> > > >
> > > > # RootI.pm
> > > > /cvs/bioperl-live $ grep -r 'use Bio::Root::RootI;' Bio | wc -l
> > > > 9
> > > > /cvs/bioperl-live $ grep -r 'use base.*Bio::Root::RootI' Bio |
> > wc -l
> > > > 79
> > > >
> > > > My guess would be that all RootI should be changed to plain
> > Root ?
> > > >
> > > > Any help appreciated,
> > > >
> > > > --
> > > > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > > > Victorian Bioinformatics Consortium, Monash University, Australia
> > > >
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > >
> > > --
> > > ===========================================================
> > > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > > ===========================================================
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >


From cmlapid at up.edu.ph  Wed Oct  4 02:06:06 2006
From: cmlapid at up.edu.ph (Carlo Lapid)
Date: Wed, 4 Oct 2006 10:06:06 +0800
Subject: [Bioperl-l] genbank mirror
Message-ID: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>

Hi,

I'm trying to set up a local mirror of a large part of the Genbank database.
For users to access the local database, I need to create a web-based search
tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
flat files I've downloaded based on a query entered by the user.

I'm trying to use Bioperl to create this from scratch, but I'm having a very
hard time, especially since I want the user to have reasonable flexibility
in customizing his search. The best that I've been able to accomplish is a
search function that retrieves genbank sequence objects based on their
primary IDs or accession numbers; by using the fetch method of the
Bio::Index::GenBank module. But this doesn't help users who don't know the
exact IDs for the sequences they want.

Can anybody suggest a way to use Bioperl to search for an ordinary word or
phrase, like "16S gene", which could be matched against the description
field, or the entire genbank entry? (Alternatively, is there some other
freely available tool or software that can do this?) I've been scouring the
Bioperl documentation, but I couldn't find anything. I just need to be
pointed in the right direction. What I thought was a relatively simple
problem has been driving me crazy for days; if anybody has any suggestions I
would really, really appreciate it.


From torsten.seemann at infotech.monash.edu.au  Wed Oct  4 02:58:03 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 04 Oct 2006 12:58:03 +1000
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
References: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <4523233B.7030505@infotech.monash.edu.au>

> I'm trying to set up a local mirror of a large part of the Genbank database.
> For users to access the local database, I need to create a web-based search
> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
> flat files I've downloaded based on a query entered by the user.

Have you coinsidered bioperl-db / BioSQL ?

http://www.bioperl.org/wiki/BioPerl_db
http://lists.open-bio.org/pipermail/biosql-l/

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From osborne1 at optonline.net  Wed Oct  4 03:16:20 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Tue, 03 Oct 2006 23:16:20 -0400
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <C1489FC4.AA43%osborne1@optonline.net>

Carlo,

You might want to look at the Bio::DB::Query::GenBank module:

http://www.bioperl.org/wiki/HOWTO:Beginners#Retrieving_a_sequence_from_a_dat
abase

However this works through NCBI's own eutils API, setting it up to query a
local mirror may be very difficult.


Brian O.


On 10/3/06 10:06 PM, "Carlo Lapid" <cmlapid at up.edu.ph> wrote:

> Hi,
> 
> I'm trying to set up a local mirror of a large part of the Genbank database.
> For users to access the local database, I need to create a web-based search
> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
> flat files I've downloaded based on a query entered by the user.
> 
> I'm trying to use Bioperl to create this from scratch, but I'm having a very
> hard time, especially since I want the user to have reasonable flexibility
> in customizing his search. The best that I've been able to accomplish is a
> search function that retrieves genbank sequence objects based on their
> primary IDs or accession numbers; by using the fetch method of the
> Bio::Index::GenBank module. But this doesn't help users who don't know the
> exact IDs for the sequences they want.
> 
> Can anybody suggest a way to use Bioperl to search for an ordinary word or
> phrase, like "16S gene", which could be matched against the description
> field, or the entire genbank entry? (Alternatively, is there some other
> freely available tool or software that can do this?) I've been scouring the
> Bioperl documentation, but I couldn't find anything. I just need to be
> pointed in the right direction. What I thought was a relatively simple
> problem has been driving me crazy for days; if anybody has any suggestions I
> would really, really appreciate it.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From osborne1 at optonline.net  Wed Oct  4 03:28:06 2006
From: osborne1 at optonline.net (Brian Osborne)
Date: Tue, 03 Oct 2006 23:28:06 -0400
Subject: [Bioperl-l] genbank mirror
In-Reply-To: <4523233B.7030505@infotech.monash.edu.au>
Message-ID: <C148A286.AA47%osborne1@optonline.net>

Torsten and Carlo,

Right. For some simple examples of using Bio::DB::Query::BioQuery to query a
BioSQL db take a look at Bio::DB::BioSQL::OBDA.

You may also want to take a look at NCBI's eutils API, it's quite powerful
but not local. Or the ENSEMBL API, people have set up their own local
ENSEMBL dbs. There's an example of this API here:

http://www.bioperl.org/wiki/Getting_Genomic_Sequences


Brian O.


On 10/3/06 10:58 PM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

>> I'm trying to set up a local mirror of a large part of the Genbank database.
>> For users to access the local database, I need to create a web-based search
>> tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
>> flat files I've downloaded based on a query entered by the user.
> 
> Have you coinsidered bioperl-db / BioSQL ?
> 
> http://www.bioperl.org/wiki/BioPerl_db
> http://lists.open-bio.org/pipermail/biosql-l/


From torsten.seemann at infotech.monash.edu.au  Wed Oct  4 05:21:24 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Wed, 04 Oct 2006 15:21:24 +1000
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
Message-ID: <452344D4.8070908@infotech.monash.edu.au>

Hi all,

Now that we have Perl 5.6.1 as a minimum, the following modules are 
standard: File::Spec, File::Temp, File::Path

Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree() 
which currently dispatch to the File:: version, or try to emulate it. We 
don't need to emulate anymore. Jason Stajich suggested in a previous 
post that they should be deprecated, and that users should use directly 
the File:: functions themselves.

I have an uncommitted simplified version of Bio::Root::IO which does 
this, and "all tests pass". The functions currently (silently) dispatch 
directly to their native counterparts.

The only tricky function is tempfile() which is *mostly* like 
File::Temp::tempfile(), but does some voodoo of converting 
(TEMPLATE=>'xxx') to the non-hash first parameter of the File:: version, 
so I'm hesitant to commit. It may do other magic - Hilmar?

Comments?

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From gianluca.debellis at itb.cnr.it  Wed Oct  4 09:25:26 2006
From: gianluca.debellis at itb.cnr.it (Gianluca De Bellis)
Date: Wed, 04 Oct 2006 11:25:26 +0200
Subject: [Bioperl-l] Bioperl under WinXP
Message-ID: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>

I'm trying to use Bioperl under WinXP-SP2 (novice)

Bioperl has been just downloaded  (v 1.2.3)

Even the simplest program with a single command (use Bio::Perl;) ends up in
an error of the Perl interpreter with these details

AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll

ModVer: 0.0.0.0      Offset: 00003294

Coming from the  windos reporting system

Where is the problem?

 
Thanks in advance


From epsteinj at mail.nih.gov  Wed Oct  4 11:25:57 2006
From: epsteinj at mail.nih.gov (Epstein, Jonathan A (NIH/NICHD) [E])
Date: Wed, 4 Oct 2006 07:25:57 -0400
Subject: [Bioperl-l] genbank mirror
References: <e7e749d0610031906k54069a3ci4ad06064df743638@mail.gmail.com>
Message-ID: <42504F69898FE546B3F0238C9BD03275532603@NIHCESMLBX7.nih.gov>

There's Seqhound:
  http://seqhound.blueprint.org/report.html

We set this up locally, and it's probably the most comprehensive free solution out there, but it's non-trivial to setup. Also, since the Blueprint&BIND have lost most of their funding, I'm not sure how long you can count on SeqHound to remain operational (although for now it is being updated).

Jonathan


-----Original Message-----
From: Carlo Lapid [mailto:cmlapid at up.edu.ph]
Sent: Tue 10/3/2006 10:06 PM
To: bioperl-l at bioperl.org
Subject: [Bioperl-l] genbank mirror
 
Hi,

I'm trying to set up a local mirror of a large part of the Genbank database.
For users to access the local database, I need to create a web-based search
tool, much like Entrez of NCBI, or SRS of EBI; that can parse the Genbank
flat files I've downloaded based on a query entered by the user.

I'm trying to use Bioperl to create this from scratch, but I'm having a very
hard time, especially since I want the user to have reasonable flexibility
in customizing his search. The best that I've been able to accomplish is a
search function that retrieves genbank sequence objects based on their
primary IDs or accession numbers; by using the fetch method of the
Bio::Index::GenBank module. But this doesn't help users who don't know the
exact IDs for the sequences they want.

Can anybody suggest a way to use Bioperl to search for an ordinary word or
phrase, like "16S gene", which could be matched against the description
field, or the entire genbank entry? (Alternatively, is there some other
freely available tool or software that can do this?) I've been scouring the
Bioperl documentation, but I couldn't find anything. I just need to be
pointed in the right direction. What I thought was a relatively simple
problem has been driving me crazy for days; if anybody has any suggestions I
would really, really appreciate it.
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Wed Oct  4 13:19:45 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 04 Oct 2006 14:19:45 +0100
Subject: [Bioperl-l] Bioperl under WinXP
In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
References: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
Message-ID: <4523B4F1.3010305@sendu.me.uk>

Gianluca De Bellis wrote:
> I'm trying to use Bioperl under WinXP-SP2 (novice)
> 
> Bioperl has been just downloaded  (v 1.2.3)
> 
> Even the simplest program with a single command (use Bio::Perl;) ends up in
> an error of the Perl interpreter with these details
> 
> AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll
> 
> ModVer: 0.0.0.0      Offset: 00003294
> 
> Coming from the  windos reporting system
> 
> Where is the problem?

Hard to say. Do non-bioperl scripts work?

Make sure to follow the Bioperl installation instructions carefully:
http://bioperl.org/wiki/Installing_Bioperl_on_Windows

And make sure to install at least version 1.4. 1.2.3 is ancient and 
effectively unsupported.


From cjfields at uiuc.edu  Wed Oct  4 14:03:34 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 4 Oct 2006 09:03:34 -0500
Subject: [Bioperl-l] Bioperl under WinXP
In-Reply-To: <000301c6e797$06fb2c80$7959959f@PORTEGEGDB>
Message-ID: <000601c6e7bd$e22ad190$15327e82@pyrimidine>

If you're using PPM, you can install a (much) newer version of BioPerl from
here:

http://www.gmod.org/ggb/ppm/

Add that as one of your repositories in PPM4 (seeing that you are using
ActivePerl 5.8.8.819), then search for bioperl.  The version should be
1.512.

In a few weeks we'll be releasing a new developer release.  A WinXP PPM is
expected, as well as a bundled package to install all prerequisites.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Gianluca De Bellis
> Sent: Wednesday, October 04, 2006 4:25 AM
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] Bioperl under WinXP
> 
> I'm trying to use Bioperl under WinXP-SP2 (novice)
> 
> Bioperl has been just downloaded  (v 1.2.3)
> 
> Even the simplest program with a single command (use Bio::Perl;) ends up
> in
> an error of the Perl interpreter with these details
> 
> AppName: perl.exe AppVer: 5.8.8.819  ModName: win32.dll
> 
> ModVer: 0.0.0.0      Offset: 00003294
> 
> Coming from the  windos reporting system
> 
> Where is the problem?
> 
> 
> 
> Thanks in advance
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at gmx.net  Wed Oct  4 14:25:23 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 4 Oct 2006 10:25:23 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>
References: <002101c6e6f8$67b4ae10$15327e82@pyrimidine>
Message-ID: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net>


On Oct 3, 2006, at 10:29 AM, Chris Fields wrote:

> The constructor in Bio::Root::RootI lets one know that its use is
> deprecated, so you shouldn't have any cases of 'our qw 
> (Bio::Root::RootI)';

Don't confuse the constructor with the inheritance tree.

Interface classes should never be instantiated, hence the  
constructor, consistent with the documentation, should never get  
executed.

> there should be some way of inheriting Root directly or  
> indirectly.  I would
> say that any direct use of RootI is not good practice, though.

I don't know what you mean by 'directly' or 'indirectly' but  
inheritance from interfaces, and interfaces extending (inheriting  
from) other interfaces, is certainly standard practice. I'm not sure  
at all why it would be a bad one.

> For the current implementation we should only inherit  
> Bio::Root::Root, which
> implements RootI.

For the implementation classes, yes. For the interface classes, no.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Oct  4 14:43:54 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 4 Oct 2006 10:43:54 -0400
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <452344D4.8070908@infotech.monash.edu.au>
References: <452344D4.8070908@infotech.monash.edu.au>
Message-ID: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>


On Oct 4, 2006, at 1:21 AM, Torsten Seemann wrote:

> Bio::Root::IO has functions catfile(), tmpdir(), tmpfile(), rmtree()
> which currently dispatch to the File:: version, or try to emulate  
> it. We
> don't need to emulate anymore. Jason Stajich suggested in a previous
> post that they should be deprecated, and that users should use  
> directly
> the File:: functions themselves.

I don't think there's a need to deprecate - if the methods just plain  
delegate to whatever File:: module is appropriate their  
implementation (supposedly) will become very simple and hence won't  
pose a maintenance burden anymore.

One can still recommend for all new scripts or modules or code  
written to use the File:: modules directly, just I'm not sure there's  
a need to tell users that they should start changing their existing  
stuff.

>
> I have an uncommitted simplified version of Bio::Root::IO which does
> this, and "all tests pass". The functions currently (silently)  
> dispatch
> directly to their native counterparts.
>
> The only tricky function is tempfile() which is *mostly* like
> File::Temp::tempfile(), but does some voodoo of converting
> (TEMPLATE=>'xxx') to the non-hash first parameter of the File::  
> version,
> so I'm hesitant to commit. It may do other magic - Hilmar?

Not that I would know of. If the tests pass (without having to change  
them!) I'd give it a try.

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Oct  4 15:35:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 4 Oct 2006 10:35:16 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <28FB0A46-24C9-4EDD-B89C-BF7A108A1EDC@gmx.net>
Message-ID: <001901c6e7ca$b12fd5b0$15327e82@pyrimidine>

...
> Don't confuse the constructor with the inheritance tree.
> 
> Interface classes should never be instantiated, hence the
> constructor, consistent with the documentation, should never get
> executed.

I know that interfaces shouldn't be instantiated.  I had noticed there are
cases of 'our qw (Bio::Root::RootI)' where it is completely acceptable to
inherit the interface.  Makes sense to me now.

> > there should be some way of inheriting Root directly or
> > indirectly.  I would
> > say that any direct use of RootI is not good practice, though.
> 
> I don't know what you mean by 'directly' or 'indirectly' but
> inheritance from interfaces, and interfaces extending (inheriting
> from) other interfaces, is certainly standard practice. I'm not sure
> at all why it would be a bad one.

I was talking specifically about inheriting RootI, and not about all Bioperl
interfaces in general.  I completely understand the use of
interface/implementation in Bioperl.  However, I missed one small fact until
yesterday (of course AFTER I posed my reply), which was that interfaces may
inherit RootI directly.  My oops.

I had understood that, in general, any Bioperl implementation should not
inherit the RootI interface directly (they should inherit Root, since that
implements RootI).  The 'constructor' present in RootI is essentially to
make sure that no one inherits from the wrong class.

Probably a bad use of the terms 'direct' and 'indirect', so maybe I didn't
get that across very well.  What I meant was that all classes inherit Root
in some way, either 'directly' (as the direct parent class) or 'indirectly'
(through the inheritance tree). Probably comes from being primarily a
molecular microbiologist and not a computer scientist.

OT, but it would be nice to have an updated class diagram to sort out the
inheritance hierarchy a bit easier.  In the meantime, the Deobfuscator does
help quite a bit.

> > For the current implementation we should only inherit
> > Bio::Root::Root, which
> > implements RootI.
> 
> For the implementation classes, yes. For the interface classes, no.

I agree (see above).  That's the one small bit about interfaces I missed
along the way.  Makes sense; they use throw_not_implemented(), which is a
RootI method.

> 	-hilmar

Chris


From pmiguel at purdue.edu  Wed Oct  4 19:38:51 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Wed, 04 Oct 2006 15:38:51 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <4521A552.60301@sendu.me.uk>
References: <4521A552.60301@sendu.me.uk>
Message-ID: <45240DCB.2080204@purdue.edu>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 1 is ready and available in CVS. I'll 
> upload tar.gz files when I have access to the server, then reply here 
> with links.
>
> In the mean time, see http://www.bioperl.org/wiki/Release_1.5.2 for 
> instructions on getting and testing this RC.
>
> Developers:
>    Make sure you're in the AUTHORS file in all 4 packages, as
>    appropriate.
>
> Users:
>    Even though 1.5.2 is a 'developer' release, we consider it the most
>    stable and capable version of Bioperl, and recommend that you use
>    it in all but the most critical production environments. Please
>    try it out and let us know of any problems or difficulties you run
>    into.
>
>
> Thank you,
> Sendu.
>   
I didn't see any tests done under solaris, so I asked our sys admin to 
do the install on one of our machines.

Just another data point:

He installed this release candidate on a Sun E450 box running solaris. 
uname -a gives:

SunOS descartes 5.10 Generic_118833-18 sun4u sparc SUNW,Ultra-4

perl -v gives:

This is perl, v5.8.8 built for sun4-solaris
(etc.)


$ time make test
PERL_DL_NONLAZY=1 /usr/local/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/AAChange...................ok
t/AAReverseMutate............ok
t/abi........................Bio::SeqIO::staden::read from bioperl-ext is not installed or is installed incorrectly - skipping abi.t tests
t/abi........................ok
t/ace........................ok
t/AlignIO....................ok
t/AlignStats.................ok
t/AlignUtil..................ok
t/alignUtilities.............ok
t/Allele.....................ok
t/Alphabet...................ok
t/Annotation.................ok
t/AnnotationAdaptor..........ok
t/asciitree..................ok
t/Assembly...................ok
        1/19 skipped:
t/Biblio.....................ok
t/Biblio_biofetch............ok
t/Biblio_eutils..............ok
t/BiblioReferences...........ok
t/BioDBGFF...................ok
t/BioDBSeqFeature............ok 1/46Argument "+" isn't numeric in numeric lt (<) at Bio/DB/SeqFeature/Segment.pm line 423.
t/BioDBSeqFeature............ok
t/BioDBSeqFeature_BDB........ok
t/BioDBSeqFeature_mysql......ok 3/46prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT sequence,offset
   FROM sequence as s,locationlist as ll
   WHERE s.id=ll.id
     AND ll.seqname= ?
     AND offset >= ?
     AND offset <= ?
   ORDER BY offset
) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT max(offset) FROM sequence as s,locationlist as ll WHERE s.id=ll.id AND ll.seqname=? AND offset<=?) statement handle DBI::st=HASH(0x8c5258) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
prepare_cached(SELECT sequence,offset
   FROM sequence as s,locationlist as ll
   WHERE s.id=ll.id
     AND ll.seqname= ?
     AND offset >= ?
     AND offset <= ?
   ORDER BY offset
) statement handle DBI::st=HASH(0x8c5048) still Active at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 1422
t/BioDBSeqFeature_mysql......ok
t/BioFetch_DB................ok
t/BioGraphics................ok
t/BlastIndex.................ok 1/13
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BlastIndex.................ok
t/BPbl2seq...................
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPbl2seq...................ok 1/108
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPbl2seq is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPbl2seq...................ok
t/BPlite.....................ok 1/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok 52/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok 88/97
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
STACK Bio::Tools::BPlite::new /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/Tools/BPlite.pm:197
STACK toplevel t/BPlite.t:127

-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPlite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPlite.....................ok
t/BPpsilite..................
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPpsilite..................ok 4/11
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::BPpsilite is deprecatedUse Bio::SearchIO classes instead
---------------------------------------------------
t/BPpsilite..................ok
t/bsml_sax...................ok
t/Chain......................ok
t/chaosxml...................ok
t/cigarstring................ok
t/ClusterIO..................ok
t/Coalescent.................ok
t/CodonTable.................ok
t/Compatible.................ok
t/consed.....................ok
t/CoordinateGraph............ok
t/CoordinateMapper...........ok
t/Correlate..................ok
t/ctf........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ctf.t tests
t/ctf........................ok
t/CytoMap....................ok
t/DB.........................skipped
        all skipped: Skipping all tests since they require network access, set BIOPERLDEBUG=1 to test
t/DBCUTG.....................ok
        11/34 skipped: Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
t/DBFasta....................ok
t/DNAMutation................ok
t/Domcut.....................ok
t/ECnumber...................ok
t/ELM........................ok 1/13
-------------------- WARNING ---------------------
MSG: sleeping for 1 seconds

---------------------------------------------------
t/ELM........................ok
t/embl.......................ok
t/EMBL_DB....................ok
t/EMBOSS_Tools...............ok
t/EncodedSeq.................ok
t/entrezgene.................ok 491/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 695/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 723/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok 824/1003Pseudo-hashes are deprecated at /usr/local/src/bioperl-1.5.2-RC1/bioperl-1.5.2-RC1/blib/lib/Bio/SeqIO/entrezgene.pm line 467.
t/entrezgene.................ok
t/ePCR.......................ok
t/ESEfinder..................ok 1/15# Looks like you planned 15 tests but only ran 14.
t/ESEfinder..................dubious
        Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED test 15
        Failed 1/15 tests, 93.33% okay (less 9 skipped tests: 5 okay, 33.33%)
t/est2genome.................ok
t/EUtilities.................skipped
        all skipped: Set BIOPERLDEBUG=1 to run tests
t/Exception..................ok
t/Exonerate..................ok
t/exp........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping exp.t tests
t/exp........................ok
t/fasta......................ok
t/FeatureIO..................ok 7/33
-------------------- WARNING ---------------------
MSG: '##feature-ontology' directive handling not yet implemented
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: '##attribute-ontology' directive handling not yet implemented
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: '##source-ontology' directive handling not yet implemented
---------------------------------------------------
t/FeatureIO..................ok
t/flat.......................ok
t/FootPrinter................ok
t/game.......................ok
t/GbrowseGFF.................ok
t/gcg........................ok
t/GDB........................ok
t/Gel........................ok
t/genbank....................ok
t/GeneCoordinateMapper.......ok
t/Geneid.....................ok
t/Genewise...................ok
        2/51 skipped:
t/Genomewise.................ok
t/Genpred....................ok
t/GFF........................ok
t/GOR4.......................ok
t/GOterm.....................ok
t/GraphAdaptor...............ok
t/GuessSeqFormat.............ok
t/hmmer......................ok
t/hmmer_pull.................ok
t/HNN........................ok
t/HtSNP......................ok
t/Index......................ok
t/InstanceSite...............ok
t/interpro...................ok
t/InterProParser.............ok
t/IUPAC......................ok
t/kegg.......................ok
t/largefasta.................ok
t/LargeLocatableSeq..........ok
t/largepseq..................ok
t/lasergene..................ok
t/LinkageMap.................ok
t/LiveSeq....................ok
t/LocatableSeq...............ok
t/Location...................ok
t/LocationFactory............ok
t/LocusLink..................ok
t/lucy.......................ok
t/Map........................ok
t/MapIO......................ok
t/masta......................ok
t/Matrix.....................ok
t/Measure....................ok
t/MeSH.......................ok
t/metafasta..................ok
t/MetaSeq....................ok
t/MicrosatelliteMarker.......ok
t/MiniMIMentry...............ok
t/MitoProt...................ok
t/Molphy.....................ok
t/MultiFile..................ok
t/multiple_fasta.............ok
t/Mutation...................ok
t/Mutator....................ok
t/NetPhos....................ok
        10/14 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/Node.......................ok
t/obo_parser.................ok
t/OddCodes...................ok
t/OMIMentry..................ok
t/OMIMentryAllelicVariant....ok
t/OMIMparser.................ok
t/Ontology...................ok
t/OntologyEngine.............ok
t/OntologyStore..............ok
t/PAML.......................ok
t/Perl.......................ok
t/phd........................ok
t/Phenotype..................ok
t/PhylipDist.................ok
t/PhysicalMap................ok
t/pICalculator...............ok
t/Pictogram..................ok
t/pir........................ok
t/pln........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping pln.t tests
t/pln........................ok
t/PopGen.....................ok
        2/89 skipped:
t/PopGenSims.................ok
t/primaryqual................ok
t/PrimarySeq.................ok
t/primedseq..................ok
t/Primer.....................ok
t/primer3....................ok
t/Promoterwise...............ok
t/ProtDist...................ok
t/protgraph..................ok
t/ProtMatrix.................ok
t/ProtPsm....................ok
t/Pseudowise.................ok
t/psm........................ok
t/QRNA.......................ok
t/qual.......................ok
t/RandDistFunctions..........ok
t/RandomTreeFactory..........ok
t/Range......................ok
t/RangeI.....................ok
t/raw........................ok
t/RefSeq.....................ok
t/Registry...................ok
t/Relationship...............ok
t/RelationshipType...........ok
t/RemoteBlast................ok
        11/13 skipped: to avoid timeout
t/RepeatMasker...............ok
t/RestrictionAnalysis........ok
t/RestrictionEnzyme..........ok 1/14
-------------------- WARNING ---------------------
MSG: Use of Bio::Tools::RestrictionEnzyme is deprecatedUse Bio::Restriction classes instead
---------------------------------------------------
t/RestrictionEnzyme..........ok
t/RestrictionIO..............ok
t/RNAChange..................ok
t/rnamotif...................ok
t/RootI......................ok
t/RootIO.....................ok
        2/27 skipped: various reasons
t/RootStorable...............ok
t/Scansite...................ok
t/scf........................ok
t/SearchDist.................ok
t/SearchIO...................ok
t/Seg........................ok
t/Seq........................ok
t/seq_quality................ok
t/SeqAnalysisParser..........ok
t/SeqBuilder.................ok
t/SeqDiff....................ok
t/SeqFeatCollection..........ok
t/SeqFeature.................ok
t/seqfeaturePrimer...........ok
t/SeqHound_DB................ok 4/14Writing into 'shoundlog' log file.
t/SeqHound_DB................ok
t/SeqIO......................ok
t/SeqPattern.................ok
t/seqread_fail...............ok
t/SeqStats...................ok
t/SequenceFamily.............ok
t/sequencetrace..............ok
t/SeqUtils...................ok
t/SeqVersion.................ok
t/seqwithquality.............ok
t/SeqWords...................ok
t/Sigcleave..................ok
t/Signalp....................ok
t/Sim4.......................ok
t/SimilarityPair.............ok
t/SimpleAlign................ok
t/simpleGOparser.............ok
t/singlet....................ok
t/sirna......................ok
t/SiteMatrix.................ok
t/SNP........................ok
t/Sopma......................ok
t/Species....................ok
        5/20 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/Spidey.....................ok
t/splicedseq.................ok
t/StandAloneBlast............ok
t/StructIO...................ok
t/Structure..................ok
t/swiss......................ok
t/Symbol.....................ok
t/tab........................ok
t/table......................ok
t/TagHaplotype...............ok
t/Taxonomy...................ok
        44/98 skipped: Skipping tests which require network access, set BIOPERLDEBUG=1 to test
t/TaxonTree..................ok
t/Tempfile...................ok
t/Term.......................ok
t/tigrxml....................ok
t/tinyseq....................ok
t/Tmhmm......................ok
t/Tools......................ok
t/Tree.......................ok
t/TreeBuild..................ok
t/TreeIO.....................ok
t/trim.......................ok
t/tRNAscanSE.................ok
t/UCSCParsers................ok
t/Unflattener................ok
t/Unflattener2...............ok
t/UniGene....................ok
t/Variation_IO...............ok
t/WABA.......................ok
t/XEMBL_DB...................ok
        1/9 skipped: server may be down
t/ztr........................Bio::SeqIO::staden::read of bioperl-ext is not installed or is installed incorrectly - skipping ztr.t tests
t/ztr........................ok
Failed Test   Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
t/ESEfinder.t  255 65280    15    2  13.33%  15
2 tests and 98 subtests skipped.
Failed 1/240 test scripts, 99.58% okay. 1/11910 subtests failed, 99.99% okay.
*** Error code 29
make: Fatal error: Command failed for target `test_dynamic'

real    13m10.064s
user    11m14.891s
sys     0m45.417s

$ TEST_VERBOSE=1 perl t/ESEfinder.t
1..15
ok 1 - use Bio::Tools::Analysis::DNA::ESEfinder;
ok 2 - use Data::Dumper;
ok 3 - use Bio::PrimarySeq;
ok 4 - use Bio::Seq;
ok 5
ok 6 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 7 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 8 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 9 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 10 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 11 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 12 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 13 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
ok 14 # skip Skipping tests which require remote servers, set BIOPERLDEBUG=1 to test
# Looks like you planned 15 tests but only ran 14.


From bix at sendu.me.uk  Thu Oct  5 07:19:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 08:19:39 +0100
Subject: [Bioperl-l] EUtilities term handling
Message-ID: <4524B20B.5010703@sendu.me.uk>

This is actually a general question and not limited to EUtilities. As I 
see it EUtiltiies lets you do queries in Bioperl that you can do on a 
website. The question is, should a Bioperl module always work with 
queries that the website it is a front-end to works with?

So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is 
essentially a frontend onto:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=

With a web-browser you can complete that url by supplying a term. For 
example, the term 'BRCA2+9606[taxid]' works and returns results:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid]

If you supply the exact same term to EUtilities::esearch like so:

my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => 
"gene", -term "BRCA2+9606[taxid]");

The search fails. From my 'user' perspective this is highly unexpected. 
Chris (the author) and I both understand /why/ it fails, but Chris 
doesn't think it is a bug, or at least something than can/should be 
changed. What do other people think? At the very least, if something 
unexpected happens, I'd suggest making a note of it in the POD 
somewhere. Eg. "Do not use + in term strings, even though they might 
work on the website".

Chris: what is the disadvantage of always submitting '+' as '+' to the 
server?


From bix at sendu.me.uk  Thu Oct  5 07:24:45 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 08:24:45 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <4524B33D.9070607@sendu.me.uk>

Sendu Bala wrote:
>
> With a web-browser you can complete that url by supplying a term. For 
> example, the term 'BRCA2+9606[taxid]' works and returns results:
> 
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?retmode=xml&db=gene&term=BRCA2+9606[taxid] 
> 
> 
> If you supply the exact same term to EUtilities::esearch like so:
> 
> my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db => 
> "gene", -term "BRCA2+9606[taxid]");

*cough*

my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db =>
"gene", -term => "BRCA2+9606[taxid]");


> The search fails. 


From m.weimer at dkfz-heidelberg.de  Thu Oct  5 12:15:53 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Thu, 05 Oct 2006 14:15:53 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Error
Message-ID: <1160050554.18691.11.camel@localhost>

When running


--------------------------------------------------------------

  #! /usr/bin/perl -w

  use strict;
  use Bio::DB::SwissProt;

  my $db_obj = new Bio::DB::SwissProt(-verbose=>1);

  my $seq_obj = $db_obj->get_Seq_by_acc('P43780');


-------------------------------------------------------------

using Bioperl 1.4-1 I get the error message

---------------------------------------------------------------------------------

  request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
  Content-Length: 45
  Content-Type: application/x-www-form-urlencoded

  format=swissprot&db=swall&style=raw&id=P43780


  ------------- EXCEPTION: Bio::Root::Exception -------------
  MSG: swissprot stream with no ID. Not swissprot in my book
  STACK: Error::throw
  STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
  STACK
Bio::SeqIO::swiss::next_seq /usr/share/perl5/Bio/SeqIO/swiss.pm:179
  STACK:
Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/share/perl5/Bio/DB/WebDBSeqI.pm:187
  STACK: ./putativeGele.pl:8
  -----------------------------------------------------------

--------------------------------------------------------------------------------

Any suggestions?

Thanks,

Marc


From bix at sendu.me.uk  Thu Oct  5 13:21:23 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 14:21:23 +0100
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <1160050554.18691.11.camel@localhost>
References: <1160050554.18691.11.camel@localhost>
Message-ID: <452506D3.5050501@sendu.me.uk>

Marc Weimer wrote:
[snip]
>   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
> 
>   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
[snip]
> using Bioperl 1.4-1 I get the error message
[snip]
>   ------------- EXCEPTION: Bio::Root::Exception -------------
>   MSG: swissprot stream with no ID. Not swissprot in my book
[snip]
> Any suggestions?

It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most 
recent official release), but 1.5.2 does 
(http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS 
(http://bioperl.org/wiki/Getting_BioPerl#CVS).


From m.weimer at dkfz-heidelberg.de  Thu Oct  5 13:35:06 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Thu, 05 Oct 2006 15:35:06 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <452506D3.5050501@sendu.me.uk>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
Message-ID: <1160055306.18691.14.camel@localhost>

Works fine with 1.5.2

Thanks,

Marc


> Marc Weimer wrote:
> [snip]
> >   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
> > 
> >   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
> [snip]
> > using Bioperl 1.4-1 I get the error message
> [snip]
> >   ------------- EXCEPTION: Bio::Root::Exception -------------
> >   MSG: swissprot stream with no ID. Not swissprot in my book
> [snip]
> > Any suggestions?
> 
> It works with the latest Bioperl. I'm not sure if 1.5.1 works (the most 
> recent official release), but 1.5.2 does 
> (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS 
> (http://bioperl.org/wiki/Getting_BioPerl#CVS).
-- 
########################################

Dr. Marc Weimer
German Cancer Research Center
Central Unit Biostatistics
Im Neuenheimer Feld 280
D-69120 Heidelberg
Phone: +49 (0) 6221/42-2387
Fax: +49 (0) 6221/42-2397

########################################


From hlapp at gmx.net  Thu Oct  5 13:55:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 09:55:58 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>


On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:

> This is actually a general question and not limited to EUtilities.  
> As I
> see it EUtiltiies lets you do queries in Bioperl that you can do on a
> website. The question is, should a Bioperl module always work with
> queries that the website it is a front-end to works with?

I think yes, but stick to this definition.

Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez  
website it will actually not work. Hence, it should be no surprise  
that it doesn't work either using Bio::DB::EUtilities.

The URL you are using to make your point is much more an example for  
using a web-service (SOAP, REST, or not) than it is for using a  
website. Using the web-service URL with a space in place of the '+'  
works, but yields a different result (just searches for BRCA2), so if  
tested for correct result the test fails.

I.e., you don't expect an input form on a website to accept URL- 
encoded input. Instead, you expect it to do any URL-encoding for you  
that needs to be done. Conversely, if you are using a URL to retrieve  
stuff using e.g. wget or curl, it is clear that you will need to do  
URL encoding yourself unless there is a command line option that lets  
you instruct the querying program to do so.

I would be careful with mangling the two definitions into one,  
resulting in a module that needs to serve two masters. You could  
consider providing an option though that lets you turn off the URL  
encoding on demand.

Aside from that, one of the advantages of having the service wrapped  
in Bioperl is in fact that you can have it accept a wider variety of  
parameters that the actual service would allow you to have, e.g.,  
arrays, hashes, or whatever seems appropriate.

My $0.02.

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Thu Oct  5 14:08:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:08:01 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
Message-ID: <452511C1.5020709@sendu.me.uk>

Hilmar Lapp wrote:
> 
> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:
> 
>> This is actually a general question and not limited to EUtilities. As I
>> see it EUtiltiies lets you do queries in Bioperl that you can do on a
>> website. The question is, should a Bioperl module always work with
>> queries that the website it is a front-end to works with?
> 
> I think yes, but stick to this definition.
> 
> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez 
> website it will actually not work. Hence, it should be no surprise that 
> it doesn't work either using Bio::DB::EUtilities.

On the contrary, I find it a surprise because EUtilities is an interface 
to NCBI's eutils, not the entrez website.

If I had previously read instructions on using eutils:
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls
I might (do) expect that I /should/ use + in my term.


> Aside from that, one of the advantages of having the service wrapped in 
> Bioperl is in fact that you can have it accept a wider variety of 
> parameters that the actual service would allow you to have, e.g., 
> arrays, hashes, or whatever seems appropriate.

I was going to suggest that terms be supplied as an array, leaving 
Bioperl code to decide how to 'AND' all the terms (elements in the 
array) together. It would also further force the user not to think of 
how eutils normally works, but to only consider the Bioperl instructions 
on how to form a query. But I'm not sure of the value of all that.


From cjfields at uiuc.edu  Thu Oct  5 14:06:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:06:50 -0500
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <452506D3.5050501@sendu.me.uk>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
Message-ID: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>

On Oct 5, 2006, at 8:21 AM, Sendu Bala wrote:

> Marc Weimer wrote:
> [snip]
>>   my $db_obj = new Bio::DB::SwissProt(-verbose=>1);
>>
>>   my $seq_obj = $db_obj->get_Seq_by_acc('P43780');
> [snip]
>> using Bioperl 1.4-1 I get the error message
> [snip]
>>   ------------- EXCEPTION: Bio::Root::Exception -------------
>>   MSG: swissprot stream with no ID. Not swissprot in my book
> [snip]
>> Any suggestions?
>
> It works with the latest Bioperl. I'm not sure if 1.5.1 works (the  
> most
> recent official release), but 1.5.2 does
> (http://bioperl.org/wiki/Release_1.5.2), as does a checkout from CVS
> (http://bioperl.org/wiki/Getting_BioPerl#CVS).

Mark, you'll have to update to 1.5.2 or CVS, as Sendu suggested.   
There were server changes for biofetch which were fixed about 4-6  
months ago (post rel. 1.5.1); I think several changes were made to  
Bio::SeqIO::swiss as well during this period.

I think the error here results from Bio::SeqIO::swiss trying to parse  
an empty byte stream.  Sendu, do you think that Bio::SeqIO::swiss  
(and other SeqIO parsers) should throw a more specific message for  
getting an empty byte stream?  Or is it more trouble than it's worth?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 14:14:40 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:14:40 +0100
Subject: [Bioperl-l] Bio::DB::SwissProt Error
In-Reply-To: <1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>
References: <1160050554.18691.11.camel@localhost>
	<452506D3.5050501@sendu.me.uk>
	<1AC863FF-3C44-4017-B20F-BD6DB413B318@uiuc.edu>
Message-ID: <45251350.5030608@sendu.me.uk>

Chris Fields wrote:
>
>>>   ------------- EXCEPTION: Bio::Root::Exception -------------
>>>   MSG: swissprot stream with no ID. Not swissprot in my book
[snip]
> I think the error here results from Bio::SeqIO::swiss trying to parse an 
> empty byte stream.  Sendu, do you think that Bio::SeqIO::swiss (and 
> other SeqIO parsers) should throw a more specific message for getting an 
> empty byte stream?  Or is it more trouble than it's worth?

Trouble wise, I've no idea without looking into it. Generally speaking 
though I can say that the error message is pretty useless and I'm always 
in favour of better error messages.


From hlapp at gmx.net  Thu Oct  5 14:21:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 10:21:49 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452511C1.5020709@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
Message-ID: <F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>


On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote:

>>
>> On Oct 5, 2006, at 3:19 AM, Sendu Bala wrote:
>>
>>> This is actually a general question and not limited to  
>>> EUtilities. As I
>>> see it EUtiltiies lets you do queries in Bioperl that you can do  
>>> on a
>>> website. The question is, should a Bioperl module always work with
>>> queries that the website it is a front-end to works with?
>>
>> I think yes, but stick to this definition.
>>
>> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez
>> website it will actually not work. Hence, it should be no surprise  
>> that
>> it doesn't work either using Bio::DB::EUtilities.
>
> On the contrary, I find it a surprise because EUtilities is an  
> interface
> to NCBI's eutils, not the entrez website.
>
> If I had previously read instructions on using eutils:
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=coursework.section.constructing-urls
> I might (do) expect that I /should/ use + in my term.

This is my point - stick to your definitions. Are you wrapping a  
query form on a website or are you wrapping a web service (i.e., a URL)?

The examples you give are about wrapping a web-service. Your original  
question was about wrapping a website. Yet another question is what  
the author of Bio::DB::EUtilities intended to wrap.

The other thing to consider is user-friendliness. If you are wrapping  
a web-service, do you still make not URL-encoding the user input the  
default? What will 90% of the users probably want or expect to be  
able to do? URL-encode all input themselves or expect the module to  
do this for them unless they turn it off?

As far as I'm concerned, I'll happily count myself among those who  
are lazy and ignorant, don't read NCBI's documentation, don't want to  
know how to URL encode and why this needs to be done, but just want  
it to work.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Oct  5 14:31:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:31:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4524B20B.5010703@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
Message-ID: <A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>

On Oct 5, 2006, at 2:19 AM, Sendu Bala wrote:

> This is actually a general question and not limited to EUtilities.  
> As I
> see it EUtiltiies lets you do queries in Bioperl that you can do on a
> website. The question is, should a Bioperl module always work with
> queries that the website it is a front-end to works with?
>
> So for example, Bio::DB::EUtilities::esearch in -db mode 'gene' is
> essentially a frontend onto:
>
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? 
> retmode=xml&db=gene&term=
>
> With a web-browser you can complete that url by supplying a term. For
> example, the term 'BRCA2+9606[taxid]' works and returns results:
>
> http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? 
> retmode=xml&db=gene&term=BRCA2+9606[taxid]
>
> If you supply the exact same term to EUtilities::esearch like so:
>
> my $esearch = Bio::DB::EUtilities->new(-eutil => "esearch", -db =>
> "gene", -term "BRCA2+9606[taxid]");
>
> The search fails. From my 'user' perspective this is highly  
> unexpected.
> Chris (the author) and I both understand /why/ it fails, but Chris
> doesn't think it is a bug, or at least something than can/should be
> changed. What do other people think? At the very least, if something
> unexpected happens, I'd suggest making a note of it in the POD
> somewhere. Eg. "Do not use + in term strings, even though they might
> work on the website".
>
> Chris: what is the disadvantage of always submitting '+' as '+' to the
> server?

A few reasons:

1)  According to NCBI, you can use '+' in queries, but not as a  
boolean.  Global changes of '+' to a space may change the meaning of  
the query in a few rare occasions.  So, if you really wanted to  
search for the string 'BRCA2+ATG', NCBI looks for that term literally.

2)  '+' is a URI reserved symbol for a space delimiter.  Therefore,  
any parameters containing '+' are URI-encoded into %2B, which is  
decoded on NCBI's end back to '+' (The is demonstrable with current  
EUtilities output and the returned XML data).

3)  Why not just use a space (implicit AND)?  Or an explicit  
boolean?  Or '&' (which apparently works but is not specified in the  
NCBI Entrez docs)?

The bug is in the query and not in the code, i.e. is is a  user- 
generated bug, not an EUtilities bug.  And it shouldn't be  
unexpected, as NCBI has very specific rules for building queries for  
Entrez (just like any other database).  If I were to use nonstandard  
queries for MySQL, BioFetch, UCSC, or anything else, I would expect  
to get bad results.  As the old saying goes, garbage in, garbage out.

The following link has their updated rules:

http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
rid=helpentrez.chapter.EntrezHelp

Here is their old one:

http://www.ncbi.nlm.nih.gov/entrez/query/static/help/helpdoc.html

We could, of course, put something in POD, but you never presented  
that option to me before.  I'll grant that the EUtilities API needs  
some cleaning up, not easy to do when the returned data varies from  
each utility.  But it does get the URL encoding correct, at least in  
this case.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 14:32:49 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:32:49 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
Message-ID: <45251791.9040409@sendu.me.uk>

Hilmar Lapp wrote:
> 
> On Oct 5, 2006, at 10:08 AM, Sendu Bala wrote:
>
>> On the contrary, I find it a surprise because EUtilities is an interface
>> to NCBI's eutils, not the entrez website.
>>
>> If I had previously read instructions on using eutils:
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls 
>>
>> I might (do) expect that I /should/ use + in my term.
> 
> This is my point - stick to your definitions. Are you wrapping a query 
> form on a website or are you wrapping a web service (i.e., a URL)?
> 
> The examples you give are about wrapping a web-service. Your original 
> question was about wrapping a website.

Right... I don't see that that changes the answer to my question though 
does it?

"The question is, should a Bioperl module always work with
queries that the web-service it is a front-end to works with?"

For me, the answer is still yes.


> As far as I'm concerned, I'll happily count myself among those who are 
> lazy and ignorant, don't read NCBI's documentation, don't want to know 
> how to URL encode and why this needs to be done, but just want it to work.

That's a reasonable attitude to take. Which comes back to the question I 
asked of Chris - naively, if you send + as + you can please everyone, 
can't you? Both people who have read the docs on the web-service and 
those who haven't? Or are there real queries in which a user may want to 
search for a phrase with a literal + in it (and where such a search 
works via eutils)?


From bix at sendu.me.uk  Thu Oct  5 14:44:33 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 15:44:33 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
Message-ID: <45251A51.6020802@sendu.me.uk>

Chris Fields wrote:
> The bug is in the query and not in the code, i.e. is is a  
> user-generated bug, not an EUtilities bug.  And it shouldn't be 
> unexpected, as NCBI has very specific rules for building queries for 
> Entrez (just like any other database).

So I guess this comes down to something Hilmar mentioned and I never 
even considered before. You consider your EUtilities stuff as a frontend 
to entrez, and therefore consider valid queries as queries that are 
valid for entrez and not eutils?

If that's the case, fine. I understand why you don't think this is a 
bug. Again, something that might warrant a mention in the POD.
Currently the naming of the modules and the explicit references to 
eutils (and me knowing the implementation uses eutils) got me confused.


From cjfields at uiuc.edu  Thu Oct  5 14:51:28 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 09:51:28 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452511C1.5020709@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
Message-ID: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>


On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote:

>>> This is actually a general question and not limited to  
>>> EUtilities. As I
>>> see it EUtiltiies lets you do queries in Bioperl that you can do  
>>> on a
>>> website. The question is, should a Bioperl module always work with
>>> queries that the website it is a front-end to works with?
>>
>> I think yes, but stick to this definition.
>>
>> Using your example, if you input 'BRCA2+9606[taxid]' on the Entrez
>> website it will actually not work. Hence, it should be no surprise  
>> that
>> it doesn't work either using Bio::DB::EUtilities.
>
> On the contrary, I find it a surprise because EUtilities is an  
> interface
> to NCBI's eutils, not the entrez website.

It uses NCBI's CGI interface for eutils, not the SOAP interface.   
Very different.  I have considered using the NCBI SOAP-based  
interface, but the web services are still somewhat incomplete, unlike  
the CGI interface.

> If I had previously read instructions on using eutils:
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=coursework.section.constructing-urls
> I might (do) expect that I /should/ use + in my term.

You are looking at part of the naked URL on that page.  Here's what  
that page says:

"When constructing URLs for the eUtils, please use lowercase  
characters for all parameters except &WebEnv. There is no required  
order for the URL parameters in an eUtils URL, and null values or  
inappropriate parameters are ignored. Avoid placing spaces in the  
URLs, particularly in queries. If a space is required, use a plus  
sign (+) instead of a space:

     * Incorrect: &id=352, 25125, 234, ...
     * Correct: &id=352,25125,234,...
     * Incorrect: &term=biomol mrna[properties] AND mouse[organism]
     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]

Other special characters, such as the # symbol used in referring to a  
query key on the History server, should be represented by their URL  
encodings (%23 for #).top link"

I use URI for building the URL with the parameters.  URI specifically  
encodes all of this for you, so spaces convert to '+' and '+'  
converts to %2B.

>> Aside from that, one of the advantages of having the service  
>> wrapped in
>> Bioperl is in fact that you can have it accept a wider variety of
>> parameters that the actual service would allow you to have, e.g.,
>> arrays, hashes, or whatever seems appropriate.
>
> I was going to suggest that terms be supplied as an array, leaving
> Bioperl code to decide how to 'AND' all the terms (elements in the
> array) together. It would also further force the user not to think of
> how eutils normally works, but to only consider the Bioperl  
> instructions
> on how to form a query. But I'm not sure of the value of all that.

Why do we need to intuit what the user is thinking at an particular  
time?  How would I know that someone actually wanted to search using  
the literal string 'abc+123' as opposed to 'abc 123'?

I see value in your last suggestion but I think a class or set of  
classes would be best suited for that:

MySQL Query     |  in                      out   | MySQL Query
Entrez Query    |-----> Generic Query class----->| Entrez Query
SRS Query       |                                | SRS Query
ad infinitum...

The generic query object could then be used in DB searches as an  
option besides using a raw string.  Though it would get tricky with  
SQL's complexity...

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Thu Oct  5 14:54:04 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 10:54:04 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251791.9040409@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<F40B4537-5CFA-4956-82D9-4A7DC989416C@gmx.net>
	<45251791.9040409@sendu.me.uk>
Message-ID: <9916EDEE-EA3C-4C55-A004-A46F37B559BF@gmx.net>


On Oct 5, 2006, at 10:32 AM, Sendu Bala wrote:

>> The examples you give are about wrapping a web-service. Your  
>> original question was about wrapping a website.
>
> Right... I don't see that that changes the answer to my question  
> though does it?
>
> "The question is, should a Bioperl module always work with
> queries that the web-service it is a front-end to works with?"
>
> For me, the answer is still yes.

The answer is still yes. My point was the query that works with a  
website is not necessarily the query that works with a web-service,  
even if that web-service also powers the website.

>
>> As far as I'm concerned, I'll happily count myself among those who  
>> are lazy and ignorant, don't read NCBI's documentation, don't want  
>> to know how to URL encode and why this needs to be done, but just  
>> want it to work.
>
> That's a reasonable attitude to take. Which comes back to the  
> question I asked of Chris - naively, if you send + as + you can  
> please everyone, can't you? Both people who have read the docs on  
> the web-service and those who haven't? Or are there real queries in  
> which a user may want to search for a phrase with a literal + in it  
> (and where such a search works via eutils)?

So are you suggesting to URL-encode some characters but not others?  
This would move you into muddy waters and I'm wondering what the gain  
is from that, and for whom it is a gain.

It sounds like it will mostly benefit those who have studied the NCBI  
documentation and know exactly the URL they want to send and want to  
ignore the EUtilities POD.

My humble guess is the far majority of people will either not read  
any documentation, or read the module's POD.

Maybe a better way to serve both types of people is to accept a  
parameter -querystring that is expected to include everything from  
'term=' onwards (including 'term=' itself) which gives you complete  
control and freedom if you know what you are doing, and otherwise  
implement what you suggested before:

> I was going to suggest that terms be supplied as an array, leaving
> Bioperl code to decide how to 'AND' all the terms (elements in the
> array) together. It would also further force the user not to think of
> how eutils normally works, but to only consider the Bioperl  
> instructions
> on how to form a query.


	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Thu Oct  5 15:02:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:02:01 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
Message-ID: <45251E69.7040507@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 9:08 AM, Sendu Bala wrote:
>
>> On the contrary, I find it a surprise because EUtilities is an interface
>> to NCBI's eutils, not the entrez website.
> 
> It uses NCBI's CGI interface for eutils, not the SOAP interface.  Very 
> different.  I have considered using the NCBI SOAP-based interface, but 
> the web services are still somewhat incomplete, unlike the CGI interface.

I don't know anything about the SOAP interface. I'm talking about the 
CGI interface that you use.


>> If I had previously read instructions on using eutils:
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=coursework.section.constructing-urls 
>>
>> I might (do) expect that I /should/ use + in my term.
> 
> You are looking at part of the naked URL on that page.  Here's what that 
> page says:

I know what it says...

>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]

The correct query is the one that has +s in it.


> I use URI for building the URL with the parameters.  URI specifically 
> encodes all of this for you, so spaces convert to '+' and '+' converts 
> to %2B.

Well, yes. This causes what I thought of as a bug. It prevents me from 
submitting a /correct/ eutils term. However it isn't a bug if you 
explain to users they shouldn't be submitting valid eutils terms, but 
only valid /entrez/ terms.


From cjfields at uiuc.edu  Thu Oct  5 15:15:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:15:49 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251A51.6020802@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
	<45251A51.6020802@sendu.me.uk>
Message-ID: <B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>


On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> The bug is in the query and not in the code, i.e. is is a  user- 
>> generated bug, not an EUtilities bug.  And it shouldn't be  
>> unexpected, as NCBI has very specific rules for building queries  
>> for Entrez (just like any other database).
>
> So I guess this comes down to something Hilmar mentioned and I  
> never even considered before. You consider your EUtilities stuff as  
> a frontend to entrez, and therefore consider valid queries as  
> queries that are valid for entrez and not eutils?

The eutils tools access the same databases as the web page, in the  
same way, using the same search terms.  From the EUtilities docs:

"The eUtils access the core search and retrieval engine of the Entrez  
system and, therefore, are only capable of retrieving data that are  
already in Entrez."

> If that's the case, fine. I understand why you don't think this is  
> a bug. Again, something that might warrant a mention in the POD.
> Currently the naming of the modules and the explicit references to  
> eutils (and me knowing the implementation uses eutils) got me  
> confused.

I'll note that in there is URI encoding in POD, but that should be a  
no-brainer.  I don't think every Bio::DB* class specifies this,  
mainly because it is taken for granted.  Pretty much anything that  
builds URL strings needs to encode based on the URI standard, and any  
server that accepts URLs is expected to decode using the same standard.

So, again, why does that have to be specifically outlined in POD?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 15:24:39 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:24:39 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251E69.7040507@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
Message-ID: <BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>

>> I use URI for building the URL with the parameters.  URI  
>> specifically encodes all of this for you, so spaces convert to '+'  
>> and '+' converts to %2B.
>
> Well, yes. This causes what I thought of as a bug. It prevents me  
> from submitting a /correct/ eutils term. However it isn't a bug if  
> you explain to users they shouldn't be submitting valid eutils  
> terms, but only valid /entrez/ terms.

I can specify in POD that URI encoding is in effect if that placates  
you, and maybe add a bit about how terms are to be built (based on  
the website).  I also noticed that the esearch POD doesn't have a  
demo in the SYNOPSIS yet (my fault).

However, I think this is all a bit silly.  This is something most  
people already realize and take for granted (it's standard for any  
CGI interface to use URI encoding).

Also, most Entrez users do not use a term like 'BRCA2+Human 
[ORGANISM]'.  They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human 
[ORGANISM]', the latter which is implicit.  All of this is on the  
Entrez website.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From MEC at stowers-institute.org  Thu Oct  5 15:12:02 2006
From: MEC at stowers-institute.org (Cook, Malcolm)
Date: Thu, 5 Oct 2006 10:12:02 -0500
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
Message-ID: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>

Lincoln,

I committed a change to Bio::SeqFeature::Store to use nfreeze instead of
freeze which should allow SeqFeature objects to survive database
freeze/thaw cycles across architectures.

I hope I was not presumptuous or in error in doing this....

Regards,

Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
 

From bix at sendu.me.uk  Thu Oct  5 15:28:55 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:28:55 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<A19D72BE-DDF1-4296-B0CB-F75D50BC2843@uiuc.edu>
	<45251A51.6020802@sendu.me.uk>
	<B0BCFFEE-9200-4CC7-9D53-7C2266950691@uiuc.edu>
Message-ID: <452524B7.5080003@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 9:44 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>> The bug is in the query and not in the code, i.e. is is a  
>>> user-generated bug, not an EUtilities bug.  And it shouldn't be 
>>> unexpected, as NCBI has very specific rules for building queries for 
>>> Entrez (just like any other database).
>>
>> So I guess this comes down to something Hilmar mentioned and I never 
>> even considered before. You consider your EUtilities stuff as a 
>> frontend to entrez, and therefore consider valid queries as queries 
>> that are valid for entrez and not eutils?
> 
> The eutils tools access the same databases as the web page, in the same 
> way, using the same search terms.

It doesn't. The eutils interface behaves differently with +s than does 
the entrez website interface. In eutils + means space, whilst in entrez, 
+ means the plus symbol.


>> If that's the case, fine. I understand why you don't think this is a 
>> bug. Again, something that might warrant a mention in the POD.
>> Currently the naming of the modules and the explicit references to 
>> eutils (and me knowing the implementation uses eutils) got me confused.
> 
> I'll note that in there is URI encoding in POD, but that should be a 
> no-brainer.

Just that it is URI encoded isn't the problem. The problem is the 
difference in behaviour outlined above.


> I don't think every Bio::DB* class specifies this, mainly 
> because it is taken for granted.  Pretty much anything that builds URL 
> strings needs to encode based on the URI standard, and any server that 
> accepts URLs is expected to decode using the same standard.
> 
> So, again, why does that have to be specifically outlined in POD?

Because they're different. If I construct a valid eutils query it might 
not work. You ought to explain why.

"EUtilities takes any valid entrez query and transforms it into a valid 
eutils query for submission. Do not try and provide a valid eutils query 
of your own, or the extra transformation will result in no results"


From bix at sendu.me.uk  Thu Oct  5 15:30:44 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 16:30:44 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
Message-ID: <45252524.7030006@sendu.me.uk>

Chris Fields wrote:
>>> I use URI for building the URL with the parameters.  URI specifically 
>>> encodes all of this for you, so spaces convert to '+' and '+' 
>>> converts to %2B.
>>
>> Well, yes. This causes what I thought of as a bug. It prevents me from 
>> submitting a /correct/ eutils term. However it isn't a bug if you 
>> explain to users they shouldn't be submitting valid eutils terms, but 
>> only valid /entrez/ terms.
> 
> I can specify in POD that URI encoding is in effect if that placates 
> you, and maybe add a bit about how terms are to be built (based on the 
> website).  I also noticed that the esearch POD doesn't have a demo in 
> the SYNOPSIS yet (my fault).
> 
> However, I think this is all a bit silly.  This is something most people 
> already realize and take for granted (it's standard for any CGI 
> interface to use URI encoding).
> 
> Also, most Entrez users do not use a term like 'BRCA2+Human[ORGANISM]'.  
> They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human[ORGANISM]', the 
> latter which is implicit.  All of this is on the Entrez website.

Exactly. You're assuming an entrez user and expecting an entrez query. I 
don't think its silly given the name of the modules for the user to 
assume the code needs an eutils query, which is a different thing with 
different behaviour /independent/ of URI encoding.


From cjfields at uiuc.edu  Thu Oct  5 15:50:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:50:51 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45251E69.7040507@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
Message-ID: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>

> I know what it says...

Ah, that's the Sendu I know and love.

>
>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>
> The correct query is the one that has +s in it.

Yes, that's because it's a URL, not a raw search term string (it has  
been URI-encoded so spaces are converted to '+').  If you use that as  
a direct query in Entrez you will not get the same response.  You do  
get something if you use the new NCBI global query form on the main  
page, but clicking on the nucleotide or PMC hits reveals that the URL  
is malformed and no term is present.  That is exactly the same  
response in EUtilities:

<?xml version="1.0"?>
<!DOCTYPE eSearchResult PUBLIC "-//NLM//DTD eSearchResult, 11 May  
2002//EN" "http://www.ncbi.nlm.nih.gov/entrez/query/DTD/ 
eSearch_020511.dtd">
<eSearchResult>
         <Count>0</Count>
         <RetMax>0</RetMax>
         <RetStart>0</RetStart>
         <IdList>
         </IdList>
         <TranslationSet>
         </TranslationSet>
         <QueryTranslation></QueryTranslation>
</eSearchResult>

Note the QueryTranslation tag is empty.

The only noticeable difference is using egquery (which I just fixed  
in CVS yesterday).  The returned XML gives no hits for any database,  
which is true based on individual esearch queries for those database,  
and is actually more consistent than the website version.

>> I use URI for building the URL with the parameters.  URI specifically
>> encodes all of this for you, so spaces convert to '+' and '+'  
>> converts
>> to %2B.
>
> Well, yes. This causes what I thought of as a bug. It prevents me from
> submitting a /correct/ eutils term. However it isn't a bug if you
> explain to users they shouldn't be submitting valid eutils terms, but
> only valid /entrez/ terms.

If you mean that most users will actually use a URL-like search term,  
then I would say you have a point.  But that simply isn't the case.

If clarifying the docs makes it better, then so be it.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 15:59:53 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 10:59:53 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45252524.7030006@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
Message-ID: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>


On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>>> I use URI for building the URL with the parameters.  URI  
>>>> specifically encodes all of this for you, so spaces convert to  
>>>> '+' and '+' converts to %2B.
>>>
>>> Well, yes. This causes what I thought of as a bug. It prevents me  
>>> from submitting a /correct/ eutils term. However it isn't a bug  
>>> if you explain to users they shouldn't be submitting valid eutils  
>>> terms, but only valid /entrez/ terms.
>> I can specify in POD that URI encoding is in effect if that  
>> placates you, and maybe add a bit about how terms are to be built  
>> (based on the website).  I also noticed that the esearch POD  
>> doesn't have a demo in the SYNOPSIS yet (my fault).
>> However, I think this is all a bit silly.  This is something most  
>> people already realize and take for granted (it's standard for any  
>> CGI interface to use URI encoding).
>> Also, most Entrez users do not use a term like 'BRCA2+Human 
>> [ORGANISM]'.  They use 'BRCA2 AND Human[ORGANISM]' or 'BRCA2 Human 
>> [ORGANISM]', the latter which is implicit.  All of this is on the  
>> Entrez website.
>
> Exactly. You're assuming an entrez user and expecting an entrez  
> query. I don't think its silly given the name of the modules for  
> the user to assume the code needs an eutils query, which is a  
> different thing with different behaviour /independent/ of URI  
> encoding.

It's a silly distinction.  The POD for Bio::DB::EUtilities states:

Bio::DB::EUtilities - interface for handling web queries and data  
retrieval from NCBI's Entrez Utilities.

My question is this : why would anyone (particularly the everyday  
bioperl user) want to use URL-encoded parameters for a query?  That  
seems to be your main argument here.  If so, wouldn't I just paste  
them together then send them off NCBI eutils?  Would I devote ~ 10  
classes to that?  I could do that in a short program using an array,  
join, and LWP::Simple.

The purpose is quite clearly stated, but if you feel that by  
badgering me to add something to POD I consider common sense, then  
you're right.  You've succeeded.  Bravo.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 16:02:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:02:05 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
Message-ID: <45252C7D.3050009@sendu.me.uk>

Chris Fields wrote:
>
>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>
>> The correct query is the one that has +s in it.
> 
> Yes, that's because it's a URL, not a raw search term string (it has 
> been URI-encoded so spaces are converted to '+').  If you use that as a 
> direct query in Entrez you will not get the same response.

But we're not doing Entrez queries. We're using a module called 
EUtilities to do an eutils query, which involves forming a url in which 
spaces should to be converted to +. That's the source of confusion. Is 
the user supposed to do this, or is EUtilities?

All you had to do 8 emails ago is tell me that EUtilities is supposed to 
do that. You /still/ haven't told me that. I give up.


From cjfields at uiuc.edu  Thu Oct  5 16:12:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 11:12:11 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45252C7D.3050009@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
Message-ID: <A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>


On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote:

> Chris Fields wrote:
>>
>>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>>
>>> The correct query is the one that has +s in it.
>> Yes, that's because it's a URL, not a raw search term string (it  
>> has been URI-encoded so spaces are converted to '+').  If you use  
>> that as a direct query in Entrez you will not get the same response.
>
> But we're not doing Entrez queries. We're using a module called  
> EUtilities to do an eutils query, which involves forming a url in  
> which spaces should to be converted to +. That's the source of  
> confusion. Is the user supposed to do this, or is EUtilities?
>
> All you had to do 8 emails ago is tell me that EUtilities is  
> supposed to do that. You /still/ haven't told me that. I give up.

It should be apparent from the documentation and the URLs posted in  
debugging output the first few times you used it.  Again, why would I  
dedicate ~ 10 classes to pasting together URI-encoded strings?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Thu Oct  5 16:22:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:22:36 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
Message-ID: <4525314C.7020205@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 10:30 AM, Sendu Bala wrote:
>
>> Exactly. You're assuming an entrez user and expecting an entrez query. 
>> I don't think its silly given the name of the modules for the user to 
>> assume the code needs an eutils query, which is a different thing with 
>> different behaviour /independent/ of URI encoding.
> 
> It's a silly distinction.  The POD for Bio::DB::EUtilities states:
> 
> Bio::DB::EUtilities - interface for handling web queries and data 
> retrieval from NCBI's Entrez Utilities.
> 
> My question is this : why would anyone (particularly the everyday 
> bioperl user) want to use URL-encoded parameters for a query?

Well I'll tell you why I was trying to use URL-encoded parameters, if 
that helps you any.

I read the pod for EUtilities but all the examples have very simple 
-term s defined with just a single word. So I wonder how I'm supposed to 
make an 'AND' term. I also have no idea what utilities I'm supposed to 
use, or what databases etc. I need to get the answer I want.

The POD points me here:
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
Combined with the EUtilities synopsis I know I'm supposed to start with 
esearch so I look at:
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esearch_help.html
And figure out what my terms are supposed to be.

Then I test some example terms in my web browser using the esearch base 
url (http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?) to see 
if they work, and copy/paste the terms into my EUtilities-using perl 
script, replacing variable terms with perl variables.

Then I find that my terms don't work, ask you about it, and you fail to 
tell me I should be testing my terms at 
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=gene.

If you think I'm stupid, fine, but I'm probably not the only stupid 
person on the planet. Which is why I suggested a POD addition. You don't 
have to make any POD change if you don't want to. I simply thought it 
might help avoid anyone 'badgering' you in the future with a similar 
problem.


From bix at sendu.me.uk  Thu Oct  5 16:28:51 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 17:28:51 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
	<A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
Message-ID: <452532C3.9030804@sendu.me.uk>

Chris Fields wrote:
> 
> On Oct 5, 2006, at 11:02 AM, Sendu Bala wrote:
> 
>> Chris Fields wrote:
>>>
>>>>>     * Correct: &term=biomol+mrna[properties]+AND+mouse[organism]
>>>>
>>>> The correct query is the one that has +s in it.
>>> Yes, that's because it's a URL, not a raw search term string (it has 
>>> been URI-encoded so spaces are converted to '+').  If you use that as 
>>> a direct query in Entrez you will not get the same response.
>>
>> But we're not doing Entrez queries. We're using a module called 
>> EUtilities to do an eutils query, which involves forming a url in 
>> which spaces should to be converted to +. That's the source of 
>> confusion. Is the user supposed to do this, or is EUtilities?
>>
>> All you had to do 8 emails ago is tell me that EUtilities is supposed 
>> to do that. You /still/ haven't told me that. I give up.
> 
> It should be apparent from the documentation and the URLs posted in 
> debugging output the first few times you used it.  Again, why would I 
> dedicate ~ 10 classes to pasting together URI-encoded strings?

I'm not sure how not doing URI-encoding would suddenly make your classes 
worthless. I find them to be very useful (even when I didn't know there 
was any URI-encoding, was incorrectly using +s and it happened to work 
anyway).


From bernd.web at gmail.com  Thu Oct  5 14:09:38 2006
From: bernd.web at gmail.com (Bernd Web)
Date: Thu, 5 Oct 2006 16:09:38 +0200
Subject: [Bioperl-l] Eutilities Batch
Message-ID: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>

Hi,

I am using the new EUtilities. It looks great.
I was trying to use epost followed by elink but i get an error. The
same error is actually given with the example on
http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
Can't call method "get_databases" on an undefined value at EU.pl line 25.

For completeness, the code is shown below too.

Any suggestions what is going wrong?

Regards,
Bernd

# chain EUtilities for complex queries

  use Bio::DB::EUtilities;

  my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                         -db         => 'pubmed',
                                         -term       => 'hutP',
                                         -usehistory => 'y');

  $esearch->get_response; # parse the response, fetch a cookie

  my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
                                       -db           => 'protein,taxonomy',
                                       -dbfrom       => 'pubmed',
                                       -cookie       => $esearch->next_cookie,
                                       -cmd          => 'neighbor');

  # this retrieves the Bio::DB::EUtilities::ElinkData object

  my ($linkset) = $elink->next_linkset;
  my @ids;

  # step through IDs for each linked database in the ElinkData object

  for my $db ($linkset->get_databases) {
    @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
    # do something here
  }


From cjfields at uiuc.edu  Thu Oct  5 17:31:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 12:31:33 -0500
Subject: [Bioperl-l] Eutilities Batch
In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
Message-ID: <F53B83B9-E188-4715-8229-0B6D9C0C982A@uiuc.edu>

I'll look into it.  I'm busy updating the EUtilities tools now.

Chris

On Oct 5, 2006, at 9:09 AM, Bernd Web wrote:

> Hi,
>
> I am using the new EUtilities. It looks great.
> I was trying to use epost followed by elink but i get an error. The
> same error is actually given with the example on
> http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
> Can't call method "get_databases" on an undefined value at EU.pl  
> line 25.
>
> For completeness, the code is shown below too.
>
> Any suggestions what is going wrong?
>
> Regards,
> Bernd
>
> # chain EUtilities for complex queries
>
>   use Bio::DB::EUtilities;
>
>   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                          -db         => 'pubmed',
>                                          -term       => 'hutP',
>                                          -usehistory => 'y');
>
>   $esearch->get_response; # parse the response, fetch a cookie
>
>   my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
>                                        -db           =>  
> 'protein,taxonomy',
>                                        -dbfrom       => 'pubmed',
>                                        -cookie       => $esearch- 
> >next_cookie,
>                                        -cmd          => 'neighbor');
>
>   # this retrieves the Bio::DB::EUtilities::ElinkData object
>
>   my ($linkset) = $elink->next_linkset;
>   my @ids;
>
>   # step through IDs for each linked database in the ElinkData object
>
>   for my $db ($linkset->get_databases) {
>     @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
>     # do something here
>   }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From daniel.lang at biologie.uni-freiburg.de  Thu Oct  5 17:12:02 2006
From: daniel.lang at biologie.uni-freiburg.de (Daniel Lang)
Date: Thu, 05 Oct 2006 19:12:02 +0200
Subject: [Bioperl-l] Bio::DB::SeqFeature
Message-ID: <45253CE2.1070208@biologie.uni-freiburg.de>

Hi,

we are storing Bio::SeqFeature::Gene::GeneStructure objects (with
multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db
(latest bioperl-live checkout).

The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch
out of a database.

The first observation is that is seems to work (fetched objects behave
like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we
get these warnings:

Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into
lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into
lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
        (in cleanup) Not a CODE reference at
/home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
prepare_cached(SELECT f.id,f.object
  FROM feature as f
  WHERE (   f.seqid=?
   AND   f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?)
         OR (f.tier=? AND f.bin between ? AND ?))
)

) statement handle DBI::st=HASH(0x1c317cf0) still Active at
/home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm
line 1422
        (in cleanup) Not a CODE reference at
/home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.

Is this something serious? Does this mean that the stored object doesn't
have everything it had before freezing? Or are we using
Bio::DB::SeqFeature inappropriately?

The other question would be, if we can visualize these stored feature
objects easily using gbrowse? I didn't find a hint mentioning
Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages...
Is it working already? Will it?

Thanks in advance,
Daniel

-- 

Daniel Lang
University of Freiburg, Plant Biotechnology
Schaenzlestr. 1, D-79104 Freiburg
fax: +49 761 203 6945
phone: +49 761 203 6974
homepage:  http://www.plant-biotech.net/
e-mail: daniel.lang at biologie.uni-freiburg.de

#################################################
My software never has bugs.
It just develops random features.
#################################################


From cjfields at uiuc.edu  Thu Oct  5 17:45:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 12:45:40 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <452532C3.9030804@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<1B69C64A-017D-4A38-A59B-1719E97C6FBB@uiuc.edu>
	<45252C7D.3050009@sendu.me.uk>
	<A382A4BF-EFF3-4673-9A9A-5AEFF30CED16@uiuc.edu>
	<452532C3.9030804@sendu.me.uk>
Message-ID: <003DD8C4-6E59-44C2-9A1C-117E036D93BC@uiuc.edu>


On Oct 5, 2006, at 11:28 AM, Sendu Bala wrote:

> I'm not sure how not doing URI-encoding would suddenly make your  
> classes worthless. I find them to be very useful (even when I  
> didn't know there was any URI-encoding, was incorrectly using +s  
> and it happened to work anyway).

That's not my point (and sincerest apologies for the 'badgering'  
bit).  If you made the assumption that all the parameters had to be  
URI-encoded, why couldn't I do something like:

my %param = (#make up your list of parameters here#);
my $eutil = 'esearch';
my $url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/$eutil.fcgi";
# join the key value pairs with '=', then join all those with &
# add to end of url
# post and retrieve via LWP::Simple

It's more user-friendly to set up the parameters so that you wouldn't  
have to encode everything yourself, esp. when the most reliable way  
to encode URI strings is to 'use URI'.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 18:11:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 13:11:25 -0500
Subject: [Bioperl-l] Eutilities Batch
In-Reply-To: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
References: <716af09c0610050709j24107fecl2c9e420def0028b3@mail.gmail.com>
Message-ID: <4A340977-C6AD-4728-8947-BF5A8A782807@uiuc.edu>


On Oct 5, 2006, at 9:09 AM, Bernd Web wrote:

> Hi,
>
> I am using the new EUtilities. It looks great.
> I was trying to use epost followed by elink but i get an error. The
> same error is actually given with the example on
> http://doc.bioperl.org/bioperl-live/Bio/DB/EUtilities/elink.html:
> Can't call method "get_databases" on an undefined value at EU.pl  
> line 25.
>
> For completeness, the code is shown below too.
>
> Any suggestions what is going wrong?
>
> Regards,
> Bernd

Grr...that's my error, sorry Bernd.  The POD wasn't updated to match  
the change I made and has a few errors.  The elink object, for  
starters, doesn't fetch the response using get_response().  Also, the  
ElinkData method has changed slightly but accomplishes the same  
thing.  Odd, since I copied and pasted that from working code...

Just a note: these are considered highly experimental at the moment,  
though they should be ready for general use and toying around.  I  
would like any suggestions on methods and so on you may have (Sendu  
has made some very helpful ones off-list which I plan on implementing).

Feel free to let me know if something doesn't work.  Note that,  
because of their experimental nature, you will want to take note of  
any methods changes in particular as I try to solidify the API and  
clean up the POD, so expect some momentary 'outages'.  I plan on  
setting up a remedial interface for all the container objects (like  
ElinkData) which will help clarify things and solidify the API in the  
next few weeks, at least to a point where the class methods have a  
consistent naming scheme.  I plan on using this as a backend web  
agent for a general Entrez interface at some point to get data into  
Bio* objects.

In the meantime, try this:

use Bio::DB::EUtilities;

my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                        -db         => 'pubmed',
                                        -term       => 'hutP',
                                        -usehistory => 'y');

$esearch->get_response; # parse the response, fetch a cookie

my $elink = Bio::DB::EUtilities->new(-eutil        => 'elink',
                                      -db           =>  
'protein,taxonomy',
                                      -dbfrom       => 'pubmed',
                                      -cookie       => $esearch- 
 >next_cookie,
                                      -cmd          => 'neighbor');

$elink->get_response;

# this retrieves the Bio::DB::EUtilities::ElinkData object

my $linkset = $elink->next_linkset;
my @ids;

# step through IDs for each linked database in the ElinkData object

for my $db ($linkset->get_all_linkdbs) {
   @ids = $linkset->get_LinkIds_by_db($db); #returns primary ID's
   print join q(,), @ids;
   # do something here
}


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dmessina at wustl.edu  Thu Oct  5 18:07:56 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 5 Oct 2006 13:07:56 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
Message-ID: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>

I'm pleased to announce a revised version of the BioPerl Deobfuscator  
is now available. Many thanks to Mauricio Cuadra for updating  
bioperl.org's installation:

http://bioperl.org/cgi-bin/deob_interface.cgi

I've incorporated many of the suggestions you all sent in after the  
first release, and many of the modules that had non-standard  
documentation have been updated in the meantime, too, so hopefully  
you'll find it much improved. There are still some issues with a few  
modules; please report any problems you see. Also, it's now indexing  
bioperl-live instead of 1.4, which should make it a little more  
useful, too. A complete list of changes is below.

I welcome your bug reports and suggestions for improvements, via  
email, this list, Bugzilla, or the Wiki page.


Thanks,
Dave


Changes

0.0.3  Mon Oct  2 20:01:45 CDT 2006
        FIX: change default $deob_detail_path to be a relative URL  
instead of
             having localhost hardcoded. Thanks to Jason Stajich for  
pointing
             this out.
        FIX: Bio::Ontology modules are no longer missing their prefix  
in the
             class list, and their methods are now shown in the lower  
pane
             as expected. Thanks to Hilmar Lapp for reporting this bug.
        FIX: can now handle (and ignore) VERSION POD section.
        FIX: missing SYNOPSIS section now handled properly. In fact, the
             SYNOPSIS and DESCRIPTION sections can be in reverse  
order now,
             although for consistency this is not recommended.
        FIX: Bug #2114: "Obfuscator doesn't show "Bio:Matrix:Generic"  
has been
             fixed. This bug turned out to afflict multiple modules,  
which
             weren't getting parsed correctly by deob_index.pl.
        NEW: Table cells have been padded out to get rid of that  
"scrunched"
             look. Thanks to Sendu Bala for this great suggestion.
        NEW: If the 'Returns' subsection of a method's documentation  
contains
             a POD L<> link, the Deobfuscator assumes this to be a  
package
             name, and wraps it in an href for display. This feature is
             not robust, but seems to work well enough for now.
        NEW: the list of classes is now sorted alphabetically depth- 
first, so
             that subclasses appear just after their parent class.  
Thanks to
             Amir Karger for noticing the strange sorting behavior.
        NEW: HTML page title now 'BioPerl Deobfuscator' to  
distinguish it from
             other Deobfuscators out there. Thanks to Amir Karger for
             suggesting this.
        NEW: 'No match' search string now more prominent. Yep, kudos  
to Amir
             Karger again -- another great idea!
        NEW: Search box caption now explicitly states that only  
package names
             can be searched. Big ups to Amir Karger for this  
suggestion.
             The ability to search method names is planned for a  
future version.
        NEW: added -x option to deob_index.pl. This allows the use of an
             'excluded modules' file. This feature was added to  
resolve an
             issue with four modules which rely on external modules  
to compile.
             Class::Inspector, used by the Deobfuscator needs to load a
             module to traverse its inheritance tree, and modules  
must compile
             before they can be loaded.
     CHANGE: using short name now when traversing with File::Find to  
help
             identify excluded modules (deob_index.pl).


From lincoln.stein at gmail.com  Thu Oct  5 18:41:08 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:41:08 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC1
In-Reply-To: <DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
References: <4521A552.60301@sendu.me.uk> <20061003023031.GI14409@iucha.net>
	<DCB8C5F7-34AD-491E-A554-F16543D83C90@uiuc.edu>
Message-ID: <6dce9a0b0610051141x6b61407ar1c0a13cf7616b35f@mail.gmail.com>

The non-numeric comparison bug in Bio::DB::SeqFeature is fixed in the
latest CVS. Do I need to do anything special to get the CVS fixes into
the release candidate?

Lincoln

On 10/2/06, Chris Fields <cjfields at uiuc.edu> wrote:
> > [I won't create a wiki account just to report this.]
> >
> > Running on Debian testing/unstable, using Perl 5.8.8 and BIOPERLDEBUG
> > not set.  Lots of warnings about missing packages and all, but this
> > looks interesting:
> >
> >    Argument "+" isn't numeric in numeric lt (<) at Bio/DB/
> > SeqFeature/Segment.pm line 423.
>
> This is verified on Mac OS X.
>
> > Otherwise:
> >
> >    Failed 1/239 test scripts, 99.58% okay. 1/11864 subtests failed,
> > 99.99% okay.
> >
> > The failed test is:
> >
> >    t/ESEfinder..................dubious
> >       Test returned status 255 (wstat 65280, 0xff00)
> >    DIED. FAILED test 15
>
> What do you get when you run that set of tests using 'perl -I. -w t/
> ESEFinder.t'?  The bad status code is odd and could be a remote
> server issue.
>
> Chris
>
>
> >
> > florin
> >
> > --
> > If we wish to count lines of code, we should not regard them as lines
> > produced but as lines spent.                       -- Edsger Dijkstra
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From MEC at stowers-institute.org  Thu Oct  5 19:18:08 2006
From: MEC at stowers-institute.org (Cook, Malcolm)
Date: Thu, 5 Oct 2006 14:18:08 -0500
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
Message-ID: <CED81D34E37D5043A1211565277A51E5065E9897@exchkc02.stowers-institute.org>


Yes, there is overhead (c.f. perldoc Storable)

    "When writing in network order, all fields are written
    out as standard lengths, which allows full interworking, but takes
    longer to read and write)"

And, I suppose there is also risk of loosing precision in using network
order:

    You can also store data in network order to allow easy sharing
across
    multiple platforms, or when storing on a socket known to be remotely
    connected. The routines to call have an initial "n" prefix for
    *network*, as in "nstore" and "nstore_fd". At retrieval time, your
data
    will be correctly restored so you don't have to know whether you're
    restoring from native or network ordered data. Double values are
stored
    stringified to ensure portability as well, at the slight risk of
loosing
    some precision in the last decimals.

So, I agree, it should be configuration option, perhaps defaulting to
using network order.

However, given the factoring of ../Bio/DB/SeqFeature/Store.pm I'm not
sure how to best make it a configuration option since the two provided
serializers don't share a common interface.  Possibly something like:

=head1 Methods for Connecting and Initializating a Database

=head2 new

 Title   : new
 Usage   : $db = Bio::DB::SeqFeature::Store->new(@options)
 Function: connect to a database
 Returns : A descendent of Bio::DB::Seqfeature::Store
 Args    : several - see below
 Status  : public

This class method creates a new database connection. The following
-name=E<gt>$value arguments are
accepted:http://iowg.brcdevel.org/gff3.html#a_fasta

 Name               Value
 ----               -----

 -adaptor           The name of the Adaptor class (default DBI::mysql)

 -serializer        The name of the serializer class (default Storable)

 -network_order     Strive to 'preserve network order' (if the
serializer implements it.  
		        Currently, only Storable.pm does, and this will
cause it to use nfreeze 
                    instead of freeze.  (default 1)

 -index_subfeatures Whether or not to make subfeatures searchable
                    (default true)

 -cache             Activate LRU caching feature -- size of cache

 -compress          Compresses features before storing them in database
                    using Compress::Zlib


Malcolm Cook
Database Applications Manager - Bioinformatics
Stowers Institute for Medical Research - Kansas City, Missouri
  

> -----Original Message-----
> From: Lincoln Stein [mailto:lincoln.stein at gmail.com] 
> Sent: Thursday, October 05, 2006 1:43 PM
> To: Cook, Malcolm
> Cc: lstein at cshl.org; bioperl-l
> Subject: Re: using nfreeze instead of freeze in Bio::SeqFeature::Store
> 
> I think it's fine unless there is a significant performance hit, in
> which case the change should be made into a configuration option. Do
> you know if there is any overhead on doing this?
> 
> Lincoln
> 
> On 10/5/06, Cook, Malcolm <MEC at stowers-institute.org> wrote:
> > Lincoln,
> >
> > I committed a change to Bio::SeqFeature::Store to use 
> nfreeze instead of
> > freeze which should allow SeqFeature objects to survive database
> > freeze/thaw cycles across architectures.
> >
> > I hope I was not presumptuous or in error in doing this....
> >
> > Regards,
> >
> > Malcolm Cook
> > Database Applications Manager - Bioinformatics
> > Stowers Institute for Medical Research - Kansas City, Missouri
> >
> >
> 
> 
> -- 
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)
> FOR URGENT MESSAGES & SCHEDULING,
> PLEASE CONTACT MY ASSISTANT,
> SANDRA MICHELSEN, AT michelse at cshl.edu
> 


From lincoln.stein at gmail.com  Thu Oct  5 18:32:40 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:32:40 -0400
Subject: [Bioperl-l] Bio::DB::SeqFeature
In-Reply-To: <45253CE2.1070208@biologie.uni-freiburg.de>
References: <45253CE2.1070208@biologie.uni-freiburg.de>
Message-ID: <6dce9a0b0610051132p7d7fcf84g27578731f9727f3f@mail.gmail.com>

Hi Daniel,

The warnings you are seeing are occurring because
Bio::SeqFeature::Gene::GeneStructure contains a CODE reference. I
think it must be registering a cleanup method via its Bio::Root::Root
ancestor. When Storable serializes the object, it complains that it
can't serialize the CODE reference and instead converts it into the
string "CODE(0xXXXXX)". Then, after you thaw the object,
Bio::Root::Root is complaining that the CODE reference is invalid
because it is a string, not a reference.

Yuck. I think, however, that I can fix this by setting some magic
variables in Storable version 2.05 that will decompile and compile the
CODE references. I will try this and send you a note when the code is
in CVS.

GBrowse does run off Bio::DB::SeqFeature::Store and is noticeably
faster than the original Bio::DB::GFF adaptor. Nothing really changes
except that you set the db_adaptor option to
Bio::DB::SeqFeature::Store. I haven't tried it using
Bio::SeqFeature::Gene::GeneStructure, so no guarantees, but I am
hopeful that it will work.

Lincoln


On 10/5/06, Daniel Lang <daniel.lang at biologie.uni-freiburg.de> wrote:
> Hi,
>
> we are storing Bio::SeqFeature::Gene::GeneStructure objects (with
> multiple transcripts) using Bio::DB::SeqFeature::Store in a mysql db
> (latest bioperl-live checkout).
>
> The Bio::SeqFeature::Gene::GeneStructure's are generated from scratch
> out of a database.
>
> The first observation is that is seems to work (fetched objects behave
> like Bio::SeqFeature::Gene::GeneStructure's) despite the fact that we
> get these warnings:
>
> Can't store item CODE(0x113db10) at lib/Storable.pm (autosplit into
> lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
> Can't store item CODE(0x11786f0) at lib/Storable.pm (autosplit into
> lib/auto/Storable/_freeze.al) line 287, <STDIN> line 1.
>         (in cleanup) Not a CODE reference at
> /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
> prepare_cached(SELECT f.id,f.object
>   FROM feature as f
>   WHERE (   f.seqid=?
>    AND   f.end>=? AND f.start<=? AND ((f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?)
>          OR (f.tier=? AND f.bin between ? AND ?))
> )
>
> ) statement handle DBI::st=HASH(0x1c317cf0) still Active at
> /home/lang/bioperl/bioperl-live//Bio/DB/SeqFeature/Store/DBI/mysql.pm
> line 1422
>         (in cleanup) Not a CODE reference at
> /home/lang/bioperl/bioperl-live//Bio/Root/Root.pm line 438, <STDIN> line 1.
>
> Is this something serious? Does this mean that the stored object doesn't
> have everything it had before freezing? Or are we using
> Bio::DB::SeqFeature inappropriately?
>
> The other question would be, if we can visualize these stored feature
> objects easily using gbrowse? I didn't find a hint mentioning
> Bio::DB::SeqFeature as being supported by gbrowse on the gmod pages...
> Is it working already? Will it?
>
> Thanks in advance,
> Daniel
>
> --
>
> Daniel Lang
> University of Freiburg, Plant Biotechnology
> Schaenzlestr. 1, D-79104 Freiburg
> fax: +49 761 203 6945
> phone: +49 761 203 6974
> homepage:  http://www.plant-biotech.net/
> e-mail: daniel.lang at biologie.uni-freiburg.de
>
> #################################################
> My software never has bugs.
> It just develops random features.
> #################################################
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From hlapp at gmx.net  Thu Oct  5 20:34:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 5 Oct 2006 16:34:49 -0400
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <4525314C.7020205@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
Message-ID: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>


On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote:

> If you think I'm stupid, fine, but I'm probably not the only stupid
> person on the planet.

That's a great suggestion that I hope we can all agree on? I'll  
happily count myself among the stupid ones too so you're not alone,  
and stupid people and even more so those who are lucky enough not to  
be stupid have an obligation to document stuff so that even the  
stupid can understand, no matter how silly the documentation might get.

Is that agreeable without causing yet more progressive hair loss?

Actually - I'm having second thoughts. Isn't it a distinguishing  
feature of stupid people that - among other things - they are stupid  
enough to believe they don't need to read documentation? You admitted  
publicly that you read documentation - are you just faking the stupid?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Oct  5 21:11:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:11:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
Message-ID: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>


On Oct 5, 2006, at 3:34 PM, Hilmar Lapp wrote:

>
> On Oct 5, 2006, at 12:22 PM, Sendu Bala wrote:
>
>> If you think I'm stupid, fine, but I'm probably not the only stupid
>> person on the planet.
>
> That's a great suggestion that I hope we can all agree on? I'll  
> happily count myself among the stupid ones too so you're not alone,  
> and stupid people and even more so those who are lucky enough not  
> to be stupid have an obligation to document stuff so that even the  
> stupid can understand, no matter how silly the documentation might  
> get.
>
> Is that agreeable without causing yet more progressive hair loss?
>
> Actually - I'm having second thoughts. Isn't it a distinguishing  
> feature of stupid people that - among other things - they are  
> stupid enough to believe they don't need to read documentation? You  
> admitted publicly that you read documentation - are you just faking  
> the stupid?
>
> 	-hilmar

If lack of good documentation == stupid, I know of a few other  
modules in trouble besides mine.  Based on that we're in for a whole  
lot of stupid!  And I feel stupid for my earlier remarks, Sendu, so  
apologies.

And Hilmar, you're too late on the hair loss, at least on my end.

I have corrected the EUtilities POD to reflect that all text input  
needs to be raw as URI encoding is done in the module, which should  
work (I think).  I plan on committing it tonight.  It also indicates  
that EUtilities search queries need to be made as if they are regular  
Entrez queries.  Would that be sufficient?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From pmiguel at purdue.edu  Thu Oct  5 20:42:00 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Thu, 05 Oct 2006 16:42:00 -0400
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
Message-ID: <45256E18.3080103@purdue.edu>

David Messina wrote:
> I'm pleased to announce a revised version of the BioPerl Deobfuscator  
> is now available. Many thanks to Mauricio Cuadra for updating  
> bioperl.org's installation:
>
> http://bioperl.org/cgi-bin/deob_interface.cgi
>
> I've incorporated many of the suggestions you all sent in after the  
> first release, and many of the modules that had non-standard  
> documentation have been updated in the meantime, too, so hopefully  
> you'll find it much improved. There are still some issues with a few  
> modules; please report any problems you see. Also, it's now indexing  
> bioperl-live instead of 1.4, which should make it a little more  
> useful, too. A complete list of changes is below.
>
> I welcome your bug reports and suggestions for improvements, via  
> email, this list, Bugzilla, or the Wiki page.
>
>
> Thanks,
> Dave
>
>   
Here are some comments:
Would be good to have the column headings for the methods table in the 
fixed part of the page, rather than the scroll box. That way you could 
always see the column headings from anywhere in the list.

Second, I've noticed that there are a fair number of methods that have 
"not documented" for "Returns" and "Usage". But in every case I've 
checked both of these were documented. For example, consider methods for 
Bio::Seq::SeqWithQuality. The method "accession_number" is listed as 
"not documented". But if you click on Bio::Seq:SeqWithQuality link to 
the documentation, usage is defined as: "$unique_biological_key = 
$obj->accession_number;" and returns is defined as "A string".

Finally, it would be good to have the version of bioperl being 
deobfuscated on the deob_interface.cgi page. Just as a quick 
sanity-checking measure. After poking around a bit I found that 
bioperl-live is being indexed in the wiki. But, I can tell, it is just 
the sort of thing I'm going to forget and look for every time come  back 
to the page after a few months...

Overall very nice, though. Just what is needed when I'm trying to 
remember "which was the method that returns subseq string and which one 
returns an object?"


Phillip SanMiguel
Purdue University


From bix at sendu.me.uk  Thu Oct  5 21:24:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 05 Oct 2006 22:24:34 +0100
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
Message-ID: <45257812.5050008@sendu.me.uk>

Chris Fields wrote:
> 
> I have corrected the EUtilities POD to reflect that all text input needs 
> to be raw as URI encoding is done in the module, which should work (I 
> think).  I plan on committing it tonight.  It also indicates that 
> EUtilities search queries need to be made as if they are regular Entrez 
> queries.  Would that be sufficient?

You may not even need to mention anything about URI encoding, which 
might frighten some people. Something as simple as:

=head1 SYNOPSIS

use Bio::DB::EUtilities;

   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
                                          -db         => 'pubmed',
                                          -term       => 'hutP AND xyz',
...

and/or some POD for the new() method:

=head2 new

  Title   : new
...
  Args    : -eutil => ...
            -db    => ...
            -term  => string, an entrez-style query

=cut

would get the point across, I think.

BTW, can the term string be supplied anywhere else other than new()? It 
doesn't matter at all if it can't, I'm just idly wondering if I missed 
anything.


From dmessina at wustl.edu  Thu Oct  5 21:42:49 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 5 Oct 2006 16:42:49 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45256E18.3080103@purdue.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
Message-ID: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>

Thanks so much, Phillip, for taking the time to check out the new  
version and send your comments. I really appreciate it! I've added  
them to the wiki page so I can track them.

Best,
Dave


From cjfields at uiuc.edu  Thu Oct  5 21:50:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:50:11 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45257812.5050008@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
	<45257812.5050008@sendu.me.uk>
Message-ID: <A0B37F41-7C33-49F6-A039-A35AB5696947@uiuc.edu>

Sendu,

I have the parameters all set up as get/sets at this point, but I'm  
open to suggestions on that.  Note in the BEGIN block the heredoc eval 
{} block.  Yes, nasty I know, but I hate AUTOLOAD.  It works as a  
quick way of getting parameter get/sets up-and-running.  I plan on  
making those explicit get/sets as soon as I can then sorting out  
particular ones to the various eutil modules where they are primarily  
used.

Long story short, every parameter is a get/set at this time  
(including term()).  The common ones needed for most EUtilities are  
initialized in the parent EUtilities::_initialize(), and eutil- 
specific parameters are initialized in the individual eutil plugins.   
Each eutil plugin only sets whatever parameters may be needed for  
operation (though you could circumvent that, since all of them are  
inherited via EUtilities).

We could always simplify it to accept simple key-value pairs, but get/ 
sets (at least to me) allow more flexibility as long as you remember  
which parameters are set and to what.

Chris

On Oct 5, 2006, at 4:24 PM, Sendu Bala wrote:

> Chris Fields wrote:
>> I have corrected the EUtilities POD to reflect that all text input  
>> needs to be raw as URI encoding is done in the module, which  
>> should work (I think).  I plan on committing it tonight.  It also  
>> indicates that EUtilities search queries need to be made as if  
>> they are regular Entrez queries.  Would that be sufficient?
>
> You may not even need to mention anything about URI encoding, which  
> might frighten some people. Something as simple as:
>
> =head1 SYNOPSIS
>
> use Bio::DB::EUtilities;
>
>   my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                          -db         => 'pubmed',
>                                          -term       => 'hutP AND  
> xyz',
> ...
>
> and/or some POD for the new() method:
>
> =head2 new
>
>  Title   : new
> ...
>  Args    : -eutil => ...
>            -db    => ...
>            -term  => string, an entrez-style query
>
> =cut
>
> would get the point across, I think.
>
> BTW, can the term string be supplied anywhere else other than new 
> ()? It doesn't matter at all if it can't, I'm just idly wondering  
> if I missed anything.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Oct  5 21:51:06 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 5 Oct 2006 16:51:06 -0500
Subject: [Bioperl-l] EUtilities term handling
In-Reply-To: <45257812.5050008@sendu.me.uk>
References: <4524B20B.5010703@sendu.me.uk>
	<47566740-0F5F-4EA8-A8BB-B2CC78211460@gmx.net>
	<452511C1.5020709@sendu.me.uk>
	<7BBC8E98-8626-46DF-887D-32075FD7BBBE@uiuc.edu>
	<45251E69.7040507@sendu.me.uk>
	<BDF43562-5342-4BAD-8FD3-8728B282A55E@uiuc.edu>
	<45252524.7030006@sendu.me.uk>
	<202F5DEA-3AFA-4917-8C73-DA932F9B29F0@uiuc.edu>
	<4525314C.7020205@sendu.me.uk>
	<45AF94CC-9AA2-4CBE-BCB8-CFB271393F80@gmx.net>
	<2FEEC82A-8CAD-40A2-90EA-A7F60FE64456@uiuc.edu>
	<45257812.5050008@sendu.me.uk>
Message-ID: <5B2E844F-7B8B-4F69-9005-138826B835FB@uiuc.edu>

> You may not even need to mention anything about URI encoding, which
> might frighten some people. Something as simple as:
>
> =head1 SYNOPSIS
>
> use Bio::DB::EUtilities;
>
>    my $esearch = Bio::DB::EUtilities->new(-eutil      => 'esearch',
>                                           -db         => 'pubmed',
>                                           -term       => 'hutP AND  
> xyz',
> ...
>
> and/or some POD for the new() method:
>
> =head2 new
>
>   Title   : new
> ...
>   Args    : -eutil => ...
>             -db    => ...
>             -term  => string, an entrez-style query
>
> =cut
>
> would get the point across, I think.

Oops, forgot.  I'll add this in and update new() when I can.  Thanks!

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Thu Oct  5 22:12:49 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Thu, 05 Oct 2006 17:12:49 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45256E18.3080103@purdue.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
Message-ID: <45258361.8080803@campus.iztacala.unam.mx>

Phillip San Miguel wrote:
> Finally, it would be good to have the version of bioperl being 
> deobfuscated on the deob_interface.cgi page. Just as a quick 
> sanity-checking measure. After poking around a bit I found that 
> bioperl-live is being indexed in the wiki. But, I can tell, it is just 
> the sort of thing I'm going to forget and look for every time come  back 
> to the page after a few months...

Dave,

I think this value can be stored in one of the index files and passed as 
an argument to the deob_index.pl script. What do you think?

Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From lincoln.stein at gmail.com  Thu Oct  5 18:42:41 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 5 Oct 2006 14:42:41 -0400
Subject: [Bioperl-l] using nfreeze instead of freeze in
	Bio::SeqFeature::Store
In-Reply-To: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>
References: <CED81D34E37D5043A1211565277A51E5065E9879@exchkc02.stowers-institute.org>
Message-ID: <6dce9a0b0610051142h56479843ofc5429d959cb6e3@mail.gmail.com>

I think it's fine unless there is a significant performance hit, in
which case the change should be made into a configuration option. Do
you know if there is any overhead on doing this?

Lincoln

On 10/5/06, Cook, Malcolm <MEC at stowers-institute.org> wrote:
> Lincoln,
>
> I committed a change to Bio::SeqFeature::Store to use nfreeze instead of
> freeze which should allow SeqFeature objects to survive database
> freeze/thaw cycles across architectures.
>
> I hope I was not presumptuous or in error in doing this....
>
> Regards,
>
> Malcolm Cook
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, Missouri
>
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From torsten.seemann at infotech.monash.edu.au  Fri Oct  6 05:26:10 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 06 Oct 2006 15:26:10 +1000
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
Message-ID: <4525E8F2.1000704@infotech.monash.edu.au>

Hilmar,

> I don't think there's a need to deprecate - if the methods just plain  
> delegate to whatever File:: module is appropriate their  
> implementation (supposedly) will become very simple and hence won't  
> pose a maintenance burden anymore.

>> I have an uncommitted simplified version of Bio::Root::IO which does
>> this, and "all tests pass". The functions currently (silently)  
>> dispatch
>> directly to their native counterparts.
>>
>> The only tricky function is tempfile() which is *mostly* like
>> File::Temp::tempfile(), but does some voodoo of converting
>> (TEMPLATE=>'xxx') to the non-hash first parameter of the File::  
>> version,
>> so I'm hesitant to commit. It may do other magic - Hilmar?
> 
> Not that I would know of. If the tests pass (without having to change  
> them!) I'd give it a try.

Tempfile.t had two tests that failed. It seems that Bio::Root::IO had 
some magic whereby it would keep a list of all tempfilenames created 
with UNLINK != 0 and when the Bio::Root::IO object was destroyed (eg. 
undef $obj) it would MANUALLY unlink each of them. This would occur 
before File::Temp got to unlink them. Not sure why it was written like 
this (as File::Temp will delete them at the end of the script anyway) 
but maybe it was legacy for when File::Temp::tempfile WASN'T available.
Anyway, I've kept backward compatibility there, although I think 
eventually it should be removed and Tempfile.t adjusted.

Although all tests pass with my new trim Bio/Root/IO.pm I am still 
concerned about committing as the assumption is that the BioPerl test 
suite is good enough to handle such a change to an important module, but 
the reality may be different :-)

Let me know if you think I should commit anyway,

Your advice is appreciated.

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From dmessina at wustl.edu  Fri Oct  6 05:25:56 2006
From: dmessina at wustl.edu (David Messina)
Date: Fri, 6 Oct 2006 00:25:56 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <45258361.8080803@campus.iztacala.unam.mx>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>
	<45256E18.3080103@purdue.edu>
	<45258361.8080803@campus.iztacala.unam.mx>
Message-ID: <CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>


On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote:
> I think this value can be stored in one of the index files and  
> passed as an argument to the deob_index.pl script. What do you think?

Yep, I think that works nicely. I added this feature and committed it  
to CVS. Here's what the new header looks like if you do deob_index.pl  
-s "bioperl-live":

?
Thanks for the suggestions, guys.

Dave

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061006/1c5819f9/attachment-0004.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deob_header.jpg
Type: image/jpeg
Size: 25739 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061006/1c5819f9/attachment-0004.jpg>

From deep_ans at yahoo.com  Fri Oct  6 13:22:49 2006
From: deep_ans at yahoo.com (deepak shingan)
Date: Fri, 6 Oct 2006 06:22:49 -0700 (PDT)
Subject: [Bioperl-l] Sort blast file result according to evalues
Message-ID: <20061006132249.49450.qmail@web51711.mail.yahoo.com>

Hi ,
  Is  there any way to parse the blast file according to evalue for each hit. I want the output sorted according to hit evalue. I am using SearchIO algorithm and already tried sorting the hits according to bits, gaps, but I am not able to sort the hits by evalue. 
  As evalues are mainly associated with hsp and each hit may have multiple hsps. 
   
  waiting for help.
   
  Thanks,
  Dun Dansi
   
   
---------------------------------
How low will we go? Check out Yahoo! Messenger?s low  PC-to-Phone call rates.


From hlapp at gmx.net  Fri Oct  6 14:03:04 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 6 Oct 2006 10:03:04 -0400
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <4525E8F2.1000704@infotech.monash.edu.au>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
	<4525E8F2.1000704@infotech.monash.edu.au>
Message-ID: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>

This is a 1.5, i.e. developers release that's in the works, and also  
you'd be doing this on the main trunk. If you get the tests to pass  
there's no reason to hold back.

You may be right and in reality it has repercussions somewhere, but  
those will be the opportunities to improve our test suite.

	-hilmar

On Oct 6, 2006, at 1:26 AM, Torsten Seemann wrote:

> Although all tests pass with my new trim Bio/Root/IO.pm I am still  
> concerned about committing as the assumption is that the BioPerl  
> test suite is good enough to handle such a change to an important  
> module, but the reality may be different :-)
>
> Let me know if you think I should commit anyway,
>
> Your advice is appreciated.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct  6 14:58:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 09:58:09 -0500
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
Message-ID: <F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>

The evalue for the hit is retrieved by the BlastHit::signifiance()  
method, if I remember correctly.  So if $hit is a  
Bio::Search::Hit::BlastHit object, you use $hit->significance.  If  
you want individual HSP evalues, you would use $hsp->evalue for the  
individual HSP objects.

The output is normally sorted by the order they appear in the  
alignments and table, which is typically by increasing evalue or  
decreasing bits (score).  So they are already sorted.  If you wanted  
to run a sort yourself you could use a sort block using '{$a- 
 >significance() <=> $b->significance()} @hits', but as pointed out  
on the wiki it may be safer to run a Schwartzian transform instead:

http://www.bioperl.org/wiki/Bioperl_Best_Practices#Sorting

Chris

On Oct 6, 2006, at 8:22 AM, deepak shingan wrote:

> Hi ,
>   Is  there any way to parse the blast file according to evalue for  
> each hit. I want the output sorted according to hit evalue. I am  
> using SearchIO algorithm and already tried sorting the hits  
> according to bits, gaps, but I am not able to sort the hits by evalue.
>   As evalues are mainly associated with hsp and each hit may have  
> multiple hsps.
>
>   waiting for help.
>
>   Thanks,
>   Dun Dansi
>
>
>
>
>  		
> ---------------------------------
> How low will we go? Check out Yahoo! Messenger?s low  PC-to-Phone  
> call rates.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Oct  6 15:03:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 10:03:45 -0500
Subject: [Bioperl-l] Clean-up of Bio::Root::IO
In-Reply-To: <074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>
References: <452344D4.8070908@infotech.monash.edu.au>
	<22F22AA4-1E44-42A2-BB00-C5BF83941709@gmx.net>
	<4525E8F2.1000704@infotech.monash.edu.au>
	<074183F9-004A-4693-A7BC-D5E5524FCC92@gmx.net>
Message-ID: <265AD609-F74E-4545-B3DD-FF94290BE0B4@uiuc.edu>

On Oct 6, 2006, at 9:03 AM, Hilmar Lapp wrote:

> This is a 1.5, i.e. developers release that's in the works, and also
> you'd be doing this on the main trunk. If you get the tests to pass
> there's no reason to hold back.
>
> You may be right and in reality it has repercussions somewhere, but
> those will be the opportunities to improve our test suite.
>
> 	-hilmar

Agreed, though I think Sendu only wants bug fixes for 1.5.2.  You  
could always commit to CVS HEAD and it could be in 1.5.3.

Let me rethink that.  There were some subtle tempfile/tempdir issues  
that were popping up on WinXP where the some tempfiles were not being  
deleted b/c of permissions issues; I had planned on adding that to  
Bugzilla today or tomorrow.  Maybe changing to File::Temp would fix  
that, so in essence it would be a bug fix!

I'll go ahead and post the bug.

Chris

>> Although all tests pass with my new trim Bio/Root/IO.pm I am still
>> concerned about committing as the assumption is that the BioPerl
>> test suite is good enough to handle such a change to an important
>> module, but the reality may be different :-)
>>
>> Let me know if you think I should commit anyway,
>>
>> Your advice is appreciated.
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From pmiguel at purdue.edu  Fri Oct  6 15:06:56 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Fri, 06 Oct 2006 11:06:56 -0400
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>	<45256E18.3080103@purdue.edu>
	<5861A24A-E0F9-4B06-9AD1-68857E6561F9@wustl.edu>
Message-ID: <45267110.7030905@purdue.edu>

David Messina wrote:
> Thanks so much, Phillip, for taking the time to check out the new  
> version and send your comments. I really appreciate it! I've added  
> them to the wiki page so I can track them.
>
> Best,
> Dave
>   
Dave,
    No problem.
    I've just added a "keyword" to search BioPerl Deobfuscator to my 
Firefox browser. That way I can just type "deob qual" in my URL bar in 
firefox and the browser jumps directly to BioPerl Deobfuscator (like a 
bookmark) but it pre-submits the search item "qual".
    I heard about the Firefox "keywords" in a TWiT/FLOSS episode on 
mozilla. You just go to any search page and right-click in the search 
box of interest and one of the choices is "Add a Keyword for this 
Search". Then you just have to fill out "Name" and "Keyword" fields and 
drop the keyword into whatever folder you like. The "Keyword" then 
becomes the word to invoke that search with parameters that follow it 
when it is typed into the URL bar.
Phillip


From arareko at campus.iztacala.unam.mx  Fri Oct  6 15:18:02 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Fri, 06 Oct 2006 10:18:02 -0500
Subject: [Bioperl-l] BioPerl Deobfuscator updated
In-Reply-To: <CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>
References: <76EB791C-0A8A-4B86-A968-F436C6DFEE26@wustl.edu>	<45256E18.3080103@purdue.edu>	<45258361.8080803@campus.iztacala.unam.mx>
	<CC2947DA-2F65-4318-B85A-99A8DC7835B0@wustl.edu>
Message-ID: <452673AA.7070305@campus.iztacala.unam.mx>

Looks great! I'll update it during the weekend.

Mauricio.

David Messina wrote:
> 
> On Oct 5, 2006, at 5:12 PM, Mauricio Herrera Cuadra wrote:
>> I think this value can be stored in one of the index files and passed 
>> as an argument to the deob_index.pl script. What do you think?
> 
> Yep, I think that works nicely. I added this feature and committed it to 
> CVS. Here's what the new header looks like if you do deob_index.pl -s 
> "bioperl-live":
> 
> 
> Thanks for the suggestions, guys.
> 
> Dave
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From bix at sendu.me.uk  Fri Oct  6 15:27:14 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 06 Oct 2006 16:27:14 +0100
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
	<F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
Message-ID: <452675D2.9090803@sendu.me.uk>

Chris Fields wrote:
> The evalue for the hit is retrieved by the BlastHit::signifiance()  
> method, if I remember correctly.  So if $hit is a  
> Bio::Search::Hit::BlastHit object, you use $hit->significance.  If  
> you want individual HSP evalues, you would use $hsp->evalue for the  
> individual HSP objects.
> 
> The output is normally sorted by the order they appear in the  
> alignments and table, which is typically by increasing evalue or  
> decreasing bits (score).  So they are already sorted.

Concur.


> If you wanted to run a sort yourself you could use a sort block using
> '{$a->significance() <=> $b->significance()} @hits'

Actually, it is best to use the sort_hits() method of the result object 
prior to asking for any hits. (As this allows for potential optimization 
in the parser.)

->significance is still the thing you need to sort on though.


From cjfields at uiuc.edu  Fri Oct  6 15:52:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 6 Oct 2006 10:52:57 -0500
Subject: [Bioperl-l] Sort blast file result according to evalues
In-Reply-To: <452675D2.9090803@sendu.me.uk>
References: <20061006132249.49450.qmail@web51711.mail.yahoo.com>
	<F883E976-804C-4BDC-A19B-D9EFFFB3720E@uiuc.edu>
	<452675D2.9090803@sendu.me.uk>
Message-ID: <31A6FC3A-8BEB-42B8-B51D-66E659EF7495@uiuc.edu>


On Oct 6, 2006, at 10:27 AM, Sendu Bala wrote:

>> If you wanted to run a sort yourself you could use a sort block using
>> '{$a->significance() <=> $b->significance()} @hits'
>
> Actually, it is best to use the sort_hits() method of the result  
> object
> prior to asking for any hits. (As this allows for potential  
> optimization
> in the parser.)

Ah, forgot about that one!

Chris


Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Fri Oct  6 18:36:49 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 6 Oct 2006 11:36:49 -0700
Subject: [Bioperl-l] tempfile cleanup
In-Reply-To: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu>
References: <437E2D49-F5F6-424A-BC61-EC4BFEB9F3C4@berkeley.edu>
Message-ID: <0FCEC6B2-E190-4800-AAB1-89559C552FA6@bioperl.org>

I think the magic trickery in there for cleanup is that File::Temp  
only cleans up tempfiles when Perl exits not when the Root::IO object  
goes out of scope -- so this can be a problem for people on CGI  
scripts that stay resident in memory and don't ever have tempfiles  
cleaned up.  The managing the list aspect allows us to call _cleanup  
periodically (perhaps before the start of every Blast run) to insure  
that tempfiles are removed.  perhaps newer File::Temp versions can  
solve this better now but I believe that was the behavior we were  
trying to deal with with managing the list of to-be-deleted files by  
the Root::IO object.

This is some hackery that also had to do with not expecting  
File::Temp to be installed I believe.

-jason


From torsten.seemann at infotech.monash.edu.au  Mon Oct  9 04:52:29 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Mon, 09 Oct 2006 14:52:29 +1000
Subject: [Bioperl-l] Multiple packages in the one .pm file
Message-ID: <4529D58D.1080004@infotech.monash.edu.au>

Hi all,

The following modules have more than one "package xxxx;" declaration in 
them. For small, internal classes I guess this is fine, but for others,
they should be split up into the filesystem - otherwise they are 
troublesome to locate and the online documentation doesn't list them!

eg.
bioperl-run/Bio/Tools/Run/Analysis/Job.pm
is in
bioperl-run/Bio/Tools/Run/Analysis.pm

Here's the culprits:

% for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio | 
sed 's/:.*$//' | sort | uniq -d ; done

bioperl-live/Bio/AnalysisI.pm
bioperl-live/Bio/DB/Fasta.pm
bioperl-live/Bio/DB/GFF.pm
bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
bioperl-live/Bio/SeqIO/interpro.pm

bioperl-run/Bio/Tools/Run/Analysis.pm
bioperl-run/Bio/Tools/Run/Analysis/soap.pm

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From pmiguel at purdue.edu  Mon Oct  9 19:57:12 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Mon, 09 Oct 2006 15:57:12 -0400
Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC?
Message-ID: <452AA998.5010104@purdue.edu>

I found a bug in Bio::SeqIO::phd and am wondering if the fix will 
propagate into the next release candidate?

The bug is here:

http://bugzilla.open-bio.org/show_bug.cgi?id=2120

I also created a patch that fixes it (on my machine, anyway).  It is a 
fairly minor change, so it seems like it would be worth propagating it 
into the next release candidate.

-- 
Phillip SanMiguel


From bix at sendu.me.uk  Mon Oct  9 20:57:28 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 09 Oct 2006 21:57:28 +0100
Subject: [Bioperl-l] Will 1.5.2 bugfixes propagate into next RC?
In-Reply-To: <452AA998.5010104@purdue.edu>
References: <452AA998.5010104@purdue.edu>
Message-ID: <452AB7B8.4040404@sendu.me.uk>

Phillip San Miguel wrote:
> I found a bug in Bio::SeqIO::phd and am wondering if the fix will 
> propagate into the next release candidate?
> 
> The bug is here:
> 
> http://bugzilla.open-bio.org/show_bug.cgi?id=2120
> 
> I also created a patch that fixes it (on my machine, anyway).  It is a 
> fairly minor change, so it seems like it would be worth propagating it 
> into the next release candidate.

If it gets committed to HEAD before I make the next candidate, then yes.
I'll do that if no one beats me to it (and if someone does, please add a 
new test for this).

BTW Phillip, thank you for the bug report but in future use the 
attachment capabilities for files, please don't paste them into the 
comments box.


From bix at sendu.me.uk  Mon Oct  9 21:01:56 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 09 Oct 2006 22:01:56 +0100
Subject: [Bioperl-l] Analysis soap problem
Message-ID: <452AB8C4.1010704@sendu.me.uk>

I thought I'd 'advertise' this bug on the list so more people see it:
http://bugzilla.open-bio.org/show_bug.cgi?id=2117

I don't want to make the next 1.5.2 release candidate until its fixed. 
Does anyone have any idea about it? Even if you can't fix it, just 
explaining what's (supposed) to be going on would help a lot.

Thank you,
Sendu.


From Kevin.M.Brown at asu.edu  Mon Oct  9 22:40:54 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 9 Oct 2006 15:40:54 -0700
Subject: [Bioperl-l] Analysis soap problem
Message-ID: <1A4207F8295607498283FE9E93B775B40219690B@EX02.asurite.ad.asu.edu>

If I had to guess from looking at the snippet provided, the variable
$seq holds no data so when you try to setup the regex /^$seq$/ you end
up with /^$/ (blank line) and the warning.

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 09, 2006 2:02 PM
> To: bioperl-l List
> Subject: [Bioperl-l] Analysis soap problem
> 
> I thought I'd 'advertise' this bug on the list so more people see it:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2117
> 
> I don't want to make the next 1.5.2 release candidate until 
> its fixed. 
> Does anyone have any idea about it? Even if you can't fix it, just 
> explaining what's (supposed) to be going on would help a lot.
> 
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Tue Oct 10 02:34:23 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 9 Oct 2006 21:34:23 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452AB8C4.1010704@sendu.me.uk>
References: <452AB8C4.1010704@sendu.me.uk>
Message-ID: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>

I have 'fixed' this in CVS.  Note the quotes; it depends on what you  
might consider fixed.  Multiple calls to results() were returning  
empty hash refs, so no data was being returned.   For now, I stored  
the hash reference in a variable then tested each one.  All tests now  
pass, including the 'outseq' one.

Maybe it's just me, but shouldn't results() either consistently  
return the same information, or contain documentation that it doesn't  
do so?  Anyway, I have left the bugzilla report open for now.

Chris

On Oct 9, 2006, at 4:01 PM, Sendu Bala wrote:

> I thought I'd 'advertise' this bug on the list so more people see it:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2117
>
> I don't want to make the next 1.5.2 release candidate until its fixed.
> Does anyone have any idea about it? Even if you can't fix it, just
> explaining what's (supposed) to be going on would help a lot.
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bosborne11 at verizon.net  Tue Oct 10 02:09:45 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 09 Oct 2006 22:09:45 -0400
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <C1507929.AB8F%bosborne11@verizon.net>

Torsten,

Fixed interpro.pm, it could have been written more simply (or more like
other SeqIO modules). Can't really address the others.

Brian O.


On 10/9/06 12:52 AM, "Torsten Seemann"
<torsten.seemann at infotech.monash.edu.au> wrote:

> Hi all,
> 
> The following modules have more than one "package xxxx;" declaration in
> them. For small, internal classes I guess this is fine, but for others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
> 
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
> 
> Here's the culprits:
> 
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
> 
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
> 
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm


From bix at sendu.me.uk  Tue Oct 10 07:03:20 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 08:03:20 +0100
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
Message-ID: <452B45B8.8010401@sendu.me.uk>

Chris Fields wrote:
> I have 'fixed' this in CVS.  Note the quotes; it depends on what you  
> might consider fixed.  Multiple calls to results() were returning  
> empty hash refs, so no data was being returned.   For now, I stored  
> the hash reference in a variable then tested each one.  All tests now  
> pass, including the 'outseq' one.
> 
> Maybe it's just me, but shouldn't results() either consistently  
> return the same information, or contain documentation that it doesn't  
> do so?  Anyway, I have left the bugzilla report open for now.

Judging by the tests there seems a clear expectation that multiple calls 
to results() should work, and certainly that makes sense and seems 
natural. So I'd say that results() should be fixed and the test script 
reverted.


From cjfields at uiuc.edu  Tue Oct 10 11:42:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 06:42:33 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452B45B8.8010401@sendu.me.uk>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
	<452B45B8.8010401@sendu.me.uk>
Message-ID: <A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>

I agree, though I think Martin Senger should be contacted, at least  
to get his thoughts.  Has anyone tried yet?

Chris

On Oct 10, 2006, at 2:03 AM, Sendu Bala wrote:

> Chris Fields wrote:
>> I have 'fixed' this in CVS.  Note the quotes; it depends on what you
>> might consider fixed.  Multiple calls to results() were returning
>> empty hash refs, so no data was being returned.   For now, I stored
>> the hash reference in a variable then tested each one.  All tests now
>> pass, including the 'outseq' one.
>>
>> Maybe it's just me, but shouldn't results() either consistently
>> return the same information, or contain documentation that it doesn't
>> do so?  Anyway, I have left the bugzilla report open for now.
>
> Judging by the tests there seems a clear expectation that multiple  
> calls
> to results() should work, and certainly that makes sense and seems
> natural. So I'd say that results() should be fixed and the test script
> reverted.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 10 12:14:31 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 13:14:31 +0100
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>
References: <452AB8C4.1010704@sendu.me.uk>
	<86DEE3DC-38D9-4F63-B804-9BC5BA59109B@uiuc.edu>
	<452B45B8.8010401@sendu.me.uk>
	<A161CB8B-3997-4E09-A6E3-4A44B70BE4A0@uiuc.edu>
Message-ID: <452B8EA7.1080800@sendu.me.uk>

Chris Fields wrote:
> I agree, though I think Martin Senger should be contacted, at least to 
> get his thoughts.  Has anyone tried yet?

He's CCd on the bug report, but I haven't tried directly, no. Do you 
want to tackle this (contacting him and/or fixing the bug)?

Cheers,
Sendu.


From cjfields at uiuc.edu  Tue Oct 10 13:20:03 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 08:20:03 -0500
Subject: [Bioperl-l] Analysis soap problem
In-Reply-To: <452B8EA7.1080800@sendu.me.uk>
Message-ID: <001801c6ec6e$cc016900$15327e82@pyrimidine>

I'll try giving it a closer look, just didn't have much time yesterday.
I'll also try contacting Martin.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Tuesday, October 10, 2006 7:15 AM
> To: bioperl-l
> Subject: Re: [Bioperl-l] Analysis soap problem
> 
> Chris Fields wrote:
> > I agree, though I think Martin Senger should be contacted, at least to
> > get his thoughts.  Has anyone tried yet?
> 
> He's CCd on the bug report, but I haven't tried directly, no. Do you
> want to tackle this (contacting him and/or fixing the bug)?
> 
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From pmiguel at purdue.edu  Tue Oct 10 14:26:35 2006
From: pmiguel at purdue.edu (Phillip San Miguel)
Date: Tue, 10 Oct 2006 10:26:35 -0400
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452AB7B8.4040404@sendu.me.uk>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
Message-ID: <452BAD9B.5010903@purdue.edu>

Sendu Bala wrote:
>
> BTW Phillip, thank you for the bug report but in future use the 
> attachment capabilities for files, please don't paste them into the 
> comments box.
>   
Sendu,
    Sounds reasonable to me. I should note, however; when I entered the 
bug, I was looking for some method to attach files. There is none on the 
"Enter Bug: Bioperl" page:

http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl

Also, "bug writing guidelines" makes no mention of it. I vaguely 
remembered there being some method to do it--but given the "bug writing 
guidelines" exhortations to be specific and detailed, I thought I must 
put the information somewhere. So I put them them the only place offered 
(on that page)--"Description:"
    I see that, once submitted, attachments can be added to a bug 
report. Is that normally how it is done? Doesn't each attachment result 
in a separate email to the bioperl guts email list?
    Anyway,  I've just added the files to the bug report as attachments, 
in case someone needs them to construct a test.
   
-- 
Phillip


From bix at sendu.me.uk  Tue Oct 10 15:10:25 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 16:10:25 +0100
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
	<452BAD9B.5010903@purdue.edu>
Message-ID: <452BB7E1.5020200@sendu.me.uk>

Phillip San Miguel wrote:
> Sendu Bala wrote:
>> BTW Phillip, thank you for the bug report but in future use the 
>> attachment capabilities for files, please don't paste them into the
>>  comments box.
>> 
> Sendu, Sounds reasonable to me. I should note, however; when I
> entered the bug, I was looking for some method to attach files. There
> is none on the "Enter Bug: Bioperl" page:
> 
> http://bugzilla.open-bio.org/enter_bug.cgi?product=Bioperl
> 
> Also, "bug writing guidelines" makes no mention of it. I vaguely 
> remembered there being some method to do it--but given the "bug
> writing guidelines" exhortations to be specific and detailed, I
> thought I must put the information somewhere. So I put them them the
> only place offered (on that page)--"Description:"

I agree that things could be better here. Who looks after bugzilla, and
is this an alterable feature?


> I see that, once submitted, attachments can be added to a bug report.
> Is that normally how it is done?

Yes, AFAIK.


> Doesn't each attachment result in a separate email to the bioperl
> guts email list?

Yes, but that's not a problem. In fact, doing it this way means you
don't email everyone subscribed to guts your big files in plain text,
but instead they get a small email with a link to the download.


> Anyway,  I've just added the files to the bug report as attachments,
>  in case someone needs them to construct a test.

Thank you.


From arareko at campus.iztacala.unam.mx  Tue Oct 10 15:14:00 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Tue, 10 Oct 2006 10:14:00 -0500
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
References: <452AA998.5010104@purdue.edu> <452AB7B8.4040404@sendu.me.uk>
	<452BAD9B.5010903@purdue.edu>
Message-ID: <452BB8B8.40409@campus.iztacala.unam.mx>

Phillip San Miguel wrote:
> I see that, once submitted, attachments can be added to a bug report.
>  Is that normally how it is done?

Yes, it's the normal method: create the bug report, then attach files.

> Doesn't each attachment result in a separate email to the bioperl 
> guts email list?

Adding a file will generate an informative email per bug change 
(attaching the file in this case) but won't send the attachment to the list.

Regards,
Mauricio.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Tue Oct 10 15:20:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 10:20:55 -0500
Subject: [Bioperl-l] Bug reports and attachments
In-Reply-To: <452BAD9B.5010903@purdue.edu>
Message-ID: <002801c6ec7f$ae8d85f0$15327e82@pyrimidine>

> Also, "bug writing guidelines" makes no mention of it. I vaguely
> remembered there being some method to do it--but given the "bug writing
> guidelines" exhortations to be specific and detailed, I thought I must
> put the information somewhere. So I put them them the only place offered
> (on that page)--"Description:"
>     I see that, once submitted, attachments can be added to a bug
> report. Is that normally how it is done? Doesn't each attachment result
> in a separate email to the bioperl guts email list?
>     Anyway,  I've just added the files to the bug report as attachments,
> in case someone needs them to construct a test.

Phillip,

Initial bug reports only require the general description, OS used, bioperl
version, etc.  That's quite normal.  Any relevant attachments are added
afterward.  We should probably make that clearer upfront on the wiki page; I
don't know if anyone can make similar changes to bugzilla.

Any bug changes, CVS commits, etc are mailed to bioperl-guts, yes.  That
isn't an issue though; it keeps the developers updated on the various
bugs/commits that are going on and is a pretty common practice.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 16:48:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 11:48:22 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
References: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>

There are a number of other bioperl-run examples (the  
Bio::Tools::Run::Analysis::soap issue I looked into revealed such).

I agree with both points, 1) that it depends on the size of the  
classes, and 2) from a maintainability standpoint, it can be very  
frustrating when looking for documentation.  Is there really any  
advantage to doing this?

Chris

On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:

> Hi all,
>
> The following modules have more than one "package xxxx;"  
> declaration in
> them. For small, internal classes I guess this is fine, but for  
> others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
>
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
>
> Here's the culprits:
>
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
>
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
>
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 16:48:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 11:48:22 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <4529D58D.1080004@infotech.monash.edu.au>
References: <4529D58D.1080004@infotech.monash.edu.au>
Message-ID: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>

There are a number of other bioperl-run examples (the  
Bio::Tools::Run::Analysis::soap issue I looked into revealed such).

I agree with both points, 1) that it depends on the size of the  
classes, and 2) from a maintainability standpoint, it can be very  
frustrating when looking for documentation.  Is there really any  
advantage to doing this?

Chris

On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:

> Hi all,
>
> The following modules have more than one "package xxxx;"  
> declaration in
> them. For small, internal classes I guess this is fine, but for  
> others,
> they should be split up into the filesystem - otherwise they are
> troublesome to locate and the online documentation doesn't list them!
>
> eg.
> bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> is in
> bioperl-run/Bio/Tools/Run/Analysis.pm
>
> Here's the culprits:
>
> % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> sed 's/:.*$//' | sort | uniq -d ; done
>
> bioperl-live/Bio/AnalysisI.pm
> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> bioperl-live/Bio/SeqIO/interpro.pm
>
> bioperl-run/Bio/Tools/Run/Analysis.pm
> bioperl-run/Bio/Tools/Run/Analysis/soap.pm
>
> -- 
> Dr Torsten Seemann               http://www.vicbioinformatics.com
> Victorian Bioinformatics Consortium, Monash University, Australia
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From lzhtom at hotmail.com  Tue Oct 10 19:42:48 2006
From: lzhtom at hotmail.com (zhihua li)
Date: Tue, 10 Oct 2006 19:42:48 +0000
Subject: [Bioperl-l] No "new" method in Bio::Tool::Run::AnalysisFactor::Pise?
Message-ID: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>

Hi netters.

I've installed Bioperl 1.5.1, both core and run modules.  But when I tried 
to use the Pise module, an error occured saying that there's no "new" 
method in this package.

My script is:

use strict;
use warnings;
use Bio::Tools::Run::AnalysisFactory::Pise;
my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
my $program=$factory->program('mfold');
$program->seq('my_input_file');
my $job = $program->run();
print STDERR $job->contect('mfold.out');

The error message I got is:

Can't locate object method "new" via package 
"Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load 
"Bio::Tools::Run::AnalysisFactor::Pise"?)

I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm and 
it DOES contain a sub new.

So what's going on? Anyone could give me a hint?

Thanks a lot!


From cjfields at uiuc.edu  Tue Oct 10 20:27:27 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 15:27:27 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
Message-ID: <E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>

Makes sense to me.  I think, as long as they're documented, it  
shouldn't be a problem.

I think the main point is that the class methods for these don't show  
up using perldoc (something I ran into with Bio::DB::Fasta's  
inclusion of Bio::PrimarySeq::Fasta), but they do show up when using  
other documentation.  So 'perldoc Bio::DB::Fasta' works, but 'perldoc  
Bio::PrimarySeq::Fasta' doesn't.  So these can be problematic when  
looking for specific methods.

However, I think pod2html handles multiple package declarations in  
one module, and the PDOC online do as well.  Does the Deobfuscator?

Chris

On Oct 10, 2006, at 3:11 PM, Lincoln Stein wrote:

> Hi,
>
> These ones are all mine:
>
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
>
> In each case, the second modules are teeny tiny ones that implement  
> iterators which are at most two methods long (typically a new() and  
> a next()). I prefer not to split them out because they will just  
> clutter up the file tree with stuff that is already well documented  
> in the "parent ship" modules.
>
> Lincoln
>
>
> On 10/10/06, Chris Fields <cjfields at uiuc.edu> wrote: There are a  
> number of other bioperl-run examples (the
> Bio::Tools::Run::Analysis::soap issue I looked into revealed such).
>
> I agree with both points, 1) that it depends on the size of the
> classes, and 2) from a maintainability standpoint, it can be very
> frustrating when looking for documentation.  Is there really any
> advantage to doing this?
>
> Chris
>
> On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:
>
> > Hi all,
> >
> > The following modules have more than one "package xxxx;"
> > declaration in
> > them. For small, internal classes I guess this is fine, but for
> > others,
> > they should be split up into the filesystem - otherwise they are
> > troublesome to locate and the online documentation doesn't list  
> them!
> >
> > eg.
> > bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> > is in
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> >
> > Here's the culprits:
> >
> > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/ 
> Bio |
> > sed 's/:.*$//' | sort | uniq -d ; done
> >
> > bioperl-live/Bio/AnalysisI.pm
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> > bioperl-live/Bio/SeqIO/interpro.pm
> >
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> > bioperl-run/Bio/Tools/Run/Analysis/soap.pm
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> (516) 367-8380 (voice)
> (516) 367-8389 (fax)
> FOR URGENT MESSAGES & SCHEDULING,
> PLEASE CONTACT MY ASSISTANT,
> SANDRA MICHELSEN, AT michelse at cshl.edu

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 10 20:30:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 15:30:16 -0500
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <870B7500-AA83-42D7-965B-865B91AA8E7F@uiuc.edu>


On Oct 10, 2006, at 2:42 PM, zhihua li wrote:

> Hi netters.
>
> I've installed Bioperl 1.5.1, both core and run modules.  But when  
> I tried to use the Pise module, an error occured saying that  
> there's no "new" method in this package.
>
> My script is:
>
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
>
> The error message I got is:
>
> Can't locate object method "new" via package  
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load  
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
>
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/ 
> Pise.pm and it DOES contain a sub new.
>
> So what's going on? Anyone could give me a hint?
>
> Thanks a lot!

Well, according to your error output you have AnalysisFactory  
misspelled ('AnalysisFactor'), which should tell you what the problem  
is.  Look for the same thing in your script.

Chris


>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 10 20:43:06 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 10 Oct 2006 21:43:06 +0100
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <452C05DA.5050803@sendu.me.uk>

zhihua li wrote:
> Hi netters.
> 
> I've installed Bioperl 1.5.1, both core and run modules.  But when I 
> tried to use the Pise module, an error occured saying that there's no 
> "new" method in this package.
> 
> My script is:
> 
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
> 
> The error message I got is:
> 
> Can't locate object method "new" via package 
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load 
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
> 
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm 
> and it DOES contain a sub new.
> 
> So what's going on? Anyone could give me a hint?

You have a typo.

Bio::Tools::Run::AnalysisFactory::Pise, not
Bio::Tools::Run::AnalysisFactor::Pise


From lincoln.stein at gmail.com  Tue Oct 10 20:11:00 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 10 Oct 2006 16:11:00 -0400
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
Message-ID: <6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>

Hi,

These ones are all mine:

> bioperl-live/Bio/DB/Fasta.pm
> bioperl-live/Bio/DB/GFF.pm
> bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> bioperl-live/Bio/DB/SeqFeature/Store/memory.pm

In each case, the second modules are teeny tiny ones that implement
iterators which are at most two methods long (typically a new() and a
next()). I prefer not to split them out because they will just clutter up
the file tree with stuff that is already well documented in the "parent
ship" modules.

Lincoln


On 10/10/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> There are a number of other bioperl-run examples (the
> Bio::Tools::Run::Analysis::soap issue I looked into revealed such).
>
> I agree with both points, 1) that it depends on the size of the
> classes, and 2) from a maintainability standpoint, it can be very
> frustrating when looking for documentation.  Is there really any
> advantage to doing this?
>
> Chris
>
> On Oct 8, 2006, at 11:52 PM, Torsten Seemann wrote:
>
> > Hi all,
> >
> > The following modules have more than one "package xxxx;"
> > declaration in
> > them. For small, internal classes I guess this is fine, but for
> > others,
> > they should be split up into the filesystem - otherwise they are
> > troublesome to locate and the online documentation doesn't list them!
> >
> > eg.
> > bioperl-run/Bio/Tools/Run/Analysis/Job.pm
> > is in
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> >
> > Here's the culprits:
> >
> > % for NN in bioperl-live bioperl-run; do grep '^package ' -r $NN/Bio |
> > sed 's/:.*$//' | sort | uniq -d ; done
> >
> > bioperl-live/Bio/AnalysisI.pm
> > bioperl-live/Bio/DB/Fasta.pm
> > bioperl-live/Bio/DB/GFF.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/berkeleydb.pm
> > bioperl-live/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/berkeleydb.pm
> > bioperl-live/Bio/DB/SeqFeature/Store/memory.pm
> > bioperl-live/Bio/SeqIO/interpro.pm
> >
> > bioperl-run/Bio/Tools/Run/Analysis.pm
> > bioperl-run/Bio/Tools/Run/Analysis/soap.pm
> >
> > --
> > Dr Torsten Seemann               http://www.vicbioinformatics.com
> > Victorian Bioinformatics Consortium, Monash University, Australia
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From asjo at koldfront.dk  Tue Oct 10 20:04:35 2006
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Tue, 10 Oct 2006 22:04:35 +0200
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <871wpglyy4.fsf@topper.koldfront.dk>

On Tue, 10 Oct 2006 19:42:48 +0000, zhihua wrote:

> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
                                               ^
                                               y
[...]

> Can't locate object method "new" via package
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load
> "Bio::Tools::Run::AnalysisFactor::Pise"?)

You missed a 'y' in "Factory".


  Best wishes,

-- 
 "We've reached a special place... Spiritually...             Adam Sj?gren
  ecumenically... grammatically."                        asjo at koldfront.dk


From dmessina at wustl.edu  Tue Oct 10 21:08:45 2006
From: dmessina at wustl.edu (David Messina)
Date: Tue, 10 Oct 2006 16:08:45 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
	<E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
Message-ID: <A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>

> However, I think pod2html handles multiple package declarations in
> one module, and the PDOC online do as well.  Does the Deobfuscator?

Nope. From my cursory examination at the time they mostly were, as  
Lincoln said, short and sweet, so I didn't consider it a big deal.

I do think the Deobfuscator should theoretically handle such cases  
anyway, though. I'll add it as a feature request on the wiki page. Or  
if you're chomping at the bit for it, I could certainly be beer- 
suaded to do it sooner rather than later... :)

Dave


From cjfields at uiuc.edu  Tue Oct 10 21:33:39 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 10 Oct 2006 16:33:39 -0500
Subject: [Bioperl-l] Multiple packages in the one .pm file
In-Reply-To: <A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>
References: <4529D58D.1080004@infotech.monash.edu.au>
	<2832EEF6-3D6F-4F6F-B2C8-914FA83884FA@uiuc.edu>
	<6dce9a0b0610101311ude636dcv54397e357a555160@mail.gmail.com>
	<E13D9102-3D98-4D91-ACFD-E30BDDAC9A1D@uiuc.edu>
	<A38D8092-7DCE-4C1E-A00A-2F7E78276ED9@wustl.edu>
Message-ID: <7F35F565-7D28-4B06-A501-4D4083652C5C@uiuc.edu>

Me?  I'm a lowly postdoc.  Lincoln's got the cash!

Chris

On Oct 10, 2006, at 4:08 PM, David Messina wrote:

>> However, I think pod2html handles multiple package declarations in
>> one module, and the PDOC online do as well.  Does the Deobfuscator?
>
> Nope. From my cursory examination at the time they mostly were, as  
> Lincoln said, short and sweet, so I didn't consider it a big deal.
>
> I do think the Deobfuscator should theoretically handle such cases  
> anyway, though. I'll add it as a feature request on the wiki page.  
> Or if you're chomping at the bit for it, I could certainly be beer- 
> suaded to do it sooner rather than later... :)
>
> Dave
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From sdavis2 at mail.nih.gov  Wed Oct 11 09:43:35 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 11 Oct 2006 05:43:35 -0400
Subject: [Bioperl-l] No "new" method in
	Bio::Tool::Run::AnalysisFactor::Pise?
In-Reply-To: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
References: <BAY110-F167178A92AB9860C81C081C7170@phx.gbl>
Message-ID: <452CBCC7.30108@mail.nih.gov>

zhihua li wrote:
> Hi netters.
>
> I've installed Bioperl 1.5.1, both core and run modules. But when I
> tried to use the Pise module, an error occured saying that there's no
> "new" method in this package.
>
> My script is:
>
> use strict;
> use warnings;
> use Bio::Tools::Run::AnalysisFactory::Pise;
> my $factory = Bio::Tools::Run::AnalysisFactor::Pise->new();
> my $program=$factory->program('mfold');
> $program->seq('my_input_file');
> my $job = $program->run();
> print STDERR $job->contect('mfold.out');
>
> The error message I got is:
>
> Can't locate object method "new" via package
> "Bio::Tools::Run::AnalysisFactor::Pise" (perhaps you forgot to load
> "Bio::Tools::Run::AnalysisFactor::Pise"?)
>
> I checked the module file at ..../Bio/Tools/Run/AnalysisFactor/Pise.pm
> and it DOES contain a sub new.
>
> So what's going on? Anyone could give me a hint?
>
> Thanks a lot!

The module name is Bio::Tools::Run::AnalysisFactory::Pise. Note that it
is not "factor" but "factory". That should probably fix your problem.

Sean


From jay at jays.net  Sat Oct  7 22:34:23 2006
From: jay at jays.net (Jay Hannah)
Date: Sat, 07 Oct 2006 17:34:23 -0500
Subject: [Bioperl-l] Documentation typo: Bio::Search::Result::GenericResult
Message-ID: <45282B6F.1030308@jays.net>

I just updated my bioperl-live this morning, so I think I'm current. :)

perldoc Bio::Search::Result::GenericResult
------------
SYNOPSIS
           # typically one gets Results from a SearchIO stream
           use Bio::SearchIO;
           my $io = new Bio::SearchIO(-format => 'blast',
                                       -file   => 't/data/HUMBETGLOA.tblastx');
           while( my $result = $io->next_result) {
               # process all search results within the input stream
               while( my $hit = $result->next_hits()) {
-------------

Except that "next_hits()" does not exist. Should be "next_hit()".

(Should I have posted a patch instead?)

Thanks,

j


From bosborne11 at verizon.net  Tue Oct 10 22:42:25 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 10 Oct 2006 18:42:25 -0400
Subject: [Bioperl-l] Documentation typo:
	Bio::Search::Result::GenericResult
In-Reply-To: <45282B6F.1030308@jays.net>
Message-ID: <C1519A11.ABD1%bosborne11@verizon.net>

j,

No need, not for something so simple.

Brian O.


On 10/7/06 6:34 PM, "Jay Hannah" <jay at jays.net> wrote:

> Except that "next_hits()" does not exist. Should be "next_hit()".
> 
> (Should I have posted a patch instead?)


From zchou at cau.edu.cn  Wed Oct 11 06:34:24 2006
From: zchou at cau.edu.cn (zhuocheng Hou)
Date: Wed, 11 Oct 2006 14:34:24 +0800
Subject: [Bioperl-l] about retreive alinged sequence
Message-ID: <000a01c6ecff$4ea4b2f0$0915020a@zchou>

Hello,everyone,

I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out.

The codes as follows (from the tutorials of HOWTOPAML):

         #
         # These codes run  and can find the screen print out of clustalw
         .......
         my $aa_aln = $aln_factory->align(\@prots, at params);
         # project the protein alignment back to CDS coordinates
         my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs);  
         my @each = $dna_aln->each_seq();         
         
         # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. 


         my $in  = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta');
         my $aln=$dna_aln;
         my $out = Bio::AlignIO->new(-file => ">out.msf" ,
                                   -format => 'msf');
         #print $out $_ while <$in>; 
         while ($aln = $in->next_aln() ) {
               my $out->write_aln($aln);
         }
         

Best regards,

Zhuocheng
CAU


From n.haigh at sheffield.ac.uk  Wed Oct 11 14:00:33 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 11 Oct 2006 15:00:33 +0100
Subject: [Bioperl-l] about retreive alinged sequence
In-Reply-To: <000a01c6ecff$4ea4b2f0$0915020a@zchou>
References: <000a01c6ecff$4ea4b2f0$0915020a@zchou>
Message-ID: <452CF901.6020409@sheffield.ac.uk>

Dear Zhuocheng

I'm not familiar with the aa_to_dna_al method but it appears that from 
your code that it returns an alignment object. Please find comments 
inserted below - hope they help!

Nathan

zhuocheng Hou wrote:
> Hello,everyone,
>
> I am a new user of Bioperl. I want to align mutiple DNA sequences based on translated proteins by using clustalw. However, I don't know how to retreive aligned sequences out.
>
> The codes as follows (from the tutorials of HOWTOPAML):
>
>          #
>          # These codes run  and can find the screen print out of clustalw
>          .......
>          my $aa_aln = $aln_factory->align(\@prots, at params);
>          # project the protein alignment back to CDS coordinates
>          my $dna_aln = aa_to_dna_aln($aa_aln, \%seqs);  
>   
$dna_aln should be a Bio::AlignIO object so all you need to do is setup 
the output stream to write the alignment object similar to what you 
wrote below. i.e.

my $out = Bio::AlignIO->new(-file => ">out.msf" ,
                                   -format => 'msf');

Then simply write the input alignment ($dna_aln) to the output stream 
with this:

my $out->write_aln($dna_aln);


>          my @each = $dna_aln->each_seq();         
>          
>          # The following codes were writed by me. However, it doesn't work. I want to get teh $dna_aln contents and write it to files. 
>
>
>          my $in  = Bio::AlignIO->newFh(-Fh=>\$dna_aln,-format => 'fasta');
>          my $aln=$dna_aln;
>          my $out = Bio::AlignIO->new(-file => ">out.msf" ,
>                                    -format => 'msf');
>          #print $out $_ while <$in>; 
>          while ($aln = $in->next_aln() ) {
>                my $out->write_aln($aln);
>          }
>          
>
> Best regards,
>
> Zhuocheng
> CAU
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From melcher at rescomp.berkeley.edu  Wed Oct 11 21:09:17 2006
From: melcher at rescomp.berkeley.edu (Graham Melcher)
Date: Wed, 11 Oct 2006 14:09:17 -0700
Subject: [Bioperl-l] Accessing GO through MYSQL?
Message-ID: <20061011210917.GA783@rescomp.berkeley.edu>

Hey all,

Preface:: This is my first post to this list, please redirect if my
questions belong elsewhere.  

I need to lookup GO ontology information given GO:Accessors, and I have
a local mysql db that mirrors the GO db from that website.  I am not
sure if the Bio::Ontology::* libraries were designed to be used in a
dynamic, load-as-you-need sort of way, and am wondering how other people
have gone about solving this problem.  Details follow...

Right now I'm using Class::DBI to access the Mysql database, then made a
new set of subclassed Bio::Ontology::TermI and
Bio::Ontology::RelationshipI which use these class::DBI objects to
access the relevent information in the database on the fly.
Unfortunately, I was getting stuck with the implementation of some of
the other Bio::Ontology::*I, especially Ontology.   Making all of these
subclasses seems infeasible, or at least enough work that it might be
available somewhere.  Are mysql accessors out there, and I just haven't
found them, or is Bio::Ontology possibly not way to go?  

Alternatively, if I end up having to write this sort of Bio::Ontology -
Class::DBI interface, would anyone be interested in it being made
generally usable and available?

Finally, I just found go-perl, but although I haven't had a lot of time
to look into it, it doesn't seem to use mysql either.

Thanks!

Graham

-- 
Graham Melcher


From sdavis2 at mail.nih.gov  Thu Oct 12 11:51:14 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 07:51:14 -0400
Subject: [Bioperl-l] Accessing GO through MYSQL?
In-Reply-To: <20061011210917.GA783@rescomp.berkeley.edu>
References: <20061011210917.GA783@rescomp.berkeley.edu>
Message-ID: <452E2C32.7070502@mail.nih.gov>

Graham Melcher wrote:
> Finally, I just found go-perl, but although I haven't had a lot of time
> to look into it, it doesn't seem to use mysql either.
>   
Yep.  Keep going.  Go-perl and Go-db-perl:

http://www.godatabase.org/dev/go-db-perl/doc/go-db-perl-doc.html

Sean


From hlapp at gmx.net  Thu Oct 12 04:44:49 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 12 Oct 2006 00:44:49 -0400
Subject: [Bioperl-l] NESCent Phyloinformatics Hackathon
Message-ID: <939B253E-2F87-450A-A277-78B5645D3494@gmx.net>

(apologies in advance to those who receive this multiple times)

The National Evolutionary Synthesis Center (NESCent) in collaboration  
with Arlin Stoltzfus (U. Maryland, NIST), Aaron Mackey (GSK), Rutger  
Vos (UBC), and Mark Holder (FSU) sponsors a Phyloinformatics  
Hackathon to take place Dec 11-15 in Durham, NC.

The (wiki) website with more information and a formal proposal is at

	https://www.nescent.org/wg_phyloinformatics/

In short, the goal is to leverage the Bio* toolkits to provide the  
"glue" for evolutionary analyses of various types that depend on  
automation, interoperability, and data integration.

CALL FOR INPUT:

The specific objectives are driven by "use cases", that is, specific  
target problems of interest to evolutionary biologists (click 'Use  
Cases' at the above website). We invite community input in order to  
focus efforts on the most urgent or pervasive problems. The wiki for  
the hackathon allows direct editing of the use cases after  
registration. You may also upload data files, or add comments to the  
"Forum" page. Alternatively, send email to hlapp at nescent.org. You  
may also contact any of the organizers with questions or comments.

ATTENDANCE:

The hackathon is scheduled for Dec 11-15, 2006 in Durham NC. Space is  
limited, and attendance is by invitation. If you have not been  
contacted but desire to attend, please contact Hilmar Lapp (hlapp at  
nescent.org).

ORGANIZERS:

Hilmar Lapp (NESCent; hlapp at nescent.org)
Aaron Mackey (GSK; aaron.j.mackey at gsk.com)
Mark Holder (FSU; mholder at scs.fsu.edu)
Arlin Stoltzfus (CARB, NIST; arlin.stoltzfus at nist.gov)
Todd Vision (NESCent; tjv at bio.unc.edu)
Rutger Vos (UBC; rvosa at sfu.ca)


From neetisomaiya at gmail.com  Thu Oct 12 06:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Thu Oct 12 06:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Thu Oct 12 06:03:20 2006
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 12 Oct 2006 11:33:20 +0530
Subject: [Bioperl-l] need help urgently
Message-ID: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>

We are using BioPerl to parse a BLAST output file, and then we want to load
full alignments into a CLOB column in one of our database tables. We are
trying to use sql loader for the same. Anyone has an idea how we can go
about it?
We have tried loading sequences into CLOB columns using sql loader, and that
works fine, but the same syntax when used for loading alignments, is not
working.

-- 
-Neeti
Even my blood says, B positive


From sayali_salodkar at persistent.co.in  Thu Oct 12 10:16:34 2006
From: sayali_salodkar at persistent.co.in (Sayali)
Date: Thu, 12 Oct 2006 15:46:34 +0530
Subject: [Bioperl-l] regarding polyphred output
Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in>

Hi, 

I want to parse the output of polyphred
http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already
available in Bioperl which would help me in doing the same.

Thanks,

Sayali

 
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails.


From sayali_salodkar at persistent.co.in  Thu Oct 12 10:16:34 2006
From: sayali_salodkar at persistent.co.in (Sayali)
Date: Thu, 12 Oct 2006 15:46:34 +0530
Subject: [Bioperl-l] regarding polyphred output
Message-ID: <006e01c6ede7$7f10bef0$00b1580a@persistent.co.in>

Hi, 

I want to parse the output of polyphred
http://droog.gs.washington.edu/PolyPhred.html. Is there a parser already
available in Bioperl which would help me in doing the same.

Thanks,

Sayali

 
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Pvt. Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Pvt. Ltd. does not accept any liability for virus infected mails.


From sdavis2 at mail.nih.gov  Thu Oct 12 10:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean


From sdavis2 at mail.nih.gov  Thu Oct 12 10:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean


From sdavis2 at mail.nih.gov  Thu Oct 12 10:40:12 2006
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 12 Oct 2006 06:40:12 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <200610120640.12250.sdavis2@mail.nih.gov>

On Thursday 12 October 2006 02:03, neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
> We have tried loading sequences into CLOB columns using sql loader, and
> that works fine, but the same syntax when used for loading alignments, is
> not working.

Neeti,

You'll need to be a bit more specific about what you are doing.  Can you post 
the code you are using and error messages?  Also, what is "sql loader"?  And 
what database are you trying to use?

Sean


From crabtree at tigr.ORG  Thu Oct 12 11:28:06 2006
From: crabtree at tigr.ORG (Jonathan Crabtree)
Date: Thu, 12 Oct 2006 07:28:06 -0400
Subject: [Bioperl-l] need help urgently
In-Reply-To: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
References: <764978cf0610112303l5a6b1222o40e327ba24d164e8@mail.gmail.com>
Message-ID: <452E26C6.6040800@tigr.org>


Hi Neeti-

neeti somaiya wrote:
> We are using BioPerl to parse a BLAST output file, and then we want to load
> full alignments into a CLOB column in one of our database tables. We are
> trying to use sql loader for the same. Anyone has an idea how we can go
> about it?
>   

This doesn't sound like a BioPerl issue per se, so this list might not
be the best venue for your question.  Since SQL*Loader is an Oracle
utility you may have better luck in a forum frequented by Oracle DBAs
and/or general bioinformatics people.  (Not that this isn't such a
forum, but unless your difficulty is actually being caused by BioPerl,
or there's some kind of SQL*Loader wrapper in BioPerl--which I don't
think is the case--you run the risk of having people complain that your
question doesn't have enough to do with BioPerl.)

> We have tried loading sequences into CLOB columns using sql loader, and that
> works fine, but the same syntax when used for loading alignments, is not
> working.
>   

It's been a while since I've done any work with SQL*Loader, but I'd
guess that the reason it works with sequences and not alignments is
because there are characters in the alignments (newlines, perhaps?) that
SQL*Loader is incorrectly interpreting as either column (field) or row
(record) delimiters.  You may need to change your flat file encoding to
use delimiters other than the defaults (and alter the SQL*Loader control
file accordingly.)  As Sean pointed out, however, it's difficult to be
much help without seeing an example of a failed input and the
corresponding error(s)!  One other thing I remember about SQL*Loader (as
of Oracle 8-9 or so) is that all the CLOB values had to appear *last* in
the SQL*Loader record, at least if you were using variable-length
fields.  But since you've loaded sequences successfully, I doubt this is
the issue.  One final thought is that I believe SQL*Loader has an option
whereby you can place your LOB values in files external to the main
SQL*Loader input file, which sidesteps the field/row delimiter issue
completely; you may want to look into this if you're not already loading
your Oracle database this way.

Jonathan


From bix at sendu.me.uk  Fri Oct 13 08:56:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 13 Oct 2006 09:56:01 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <4521E74E.1040404@infotech.monash.edu.au>
References: <4521E74E.1040404@infotech.monash.edu.au>
Message-ID: <452F54A1.7010908@sendu.me.uk>

Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's 
certainly interface-like, but doesn't follow the normal interface naming 
convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed 
WrapperBaseI? Left alone?


From cjfields at uiuc.edu  Fri Oct 13 12:20:58 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 13 Oct 2006 07:20:58 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <452F54A1.7010908@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>
	<452F54A1.7010908@sendu.me.uk>
Message-ID: <43CC4E80-8F15-4C83-929D-DDC719360C8F@uiuc.edu>

I would say, according to BioPerl convention, it should be renamed  
WrapperBaseI.  It has a few interface-like methods and (importantly)  
lacks a constructor.  Unless someone else out there has other reasoning?

Note that this will require lots of bioperl-run changes as well, at  
least I think it will.

Chris

On Oct 13, 2006, at 3:56 AM, Sendu Bala wrote:

> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's
> certainly interface-like, but doesn't follow the normal interface  
> naming
> convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed
> WrapperBaseI? Left alone?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From avilella at gmail.com  Fri Oct 13 15:26:47 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 13 Oct 2006 16:26:47 +0100
Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method
Message-ID: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>

Hi all,

While using the remove_gaps method in Bio::SimpleAlign I found out
that if the alignment is (bad enough for) having no columns without
any gap at all, the method will give a:

Use of uninitialized value in split at this line in add_seq:

map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq);

So my idea was to tweak this line to something like:

map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || '');

But I am unsure about any other side effects this may have.

Anyone?

    Albert.


From cjfields at uiuc.edu  Fri Oct 13 15:51:38 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 13 Oct 2006 10:51:38 -0500
Subject: [Bioperl-l] hickup in SimpleAlign remove_gaps method
In-Reply-To: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>
References: <358f4d650610130826k1a126983t73839d665bbcb502@mail.gmail.com>
Message-ID: <EE9FE57F-EE17-44FE-B298-CD4084675085@uiuc.edu>

You can check to see if it passes all tests.  I'm guessing  
SimpleAlign.t tests this method out in some way (though it's always  
safer to check).

Chris

On Oct 13, 2006, at 10:26 AM, Albert Vilella wrote:

> Hi all,
>
> While using the remove_gaps method in Bio::SimpleAlign I found out
> that if the alignment is (bad enough for) having no columns without
> any gap at all, the method will give a:
>
> Use of uninitialized value in split at this line in add_seq:
>
> map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq);
>
> So my idea was to tweak this line to something like:
>
> map { $self->{'_symbols'}->{$_} = 1; } split(//,$seq->seq || '');
>
> But I am unsure about any other side effects this may have.
>
> Anyone?
>
>     Albert.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jay at jays.net  Fri Oct 13 16:09:16 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 11:09:16 -0500
Subject: [Bioperl-l] Documentation typo:
	Bio::Search::Result::GenericResult
In-Reply-To: <C1519A11.ABD1%bosborne11@verizon.net>
References: <C1519A11.ABD1%bosborne11@verizon.net>
Message-ID: <452FBA2C.7070003@jays.net>

Thanks Brian! 

My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :)

/home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v
----------------------------
revision 1.27
date: 2006/10/10 22:41:46;  author: bosborne;  state: Exp;  lines: +4 -4
next_hit, not next_hits
----------------------------

I'm a simple man who takes great satisfaction in the simple things. :)

j


Brian Osborne wrote:
> j,
> 
> No need, not for something so simple.
> 
> Brian O.
> 
> 
> On 10/7/06 6:34 PM, "Jay Hannah" <jay at jays.net> wrote:
>> Except that "next_hits()" does not exist. Should be "next_hit()".
>>
>> (Should I have posted a patch instead?)
> 


From jay at jays.net  Fri Oct 13 16:24:48 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 11:24:48 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
Message-ID: <452FBDD0.2070008@jays.net>

So I'm doing the following:

1) Using Bio::SeqIO to read in a genbank file and kick out fasta.
2) Reading that fasta file w/ command line formatdb.
3) Using that output for command line blastall.
4) Using Bio::SearchIO to read the blast results.

(If there's a better way, do tell. -grin-)

This sequence is working great for nucleotide BLASTing, but I'm stuck on step 1 when trying protein BLAST. 

my $seq_in  = Bio::SeqIO->new(
   -file => "<Organism1.genbank", 
   -format => "genbank", 
   -alphabet => "protein"
);
my $seq_out_protein = Bio::SeqIO->new(
   -file => ">out",
   -format => 'fasta',
   -alphabet => 'protein'
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");
   $seq_out_protein->write_seq($inseq);
}

This creates a nucleotide file "out". Setting -alphabet doesn't seem to do anything. Setting molecule("protein") doesn't seem to do anything either.

I was expecting that it would just pull all the CDS strings out of the genbank file and dump those into fasta format?

Am I missing something obvious?

Thanks,

j


From bosborne11 at verizon.net  Fri Oct 13 16:54:02 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 13 Oct 2006 12:54:02 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <452FBDD0.2070008@jays.net>
Message-ID: <C1553CEA.AC2E%bosborne11@verizon.net>

Jay,

You're looking for the "translation" string in the CDS section, yes? You
need to delve a bit into features, the CDS is considered to be a feature of
the main or parent nucleotide sequence and the translation is part of CDS
feature:

http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank


Brian O.


On 10/13/06 12:24 PM, "Jay Hannah" <jay at jays.net> wrote:

> Am I missing something 


From bix at sendu.me.uk  Fri Oct 13 16:59:46 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 13 Oct 2006 17:59:46 +0100
Subject: [Bioperl-l] Documentation
	typo:	Bio::Search::Result::GenericResult
In-Reply-To: <452FBA2C.7070003@jays.net>
References: <C1519A11.ABD1%bosborne11@verizon.net> <452FBA2C.7070003@jays.net>
Message-ID: <452FC602.3080302@sendu.me.uk>

Jay Hannah wrote:
> Thanks Brian! 
> 
> My first BioPerl patch has been applied! A one byte typo correction! I'm so excited! :)
> 
> /home/repository/bioperl/bioperl-live/Bio/Search/Result/GenericResult.pm,v
> ----------------------------
> revision 1.27
> date: 2006/10/10 22:41:46;  author: bosborne;  state: Exp;  lines: +4 -4
> next_hit, not next_hits
> ----------------------------

Congratulations! :D

Next it will be two byte corrections and from there, the sky's the limit! :)


From hlapp at gmx.net  Fri Oct 13 17:28:50 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 13 Oct 2006 13:28:50 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <452F54A1.7010908@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>
	<452F54A1.7010908@sendu.me.uk>
Message-ID: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>

What does the POD (and the code) say about instantiating it?

	-hilmar

On Oct 13, 2006, at 4:56 AM, Sendu Bala wrote:

> Bio::Tools::Run::WrapperBase is currently a Bio::Root::RootI. It's
> certainly interface-like, but doesn't follow the normal interface  
> naming
> convention (WrapperBaseI). Should it be a Bio::Root::Root? Renamed
> WrapperBaseI? Left alone?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From jay at jays.net  Fri Oct 13 18:56:38 2006
From: jay at jays.net (Jay Hannah)
Date: Fri, 13 Oct 2006 13:56:38 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <C1553CEA.AC2E%bosborne11@verizon.net>
References: <C1553CEA.AC2E%bosborne11@verizon.net>
Message-ID: <452FE166.5080405@jays.net>

Brian Osborne wrote:
> You're looking for the "translation" string in the CDS section, yes? You
> need to delve a bit into features, the CDS is considered to be a feature of
> the main or parent nucleotide sequence and the translation is part of CDS
> feature:
> 
> http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Features_from_Genbank

Yes. Thanks. I "rolled my own" -- I'm now doing this:

while (my $inseq = $seq_in->next_seq) {
   my @features = $inseq->get_SeqFeatures();
   foreach my $feat ( @features ) {
      next unless ($feat->primary_tag eq "CDS");
      my @db_xrefs = $feat->annotation->get_Annotations("db_xref");
      @db_xrefs = grep { /^GI:/ } @db_xrefs;
      die "Panic! More than one GI: db_xref?"     if (@db_xrefs > 1);
      die "Panic! No GI: db_xref?"            unless (@db_xrefs == 1);
      my $gi = $db_xrefs[0];
      $gi =~ s/^GI://;
      my @translations = $feat->annotation->get_Annotations("translation");
      die "Panic! More than one translation?" if (@translations > 1);
      my @protein_ids = $feat->annotation->get_Annotations("protein_id");
      die "Panic! More than one protein_id?"  if (@protein_ids > 1);
      my @product = $feat->annotation->get_Annotations("product");
      die "Panic! More than one product?"  if (@product > 1);
      print ">gi|$gi|gb|$protein_ids[0]|";
      print $inseq->id . " $product[0]\n";
      print "$translations[0]\n";
   }
}

To generate a homebrew fasta file for a protein BLAST.

I just thought that -alphabet and molecule() would do that stuff for me? What else would "protein" mean in those? Does anyone use -alphabet and/or molecule()? For what? How? Again, here's what I'm talking about:

==========
my $seq_out_protein = Bio::SeqIO->new(
   -file => ">out",
   -format => 'fasta',
   -alphabet => 'protein'    # No effect?
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");    # No effect?
==========

Thanks,

j


From bosborne11 at verizon.net  Fri Oct 13 21:20:40 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 13 Oct 2006 17:20:40 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <452FE166.5080405@jays.net>
Message-ID: <C1557B68.AC3E%bosborne11@verizon.net>

Jay,

Yes, people use the -alphabet parameter. If you set it to something then
Bioperl will not try to determine whether the sequence is protein, rna, or
dna and this is particularly useful when the sequence contains characters
that Bioperl would object to (sequences with distasteful characters can be
created by various applications, for example, or you might introduce some
weird character for some reason). Setting the -alphabet would also speed up
Bioperl a bit, for the same reason.

Brian O.


On 10/13/06 2:56 PM, "Jay Hannah" <jay at jays.net> wrote:

> 
> I just thought that -alphabet and molecule() would do that stuff for me? What
> else would "protein" mean in those? 


From jay at jays.net  Sat Oct 14 15:25:05 2006
From: jay at jays.net (Jay Hannah)
Date: Sat, 14 Oct 2006 10:25:05 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <C1557B68.AC3E%bosborne11@verizon.net>
References: <C1557B68.AC3E%bosborne11@verizon.net>
Message-ID: <45310151.5050901@jays.net>

Brian Osborne wrote:
> Yes, people use the -alphabet parameter. If you set it to something then
> Bioperl will not try to determine whether the sequence is protein, rna, or
> dna and this is particularly useful when the sequence contains characters
> that Bioperl would object to (sequences with distasteful characters can be
> created by various applications, for example, or you might introduce some
> weird character for some reason). Setting the -alphabet would also speed up
> Bioperl a bit, for the same reason.

Huh. That's what I assumed when I stumbled into the -alphabet parameter. So I thought this would read the protein sequences out of my genbank file and write a fasta file for me:

my $seq_in  = Bio::SeqIO->new(
   -file     => "<$file",  
   -format   => "genbank",
   -alphabet => "protein"  # No effect?
);
my $seq_out = Bio::SeqIO->new(
   -file     => ">$outfile",
   -format   => "fasta",
   -alphabet => "protein"  # No effect?
);
while (my $inseq = $seq_in->next_seq) {
   $inseq->molecule("protein");    # No effect?
   $seq_out->write_seq($inseq);
}

It didn't. Would it be a Good Thing if it did what I was expecting? (Like I said I rolled my own, but I'm always looking for ways to enhance BioPerl that other people might find useful... Someday I will contribute something useful, by golly. -grin-)

(Background: I'm doing protein BLASTs from genbank files. To make formatdb happy I have to have fasta files full of the protein sequences.)

j


From bosborne11 at verizon.net  Sat Oct 14 18:40:21 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Sat, 14 Oct 2006 14:40:21 -0400
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <45310151.5050901@jays.net>
Message-ID: <C156A755.AC52%bosborne11@verizon.net>

Jay,

What you expected was that setting the -alphabet to "protein" would make
Bioperl translate the input nucleotide sequence to output protein. In
Bioperl this is accomplished by using the translate() method, no surprise
there. If you take a look at the documentation on translate() in the online
Bioperl Tutorial you'll see that this is a fairly sophisticated method, you
can do all sorts of different things with it. So using -alphabet for this
purpose won't really work, there are too many different ways to translate.

Brian O.


On 10/14/06 11:25 AM, "Jay Hannah" <jay at jays.net> wrote:

> Would it be a Good Thing if it did what I was expecting?


From cjfields at uiuc.edu  Sun Oct 15 00:44:04 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 14 Oct 2006 19:44:04 -0500
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
In-Reply-To: <45310151.5050901@jays.net>
Message-ID: <000601c6eff3$084663c0$15327e82@pyrimidine>

...
> Huh. That's what I assumed when I stumbled into the -alphabet parameter.
> So I thought this would read the protein sequences out of my genbank file
> and write a fasta file for me:

You have to think about it this way: the GenBank record you are using is for
the nucleotide sequence only, and all other information in that record
describes the sequence.  Similarly, if you used a 'GenPept' sequence, the
focus would be the protein sequence.  Both normally contain annotations
which describe the sequence globally, such as references, organism info,
etc.  Both also may contain features (or SeqFeatures), which describe a
feature bound to a particular location on the sequence.  However, features
are not an absolute requirement for a sequence; they're sort of 'window
dressing', albeit almost always essential for describing the main sequence.

I would do exactly as Brian suggests.  See the Feature/Annotation HOWTO for
ideas on how to screen out the particular features you want and either grab
the 'translation' tag data or get the sequence object from the feature and
translate it directly.  You should get the same result either way though
getting the tag may be faster.

...

> It didn't. Would it be a Good Thing if it did what I was expecting? (Like
> I said I rolled my own, but I'm always looking for ways to enhance BioPerl
> that other people might find useful... Someday I will contribute something
> useful, by golly. -grin-)
> 
> (Background: I'm doing protein BLASTs from genbank files. To make formatdb
> happy I have to have fasta files full of the protein sequences.)
> 
> j

You could, theoretically, write up a method to only retrieve features which
correspond to coding regions only (CDS).  You may want to optionally screen
out pseudogenes but that's up to you.

Chris


From avilella at gmail.com  Sun Oct 15 11:08:23 2006
From: avilella at gmail.com (Albert Vilella)
Date: Sun, 15 Oct 2006 12:08:23 +0100
Subject: [Bioperl-l] no_residues test in SimpleAlign.t
Message-ID: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>

Hi all,

Can somebody check the SimpleAlign.t test?

perl t/SimpleAlign.t

I get a few errors, I am looking at one that deals with no_residues. I
don't understand if this is suposed to work:

sub no_residues {
    my $self = shift;
    my $count = 0;

    foreach my $seq ($self->each_seq) {
	my $str = $seq->seq();

	$count += ($str =~ s/[^A-Za-z]//g);
        #is this the same as:
        # $str =~ s/[^A-Za-z]//g;
        # $count += length($str);
    }

Cheers,

    Albert.
    return $count;
}


From cjfields at uiuc.edu  Sun Oct 15 17:53:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 15 Oct 2006 12:53:50 -0500
Subject: [Bioperl-l] no_residues test in SimpleAlign.t
In-Reply-To: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>
References: <358f4d650610150408r23cdf693t8e3b6b797649ff58@mail.gmail.com>
Message-ID: <FE798536-21DA-4377-96E2-0BF98C235970@uiuc.edu>

Albert,

I get all 75 tests passing.  SimpleAlign.t was recently switched over  
to Test::More, so you should be seeing more explicit test  
descriptions.  It looks like test 27 is no_residues().  Were there  
any more that failed?

I usually run 'perl -I. t/test.t' from the main bioperl directory to  
check individual tests from the local directory.  Otherwise you are  
checking your installed version which may be older (and may not match  
tests and recent bug fixes).  Could that be the problem?

Chris

On Oct 15, 2006, at 6:08 AM, Albert Vilella wrote:

> Hi all,
>
> Can somebody check the SimpleAlign.t test?
>
> perl t/SimpleAlign.t
>
> I get a few errors, I am looking at one that deals with no_residues. I
> don't understand if this is suposed to work:
>
> sub no_residues {
>     my $self = shift;
>     my $count = 0;
>
>     foreach my $seq ($self->each_seq) {
> 	my $str = $seq->seq();
>
> 	$count += ($str =~ s/[^A-Za-z]//g);
>         #is this the same as:
>         # $str =~ s/[^A-Za-z]//g;
>         # $count += length($str);
>     }
>
> Cheers,
>
>     Albert.
>     return $count;
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From DGroskreutz at twt.com  Mon Oct 16 06:00:39 2006
From: DGroskreutz at twt.com (DGroskreutz at twt.com)
Date: Mon, 16 Oct 2006 01:00:39 -0500
Subject: [Bioperl-l] CN=Deb Groskreutz/OU=MSN/O=TWT is out of the office.
Message-ID: <OF66FF39D7.C58855EB-ON86257209.002104F9-86257209.002104F9@twt.com>


I will be out of the office starting  10/13/2006 and will not return until
10/30/2006.

I will be out of the office until October 30, 2006.
I will reply to your message at that time.

Thanks,
Deb


NOTICE OF CONFIDENTIALITY:
The information contained in this communication, including attachments, is intended for the specific delivery to and use by the individual(s) to whom it is addressed. This email includes confidential information that may be attorney-client privileged. Any review, retransmission, dissemination, or unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please reply to the sender immediately and delete the original communication and any copy of it from your computer system, including all attachments.


From bix at sendu.me.uk  Mon Oct 16 08:08:34 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 09:08:34 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>
	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
Message-ID: <45333E02.9070808@sendu.me.uk>

Hilmar Lapp wrote:
> What does the POD (and the code) say about instantiating it?

=head1 SYNOPSIS

   # do not use this object directly, it provides the following methods
   # for its subclasses

...


=head1 DESCRIPTION

This is a basic module from which to build executable wrapper modules.
It has some basic methods to help when implementing new modules.


There is no new() method.


From bix at sendu.me.uk  Mon Oct 16 13:23:41 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 14:23:41 +0100
Subject: [Bioperl-l] Bio::WebAgent sleep warning
Message-ID: <453387DD.3040105@sendu.me.uk>

Hi,

Does anyone think it's appropriate for Bio::WebAgent to issue warnings 
every time it sleeps? I'd consider the sleeping part of its normal, 
expected and desired behaviour so I don't need to be warned about it. 
Perhaps change the $self->warn to a $self->debug?


From cjfields at uiuc.edu  Mon Oct 16 14:12:10 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 09:12:10 -0500
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <453387DD.3040105@sendu.me.uk>
Message-ID: <000c01c6f12d$121b5000$15327e82@pyrimidine>

> Hi,
> 
> Does anyone think it's appropriate for Bio::WebAgent to issue warnings
> every time it sleeps? I'd consider the sleeping part of its normal,
> expected and desired behaviour so I don't need to be warned about it.
> Perhaps change the $self->warn to a $self->debug?

That sounds fine.  Using debugging output for sleep would be similar
behavior to Bio::DB::NCBIHelper and BioDB::GenericWebDBI.  You may want to
pass it by Heikki (I think that's his module).  

The only reason I would want to see sleep output, personally, is to make
sure it is working properly.

Almost looks like that class has the same intent that GenericWebDBI has
(even down to using LWP::UserAgent as a superclass).  I may look into it to
see if I can use this as a superclass for GenericWebDBI.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Mon Oct 16 14:26:21 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 15:26:21 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
Message-ID: <4533968D.6040009@sheffield.ac.uk>

Did anyone reconfigure the bioperl web server (which ever server hosts
http://bioperl.org/DIST) by adding the following lines to the httpd.conf
file:

RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*)
http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1

This will be required as a workaround to a bug in ActivePerl 5.8.8.819
which will result in a failed install of Bioperl via PPM.

Cheers
Nath


From n.haigh at sheffield.ac.uk  Mon Oct 16 15:30:16 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 16:30:16 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A257.2000207@campus.iztacala.unam.mx>
References: <4533968D.6040009@sheffield.ac.uk>
	<4533A257.2000207@campus.iztacala.unam.mx>
Message-ID: <4533A588.9020505@sheffield.ac.uk>

Mauricio Herrera Cuadra wrote:
> Done. Could you please check if it works as it should?
>
> Cheers,
> Mauricio.
Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
someone to pop it in http://bioperl/DIST

Volunteers?

BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
the PPD? I seem to remember that there was talk about having to maintain
a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
this front?

Nath


From arareko at campus.iztacala.unam.mx  Mon Oct 16 15:16:39 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 10:16:39 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533968D.6040009@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>
Message-ID: <4533A257.2000207@campus.iztacala.unam.mx>

Done. Could you please check if it works as it should?

Cheers,
Mauricio.

Nathan Haigh wrote:
> Did anyone reconfigure the bioperl web server (which ever server hosts
> http://bioperl.org/DIST) by adding the following lines to the httpd.conf
> file:
> 
> RedirectMatch /DIST/MSWin32-x86-multi-thread-5.8/(.*)
> http://ppm.activestate.com/PPMPackages/5.8-windows/MSWin32-x86-multi-thread-5.8/$1
> 
> This will be required as a workaround to a bug in ActivePerl 5.8.8.819
> which will result in a failed install of Bioperl via PPM.
> 
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From arareko at campus.iztacala.unam.mx  Mon Oct 16 15:33:33 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 10:33:33 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>
	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk>
Message-ID: <4533A64D.6040203@campus.iztacala.unam.mx>

Nathan Haigh wrote:
> Mauricio Herrera Cuadra wrote:
>> Done. Could you please check if it works as it should?
>>
>> Cheers,
>> Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?

You can send it to me.

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From akarger at CGR.Harvard.edu  Mon Oct 16 15:54:33 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 16 Oct 2006 11:54:33 -0400
Subject: [Bioperl-l] Bio::Location::Split
Message-ID: <B9182BFF5B004245BABC12956EA6322E016B1ED7@huls5.nucleus.harvard.edu>

I recently came across bug 2101, where Bio::Location::Split::to_FTstring
gives the incorrect order for multi-sublocation locations on the minus
strand. That is, I found it by getting incorrect results, and then found
it in Bugzilla and in the September archives.

I'm converting CDS files from one format to another. E.g., I read an
EMBL file with a chromosome and CDS features, and want to output the
location in a FASTA header. If I do something like:

foreach (<$in>) {
    foreach my $feat ($seq->getSeqFeatures) {
        print $feat->location->to_FTstring()
    }
}

I get the wrong results for multi-exon CDSs on the -1 strand, as
described in the bug report.

Is there a relatively easy way around this? I assume I can't get at the
original string of the location, which in this case is all I need. Can I
just flip the order of the exons in certain cases? Chris F, can you tell
me the preliminary solution you mentioned?

I must say I'm sort of surprised this wasn't found before. It seems like
a not-that-rare occurrence. Oh well.

Thanks,

- Amir Karger
Research Computing
Life Sciences Division
Harvard University


From bix at sendu.me.uk  Mon Oct 16 16:14:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 17:14:39 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk>
Message-ID: <4533AFEF.8080103@sendu.me.uk>

Nathan Haigh wrote:
> Mauricio Herrera Cuadra wrote:
>> Done. Could you please check if it works as it should?
>>
>> Cheers,
>> Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?

I'm sure Mauricio would be happy to do it, but so am I. You may want to 
hold off a little while until I release rc2, which may be a few hours away.


> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
> the PPD? I seem to remember that there was talk about having to maintain
> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
> this front?

It depends on what is in the PPD and what kind of auto-dependency 
features the ActiveState installer has. Given Perl 5.8 and your current 
PPD, does Bioperl install with the same or fewer number of skips if you 
also install Bundle::BioPerl first? That is, does Bundle::BioPerl even 
do anything useful anymore? If not, obviously don't bother making it a 
pre-req. If it does, my opinion is that you make it a pre-req. If people 
really don't want to install the optional stuff they can download the 
.zip file and install manually without even a make.


From Kevin.M.Brown at asu.edu  Mon Oct 16 16:14:51 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Mon, 16 Oct 2006 09:14:51 -0700
Subject: [Bioperl-l] Bio::SeqIO, genbank -> fasta, protein only?
Message-ID: <1A4207F8295607498283FE9E93B775B402196FAA@EX02.asurite.ad.asu.edu>

> > Yes, people use the -alphabet parameter. If you set it to 
> something then
> > Bioperl will not try to determine whether the sequence is 
> protein, rna, or
> > dna and this is particularly useful when the sequence 
> contains characters
> > that Bioperl would object to (sequences with distasteful 
> characters can be
> > created by various applications, for example, or you might 
> introduce some
> > weird character for some reason). Setting the -alphabet 
> would also speed up
> > Bioperl a bit, for the same reason.
> 
> Huh. That's what I assumed when I stumbled into the -alphabet 
> parameter. So I thought this would read the protein sequences 
> out of my genbank file and write a fasta file for me:
> 
> my $seq_in  = Bio::SeqIO->new(
>    -file     => "<$file",  
>    -format   => "genbank",
>    -alphabet => "protein"  # No effect?
> );
> my $seq_out = Bio::SeqIO->new(
>    -file     => ">$outfile",
>    -format   => "fasta",
>    -alphabet => "protein"  # No effect?
> );
> while (my $inseq = $seq_in->next_seq) {
>    $inseq->molecule("protein");    # No effect?
>    $seq_out->write_seq($inseq);
> }
> 
> It didn't. Would it be a Good Thing if it did what I was 
> expecting? (Like I said I rolled my own, but I'm always 
> looking for ways to enhance BioPerl that other people might 
> find useful... Someday I will contribute something useful, by 
> golly. -grin-)
> 
> (Background: I'm doing protein BLASTs from genbank files. To 
> make formatdb happy I have to have fasta files full of the 
> protein sequences.)

This might work for your needs (CDS to protein FASTA).

my $seq_in  = Bio::SeqIO->new(
   -file     => "<$file",  
   -format   => "genbank",
);

open my $seq_out, ">$outfile";

while (my $inseq = $seq_in->next_seq) {
   print $seq_out ">". $inseq->display_id(). "\n";
   print $seq_out $inseq->translate() ."\n";
}


From bix at sendu.me.uk  Mon Oct 16 15:44:19 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 16:44:19 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
Message-ID: <4533A8D3.90709@sendu.me.uk>

I think Chris recently deprecated this, but should it be? For me, its 
POD description justifies its existence, and perhaps more importantly, 
Bio::Index::Blast relies on it.

I took a quick peek at the latter and it didn't seem trivial to move it 
over to Bio::SearchIO instead.

Should it be undeprecated?


From n.haigh at sheffield.ac.uk  Mon Oct 16 16:39:02 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Mon, 16 Oct 2006 17:39:02 +0100
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533AFEF.8080103@sendu.me.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>
	<4533A588.9020505@sheffield.ac.uk> <4533AFEF.8080103@sendu.me.uk>
Message-ID: <4533B5A6.1070701@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> Mauricio Herrera Cuadra wrote:
>>> Done. Could you please check if it works as it should?
>>>
>>> Cheers,
>>> Mauricio.
>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
>> someone to pop it in http://bioperl/DIST
>>
>> Volunteers?
>
> I'm sure Mauricio would be happy to do it, but so am I. You may want
> to hold off a little while until I release rc2, which may be a few
> hours away.

Just e-mailed Mauricio links to the files off list, It's not a big job
for me to remake the bioperl PPD, so Mauricio it's up to you if you want
to wait 18hrs for me to make the PPDs for 1.5.2-rc2.
>
>
>> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
>> the PPD? I seem to remember that there was talk about having to maintain
>> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
>> this front?
>
> It depends on what is in the PPD and what kind of auto-dependency
> features the ActiveState installer has. Given Perl 5.8 and your
> current PPD, does Bioperl install with the same or fewer number of
> skips if you also install Bundle::BioPerl first? That is, does
> Bundle::BioPerl even do anything useful anymore? If not, obviously
> don't bother making it a pre-req. If it does, my opinion is that you
> make it a pre-req. If people really don't want to install the optional
> stuff they can download the .zip file and install manually without
> even a make.
As far as the PPDs are concerned - no tests are run during installation.
PPM more or less just copies files into the correct place for Perl to
find so both approaches result in the same thing. However, I've not
tried making a CPAN distribution file for either Bioperl or
Bundle::Bioperl - I wouldn't know where to start!

MakeFile.PL now only documents the prereq in one place (%packages), and
this is used to add the prereq to the bioperl PPD when issuing "nmake
ppd". This way, each release of BioPerl should be up-to-date with prereq
as long as developers add their modules prereq to %packages. If we have
Bundle::BioPerl, most of those prereq need to be moved from the Bioperl
PPD to the Bundle::BioPerl PPD - a bit of a pain because there are no
guidelines as to what should/should not go in Bundle::BioPerl.
Therefore, as far as the PPDs are concerned, it far easier to do away
with Bundel::BioPerl.

Nath


From hlapp at gmx.net  Mon Oct 16 17:04:24 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:04:24 -0400
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <45333E02.9070808@sendu.me.uk>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>
	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>
	<45333E02.9070808@sendu.me.uk>
Message-ID: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>

So it looks like an abstract base class, not an interface that  
defines a contract or API? Should use Root.pm then, would be my vote.

	-hilmar

On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> What does the POD (and the code) say about instantiating it?
>
> =head1 SYNOPSIS
>
>    # do not use this object directly, it provides the following  
> methods
>    # for its subclasses
>
> ...
>
>
> =head1 DESCRIPTION
>
> This is a basic module from which to build executable wrapper modules.
> It has some basic methods to help when implementing new modules.
>
>
> There is no new() method.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Mon Oct 16 17:08:28 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:08:28 -0400
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <453387DD.3040105@sendu.me.uk>
References: <453387DD.3040105@sendu.me.uk>
Message-ID: <B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>

It depends. What triggers the sleeping? If it's part of every request  
that it processes then I'd agree. If it is triggered by failure to  
precede the next try then the failure is probably not expected  
(though possible), and hence should be reported by warn().

If it is just part of the polling cycle then there should probably be  
a limit up to which the time waited is considered 'normal' and after  
which it is considered 'excessive' and hence should be reported  
through warn().

My $0.02.

	-hilmar

On Oct 16, 2006, at 9:23 AM, Sendu Bala wrote:

> Hi,
>
> Does anyone think it's appropriate for Bio::WebAgent to issue warnings
> every time it sleeps? I'd consider the sleeping part of its normal,
> expected and desired behaviour so I don't need to be warned about it.
> Perhaps change the $self->warn to a $self->debug?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Mon Oct 16 17:13:53 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 18:13:53 +0100
Subject: [Bioperl-l] Bio::WebAgent sleep warning
In-Reply-To: <B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>
References: <453387DD.3040105@sendu.me.uk>
	<B6C9C0EE-00DD-4B16-98D9-2BBBE6CD835E@gmx.net>
Message-ID: <4533BDD1.8060204@sendu.me.uk>

Hilmar Lapp wrote:
> It depends. What triggers the sleeping? If it's part of every request 
> that it processes then I'd agree. If it is triggered by failure to 
> precede the next try then the failure is probably not expected (though 
> possible), and hence should be reported by warn().
> 
> If it is just part of the polling cycle then there should probably be a 
> limit up to which the time waited is considered 'normal' and after which 
> it is considered 'excessive' and hence should be reported through warn().

=head2 sleep

  Title   : sleep
  Usage   : $self->sleep
  Function: sleep for a number of seconds indicated by the delay policy
  Returns : none
  Args    : none

NOTE: This method keeps track of the last time it was called and only
imposes a sleep if it was called more recently than the delay_policy()
allows.

=cut

It issues a warning every time it actually sleeps. I find it 
inappropriate that a method warns me that it did what I asked it to do.


From arareko at campus.iztacala.unam.mx  Mon Oct 16 17:14:06 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Mon, 16 Oct 2006 12:14:06 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533B5A6.1070701@sheffield.ac.uk>
References: <4533968D.6040009@sheffield.ac.uk>	<4533A257.2000207@campus.iztacala.unam.mx>	<4533A588.9020505@sheffield.ac.uk>
	<4533AFEF.8080103@sendu.me.uk> <4533B5A6.1070701@sheffield.ac.uk>
Message-ID: <4533BDDE.2040801@campus.iztacala.unam.mx>

Nathan Haigh wrote:
> Sendu Bala wrote:
>> Nathan Haigh wrote:
>>> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
>>> someone to pop it in http://bioperl/DIST
>>>
>>> Volunteers?
>> I'm sure Mauricio would be happy to do it, but so am I. You may want
>> to hold off a little while until I release rc2, which may be a few
>> hours away.
> 
> Just e-mailed Mauricio links to the files off list, It's not a big job
> for me to remake the bioperl PPD, so Mauricio it's up to you if you want
> to wait 18hrs for me to make the PPDs for 1.5.2-rc2.

Too late, I've already placed 1.5.2-rc1 in DIST. hehe :)

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From bix at sendu.me.uk  Mon Oct 16 16:32:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 17:32:11 +0100
Subject: [Bioperl-l] Swissprot problems
Message-ID: <4533B40B.2030908@sendu.me.uk>

t/Biofetch.t and t/DB.t are skipping their swissprot database fetches.
Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for 
maintenance but is now back up. However I'm guessing the databases must 
have changed. I've manually looked for the test case 'YNB3_YEAST' in 
database 'UniProtKB' and it came back with no result, even though I can 
find the test case manually at the expasy website.

Is this an EBI bug or deliberate change that makes sense to someone?


From m.weimer at dkfz-heidelberg.de  Mon Oct 16 16:43:38 2006
From: m.weimer at dkfz-heidelberg.de (Marc Weimer)
Date: Mon, 16 Oct 2006 18:43:38 +0200
Subject: [Bioperl-l] Bio::DB::SwissProt Problem
Message-ID: <1161017019.5203.6.camel@localhost>

Dear list members,

when running 

######################################################################
#! /usr/bin/perl -w

use strict;
use Bio::DB::SwissProt;

my $db_obj = new Bio::DB::SwissProt(-verbose => 1);

my $seq_obj = $db_obj->get_Seq_by_acc("O02938");
######################################################################

using Bioperl 1.5.2 I get the following message:

##########################################################################################

request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
Content-Length: 49
Content-Type: application/x-www-form-urlencoded

format=swissprot&db=UniProtKB&style=raw&id=O02938


------------- EXCEPTION: Bio::Root::Exception -------------
MSG: acc O02938 does not exist
STACK: Error::throw
STACK:
Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350
STACK:
Bio::DB::WebDBSeqI::get_Seq_by_acc /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181
STACK: ./get.test.pl:8
-----------------------------------------------------------

##########################################################################################

But the accession number does exist. Surprisingly, everything worked
fine a few days ago. Any ideas of what might have happened?

Thanks and best regards,

Marc

 
From hlapp at gmx.net  Mon Oct 16 17:15:50 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 16 Oct 2006 13:15:50 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
References: <4533A8D3.90709@sendu.me.uk>
Message-ID: <C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>

The problem is it is not maintained, and there are outstanding been  
bug reports.

If you un-deprecate it, then we need a response to people who come  
across problems with it when using it. Either you change the POD to  
say exactly who and when one should use it (or rather not) and point  
to the fact that it is unsupported for all other cases.

Or what would you suggest?

	-hilmar

On Oct 16, 2006, at 11:44 AM, Sendu Bala wrote:

> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
>
> I took a quick peek at the latter and it didn't seem trivial to  
> move it
> over to Bio::SearchIO instead.
>
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Oct 16 17:21:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:21:46 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
Message-ID: <000001c6f147$8efdfd60$15327e82@pyrimidine>

Bio::Tools::BPlite was placed on the deprecation list a while back (~ rel
1.5); the other related Bio::Tools::BP* modules were also supposed to be on
that list as well.  

If we want to undeprecate (de-deprecate? reprecate?) BPlite we also would
need to do the same for the others.  They must be updated to parse current
BLAST/PSI-BLAST/bl2seq text output, something that Bio::SearchIO::blast is
currently capable of (so the functionality is redundant).  And someone needs
to take them over.

In my opinion it may be more trouble than it's worth as they haven't been
touched in a while.    Seems if we 'revive' BPlite we're not really moving
forward esp. since you have added the PullParser recently and made
substantial improvements to SearchIO.  

Maybe Bio::Index::Blast just needs to be deprecated or rewritten to use
SearchIO?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 16, 2006 10:44 AM
> To: bioperl-l
> Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
> 
> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
> 
> I took a quick peek at the latter and it didn't seem trivial to move it
> over to Bio::SearchIO instead.
> 
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bix at sendu.me.uk  Mon Oct 16 17:21:58 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 18:21:58 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>
References: <4533A8D3.90709@sendu.me.uk>
	<C53CD6F5-6E0D-49B7-B280-A51C3A2B991F@gmx.net>
Message-ID: <4533BFB6.5070504@sendu.me.uk>

Hilmar Lapp wrote:
> The problem is it is not maintained, and there are outstanding been bug 
> reports.
> 
> If you un-deprecate it, then we need a response to people who come 
> across problems with it when using it. Either you change the POD to say 
> exactly who and when one should use it (or rather not) and point to the 
> fact that it is unsupported for all other cases.
> 
> Or what would you suggest?

I'm not sure.

Does Bio::Index::Blast even work correctly? Does it suffer from whatever 
bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should 
that be deprecated as well?

Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO 
and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't 
seem trivial (or even appropriate).

Ultimately I just wanted to solve the warnings in the test suite. 
Thoughts, Chris?


From cjfields at uiuc.edu  Mon Oct 16 17:30:05 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:30:05 -0500
Subject: [Bioperl-l] Bioperl Server Reconfig
In-Reply-To: <4533A588.9020505@sheffield.ac.uk>
Message-ID: <000101c6f148$b8538b20$15327e82@pyrimidine>

> Mauricio Herrera Cuadra wrote:
> > Done. Could you please check if it works as it should?
> >
> > Cheers,
> > Mauricio.
> Will do as soon as I have created a Bioperl 1.5.2-rc1 PPD and got
> someone to pop it in http://bioperl/DIST
> 
> Volunteers?
> 
> BTW, was it agreed to have Bundle::BioPerl as a prereq of Bioperl for
> the PPD? I seem to remember that there was talk about having to maintain
> a separate Bundle::BioPerl for each release of Bioperl. Any ideas on
> this front?
> 
> Nath

Nathan,

I think Chris Dagdigian still maintains Bundle::Bioperl on CPAN.  That
version should be the common basis for prereqs for any Bioperl core
installation.  

It's relatively easy to add/remove modules to the Bundle::Bioperl.  Contact
Chris D. and let him know if anything needs to be changed.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 16 17:33:50 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:33:50 -0500
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
Message-ID: <000201c6f149$3ed63490$15327e82@pyrimidine>

> So it looks like an abstract base class, not an interface that
> defines a contract or API? Should use Root.pm then, would be my vote.
> 
> 	-hilmar

Makes sense to me.  Maybe another audit is needed to catch similar
instances, or has this been done already?

Chris

> On Oct 16, 2006, at 4:08 AM, Sendu Bala wrote:
> 
> > Hilmar Lapp wrote:
> >> What does the POD (and the code) say about instantiating it?
> >
> > =head1 SYNOPSIS
> >
> >    # do not use this object directly, it provides the following
> > methods
> >    # for its subclasses
> >
> > ...
> >
> >
> > =head1 DESCRIPTION
> >
> > This is a basic module from which to build executable wrapper modules.
> > It has some basic methods to help when implementing new modules.
> >
> >
> > There is no new() method.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 16 17:57:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 12:57:35 -0500
Subject: [Bioperl-l] Bio::Location::Split
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E016B1ED7@huls5.nucleus.harvard.edu>
Message-ID: <000301c6f14c$8fb0e060$15327e82@pyrimidine>

> I recently came across bug 2101, where Bio::Location::Split::to_FTstring
> gives the incorrect order for multi-sublocation locations on the minus
> strand. That is, I found it by getting incorrect results, and then found
> it in Bugzilla and in the September archives.
>
> I'm converting CDS files from one format to another. E.g., I read an
> EMBL file with a chromosome and CDS features, and want to output the
> location in a FASTA header. If I do something like:
> 
> foreach (<$in>) {
>     foreach my $feat ($seq->getSeqFeatures) {
>         print $feat->location->to_FTstring()
>     }
> }
> 
> I get the wrong results for multi-exon CDSs on the -1 strand, as
> described in the bug report.
> 
> Is there a relatively easy way around this? I assume I can't get at the
> original string of the location, which in this case is all I need. Can I
> just flip the order of the exons in certain cases? Chris F, can you tell
> me the preliminary solution you mentioned?
> 
> I must say I'm sort of surprised this wasn't found before. It seems like
> a not-that-rare occurrence. Oh well.
> 
> Thanks,
> 
> - Amir Karger
> Research Computing
> Life Sciences Division
> Harvard University

Could you let me know specifically which EMBL file contains the odd
locations?  The bug report uses theoretical locations, not actual ones, so
it would be nice to have a real-world example to test against.  

As for the lack of catching this, the particular types of locations that
cause the issue are quite rare.  Note that there are two bugs for that bug
report.  The first (and more serious) is still unresolved.  The second
(where remote locations are treated differently in Location::Split, which
caused more problems than it was worth) had a fix committed about a month
ago.  

Any fixes I have made for the first bug invariably break several other
methods, which use the current Location::Split object logic for retrieving
sequences, building feature strings, etc.  Since a new RC is imminent and
the bug only affects a small number of locations, I have held off until
after a final release is made (the last thing I want to do is fix something
that breaks ~6-8 other methods), but I'll try looking at it again this week.


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 16 18:29:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:29:02 -0500
Subject: [Bioperl-l] Swissprot problems
In-Reply-To: <4533B40B.2030908@sendu.me.uk>
Message-ID: <000401c6f150$f57dfc30$15327e82@pyrimidine>


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
> Sent: Monday, October 16, 2006 11:32 AM
> To: bioperl-l
> Subject: [Bioperl-l] Swissprot problems
> 
> t/Biofetch.t and t/DB.t are skipping their swissprot database fetches.
> Earlier today http://www.ebi.ac.uk/cgi-bin/dbfetch was down for
> maintenance but is now back up. However I'm guessing the databases must
> have changed. I've manually looked for the test case 'YNB3_YEAST' in
> database 'UniProtKB' and it came back with no result, even though I can
> find the test case manually at the expasy website.
> 
> Is this an EBI bug or deliberate change that makes sense to someone?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

I can confirm that.  It's not our end, though.  Entering the same data on
the DBFetch web page also gets no data.  I have emailed EBI about the
problem and will let you know if I hear anything; I think it's related to
the maintenance issue.

Notably, nothing on the web page indicates any database name changes yet.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 16 18:29:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:29:52 -0500
Subject: [Bioperl-l] Bio::DB::SwissProt Problem
In-Reply-To: <1161017019.5203.6.camel@localhost>
Message-ID: <000501c6f151$12918710$15327e82@pyrimidine>

We think there is a problem on the SwissProt (DBFetch) server.  I have
contacted them about the problem and will post something when I hear
something back.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Marc Weimer
> Sent: Monday, October 16, 2006 11:44 AM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bio::DB::SwissProt Problem
> 
> Dear list members,
> 
> when running
> 
> ######################################################################
> #! /usr/bin/perl -w
> 
> use strict;
> use Bio::DB::SwissProt;
> 
> my $db_obj = new Bio::DB::SwissProt(-verbose => 1);
> 
> my $seq_obj = $db_obj->get_Seq_by_acc("O02938");
> ######################################################################
> 
> using Bioperl 1.5.2 I get the following message:
> 
> ##########################################################################
> ################
> 
> request is POST http://www.ebi.ac.uk/cgi-bin/dbfetch
> Content-Length: 49
> Content-Type: application/x-www-form-urlencoded
> 
> format=swissprot&db=UniProtKB&style=raw&id=O02938
> 
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: acc O02938 does not exist
> STACK: Error::throw
> STACK:
> Bio::Root::Root::throw /usr/local/share/perl/5.8.7/Bio/Root/Root.pm:350
> STACK:
> Bio::DB::WebDBSeqI::get_Seq_by_acc
> /usr/local/share/perl/5.8.7/Bio/DB/WebDBSeqI.pm:181
> STACK: ./get.test.pl:8
> -----------------------------------------------------------
> 
> ##########################################################################
> ################
> 
> But the accession number does exist. Surprisingly, everything worked
> fine a few days ago. Any ideas of what might have happened?
> 
> Thanks and best regards,
> 
> Marc
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Mon Oct 16 18:39:28 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 13:39:28 -0500
Subject: [Bioperl-l] SwissProt Down
Message-ID: <000601c6f152$6997dbd0$15327e82@pyrimidine>

Looks like the swissprot problem stems from maintenance at EBI.  From the
EBI page http://www.ebi.ac.uk/Information/ (not on the DBFetch page, BTW):

Please Note: Monday October 16th 12:00-15:00 -  Due to general maintenance,
some services from the EBI may be temporarily unavailable. We apologise for
any inconvenience.

At least we know that Test::More skips are working!

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Mon Oct 16 18:51:31 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 16 Oct 2006 19:51:31 +0100
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C15946CA.ACA9%bosborne11@verizon.net>
References: <C15946CA.ACA9%bosborne11@verizon.net>
Message-ID: <4533D4B3.2000809@sendu.me.uk>

Brian Osborne wrote:
> Sendu,
> 
> I just made a commit that makes Bio::Index::Blast use SearchIO instead of
> BPlite.

I was concerned about the whole id_parser thing. Did you determine that 
your change still allows for id_parser to be used and have the intended 
effect, or that id_parser is in someway meaningless and should be 
removed as a method?


From cjfields at uiuc.edu  Mon Oct 16 19:03:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 14:03:33 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533BFB6.5070504@sendu.me.uk>
Message-ID: <000301c6f155$c7029ff0$15327e82@pyrimidine>

> Hilmar Lapp wrote:
> > The problem is it is not maintained, and there are outstanding been bug
> > reports.
> >
> > If you un-deprecate it, then we need a response to people who come
> > across problems with it when using it. Either you change the POD to say
> > exactly who and when one should use it (or rather not) and point to the
> > fact that it is unsupported for all other cases.
> >
> > Or what would you suggest?
> 
> I'm not sure.
> 
> Does Bio::Index::Blast even work correctly? Does it suffer from whatever
> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should
> that be deprecated as well?
> 
> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO
> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't
> seem trivial (or even appropriate).
> 
> Ultimately I just wanted to solve the warnings in the test suite.
> Thoughts, Chris?

My opinion is we either have to completely support BPlite (and the others)
or drop it altogether.  I don't think we can state "use BPLite only with
Bio::Index::Blast, use SearchIO everywhere else."  That's too inconsistent.


It seems simpler to deprecate the various Bio::Tools::BP* classes and either
fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working
on) or deprecate Bio::Index::Blast as well.  

The warnings in the test suite belong to BlastIndex.t, correct?  I updated
using Brian's Bio::Index::blast fix and it passes now w/o warnings.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From akarger at CGR.Harvard.edu  Mon Oct 16 19:00:28 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Mon, 16 Oct 2006 15:00:28 -0400
Subject: [Bioperl-l] Bio::Location::Split
Message-ID: <B9182BFF5B004245BABC12956EA6322E016B1F89@huls5.nucleus.harvard.edu>

 
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu] 
> >
> > I'm converting CDS files from one format to another. E.g., I read an
> > EMBL file with a chromosome and CDS features, and want to output the
> > location in a FASTA header.> > 
> > I get the wrong results for multi-exon CDSs on the -1 strand, as
> > described in the bug report.
> > 
>
> Could you let me know specifically which EMBL file contains the odd
> locations?  The bug report uses theoretical locations, not 
> actual ones, so
> it would be nice to have a real-world example to test against. 

I downloaded candida glabrata chromosome B from EBI:
http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948

testportal>perl location.pl new_glabrata_B.embl > bio
testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/'
new_glabrata_B.embl > nonbio
testportal>wc bio nonbio
 217  217 4537 bio
 217  217 4549 nonbio
 434  434 9086 total
testportal>diff bio nonbio
4c4
< complement(join(10632..11157,10347..10372))
---
> join(complement(10632..11157),complement(10347..10372))

Just one example here, but see below.
 
> As for the lack of catching this, the particular types of 
> locations that
> cause the issue are quite rare.  

Really? I guess our definitions of rare depend on which sequences we're
working with. I'm doing fungal genomes, and here's a grep for a few
species' entire genomes:

testportal>foreach i ( *.embl )
foreach? echo $i
foreach? grep CDS $i | grep join | grep -c complement
foreach? end
glabrata_orf.embl
29
hansenii_orf.embl
151
lactis_orf.embl
70
lipolytica_orf.embl
337
pombe_orf.embl
1137

You might like to use pombe as a test case, as it has lots of these
complement joins, including ones with multiple introns.

Anyway, I'd question the "rare" designation. It seems to me like any
species that has introns will have situations like this in their CDSs.
Not to mention any other sequence that uses Bio::Location::Split. (Since
I'm not a Real Biologist, I can't think up mor examples here, but I'm
sure they exist.)

Or are you saying it's rare to use join (complement(C..D),
complement(A..B)) instead of complement(join(A..B, C..D)). In that case,
I guess I just got really unlucky in that five fungal genomes I was
using decided to use the "rare" syntax. 

> Note that there are two bugs 
> for that bug
> report.  The first (and more serious) is still unresolved.  The second
> (where remote locations are treated differently in 
> Location::Split, which
> caused more problems than it was worth) had a fix committed 
> about a month
> ago.  

Sadly, it's the first (and in my case, more common (I have no remote
locations.)) bug for me.

> Any fixes I have made for the first bug invariably break several other
> methods, which use the current Location::Split object logic 
> for retrieving
> sequences, building feature strings, etc.  Since a new RC is 
> imminent and
> the bug only affects a small number of locations, I have held 
> off until
> after a final release is made (the last thing I want to do is 
> fix something
> that breaks ~6-8 other methods), but I'll try looking at it 
> again this week.

IMO this is a pretty serious bug (if these kinds of sequences aren't
that rare as I've shown above), because you're outputting sequence
descriptions that are just plain wrong. Anyone who uses
FTLocationFactory to read these output description will have incorrect
sequence, incorrect translated proteins, etc. And it's even more serious
if other methods are depending on it.

I know I can't dictate your time, and should be volunteering to work on
fixing it. But if it affects other modules, then I will no doubt break
things even more than you have in your attempts.  

-Amir

> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 


From bosborne11 at verizon.net  Mon Oct 16 18:25:14 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 14:25:14 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533A8D3.90709@sendu.me.uk>
Message-ID: <C15946CA.ACA9%bosborne11@verizon.net>

Sendu,

I just made a commit that makes Bio::Index::Blast use SearchIO instead of
BPlite. The BlastIndex.t test is giving a few warnings so I need to take a
look at that but all tests are passing.

An awful lot of work has gone into the SearchIO system, for more on why its
approach is deemed to be superior in the context of Bioperl see the SearchIO
HOWTO. One key feature of this upcoming release is an emphasis on removing
extraneous modules, I think it's safe to say that BPlite has been considered
extraneous for a number of years now.

Brian O.


On 10/16/06 11:44 AM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> I think Chris recently deprecated this, but should it be? For me, its
> POD description justifies its existence, and perhaps more importantly,
> Bio::Index::Blast relies on it.
> 
> I took a quick peek at the latter and it didn't seem trivial to move it
> over to Bio::SearchIO instead.
> 
> Should it be undeprecated?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Mon Oct 16 18:59:38 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 14:59:38 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <4533D4B3.2000809@sendu.me.uk>
Message-ID: <C1594EDA.ACB9%bosborne11@verizon.net>

Sendu,

OK. I _think_ this change shouldn't affect id_parser() but I will test this
in BlastIndex.t. The id_parser() method is relevant to all these Index*
modules - don't know how much it's used but it certainly is nice to have it
available.

Brian O.


On 10/16/06 2:51 PM, "Sendu Bala" <bix at sendu.me.uk> wrote:

> Brian Osborne wrote:
>> Sendu,
>> 
>> I just made a commit that makes Bio::Index::Blast use SearchIO instead of
>> BPlite.
> 
> I was concerned about the whole id_parser thing. Did you determine that
> your change still allows for id_parser to be used and have the intended
> effect, or that id_parser is in someway meaningless and should be
> removed as a method?


From cjfields at uiuc.edu  Mon Oct 16 20:51:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 15:51:08 -0500
Subject: [Bioperl-l] Bio::Location::Split
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E016B1F89@huls5.nucleus.harvard.edu>
Message-ID: <000001c6f164$d1380190$15327e82@pyrimidine>

...
> I downloaded candida glabrata chromosome B from EBI:
> http://www.ebi.ac.uk/genomes/eukaryota.html, CR380948
> 
> testportal>perl location.pl new_glabrata_B.embl > bio
> testportal>perl -wlne 'print $1 if /^FT\s+CDS\s+(.*)/'
> new_glabrata_B.embl > nonbio
> testportal>wc bio nonbio
>  217  217 4537 bio
>  217  217 4549 nonbio
>  434  434 9086 total
> testportal>diff bio nonbio
> 4c4
> < complement(join(10632..11157,10347..10372))
> ---
> > join(complement(10632..11157),complement(10347..10372))
> 
> Just one example here, but see below.
> 
> > As for the lack of catching this, the particular types of
> > locations that
> > cause the issue are quite rare.
> 
> Really? I guess our definitions of rare depend on which sequences we're
> working with. I'm doing fungal genomes, and here's a grep for a few
> species' entire genomes:
> 
> testportal>foreach i ( *.embl )
> foreach? echo $i
> foreach? grep CDS $i | grep join | grep -c complement
> foreach? end
> glabrata_orf.embl
> 29
> hansenii_orf.embl
> 151
> lactis_orf.embl
> 70
> lipolytica_orf.embl
> 337
> pombe_orf.embl
> 1137
> 
> You might like to use pombe as a test case, as it has lots of these
> complement joins, including ones with multiple introns.

I'll use those.  I'll see if an analogous GenBank file exists as well.  

I can probably make a preliminary fix for FT_string() so that it arranges
the sublocations correctly, but I think the best way to go is to have
FTLocationFactory not modify the various sublocations to start with, which
it currently does when it sets strand() (strand() propagates the strand info
to sublocations). 

> Anyway, I'd question the "rare" designation. It seems to me like any
> species that has introns will have situations like this in their CDSs.
> Not to mention any other sequence that uses Bio::Location::Split. (Since
> I'm not a Real Biologist, I can't think up mor examples here, but I'm
> sure they exist.)

I think that additional tests are definitely needed for pulling out
sequences.  

What I mean by 'rare' is that the majority of sequences do not have
problems.  Also, this seems to be a 'silent' bug since the error shows up in
to_FTstring() but the object sublocations seem to beprocessed correctly when
using the location object directly (such as via SeqFeatureI).  

Round-tripping the sequence should pick it up though.  Since
complement(join(10632..11157,10347..10372)) is not the same as
join(complement(10632..11157),complement(10347..10372)).  

That is essentially what you are doing, correct? i.e. getting the sequences
using Bioperl, saving them (which passes them through SeqIO), reading them
again (back through SeqIO with the malformed location string).

> Or are you saying it's rare to use join (complement(C..D),
> complement(A..B)) instead of complement(join(A..B, C..D)). In that case,
> I guess I just got really unlucky in that five fungal genomes I was
> using decided to use the "rare" syntax.

Location::Split is supposed to handle all variations, but apparently it
isn't.  

> > Note that there are two bugs
> > for that bug
> > report.  The first (and more serious) is still unresolved.  The second
> > (where remote locations are treated differently in
> > Location::Split, which
> > caused more problems than it was worth) had a fix committed
> > about a month
> > ago.
> 
> Sadly, it's the first (and in my case, more common (I have no remote
> locations.)) bug for me.
> 
> > Any fixes I have made for the first bug invariably break several other
> > methods, which use the current Location::Split object logic
> > for retrieving
> > sequences, building feature strings, etc.  Since a new RC is
> > imminent and
> > the bug only affects a small number of locations, I have held
> > off until
> > after a final release is made (the last thing I want to do is
> > fix something
> > that breaks ~6-8 other methods), but I'll try looking at it
> > again this week.
> 
> IMO this is a pretty serious bug (if these kinds of sequences aren't
> that rare as I've shown above), because you're outputting sequence
> descriptions that are just plain wrong. Anyone who uses
> FTLocationFactory to read these output description will have incorrect
> sequence, incorrect translated proteins, etc. And it's even more serious
> if other methods are depending on it.
> 
> I know I can't dictate your time, and should be volunteering to work on
> fixing it. But if it affects other modules, then I will no doubt break
> things even more than you have in your attempts.
> 
> -Amir

I'll give it a look over the next week.  Like I mentioned above, I may be
able to fix it in Split::to_FTstring() w/o breaking other tests (in which
case I'll commit it for the 1.5.2 release), but it would be a temporary hack
until I can work out why other tests are failing.

Chris


From jason at bioperl.org  Mon Oct 16 22:45:21 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Oct 2006 15:45:21 -0700
Subject: [Bioperl-l] split location problems
Message-ID: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>

The whole point of split locations is to represent genes with introns  
so that is not the "rare" case.

I'm confused where the problem is.  The locations that I get out with  
to_FTstring on the location object are exactly the same as those input.

I have processed the genbank fungal genomes into GFF3 and have had no  
problems so I'm confused where you are breaking down.  If I write  
them out as embl I also get the correct thing.  This is using the CVS  
version of bioperl from the HEAD.

I've added code to test this to bug 2101 including a C.glabrata  
chromsome downloaded from genbank.  Perhaps the problem is on the  
EMBL parsing side, I didn't test that.

On the technical side, I still am not sure I fully know where the  
strand information should be stored - the top level container or the  
sub-features.  I'll try and stay up on the discussion if anything has  
been decided that I should know about.

-jason


From torsten.seemann at infotech.monash.edu.au  Mon Oct 16 22:23:23 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Tue, 17 Oct 2006 08:23:23 +1000
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <000201c6f149$3ed63490$15327e82@pyrimidine>
References: <000201c6f149$3ed63490$15327e82@pyrimidine>
Message-ID: <4534065B.9020309@infotech.monash.edu.au>

Chris Fields wrote:
>> So it looks like an abstract base class, not an interface that
>> defines a contract or API? Should use Root.pm then, would be my vote.
>> 	-hilmar
> 
> Makes sense to me.  Maybe another audit is needed to catch similar
> instances, or has this been done already?

The purpose of my original (poorly phrased) question was to try and sort 
out where Root and RootI where being used the wrong way around.

I'm currently "all-audited out" so I leave this task to another volunteer.

-- 
Dr Torsten Seemann               http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia


From cjfields at uiuc.edu  Tue Oct 17 01:07:55 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 20:07:55 -0500
Subject: [Bioperl-l] split location problems
In-Reply-To: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
Message-ID: <BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>


On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote:

> The whole point of split locations is to represent genes with  
> introns so that is not the "rare" case.
>
> I'm confused where the problem is.  The locations that I get out  
> with to_FTstring on the location object are exactly the same as  
> those input.

The problem is with the a subset of split locations described in the  
bug report.  The following works:

complement(join(2691..4571,4918..5163))

whereas this:

join(complement(4918..5163),complement(2691..4571))

gives this:

complement(join(4918..5163,2691..4571))

which is not syntactically the same.  It should be:

complement(join(2691..4571,4918..5163))

since 'join' implies that the order of the segments to be joined is  
important ('order' and 'bond' do not, I guess).

> I have processed the genbank fungal genomes into GFF3 and have had  
> no problems so I'm confused where you are breaking down.  If I  
> write them out as embl I also get the correct thing.  This is using  
> the CVS version of bioperl from the HEAD.
>
> I've added code to test this to bug 2101 including a C.glabrata  
> chromsome downloaded from genbank.  Perhaps the problem is on the  
> EMBL parsing side, I didn't test that.
>
> On the technical side, I still am not sure I fully know where the  
> strand information should be stored - the top level container or  
> the sub-features.  I'll try and stay up on the discussion if  
> anything has been decided that I should know about.
>
> -jason

Split::strand() sets the sublocations as well, which seems to confuse  
the situation more but it is consistent with LocationI, as Hilmar  
points out.  I'm looking into a few solutions now, including a fix in  
Split::to_FTstring().

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Tue Oct 17 02:48:14 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 16 Oct 2006 19:48:14 -0700
Subject: [Bioperl-l] split location problems
In-Reply-To: <BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>
References: <0B6ADEF3-D6C8-4348-BFE2-A9769F2DF29D@bioperl.org>
	<BE15BACB-6076-4F27-82FF-3DFE10FFD1C0@uiuc.edu>
Message-ID: <8273f6c20610161948w201537a5v2fcfa189eb809283@mail.gmail.com>

This probably was exposed by the fact that the Split object used to
explicitly sort the features by start*strand always.  But with remote
locations and needing to be able to explicitly set the order (for features
that are not required to be 5' -> 3') that code must have been removed.   I
think there is just one place that must be missing a 'reverse' on the list
of sub-locations when the top-level feature is a complement.  I'll wait for
your fix before wading in - we probably might want to figure out a
'consolidate' method to shrink redundant and equivalent representations to
the shortest possible form. Ugh this really starts to resemble trying to
write a boolean logic toolkit....
-jason

On 10/16/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
> On Oct 16, 2006, at 5:45 PM, Jason Stajich wrote:
>
> > The whole point of split locations is to represent genes with
> > introns so that is not the "rare" case.
> >
> > I'm confused where the problem is.  The locations that I get out
> > with to_FTstring on the location object are exactly the same as
> > those input.
>
> The problem is with the a subset of split locations described in the
> bug report.  The following works:
>
> complement(join(2691..4571,4918..5163))
>
> whereas this:
>
> join(complement(4918..5163),complement(2691..4571))
>
> gives this:
>
> complement(join(4918..5163,2691..4571))
>
> which is not syntactically the same.  It should be:
>
> complement(join(2691..4571,4918..5163))
>
> since 'join' implies that the order of the segments to be joined is
> important ('order' and 'bond' do not, I guess).
>
> > I have processed the genbank fungal genomes into GFF3 and have had
> > no problems so I'm confused where you are breaking down.  If I
> > write them out as embl I also get the correct thing.  This is using
> > the CVS version of bioperl from the HEAD.
> >
> > I've added code to test this to bug 2101 including a C.glabrata
> > chromsome downloaded from genbank.  Perhaps the problem is on the
> > EMBL parsing side, I didn't test that.
> >
> > On the technical side, I still am not sure I fully know where the
> > strand information should be stored - the top level container or
> > the sub-features.  I'll try and stay up on the discussion if
> > anything has been decided that I should know about.
> >
> > -jason
>
> Split::strand() sets the sublocations as well, which seems to confuse
> the situation more but it is consistent with LocationI, as Hilmar
> points out.  I'm looking into a few solutions now, including a fix in
> Split::to_FTstring().
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
Jason Stajich
jason at bioperl.org
http://www.duke.edu/~jes12/


From cjfields at uiuc.edu  Tue Oct 17 03:34:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 16 Oct 2006 22:34:25 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C159C54B.ACD5%bosborne11@verizon.net>
References: <C159C54B.ACD5%bosborne11@verizon.net>
Message-ID: <AE334107-1639-468E-ABA8-2F992693809A@uiuc.edu>


On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:

> Chris and Sendu,
>
> Sendu was correct in wondering whether id_parser() in Blast.pm  
> would work
> after the module was altered to use SearchIO but what I've found  
> out from my
> local tests is that id_parser() didn't work when BPlite was being used
> either. I can continue to work on this but it's safe to say that  
> removing
> BPlite doesn't cause a problem with id_parser, it was already there.
>
> Brian O.

....

It may be one reason (the main reason?) the method wasn't tested.   
Maybe it should be removed if it can't be easily fixed; I don't think  
it makes sense keeping it otherwise.

Chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bosborne11 at verizon.net  Tue Oct 17 03:24:59 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 23:24:59 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <000301c6f155$c7029ff0$15327e82@pyrimidine>
Message-ID: <C159C54B.ACD5%bosborne11@verizon.net>

Chris and Sendu,

Sendu was correct in wondering whether id_parser() in Blast.pm would work
after the module was altered to use SearchIO but what I've found out from my
local tests is that id_parser() didn't work when BPlite was being used
either. I can continue to work on this but it's safe to say that removing
BPlite doesn't cause a problem with id_parser, it was already there.

Brian O.


On 10/16/06 3:03 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

>> Hilmar Lapp wrote:
>>> The problem is it is not maintained, and there are outstanding been bug
>>> reports.
>>> 
>>> If you un-deprecate it, then we need a response to people who come
>>> across problems with it when using it. Either you change the POD to say
>>> exactly who and when one should use it (or rather not) and point to the
>>> fact that it is unsupported for all other cases.
>>> 
>>> Or what would you suggest?
>> 
>> I'm not sure.
>> 
>> Does Bio::Index::Blast even work correctly? Does it suffer from whatever
>> bugs Bio::Tools::BPlite has? Is Bio::Index::Blast maintained? Should
>> that be deprecated as well?
>> 
>> Ideally I'd like to see Bio::Index::Blast updated to use Bio::SearchIO
>> and then Bio::Tools::BPlite can be deprecated. But like I say, it didn't
>> seem trivial (or even appropriate).
>> 
>> Ultimately I just wanted to solve the warnings in the test suite.
>> Thoughts, Chris?
> 
> My opinion is we either have to completely support BPlite (and the others)
> or drop it altogether.  I don't think we can state "use BPLite only with
> Bio::Index::Blast, use SearchIO everywhere else."  That's too inconsistent.
> 
> 
> It seems simpler to deprecate the various Bio::Tools::BP* classes and either
> fix Bio::Index::Blast to use Bio::SearchIO (which I think Brian is working
> on) or deprecate Bio::Index::Blast as well.
> 
> The warnings in the test suite belong to BlastIndex.t, correct?  I updated
> using Brian's Bio::Index::blast fix and it passes now w/o warnings.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bosborne11 at verizon.net  Tue Oct 17 03:48:56 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Mon, 16 Oct 2006 23:48:56 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <AE334107-1639-468E-ABA8-2F992693809A@uiuc.edu>
Message-ID: <C159CAE8.ACD9%bosborne11@verizon.net>

Chris,

OK. In fact there's no written guarantee that all Bio::Index* modules have
an id_parser() method. It happens that most do, and it's useful. I'll fix
the documentation in Bio::Index::Blast and add an enhancement request to
Bugzilla, may be able to get around to before 1.5.2 release but no promises.

Brian O.


On 10/16/06 11:34 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> 
> On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:
> 
>> Chris and Sendu,
>> 
>> Sendu was correct in wondering whether id_parser() in Blast.pm
>> would work
>> after the module was altered to use SearchIO but what I've found
>> out from my
>> local tests is that id_parser() didn't work when BPlite was being used
>> either. I can continue to work on this but it's safe to say that
>> removing
>> BPlite doesn't cause a problem with id_parser, it was already there.
>> 
>> Brian O.
> 
> ....
> 
> It may be one reason (the main reason?) the method wasn't tested.
> Maybe it should be removed if it can't be easily fixed; I don't think
> it makes sense keeping it otherwise.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Tue Oct 17 06:35:43 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 07:35:43 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
Message-ID: <453479BF.90408@sheffield.ac.uk>

I'm a bit unclear as to what is happening with these files.

Are these files now superseded by the wikified versions? If so, should 
these files now just simply contain a link to the wikified versions - 
otherwise things could get in a mess since I updated the wiki version of 
INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks 
ago - hopefully these differences aren't that big.

Nath


From faruque at ebi.ac.uk  Tue Oct 17 08:19:44 2006
From: faruque at ebi.ac.uk (Nadeem Faruque)
Date: Tue, 17 Oct 2006 09:19:44 +0100
Subject: [Bioperl-l] split location problems
Message-ID: <F2A2DB48-8EDF-43AA-AFCF-45B48AF43B1C@ebi.ac.uk>

EMBL' currently outputs join-complements in the format
join(complement(30..40),complement(10..20))
instead of the Genbank preferred
complement(join(10..20,30..40))

EMBL's may reflect what happens in the cell a little more than  
Genbank's, but it is less readable and less concise.
NB I've also seen a couple of people construct these incorrectly
eg join(complement(10..20),complement(30..40))

I believe we are moving to the complement-join format but I can't  
give a date for the transition.

Having said that, trans-splicing will still give us the joys of  
complex locations,
eg
join(1..5,complement(join(10..20,30..40)))
complement(join(30..40,10..20)) <- looks wrong (unless it is a very  
small circle) but mis-ordered exons are resolved by the trans- 
splicing machinery.

Nadeem


--
S.M. Nadeem N. Faruque
EMBL Nucleotide Database Curation Team
EMBL Outstation
Tel: +44 1223 494611                     Fax: +44 1223 494472
The European Bioinformatics Institute    URL: http://www.ebi.ac.uk/
Email for data submissions: datasubs at ebi.ac.uk
Email for updates: update at ebi.ac.uk
========================================================


From bix at sendu.me.uk  Tue Oct 17 08:59:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 09:59:36 +0100
Subject: [Bioperl-l] Use of Root.pm versus RootI.pm
In-Reply-To: <1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
References: <4521E74E.1040404@infotech.monash.edu.au>	<452F54A1.7010908@sendu.me.uk>	<5F9A3E79-9BB4-4ED9-BCAA-4E083ACCF74E@gmx.net>	<45333E02.9070808@sendu.me.uk>
	<1B34E96B-EAA8-4B72-9FDC-E7DA5B663BD8@gmx.net>
Message-ID: <45349B78.8090905@sendu.me.uk>

Hilmar Lapp wrote:
> So it looks like an abstract base class, not an interface that  
> defines a contract or API? Should use Root.pm then, would be my vote.

Agreed, that was actually what I did in my local copy when I made a new 
inheriting class (so discovering the problem). This change is harmless 
to other modules, but does mean they'll have redundant use of 
Bio::Root::Root which will want cleaning up at some stage.


From bix at sendu.me.uk  Tue Oct 17 10:32:54 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 11:32:54 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
Message-ID: <4534B156.4090501@sendu.me.uk>

Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
See http://www.bioperl.org/wiki/Release_1.5.2 for
instructions on getting and testing this RC.

Developers:
   This should be the last RC before release ~next monday. Now would
   be a good time for last minute documentaiton updates and additions.

Users:
   Even though 1.5.2 is a 'developer' release, we consider it the most
   stable and capable version of Bioperl, and recommend that you use
   it in all but the most critical production environments. Please
   try it out and let us know of any problems or difficulties you run
   into.


Thank you,
Sendu.


From cjfields at uiuc.edu  Tue Oct 17 11:16:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 06:16:47 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <453479BF.90408@sheffield.ac.uk>
References: <453479BF.90408@sheffield.ac.uk>
Message-ID: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>

The general consensus was to keep text versions available; we could  
add URL links to the wiki pages for the most up-to-dat version.  BTW,  
I have modified INSTALL already.  INSTALL.WIN is next in line (I was  
waiting for your changes).

Chris

On Oct 17, 2006, at 1:35 AM, Nathan S. Haigh wrote:

> I'm a bit unclear as to what is happening with these files.
>
> Are these files now superseded by the wikified versions? If so, should
> these files now just simply contain a link to the wikified versions -
> otherwise things could get in a mess since I updated the wiki  
> version of
> INSTALL.WIN and I see Sendu updated the cvs versions a couple of weeks
> ago - hopefully these differences aren't that big.
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Tue Oct 17 11:45:45 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 12:45:45 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>
References: <453479BF.90408@sheffield.ac.uk>
	<72C437E7-8F69-416D-996B-FF3DD1498E78@uiuc.edu>
Message-ID: <4534C269.5050704@sheffield.ac.uk>

Chris Fields wrote:
> The general consensus was to keep text versions available; we could 
> add URL links to the wiki pages for the most up-to-dat version.  BTW, 
> I have modified INSTALL already.  INSTALL.WIN is next in line (I was 
> waiting for your changes).
>
Is it possible to generate these files from the wiki whenever there is a 
release? I now edits shouldn't be too severe or too often - but I can 
see things getting a little messy/annoying if edits have to be made in 2 
places.

Nath


From cjfields at uiuc.edu  Tue Oct 17 14:04:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:04:32 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534C269.5050704@sheffield.ac.uk>
Message-ID: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>

There isn't a very easy way since so many links have to be removed/modified.
I have found a few CPAN modules that could help, but for now I just dump the
text output from a text browser (elinks) using the 'printable version' page
and hand-edit, which works very quickly.  That works for the time being
until I can find another more automated solution.

Fortunately there have been very few edits to either INSTALL wiki page so
they should remain relatively stable.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
> Sent: Tuesday, October 17, 2006 6:46 AM
> To: Chris Fields
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> 
> Chris Fields wrote:
> > The general consensus was to keep text versions available; we could
> > add URL links to the wiki pages for the most up-to-dat version.  BTW,
> > I have modified INSTALL already.  INSTALL.WIN is next in line (I was
> > waiting for your changes).
> >
> Is it possible to generate these files from the wiki whenever there is a
> release? I now edits shouldn't be too severe or too often - but I can
> see things getting a little messy/annoying if edits have to be made in 2
> places.
> 
> Nath


From cjfields at uiuc.edu  Tue Oct 17 14:12:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:12:09 -0500
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <C159CAE8.ACD9%bosborne11@verizon.net>
Message-ID: <000401c6f1f6$424b5580$15327e82@pyrimidine>


> Chris,
> 
> OK. In fact there's no written guarantee that all Bio::Index* modules have
> an id_parser() method. It happens that most do, and it's useful. I'll fix
> the documentation in Bio::Index::Blast and add an enhancement request to
> Bugzilla, may be able to get around to before 1.5.2 release but no
> promises.
> 
> Brian O.

Do the various Bio::Index* modules share a common interface?  

I wouldn't worry too much about it for this release, unless you really have
time.  It is still, after all, a developer's release, and you've noted it in
Bugzilla.  We could try for another dev release in winter (rel 1.5.3, I
guess) to get any bug fixes or new modules added.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


> On 10/16/06 11:34 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
> 
> >
> > On Oct 16, 2006, at 10:24 PM, Brian Osborne wrote:
> >
> >> Chris and Sendu,
> >>
> >> Sendu was correct in wondering whether id_parser() in Blast.pm
> >> would work
> >> after the module was altered to use SearchIO but what I've found
> >> out from my
> >> local tests is that id_parser() didn't work when BPlite was being used
> >> either. I can continue to work on this but it's safe to say that
> >> removing
> >> BPlite doesn't cause a problem with id_parser, it was already there.
> >>
> >> Brian O.
> >
> > ....
> >
> > It may be one reason (the main reason?) the method wasn't tested.
> > Maybe it should be removed if it can't be easily fixed; I don't think
> > it makes sense keeping it otherwise.
> >
> > Chris
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Tue Oct 17 14:15:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 15:15:17 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
Message-ID: <4534E575.5050308@sheffield.ac.uk>

Chris Fields wrote:
> There isn't a very easy way since so many links have to be removed/modified.
> I have found a few CPAN modules that could help, but for now I just dump the
> text output from a text browser (elinks) using the 'printable version' page
> and hand-edit, which works very quickly.  That works for the time being
> until I can find another more automated solution.
>
> Fortunately there have been very few edits to either INSTALL wiki page so
> they should remain relatively stable.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>   
So am I correct in saying that the best way is to make all updates to 
the wikified versions of these files, and then at regular 
intervals/major releases you (or someone else) will update the CVS 
version of the files in the way describe above?

Cheers
Nath


From bix at sendu.me.uk  Tue Oct 17 14:00:39 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 15:00:39 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E09C.9030707@genomics.dk>
References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk>
Message-ID: <4534E207.8030508@sendu.me.uk>

Niels Larsen wrote:
> Greetings,
> 
> I am no perl beginner, but I am a BioPerl beginner. Today I looked
> for remote similarity services that can be used from Perl. I found
> the EBI SOAP interface where their example script returns
> 
> Can't find method element in the message at 
> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

What script exactly? There was a problem with the SOAP server that was 
fixed earlier today.


> and the DDBJ service which (from Denmark) returns
> 
> undef

What returned undef? Specifics please.


> and then the NCBI server accessed through BioPerls RemoteBlast which
> seems to spin in a loop that fills TMPDIR with many tempfiles. Will
> release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
> is working towards that).

What version of Bioperl were you testing with? What did you do to get it 
to 'spin in a loop'? I can tell you that remote blasting certainly works 
in Bioperl 1.5.2, but you'll have to give more details on the things you 
tried and the problems you encountered.

You can also answer the questions yourself by trying the release candidate.


From B.Beckert at ibmc.u-strasbg.fr  Tue Oct 17 13:59:30 2006
From: B.Beckert at ibmc.u-strasbg.fr (Bertrand Beckert)
Date: Tue, 17 Oct 2006 15:59:30 +0200
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
Message-ID: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>


hi,

I am running a large number of blasts via a connexion to ncbi blast
page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
some problems. I make a simple example with only one sequence in
order to understand how work this module. This is my simple input
file, a DNA sequence in fasta form:

> test
>
TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT
TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG
TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA
I have made some modification of the example available in doc of
bioperl.
It give me a RID which contain the results of my blast but I have a
problem with the "$result=$factory->retrieve_blast($rid)" in my script.
In the documentation it wrote that $result=$factory->retrieve_blast
($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
object. In my case it returns a Bio::SearchIO::blast... I don't
understand why I don't have the good type of object return (see PART I).

I also try to resolve the problem by replace the foreach loop in my
script by a new one in order to explore the blast page result but it
also don't work (see part II).

could you help me please. Thank you

Bertrand Beckert.

PART I:

Here is my script with a little annotation and also the shell window
printing:
------------------------------------------------------------------------

----------------------------
#!/usr/bin/perl -w
use Bio::Tools::Run::RemoteBlast;
use Bio::SearchIO;
sub blast {
my $prog='blastn';
my $db='refseq_genomic';
my $e_val='1e-10';
my $Input='Seq.fasta';
my @params = ('-prog' =>  $prog, '-data' =>  $db, '-expect' =>
$e_val, '-readmethod' => 'SearchIO');
my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
#changes parameters
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25';
$factory->submit_blast($Input);
print STDERR "waiting...\n";
while (my @rids=$factory->each_rid) {
           print "my rid: ", at rids,"\n";
	 #return me the ID of the submited blast i.e. RID:
1161079157-766-185099855365.BLASTQ2
	 #this page contains the result of my blast...
	         foreach my $rid (@rids) {
		                 $result=$factory->retrieve_blast($rid);
		#line in order to understand what type of object is
return by
retrieve_blast		
                  print "rc:", $result,"\n";
		
		                }
			}
		}

&blast;
------------------------------------------------------------------------

----------------------------

here you can see the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc54)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc30)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x89eb7f4)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x8a2cc74)
my rid: 1161079157-766-185099855365.BLASTQ2
...
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x886bbac)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x89eb5f0)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x8a2d2d4)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x84fa054)
...

PARTII:

I also try to resolve the problem by replace the foreach loop in my
script by:
------------------------------------------------------------------------

----------------------------
foreach my $rid (@rids) {
                  while(1) {
                  $result=$factory->retrieve_blast($rid)->next_result();
                  print "rc:", $result,"\n";
                  if ($result) {
                  print  $result->num_hits(),"\n";
                  }
------------------------------------------------------------------------

----------------------------
With tis loop I could explore the result Blast page. that is what I
obtain in the shell window:
		
bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161088606-9905-123050755601.BLASTQ4
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb834)


----
-- 
Berrtrand BECKERT
PhD student
IBMC - UPR 9002 du CNRS - ARN
15, rue Rene Descartes
F-67084 STRASBOURG Cedex
b.beckert at ibmc.u-strasbg.fr


From niels at genomics.dk  Tue Oct 17 13:54:36 2006
From: niels at genomics.dk (Niels Larsen)
Date: Tue, 17 Oct 2006 15:54:36 +0200
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534B156.4090501@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>
Message-ID: <4534E09C.9030707@genomics.dk>

Greetings,

I am no perl beginner, but I am a BioPerl beginner. Today I looked
for remote similarity services that can be used from Perl. I found
the EBI SOAP interface where their example script returns

Can't find method element in the message at 
/ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

and the DDBJ service which (from Denmark) returns

undef

and then the NCBI server accessed through BioPerls RemoteBlast which
seems to spin in a loop that fills TMPDIR with many tempfiles. Will
release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
is working towards that).

Niels L


------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------


From cjfields at uiuc.edu  Tue Oct 17 14:28:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:28:40 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534E575.5050308@sheffield.ac.uk>
Message-ID: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>

...
> So am I correct in saying that the best way is to make all updates to
> the wikified versions of these files, and then at regular
> intervals/major releases you (or someone else) will update the CVS
> version of the files in the way describe above?
> 
> Cheers
> Nath

Yes.  I think the online docs will stay relatively stable.  A week or so ago
Mauricio and I were discussing moving the dependencies list to it's own CVS
document (since they pertain to all Bioperl installations, not just UNIX'y
flavors).  I haven't done that yet since I was waiting on the INSTALL.WIN
changes before I made any more changes.  Well, that and I've been really
busy doing other things.

One way we could make sure that changes to the online docs would match the
CVS docs would be to only allow certain wiki users (such as sysadmins) make
modifications to those pages.  That way any changes would have to go through
someone who also has CVS access and could make similar changes to the
distribution docs.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Tue Oct 17 14:37:38 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 15:37:38 +0100
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>
References: <000501c6f1f8$8b78efe0$15327e82@pyrimidine>
Message-ID: <4534EAB2.50609@sheffield.ac.uk>

Chris Fields wrote:
> ...
>   
>> So am I correct in saying that the best way is to make all updates to
>> the wikified versions of these files, and then at regular
>> intervals/major releases you (or someone else) will update the CVS
>> version of the files in the way describe above?
>>
>> Cheers
>> Nath
>>     
>
> Yes.  I think the online docs will stay relatively stable.  A week or so ago
> Mauricio and I were discussing moving the dependencies list to it's own CVS
> document (since they pertain to all Bioperl installations, not just UNIX'y
> flavors).  I haven't done that yet since I was waiting on the INSTALL.WIN
> changes before I made any more changes.  Well, that and I've been really
> busy doing other things.
>   
Sounds good.
> One way we could make sure that changes to the online docs would match the
> CVS docs would be to only allow certain wiki users (such as sysadmins) make
> modifications to those pages.  That way any changes would have to go through
> someone who also has CVS access and could make similar changes to the
> distribution docs.
>   
Ugh, not sure I like the sound of maintaining 2 copies of any files - 
sounds like a future headache even if they are pretty stable. It also 
makes it unclear which of the two file should be considered first (i.e. 
is the most up-to-date) on pages such as:
http://www.bioperl.org/wiki/Installing_BioPerl

It suggests that INSTALL and INSTALL.WIN should be looked at first, but 
there are online copies of those files available - this should now be 
the other way around - shouldn't it? I might just be making a mountain 
out of a molehill, so I'll shut up on this topic and make any future 
edits to the wiki pages instead.
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   


From bosborne11 at verizon.net  Tue Oct 17 14:48:54 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 17 Oct 2006 10:48:54 -0400
Subject: [Bioperl-l] Should Bio::Tools::BPlite be deprecated?
In-Reply-To: <000401c6f1f6$424b5580$15327e82@pyrimidine>
Message-ID: <C15A6596.AD0B%bosborne11@verizon.net>

Chris,

The Bio::Index modules either 'use base qw(Bio::Index::Abstract)' or 'use
base qw(Bio::Index::AbstractSeq)'. Neither of these modules has an
id_parser() method.

Brian O.


On 10/17/06 10:12 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> Do the various Bio::Index* modules share a common interface?  


From cjfields at uiuc.edu  Tue Oct 17 14:45:53 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 09:45:53 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <4534EAB2.50609@sheffield.ac.uk>
Message-ID: <000601c6f1fa$f260b560$15327e82@pyrimidine>

...
> > One way we could make sure that changes to the online docs would match
> the
> > CVS docs would be to only allow certain wiki users (such as sysadmins)
> make
> > modifications to those pages.  That way any changes would have to go
> through
> > someone who also has CVS access and could make similar changes to the
> > distribution docs.
> >
> Ugh, not sure I like the sound of maintaining 2 copies of any files -
> sounds like a future headache even if they are pretty stable. It also
> makes it unclear which of the two file should be considered first (i.e.
> is the most up-to-date) on pages such as:
> http://www.bioperl.org/wiki/Installing_BioPerl
> 
> It suggests that INSTALL and INSTALL.WIN should be looked at first, but
> there are online copies of those files available - this should now be
> the other way around - shouldn't it? I might just be making a mountain
> out of a molehill, so I'll shut up on this topic and make any future
> edits to the wiki pages instead.

Yes that should be the other way around (the wiki would be the most
up-to-date), so the CVS docs should point to the wiki, not vice-versa.

Getting the docs right is as important as getting the code to work.  So I
don't consider it a 'mountain-out-of-a-molehill' problem.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Tue Oct 17 15:07:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 10:07:49 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E207.8030508@sendu.me.uk>
Message-ID: <001001c6f1fe$02fd4de0$15327e82@pyrimidine>

> Niels Larsen wrote:
> > Greetings,
> >
> > I am no perl beginner, but I am a BioPerl beginner. Today I looked
> > for remote similarity services that can be used from Perl. I found
> > the EBI SOAP interface where their example script returns
> >
> > Can't find method element in the message at
> > /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.
> 
> What script exactly? There was a problem with the SOAP server that was
> fixed earlier today.
> 
> 
> > and the DDBJ service which (from Denmark) returns
> >
> > undef
> 
> What returned undef? Specifics please.
> 

The first problem, like Sendu mentions, was fixed on the remote server (I
get them to pass now).  Those were from bioperl-run, though, not the bioperl
core distribution.

As for DDBJ, do you mean EBI or SwissProt?  I ask b/c you mention Denmark.
EBI were having server maintenance outages yesterday, which was announced
here.

As Sendu mentions, please be more specific.

> > and then the NCBI server accessed through BioPerls RemoteBlast which
> > seems to spin in a loop that fills TMPDIR with many tempfiles. Will
> > release 1.5.2 include improvements to RemoteBlast? (I see Roger Hall
> > is working towards that).
> 
> What version of Bioperl were you testing with? What did you do to get it
> to 'spin in a loop'? I can tell you that remote blasting certainly works
> in Bioperl 1.5.2, but you'll have to give more details on the things you
> tried and the problems you encountered.
> 
> You can also answer the questions yourself by trying the release
> candidate.

The tempfiles showing up are from the repeated RID requests and are deleted
after the BLAST run (at least they should be); this is quite normal.  They
don't 'spin in a loop' unless the BLAST query is taking a particularly long
time, which can happen depending on how the BLAST query is set up, i.e. what
type of BLAST program is requested, if comp-based stats are requested,
length of query, database requested, etc.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Oct 17 15:14:07 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 16:14:07 +0100
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
In-Reply-To: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>
References: <CE7A1962-05A0-46CD-8832-BC9C5A8C7656@ibmc.u-strasbg.fr>
Message-ID: <4534F33F.3070809@sendu.me.uk>

Bertrand Beckert wrote:
> hi,
> 
> I am running a large number of blasts via a connexion to ncbi blast
> page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
> I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
> some problems.
[snip]
> In the documentation it wrote that $result=$factory->retrieve_blast
> ($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
> object. In my case it returns a Bio::SearchIO::blast... I don't
> understand why I don't have the good type of object return (see PART I).

I take it you're using some old version of Bioperl where unfortunately 
the documentation was incorrect. In fact you're supposed to get a 
Bio::SearchIO object, so it is a good thing that you are. The latest 
version of Bioperl has (as far as I can see) correct documentation and 
behaviour.

Bio::Tools::Bplite and Bio::Tools::Blast are deprecated. You want 
Bio::SearchIO::blast. All is well.


> I also try to resolve the problem by replace the foreach loop in my
> script by a new one in order to explore the blast page result but it
> also don't work (see part II).

I'm not really sure what problem you might be facing there, but take a 
look at some up-to-date documentation, using the new example code:

http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html


From n.haigh at sheffield.ac.uk  Tue Oct 17 16:10:15 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 17 Oct 2006 17:10:15 +0100
Subject: [Bioperl-l] [Fwd: Re: Bundle::BioPerl]
Message-ID: <45350067.6070604@sheffield.ac.uk>

FYI on Bundle::BioPerl

Nathan

-------- Original Message --------
Subject: 	Re: Bundle::BioPerl
Date: 	Tue, 17 Oct 2006 11:52:00 -0400
From: 	Chris Dagdigian <dag at sonsorol.org>
To: 	Nathan S. Haigh <n.haigh at sheffield.ac.uk>
References: 	<45348FB8.4050009 at sheffield.ac.uk>


Hi Nathan,

I've updated the Bundle and uploaded it to CPAN.

I *think* the rationale for keeping it still exists but I'm removed  
enough from Bioperl now that I'll defer to others on the decision.

The basic idea was that BioPerl has a heck of a lot of dependencies  
that it requires of (other perl modules) in order to get all the  
functionality out of it. Many of these dependencies may not be  
present in default Perl installations.  Tracking down all of the  
dependencies and installing them (along with all of the dependencies- 
of-the-dependencies) by hand is a massive pain.

The nice thing about the Bundle is that it lists the core module  
dependencies and it works great with the CPAN.pm module to automate  
the downloading and installation of everything that BioPerl requires.  
The CPAN module is smart enough that when processing *our* bundle it  
will also track down and install anything that our bundle entries  
themselves list as a dependency.

So for unix/Linux systems the Bundle is a great one-liner ("perl - 
MCPAN -e 'install Bundle::BioPerl'" )  way to auto-install or update  
the many perl modules that BioPerl makes use of.

On the windows side, not sure if it is of any help though.

Regards,
Chris


On Oct 17, 2006, at 4:09 AM, Nathan S. Haigh wrote:

> Hi Chris
>
> I've been working on making a PPD for the upcoming Bioperl 1.5.2  
> release. During this time I also updated Bundle::BioPerl to include  
> up-to-date prereqs. I was wondering if you could update the CPAN  
> package? The updated BioPerl.pm file is attached.
>
> There is some talk about why and if we need Bundle::BioPerl  
> anymore. What was the rationale for having it in the first place,  
> and does it still hold true now?
>
> Cheers
> Nath
>


From plu5even at gmail.com  Tue Oct 17 16:26:34 2006
From: plu5even at gmail.com (Peter H. Baenziger)
Date: Tue, 17 Oct 2006 12:26:34 -0400
Subject: [Bioperl-l] LocatableSeq object vs Sequence Object
Message-ID: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com>

All,
This is my first bioperl script (but not my first Perl script) so
please forgive my naivety.  I've read through documentation and looked
through cookbooks and the like but to no avail.  Any advice is
appreciated.
 So...I am working with an alignment object of several sequences.  My
intentions is to loop through all the sequences of the alignment to
find what amino acid they have at a known position in the alignment
(not the position in the sequence).  I was thinking I could use:
foreach $seq ($alignment->each_seq())
to loop through the sequences and call:
$seq->location_from_column($pos)
on each of the sequences.  However, I don't think I have
"LocatableSequences" (the type of object that has method
"location_from_columns") being returned by $alignment->each_seq().
So, how do I bridge this gap here?  Or is there a better way?
My appreciation in advance!
Peter

 code:
my $swissObj = $swissdb->get_Seq_by_acc($query);  //put several of
these in @sequenceObjects
...
my $alignFactory = Bio::Tools::Run::Alignment::Clustalw->new();
    my $alignment = $alignFactory->align(\@sequenceObjects);
    #print $alignment->overall_percentage_identity(); #works

    #now we find the "alignment position" of the mutation we have on
the human version and get the amino acid at that "alignment position"
for all seq
    my $humanSequence = $prefix."HUMAN";
    my $pos = $alignment->column_from_residue_number($humanSequence,
$aa_seqpos); #this is the "alignment position" equivalent to the
mutation position

    #we'll keep track of what amino acid each species has at the
"alignment equivalent" location listed as being a mutation on the the
human version
    foreach $seq ($alignment->each_seq())
    {
        #print $seq->species() . "\n"; #won't work because
$alignment->each_seq() actually returns a locatableSeq object, not a
normal sequence object
        $speciesAA{$species} = $seq->locatation_from_column($pos);
    }


-- 
<<->>
Peter H. Baenziger


From akarger at CGR.Harvard.edu  Tue Oct 17 16:53:19 2006
From: akarger at CGR.Harvard.edu (Amir Karger)
Date: Tue, 17 Oct 2006 12:53:19 -0400
Subject: [Bioperl-l] split location problems
Message-ID: <B9182BFF5B004245BABC12956EA6322E018E6735@huls5.nucleus.harvard.edu>

> From: Jason Stajich [mailto:jason.stajich at gmail.com]
> 
> The whole point of split locations is to represent genes with 
> introns  
> so that is not the "rare" case.

Absolutely.

> I have processed the genbank fungal genomes into GFF3 and 
> have had no  
> problems so I'm confused where you are breaking down.  If I write  
> them out as embl I also get the correct thing.  This is using 
> the CVS  
> version of bioperl from the HEAD.
> 
> I've added code to test this to bug 2101 including a C.glabrata  
> chromsome downloaded from genbank.  Perhaps the problem is on the  
> EMBL parsing side, I didn't test that.

Well, I don't know whether it's EMBL parsing, or a bit further down the
pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968),
and it describes the complement/joins in the way that Bioperl is
handling correctly.

GenBank:
     CDS             complement(join(10347..10372,10632..11157))
                     /locus_tag="CAGL0B00242g"

EMBL:
FT   CDS
join(complement(10632..11157),complement(10347..10372))
FT                   /locus_tag="CAGL0B00242g"

Here's the diff when I run the location-printing script I posted
yesterday:

diff biogb bio
1c1,5
< complement(join(10347..10372,10632..11157))
---
> complement(1701..2651)
> complement(2635..3345)
> complement(3980..4408)
> complement(join(10632..11157,10347..10372))
> 10379..10615
209a214,217
> 498198..498890
> 499712..500062
> 499851..500702
> 500579..501364

As you can see, the complement/join CDS is written out in a different
order, which is Bad.

(I looked at at least one of the other differences: the GB file says
it's a "misc feature" and EMBL says it's a CDS. But they don't seem to
be relevant here.)

-Amir

> 
> On the technical side, I still am not sure I fully know where the  
> strand information should be stored - the top level container or the  
> sub-features.  I'll try and stay up on the discussion if 
> anything has  
> been decided that I should know about.
> 
> -jason
> 
> 
> 
> 


From paul.boutros at utoronto.ca  Tue Oct 17 16:57:19 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 12:57:19 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
Message-ID: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>

Hi,
Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed  
tests, the first seems to be just a result of me not having DBD::mysql  
installed.
Paul

Test Summary
============

Failed Test               Stat Wstat Total Fail  List of Failed
-------------------------------------------------------------------------------
t/BioDBSeqFeature_mysql.t               46   46  1-46
t/SearchIO.t                22  5632  1337 2671  2-1337
2 tests and 106 subtests skipped.
Failed 2/236 test scripts. 1382/11688 subtests failed.
Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =  
159.61 CPU)

BioDBSeqFeature_mysql
=====================
pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
1..46
install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC  
contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t  
/db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi  
/db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at  
(eval 37) line 3.
Perhaps the DBD::mysql perl module hasn't been fully installed,
or perhaps the capitalisation of 'mysql' isn't right.
Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
  at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208

SearchIO
========
pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
1..1337
ok 1

-------------------- WARNING ---------------------
MSG: XML::SAX::Expat not currently supported; must have local copies  
of NCBI DTD docs!
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: error in parsing a report:

404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'  
does not exist  
file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
Handler couldn't resolve external entity at line 2, column 82, byte 104
error in processing external entity reference at line 2, column 82,  
byte 104 at  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line  
187

---------------------------------------------------
not ok 2
# Failed test 2 in t/SearchIO.t at line 68
Can't call method "database_name" on an undefined value at  
t/SearchIO.t line 69.

------------------------------

Message: 10
Date: Tue, 17 Oct 2006 11:32:54 +0100
From: Sendu Bala <bix at sendu.me.uk>
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
To: bioperl-l at bioperl.org
Message-ID: <4534B156.4090501 at sendu.me.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
See http://www.bioperl.org/wiki/Release_1.5.2 for
instructions on getting and testing this RC.

Developers:
    This should be the last RC before release ~next monday. Now would
    be a good time for last minute documentaiton updates and additions.

Users:
    Even though 1.5.2 is a 'developer' release, we consider it the most
    stable and capable version of Bioperl, and recommend that you use
    it in all but the most critical production environments. Please
    try it out and let us know of any problems or difficulties you run
    into.


Thank you,
Sendu.


From barry.moore at genetics.utah.edu  Tue Oct 17 16:57:48 2006
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Tue, 17 Oct 2006 10:57:48 -0600
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
References: <000301c6f1f5$2d3509d0$15327e82@pyrimidine>
Message-ID: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>

lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix

does a reasonable job of textifying html.  You get the links as  
numbered references at the bottom or:

lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |  
perl -ane 's/\[?\[\d+\](edit\])?//g;print'

to remove the links all together.

Barry

P.S.  Looks like this:

    #Creative Commons copyright

Installing Bioperl for Unix

 From BioPerl

    Jump to: navigation, search

Contents

      * 1 BIOPERL INSTALLATION
      * 2 SYSTEM REQUIREMENTS
      * 3 OPTIONAL
      * 4 ADDITIONAL INSTALLATION INFORMATION
      * 5 THE BIOPERL BUNDLE
      * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
      * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
      * 8 WHERE ARE THE MAN PAGES?
      * 9 EXTERNAL PROGRAMS
           + 9.1 Environment Variables
      * 10 INSTALLING BIOPERL SCRIPTS
      * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
      * 12 INSTALLING BIOPERL MODULES THE HARD WAY
      * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
      * 14 THE TEST SYSTEM
      * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
           + 15.1 CONFIGURING for BSD and Solaris boxes
           + 15.2 INSTALLATION
         * 16 DEPENDENCIES AND Bundle::BioPerl


BIOPERL INSTALLATION

    Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
    and on Mac OS X (see the PLATFORMS file for more details).  
Following are
    instructions  for  installing Bioperl for Unix/Linux/Mac OS X;  
Windows
    installation instructions can be found here. For installing  
Bioperl for
    Mac OS X using Fink, see Getting BioPerl.


SYSTEM REQUIREMENTS

      * Perl 5.005 or later; version 5.6 and greater are recommended.  
Note
        that most modules will work with earlier versions of Perl.  
The only ones
        that will not are Bio::SimpleAlign and the Bio::Index::*  
modules. If
        you don't need these modules and you want to install Bioperl  
using an
        earlier version of Perl, edit the "require 5.005;" line in  
Makefile.PL
        as necessary.

      * External modules: Bioperl uses functionality provided in  
other Perl
        modules. Some of these are included in the standard perl  
package but
        some  need to be obtained from the CPAN site. The list of  
external
        modules is included at the bottom of this document.

    The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of  
these
    external modules easy. Simply install the bundle using your CPAN  
shell and
    all necessary modules will be installed. See THE BIOPERL BUNDLE,  
below.


OPTIONAL

      * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
        bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
        PACKAGE, below).


ADDITIONAL INSTALLATION INFORMATION

      * Additional information on Bioperl and MAC OS:
           + OS 9 - http://bioperl.org/Core/mac-bioperl.html
           + OSX-http://www.tc.umn.edu/~cann0010/ 
Bioperl_OSX_install.html
           + OS X - Installing using Fink (in Getting BioPerl)


THE BIOPERL BUNDLE

    You typically need root privileges to install using CPAN. If you  
don't
    have these privileges please see INSTALLING BIOPERL IN A PERSONAL  
MODULE
    AREA for additional information.

    Install Bundle::Bioperl using CPAN. One way:
 >perl -MCPAN -e "install Bundle::BioPerl"

    Another way:
 >perl -MCPAN -e shell
cpan>install Bundle::BioPerl


On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:

> There isn't a very easy way since so many links have to be removed/ 
> modified.
> I have found a few CPAN modules that could help, but for now I just  
> dump the
> text output from a text browser (elinks) using the 'printable  
> version' page
> and hand-edit, which works very quickly.  That works for the time  
> being
> until I can find another more automated solution.
>
> Fortunately there have been very few edits to either INSTALL wiki  
> page so
> they should remain relatively stable.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>> -----Original Message-----
>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>> Sent: Tuesday, October 17, 2006 6:46 AM
>> To: Chris Fields
>> Cc: bioperl-l
>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>
>> Chris Fields wrote:
>>> The general consensus was to keep text versions available; we could
>>> add URL links to the wiki pages for the most up-to-dat version.   
>>> BTW,
>>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
>>> waiting for your changes).
>>>
>> Is it possible to generate these files from the wiki whenever  
>> there is a
>> release? I now edits shouldn't be too severe or too often - but I can
>> see things getting a little messy/annoying if edits have to be  
>> made in 2
>> places.
>>
>> Nath
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From niels at genomics.dk  Tue Oct 17 16:58:14 2006
From: niels at genomics.dk (Niels Larsen)
Date: Tue, 17 Oct 2006 18:58:14 +0200
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534E207.8030508@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk> <4534E09C.9030707@genomics.dk>
	<4534E207.8030508@sendu.me.uk>
Message-ID: <45350BA6.3040102@genomics.dk>

Ok, here are ways to reproduce; I sure apologize if I made the
test scripts wrong. And I suppose EBI/DDBJ's interfaces are not
a bioperl issue really.

Niels

------------ EBI

I invoked the EBI script

http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip

like this

WSWUBlastClient.pl -p blastn -D embl test.fasta

where the content of test.fasta is below, and got

Can't find method element in the message at 
/ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

 >Planctomyces sp. 282; Genbank Taxonomy ID: 79927
AATGAACGTTGGCGGCATGGATTAGGCATGCAAGTCGAGGGAGAACCCGCAAGGGGACACCGGCG
AACGGGGTAGGAATACATAGGTAACGTACCCTCAGGACGGGGATAGCCAAGGGAAACTTTGGGTA
ATACCCGATGTGATGGCAAGATGTGAATGCTTGTCATCAAAGGTGAGATTCCACCTGAGGAGCGG
CTTATGCATCATTAGCTTGTTGGCGGGGTAACGGCCCACCAAGGCTGCGATGATTAGGGGGTGTG
AGAGCATGGCCCCCACCACTGGCACTGAGACACTGGCCAGACACCTACGGGTGGCTGCAGTCGAG

I tried with this test sequence in fasta format and with just the
sequence.

------------ DDBJ

Inspired by this page,

http://xml.nig.ac.jp/doc/Blast.txt

I made this test script

------ cut --
#!/usr/bin/env perl

use strict;
use warnings FATAL => qw ( all );

my ( $service, $seqstr, $result );

use SOAP::Lite;
use Data::Dumper;

$service = SOAP::Lite->service('http://xml.nig.ac.jp/wsdl/Blast.wsdl');

$seqstr = "MSSRIARALALVVTLLHLTRLALSTCPAACHCPLEAPKCAPGVGLVRDGCGCCKVCAKQL";

$result = $service->searchSimple( "blastp", "SWISS", $seqstr );

print Dumper( $result );
------ cut --

which for me prints undef.

------------- NCBI/Bioperl

I installed 1.5.2-RC2, looked at the RemoteBlast example in

http://www.bioperl.org/wiki/Bptutorial.pl

and then put that into this test code, more or less cut/paste,

--- cut --
#!/usr/bin/env perl

use strict;
use warnings FATAL => qw ( all );

use Bio::Tools::Run::RemoteBlast;
use Data::Dumper;

my ( $remote_blast, $r, $rc, $rid, @rids );

$remote_blast = Bio::Tools::Run::RemoteBlast->new (
                 -prog => 'blastn', -data => 'ecoli', -expect => '1e-10' );

$r = $remote_blast->submit_blast("ecoli.fasta");

while ( @rids = $remote_blast->each_rid )
{
#    print Dumper( \@rids );

     for $rid ( @rids ) {
         $rc = $remote_blast->retrieve_blast($rid);
#        print Dumper( $rc );
     }

     sleep 10;
}
--- cut --

which saves the same blast report to TMPDIR for every 10 seconds.
The "ecoli.fasta" file contains this

 >test
gggggctctgttggttctcccgcaacgctactctgtttaccaggtcaggtccggaaggaa
gcagccaaggcagatgacgcgtgtgccgggatgtagctggcagggcccccaccc

Maybe I am supposed to add a check for content in $rc and then stop
the inner loop? I could figure that out maybe, but I wish there was a
function which simply takes a single sequence + arguments and only
returns a list of matches when done, and does not return until then
(or until a specified timeout).


------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------


From bertrand.beckert at gmail.com  Tue Oct 17 14:52:36 2006
From: bertrand.beckert at gmail.com (bertrand beckert)
Date: Tue, 17 Oct 2006 16:52:36 +0200
Subject: [Bioperl-l] problems with: Bio::Tools::Run::RemoteBlast
Message-ID: <500217090610170752q565cfc08t5208e3b64f99ef7f@mail.gmail.com>

hi,

I am running a large number of blasts via a connexion to ncbi blast
page ('http://www.ncbi.nlm.nih.gov/blast/Blast.cgi').
I try to use 'Bio::Tools::Run::RemoteBlast' but unfortunately I have
some problems. I make a simple example with only one sequence in
order to understand how work this module. This is my simple input
file, a DNA sequence in fasta form:

>test
TTTTGATGAGGCGCATCAATCATGAGTAAAGTTTAGATTACTGTCTGCTAACAGCTGAAT
TTGAAAGGGTGCGATGCCGAAGCGATTATAATAGCAGTTATAATTTGTTGGACTTTTTGG
TTAAGAGCTGAGAGTTTGTCATTATTTAAAAATAATGGAGTGCATCACTTGTA

I have made some modification of the example available in doc of bioperl.
It give me a RID which contain the results of my blast but I have a
problem with the "$result=$factory->retrieve_blast($rid)" in my script.
In the documentation it wrote that $result=$factory->retrieve_blast
($rid) return when it work a Bio::Tools::Bplite or Bio::Tools::Blast
object. In my case it returns a Bio::SearchIO::blast... I don't
understand why I don't have the good type of object return (see PART I).

I also try to resolve the problem by replace the foreach loop in my
script by a new one in order to explore the blast page result but it
also don't work (see part II).

could you help me please. Thank you

Bertrand Beckert.

PART I:

Here is my script with a little annotation and also the shell window
printing:
------------------------------------------------------------------------
#!/usr/bin/perl -w
use Bio::Tools::Run::RemoteBlast;
use Bio::SearchIO;
sub blast {
my $prog='blastn';
my $db='refseq_genomic';
my $e_val='1e-10';
my $Input='Seq.fasta';
my @params = ('-prog' =>  $prog, '-data' =>  $db, '-expect' =>
$e_val, '-readmethod' => 'SearchIO');
my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
#changes parameters
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'}='Bacteria [ORGN]';
$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'}='BLOSUM25';
$factory->submit_blast($Input);
print STDERR "waiting...\n";
while (my @rids=$factory->each_rid) {
          print "my rid: ", at rids,"\n";
     #return me the ID of the submited blast i.e. RID:
1161079157-766-185099855365.BLASTQ2
     #this page contains the result of my blast...
             foreach my $rid (@rids) {
                         $result=$factory->retrieve_blast($rid);
        #line in order to understand what type of object is
return by
retrieve_blast
                 print "rc:", $result,"\n";

                        }
            }
        }

&blast;
------------------------------------------------------------------------

here you can see the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc54)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x890bc30)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x89eb7f4)
my rid: 1161079157-766-185099855365.BLASTQ2
rc:Bio::SearchIO::blast=HASH(0x8a2cc74)
my rid: 1161079157-766-185099855365.BLASTQ2
...
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x886bbac)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x89eb5f0)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x8a2d2d4)
my rid: 1161079157-766-185099855365.BLASTQ2
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::SearchIO::blast=HASH(0x84fa054)
...

PARTII:

I also try to resolve the problem by replace the foreach loop in my
script by:
------------------------------------------------------------------------

foreach my $rid (@rids) {
                 while(1) {
                 $result=$factory->retrieve_blast($rid)->next_result();
                 print "rc:", $result,"\n";
                 if ($result) {
                 print  $result->num_hits(),"\n";
                 }
------------------------------------------------------------------------

With tis loop I could explore the result Blast page. that is what I
obtain in the shell window:

bbeckert at tatooine:~/Script_perl$ ./test.pl
waiting...
my rid: 1161088606-9905-123050755601.BLASTQ4
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Use of uninitialized value in print at ./retrieve_blast.pl line 30.
rc:
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb8b8)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fba8c)
0
Parsing of undecoded UTF-8 will give garbage when decoding entities
at /usr/share/perl5/LWP/Protocol.pm line 137.
rc:Bio::Search::Result::BlastResult=HASH(0x84fb834)


----
-- 
Berrtrand BECKERT
PhD student
IBMC - UPR 9002 du CNRS - ARN
15, rue Rene Descartes
F-67084 STRASBOURG Cedex
b.beckert at ibmc.u-strasbg.fr
bertrand.beckert at gmail.com


From cjfields at uiuc.edu  Tue Oct 17 17:50:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 12:50:49 -0500
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>
Message-ID: <001201c6f214$c8934440$15327e82@pyrimidine>

(Apologies for the top post, but I thought my response might get lost below)

I use elinks in a similar fashion.  It tends to format the tables a bit
better than lynx.

Chris

> -----Original Message-----
> From: Barry Moore [mailto:barry.moore at genetics.utah.edu]
> Sent: Tuesday, October 17, 2006 11:58 AM
> To: Chris Fields
> Cc: 'Nathan S. Haigh'; 'bioperl-l'
> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
> 
> does a reasonable job of textifying html.  You get the links as
> numbered references at the bottom or:
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
> 
> to remove the links all together.
> 
> Barry
> 
> P.S.  Looks like this:
> 
>     #Creative Commons copyright
> 
> Installing Bioperl for Unix
> 
>  From BioPerl
> 
>     Jump to: navigation, search
> 
> Contents
> 
>       * 1 BIOPERL INSTALLATION
>       * 2 SYSTEM REQUIREMENTS
>       * 3 OPTIONAL
>       * 4 ADDITIONAL INSTALLATION INFORMATION
>       * 5 THE BIOPERL BUNDLE
>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>       * 8 WHERE ARE THE MAN PAGES?
>       * 9 EXTERNAL PROGRAMS
>            + 9.1 Environment Variables
>       * 10 INSTALLING BIOPERL SCRIPTS
>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>       * 14 THE TEST SYSTEM
>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>            + 15.1 CONFIGURING for BSD and Solaris boxes
>            + 15.2 INSTALLATION
>          * 16 DEPENDENCIES AND Bundle::BioPerl
> 
> 
> BIOPERL INSTALLATION
> 
>     Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
>     and on Mac OS X (see the PLATFORMS file for more details).
> Following are
>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
> Windows
>     installation instructions can be found here. For installing
> Bioperl for
>     Mac OS X using Fink, see Getting BioPerl.
> 
> 
> SYSTEM REQUIREMENTS
> 
>       * Perl 5.005 or later; version 5.6 and greater are recommended.
> Note
>         that most modules will work with earlier versions of Perl.
> The only ones
>         that will not are Bio::SimpleAlign and the Bio::Index::*
> modules. If
>         you don't need these modules and you want to install Bioperl
> using an
>         earlier version of Perl, edit the "require 5.005;" line in
> Makefile.PL
>         as necessary.
> 
>       * External modules: Bioperl uses functionality provided in
> other Perl
>         modules. Some of these are included in the standard perl
> package but
>         some  need to be obtained from the CPAN site. The list of
> external
>         modules is included at the bottom of this document.
> 
>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
> these
>     external modules easy. Simply install the bundle using your CPAN
> shell and
>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
> below.
> 
> 
> OPTIONAL
> 
>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>         PACKAGE, below).
> 
> 
> 
> ADDITIONAL INSTALLATION INFORMATION
> 
>       * Additional information on Bioperl and MAC OS:
>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>            + OSX-http://www.tc.umn.edu/~cann0010/
> Bioperl_OSX_install.html
>            + OS X - Installing using Fink (in Getting BioPerl)
> 
> 
> 
> THE BIOPERL BUNDLE
> 
>     You typically need root privileges to install using CPAN. If you
> don't
>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
> MODULE
>     AREA for additional information.
> 
>     Install Bundle::Bioperl using CPAN. One way:
>  >perl -MCPAN -e "install Bundle::BioPerl"
> 
>     Another way:
>  >perl -MCPAN -e shell
> cpan>install Bundle::BioPerl
> 
> 
> 
> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
> 
> > There isn't a very easy way since so many links have to be removed/
> > modified.
> > I have found a few CPAN modules that could help, but for now I just
> > dump the
> > text output from a text browser (elinks) using the 'printable
> > version' page
> > and hand-edit, which works very quickly.  That works for the time
> > being
> > until I can find another more automated solution.
> >
> > Fortunately there have been very few edits to either INSTALL wiki
> > page so
> > they should remain relatively stable.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >> -----Original Message-----
> >> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
> >> Sent: Tuesday, October 17, 2006 6:46 AM
> >> To: Chris Fields
> >> Cc: bioperl-l
> >> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
> >>
> >> Chris Fields wrote:
> >>> The general consensus was to keep text versions available; we could
> >>> add URL links to the wiki pages for the most up-to-dat version.
> >>> BTW,
> >>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
> >>> waiting for your changes).
> >>>
> >> Is it possible to generate these files from the wiki whenever
> >> there is a
> >> release? I now edits shouldn't be too severe or too often - but I can
> >> see things getting a little messy/annoying if edits have to be
> >> made in 2
> >> places.
> >>
> >> Nath
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Oct 17 17:52:36 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 12:52:36 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
Message-ID: <001301c6f215$07a9a070$15327e82@pyrimidine>

What do you get when you run the SearchIO.t test by itself using 'perl -I.
t/SearchIO.t'?  It looks like something pretty catastrophic happened.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
> Sent: Tuesday, October 17, 2006 11:57 AM
> To: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
> 
> Hi,
> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
> tests, the first seems to be just a result of me not having DBD::mysql
> installed.
> Paul
> 
> Test Summary
> ============
> 
> Failed Test               Stat Wstat Total Fail  List of Failed
> --------------------------------------------------------------------------
> -----
> t/BioDBSeqFeature_mysql.t               46   46  1-46
> t/SearchIO.t                22  5632  1337 2671  2-1337
> 2 tests and 106 subtests skipped.
> Failed 2/236 test scripts. 1382/11688 subtests failed.
> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
> 159.61 CPU)
> 
> BioDBSeqFeature_mysql
> =====================
> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
> 1..46
> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
> (eval 37) line 3.
> Perhaps the DBD::mysql perl module hasn't been fully installed,
> or perhaps the capitalisation of 'mysql' isn't right.
> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
> 
> SearchIO
> ========
> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
> 1..1337
> ok 1
> 
> -------------------- WARNING ---------------------
> MSG: XML::SAX::Expat not currently supported; must have local copies
> of NCBI DTD docs!
> ---------------------------------------------------
> 
> -------------------- WARNING ---------------------
> MSG: error in parsing a report:
> 
> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> does not exist
> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> Handler couldn't resolve external entity at line 2, column 82, byte 104
> error in processing external entity reference at line 2, column 82,
> byte 104 at
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> 187
> 
> ---------------------------------------------------
> not ok 2
> # Failed test 2 in t/SearchIO.t at line 68
> Can't call method "database_name" on an undefined value at
> t/SearchIO.t line 69.
> 
> ------------------------------
> 
> Message: 10
> Date: Tue, 17 Oct 2006 11:32:54 +0100
> From: Sendu Bala <bix at sendu.me.uk>
> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
> To: bioperl-l at bioperl.org
> Message-ID: <4534B156.4090501 at sendu.me.uk>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> See http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
> 
> Developers:
>     This should be the last RC before release ~next monday. Now would
>     be a good time for last minute documentaiton updates and additions.
> 
> Users:
>     Even though 1.5.2 is a 'developer' release, we consider it the most
>     stable and capable version of Bioperl, and recommend that you use
>     it in all but the most critical production environments. Please
>     try it out and let us know of any problems or difficulties you run
>     into.
> 
> 
> Thank you,
> Sendu.
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From paul.boutros at utoronto.ca  Tue Oct 17 17:59:33 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 13:59:33 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
Message-ID: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca>

Hi Chris,

Here it is:
pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
1..1337
ok 1

-------------------- WARNING ---------------------
MSG: XML::SAX::Expat not currently supported; must have local copies  
of NCBI DTD docs!
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: error in parsing a report:

404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'  
does not exist  
file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
Handler couldn't resolve external entity at line 2, column 82, byte 104
error in processing external entity reference at line 2, column 82,  
byte 104 at  
/db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line  
187

---------------------------------------------------
not ok 2
# Failed test 2 in t/SearchIO.t at line 68
Can't call method "database_name" on an undefined value at  
t/SearchIO.t line 69.


Quoting Chris Fields <cjfields at uiuc.edu>:

> What do you get when you run the SearchIO.t test by itself using 'perl -I.
> t/SearchIO.t'?  It looks like something pretty catastrophic happened.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>> Sent: Tuesday, October 17, 2006 11:57 AM
>> To: bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>>
>> Hi,
>> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
>> tests, the first seems to be just a result of me not having DBD::mysql
>> installed.
>> Paul
>>
>> Test Summary
>> ============
>>
>> Failed Test               Stat Wstat Total Fail  List of Failed
>> --------------------------------------------------------------------------
>> -----
>> t/BioDBSeqFeature_mysql.t               46   46  1-46
>> t/SearchIO.t                22  5632  1337 2671  2-1337
>> 2 tests and 106 subtests skipped.
>> Failed 2/236 test scripts. 1382/11688 subtests failed.
>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
>> 159.61 CPU)
>>
>> BioDBSeqFeature_mysql
>> =====================
>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>> 1..46
>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
>> (eval 37) line 3.
>> Perhaps the DBD::mysql perl module hasn't been fully installed,
>> or perhaps the capitalisation of 'mysql' isn't right.
>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>>
>> SearchIO
>> ========
>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>> 1..1337
>> ok 1
>>
>> -------------------- WARNING ---------------------
>> MSG: XML::SAX::Expat not currently supported; must have local copies
>> of NCBI DTD docs!
>> ---------------------------------------------------
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>
>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> does not exist
>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> error in processing external entity reference at line 2, column 82,
>> byte 104 at
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> 187
>>
>> ---------------------------------------------------
>> not ok 2
>> # Failed test 2 in t/SearchIO.t at line 68
>> Can't call method "database_name" on an undefined value at
>> t/SearchIO.t line 69.
>>
>> ------------------------------
>>
>> Message: 10
>> Date: Tue, 17 Oct 2006 11:32:54 +0100
>> From: Sendu Bala <bix at sendu.me.uk>
>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>> To: bioperl-l at bioperl.org
>> Message-ID: <4534B156.4090501 at sendu.me.uk>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>
>> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
>> See http://www.bioperl.org/wiki/Release_1.5.2 for
>> instructions on getting and testing this RC.
>>
>> Developers:
>>     This should be the last RC before release ~next monday. Now would
>>     be a good time for last minute documentaiton updates and additions.
>>
>> Users:
>>     Even though 1.5.2 is a 'developer' release, we consider it the most
>>     stable and capable version of Bioperl, and recommend that you use
>>     it in all but the most critical production environments. Please
>>     try it out and let us know of any problems or difficulties you run
>>     into.
>>
>>
>> Thank you,
>> Sendu.
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From barry.moore at genetics.utah.edu  Tue Oct 17 18:07:12 2006
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Tue, 17 Oct 2006 12:07:12 -0600
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <C15A8DE6.AD40%bosborne11@verizon.net>
References: <C15A8DE6.AD40%bosborne11@verizon.net>
Message-ID: <588DE26B-8F18-4540-BAEE-2B479CBDE8B3@genetics.utah.edu>

In fact, I think it was you who taught me that trick in the first place.

B

On Oct 17, 2006, at 11:40 AM, Brian Osborne wrote:

> Barry,
>
> I second that. lynx does the best job of converting HTML to text  
> I've seen.
>
> Brian O.
>
>
> On 10/17/06 12:57 PM, "Barry Moore" <barry.moore at genetics.utah.edu>  
> wrote:
>
>> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
>>
>> does a reasonable job of textifying html.  You get the links as
>> numbered references at the bottom or:
>>
>> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
>> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
>>
>> to remove the links all together.
>>
>> Barry
>>
>> P.S.  Looks like this:
>>
>>     #Creative Commons copyright
>>
>> Installing Bioperl for Unix
>>
>>  From BioPerl
>>
>>     Jump to: navigation, search
>>
>> Contents
>>
>>       * 1 BIOPERL INSTALLATION
>>       * 2 SYSTEM REQUIREMENTS
>>       * 3 OPTIONAL
>>       * 4 ADDITIONAL INSTALLATION INFORMATION
>>       * 5 THE BIOPERL BUNDLE
>>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>>       * 8 WHERE ARE THE MAN PAGES?
>>       * 9 EXTERNAL PROGRAMS
>>            + 9.1 Environment Variables
>>       * 10 INSTALLING BIOPERL SCRIPTS
>>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>>       * 14 THE TEST SYSTEM
>>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>>            + 15.1 CONFIGURING for BSD and Solaris boxes
>>            + 15.2 INSTALLATION
>>          * 16 DEPENDENCIES AND Bundle::BioPerl
>>
>>
>> BIOPERL INSTALLATION
>>
>>     Bioperl has been installed on many forms of Unix, Win9X/NT/ 
>> 2000/XP,
>>     and on Mac OS X (see the PLATFORMS file for more details).
>> Following are
>>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
>> Windows
>>     installation instructions can be found here. For installing
>> Bioperl for
>>     Mac OS X using Fink, see Getting BioPerl.
>>
>>
>> SYSTEM REQUIREMENTS
>>
>>       * Perl 5.005 or later; version 5.6 and greater are recommended.
>> Note
>>         that most modules will work with earlier versions of Perl.
>> The only ones
>>         that will not are Bio::SimpleAlign and the Bio::Index::*
>> modules. If
>>         you don't need these modules and you want to install Bioperl
>> using an
>>         earlier version of Perl, edit the "require 5.005;" line in
>> Makefile.PL
>>         as necessary.
>>
>>       * External modules: Bioperl uses functionality provided in
>> other Perl
>>         modules. Some of these are included in the standard perl
>> package but
>>         some  need to be obtained from the CPAN site. The list of
>> external
>>         modules is included at the bottom of this document.
>>
>>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
>> these
>>     external modules easy. Simply install the bundle using your CPAN
>> shell and
>>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
>> below.
>>
>>
>> OPTIONAL
>>
>>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions  
>> (the
>>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>>         PACKAGE, below).
>>
>>
>>
>> ADDITIONAL INSTALLATION INFORMATION
>>
>>       * Additional information on Bioperl and MAC OS:
>>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>>            + OSX-http://www.tc.umn.edu/~cann0010/
>> Bioperl_OSX_install.html
>>            + OS X - Installing using Fink (in Getting BioPerl)
>>
>>
>>
>> THE BIOPERL BUNDLE
>>
>>     You typically need root privileges to install using CPAN. If you
>> don't
>>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
>> MODULE
>>     AREA for additional information.
>>
>>     Install Bundle::Bioperl using CPAN. One way:
>>> perl -MCPAN -e "install Bundle::BioPerl"
>>
>>     Another way:
>>> perl -MCPAN -e shell
>> cpan>install Bundle::BioPerl
>>
>>
>>
>> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
>>
>>> There isn't a very easy way since so many links have to be removed/
>>> modified.
>>> I have found a few CPAN modules that could help, but for now I just
>>> dump the
>>> text output from a text browser (elinks) using the 'printable
>>> version' page
>>> and hand-edit, which works very quickly.  That works for the time
>>> being
>>> until I can find another more automated solution.
>>>
>>> Fortunately there have been very few edits to either INSTALL wiki
>>> page so
>>> they should remain relatively stable.
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher - Switzer Lab
>>> Dept. of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>> -----Original Message-----
>>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>>>> Sent: Tuesday, October 17, 2006 6:46 AM
>>>> To: Chris Fields
>>>> Cc: bioperl-l
>>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>>>
>>>> Chris Fields wrote:
>>>>> The general consensus was to keep text versions available; we  
>>>>> could
>>>>> add URL links to the wiki pages for the most up-to-dat version.
>>>>> BTW,
>>>>> I have modified INSTALL already.  INSTALL.WIN is next in line  
>>>>> (I was
>>>>> waiting for your changes).
>>>>>
>>>> Is it possible to generate these files from the wiki whenever
>>>> there is a
>>>> release? I now edits shouldn't be too severe or too often - but  
>>>> I can
>>>> see things getting a little messy/annoying if edits have to be
>>>> made in 2
>>>> places.
>>>>
>>>> Nath
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From bix at sendu.me.uk  Tue Oct 17 18:07:04 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 17 Oct 2006 19:07:04 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
References: <20061017125719.t1d5r3a2o1s0k08o@webmail.utoronto.ca>
Message-ID: <45351BC8.9080507@sendu.me.uk>

Paul Boutros wrote:
> Hi,
> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed  
> tests, the first seems to be just a result of me not having DBD::mysql  
> installed.
[snip]

Thanks for those, very useful. Not something that's come up before 
afaik; I'll look into them.


From cjfields at uiuc.edu  Tue Oct 17 18:31:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 13:31:51 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017135933.cu3li7pz8fzk8cko@webmail.utoronto.ca>
Message-ID: <001401c6f21a$836f9fc0$15327e82@pyrimidine>

Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX
backend parser.  For some reason BLAST XML parsing doesn't work with that
parser (it tries to verify the XML first before parsing, hence the DTD
error).  I may try getting this to work again, but so far I haven't found an
easy way to prevent XML verification via XML::SAX::Expat.

There are two options: 1) install XML::SAX::ExpatXS (the better option),
which works AND is 4x faster than XML::SAX::Expat, or  2) set the default
parser in the PareserDetails.ini file in your local to use
XML::SAX::PurePerl.  

BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just
hasn't officially happened yet); the latter hasn't had significant
development in about three years.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
> Sent: Tuesday, October 17, 2006 1:00 PM
> To: Chris Fields
> Cc: bioperl-l at lists.open-bio.org
> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
> 
> Hi Chris,
> 
> Here it is:
> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
> 1..1337
> ok 1
> 
> -------------------- WARNING ---------------------
> MSG: XML::SAX::Expat not currently supported; must have local copies
> of NCBI DTD docs!
> ---------------------------------------------------
> 
> -------------------- WARNING ---------------------
> MSG: error in parsing a report:
> 
> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> does not exist
> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> Handler couldn't resolve external entity at line 2, column 82, byte 104
> error in processing external entity reference at line 2, column 82,
> byte 104 at
> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> 187
> 
> ---------------------------------------------------
> not ok 2
> # Failed test 2 in t/SearchIO.t at line 68
> Can't call method "database_name" on an undefined value at
> t/SearchIO.t line 69.
> 
> 
> Quoting Chris Fields <cjfields at uiuc.edu>:
> 
> > What do you get when you run the SearchIO.t test by itself using 'perl -
> I.
> > t/SearchIO.t'?  It looks like something pretty catastrophic happened.
> >
> > Christopher Fields
> > Postdoctoral Researcher - Switzer Lab
> > Dept. of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
> >> Sent: Tuesday, October 17, 2006 11:57 AM
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
> >>
> >> Hi,
> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
> >> tests, the first seems to be just a result of me not having DBD::mysql
> >> installed.
> >> Paul
> >>
> >> Test Summary
> >> ============
> >>
> >> Failed Test               Stat Wstat Total Fail  List of Failed
> >> -----------------------------------------------------------------------
> ---
> >> -----
> >> t/BioDBSeqFeature_mysql.t               46   46  1-46
> >> t/SearchIO.t                22  5632  1337 2671  2-1337
> >> 2 tests and 106 subtests skipped.
> >> Failed 2/236 test scripts. 1382/11688 subtests failed.
> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
> >> 159.61 CPU)
> >>
> >> BioDBSeqFeature_mysql
> >> =====================
> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
> >> 1..46
> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
> >> (eval 37) line 3.
> >> Perhaps the DBD::mysql perl module hasn't been fully installed,
> >> or perhaps the capitalisation of 'mysql' isn't right.
> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
> >>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
> >>
> >> SearchIO
> >> ========
> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
> >> 1..1337
> >> ok 1
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: XML::SAX::Expat not currently supported; must have local copies
> >> of NCBI DTD docs!
> >> ---------------------------------------------------
> >>
> >> -------------------- WARNING ---------------------
> >> MSG: error in parsing a report:
> >>
> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
> >> does not exist
> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
> >> Handler couldn't resolve external entity at line 2, column 82, byte 104
> >> error in processing external entity reference at line 2, column 82,
> >> byte 104 at
> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
> >> 187
> >>
> >> ---------------------------------------------------
> >> not ok 2
> >> # Failed test 2 in t/SearchIO.t at line 68
> >> Can't call method "database_name" on an undefined value at
> >> t/SearchIO.t line 69.
> >>
> >> ------------------------------
> >>
> >> Message: 10
> >> Date: Tue, 17 Oct 2006 11:32:54 +0100
> >> From: Sendu Bala <bix at sendu.me.uk>
> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
> >> To: bioperl-l at bioperl.org
> >> Message-ID: <4534B156.4090501 at sendu.me.uk>
> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>
> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> >> See http://www.bioperl.org/wiki/Release_1.5.2 for
> >> instructions on getting and testing this RC.
> >>
> >> Developers:
> >>     This should be the last RC before release ~next monday. Now would
> >>     be a good time for last minute documentaiton updates and additions.
> >>
> >> Users:
> >>     Even though 1.5.2 is a 'developer' release, we consider it the most
> >>     stable and capable version of Bioperl, and recommend that you use
> >>     it in all but the most critical production environments. Please
> >>     try it out and let us know of any problems or difficulties you run
> >>     into.
> >>
> >>
> >> Thank you,
> >> Sendu.
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> 


From cjfields at uiuc.edu  Tue Oct 17 19:05:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 14:05:59 -0500
Subject: [Bioperl-l] split location problems
In-Reply-To: <B9182BFF5B004245BABC12956EA6322E018E6735@huls5.nucleus.harvard.edu>
Message-ID: <001b01c6f21f$48640420$15327e82@pyrimidine>

> > From: Jason Stajich [mailto:jason.stajich at gmail.com]
> >
> > The whole point of split locations is to represent genes with
> > introns
> > so that is not the "rare" case.
> 
> Absolutely.

Right, but that specific kind of join statement is not commonly used  in
GenBank files, which seems to be the format predominately used (no offense
to EBI).  This may explain why we haven't seen this pop up more often.  

I believe we're seeing is a difference in the way these locations are
described at NCBI vs EBI, which Nadeem Faruque seems to corroborate.  He
indicated that EBI may move to using similar GenBank-like location strings.
Regardless, FTlocationFactory and Bio::Location::Split should handle both if
they are present but only seems to like the GenBank version.

> > I've added code to test this to bug 2101 including a C.glabrata
> > chromsome downloaded from genbank.  Perhaps the problem is on the
> > EMBL parsing side, I didn't test that.
> 
> Well, I don't know whether it's EMBL parsing, or a bit further down the
> pipe, but I downloaded C.glabrata chromosome B for GenBank (NC_005968),
> and it describes the complement/joins in the way that Bioperl is
> handling correctly.
> 
> GenBank:
>      CDS             complement(join(10347..10372,10632..11157))
>                      /locus_tag="CAGL0B00242g"
> 
> EMBL:
> FT   CDS
> join(complement(10632..11157),complement(10347..10372))
> FT                   /locus_tag="CAGL0B00242g"

Yes, something that I found out independently (and corroborated by Nadeem).

> Here's the diff when I run the location-printing script I posted
> yesterday:
> 
> diff biogb bio
> 1c1,5
> < complement(join(10347..10372,10632..11157))
> ---
> > complement(1701..2651)
> > complement(2635..3345)
> > complement(3980..4408)
> > complement(join(10632..11157,10347..10372))
> > 10379..10615
> 209a214,217
> > 498198..498890
> > 499712..500062
> > 499851..500702
> > 500579..501364
> 
> As you can see, the complement/join CDS is written out in a different
> order, which is Bad.

I think this can be handled directly in to_FTstring().  I'll have to add a
method to get the strand info from the Split object w/o going through
strand().  

However, I'm thinking about trying a different tact which is a bit simpler
and, if it proves fruitful, may simplify Split locations somewhat.  It won't
be ready for 1.5.2 but maybe the next release.

> (I looked at at least one of the other differences: the GB file says
> it's a "misc feature" and EMBL says it's a CDS. But they don't seem to
> be relevant here.)
> -Amir

Probably not but something to keep in mind.
 
-c

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From er at xs4all.nl  Tue Oct 17 19:01:48 2006
From: er at xs4all.nl (Erikjan)
Date: Tue, 17 Oct 2006 21:01:48 +0200 (CEST)
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <001301c6f215$07a9a070$15327e82@pyrimidine>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
Message-ID: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>

Hello,

I noticed a little problem with the Annotation "DBLink" from GenBank entries

When I run:

perl -MBio::DB::GenBank -e 'my $gi =
56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
$db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
$ac=$seq->annotation(); my @annotations = $ac->get_Annotations("dblink");
for(@annotations) { print $_, "\n";} print $INC{
"Bio/Annotation/DBLink.pm" }, "\n"; '

This yields:

   GenBank:AL591065.17.17

and the place where the used Bio/Annotation/DBLink.pm resides.

Can others repeat this?

I have dug into the source a little and Bio::Annotation::DBLink seems to
be the place where this happens: it has a concatenation which leads to
that repeated version number.

It this something that I should fix "client-side", so to speak, or is it
worthwhile to add some logic to that concatenation to prevent this?


Thanks,

Eric


From bosborne11 at verizon.net  Tue Oct 17 17:40:54 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 17 Oct 2006 13:40:54 -0400
Subject: [Bioperl-l] INSTALL and INSTALL.WIN
In-Reply-To: <3CACE1AB-D7F2-4F54-A1C3-35E1AE444CD1@genetics.utah.edu>
Message-ID: <C15A8DE6.AD40%bosborne11@verizon.net>

Barry,

I second that. lynx does the best job of converting HTML to text I've seen.

Brian O.


On 10/17/06 12:57 PM, "Barry Moore" <barry.moore at genetics.utah.edu> wrote:

> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix
> 
> does a reasonable job of textifying html.  You get the links as
> numbered references at the bottom or:
> 
> lynx -dump http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix |
> perl -ane 's/\[?\[\d+\](edit\])?//g;print'
> 
> to remove the links all together.
> 
> Barry
> 
> P.S.  Looks like this:
> 
>     #Creative Commons copyright
> 
> Installing Bioperl for Unix
> 
>  From BioPerl
> 
>     Jump to: navigation, search
> 
> Contents
> 
>       * 1 BIOPERL INSTALLATION
>       * 2 SYSTEM REQUIREMENTS
>       * 3 OPTIONAL
>       * 4 ADDITIONAL INSTALLATION INFORMATION
>       * 5 THE BIOPERL BUNDLE
>       * 6 INSTALLING BIOPERL THE EASY WAY USING CPAN
>       * 7 INSTALLING BIOPERL THE EASY WAY USING GNU 'make'
>       * 8 WHERE ARE THE MAN PAGES?
>       * 9 EXTERNAL PROGRAMS
>            + 9.1 Environment Variables
>       * 10 INSTALLING BIOPERL SCRIPTS
>       * 11 INSTALLING BIOPERL IN A PERSONAL MODULE AREA
>       * 12 INSTALLING BIOPERL MODULES THE HARD WAY
>       * 13 USING MODULES NOT INSTALLED IN THE STANDARD LOCATION
>       * 14 THE TEST SYSTEM
>       * 15 BUILDING THE OPTIONAL bioperl-ext PACKAGE
>            + 15.1 CONFIGURING for BSD and Solaris boxes
>            + 15.2 INSTALLATION
>          * 16 DEPENDENCIES AND Bundle::BioPerl
> 
> 
> BIOPERL INSTALLATION
> 
>     Bioperl has been installed on many forms of Unix, Win9X/NT/2000/XP,
>     and on Mac OS X (see the PLATFORMS file for more details).
> Following are
>     instructions  for  installing Bioperl for Unix/Linux/Mac OS X;
> Windows
>     installation instructions can be found here. For installing
> Bioperl for
>     Mac OS X using Fink, see Getting BioPerl.
> 
> 
> SYSTEM REQUIREMENTS
> 
>       * Perl 5.005 or later; version 5.6 and greater are recommended.
> Note
>         that most modules will work with earlier versions of Perl.
> The only ones
>         that will not are Bio::SimpleAlign and the Bio::Index::*
> modules. If
>         you don't need these modules and you want to install Bioperl
> using an
>         earlier version of Perl, edit the "require 5.005;" line in
> Makefile.PL
>         as necessary.
> 
>       * External modules: Bioperl uses functionality provided in
> other Perl
>         modules. Some of these are included in the standard perl
> package but
>         some  need to be obtained from the CPAN site. The list of
> external
>         modules is included at the bottom of this document.
> 
>     The CPAN Bioperl Bundle (Bundle::BioPerl) makes installation of
> these
>     external modules easy. Simply install the bundle using your CPAN
> shell and
>     all necessary modules will be installed. See THE BIOPERL BUNDLE,
> below.
> 
> 
> OPTIONAL
> 
>       * ANSI  C  or  GNU  C  compiler  (gcc)  for  XS  extensions (the
>         bioperl-ext package; see BUILDING THE OPTIONAL bioperl-ext
>         PACKAGE, below).
> 
> 
> 
> ADDITIONAL INSTALLATION INFORMATION
> 
>       * Additional information on Bioperl and MAC OS:
>            + OS 9 - http://bioperl.org/Core/mac-bioperl.html
>            + OSX-http://www.tc.umn.edu/~cann0010/
> Bioperl_OSX_install.html
>            + OS X - Installing using Fink (in Getting BioPerl)
> 
> 
> 
> THE BIOPERL BUNDLE
> 
>     You typically need root privileges to install using CPAN. If you
> don't
>     have these privileges please see INSTALLING BIOPERL IN A PERSONAL
> MODULE
>     AREA for additional information.
> 
>     Install Bundle::Bioperl using CPAN. One way:
>> perl -MCPAN -e "install Bundle::BioPerl"
> 
>     Another way:
>> perl -MCPAN -e shell
> cpan>install Bundle::BioPerl
> 
> 
> 
> On Oct 17, 2006, at 8:04 AM, Chris Fields wrote:
> 
>> There isn't a very easy way since so many links have to be removed/
>> modified.
>> I have found a few CPAN modules that could help, but for now I just
>> dump the
>> text output from a text browser (elinks) using the 'printable
>> version' page
>> and hand-edit, which works very quickly.  That works for the time
>> being
>> until I can find another more automated solution.
>> 
>> Fortunately there have been very few edits to either INSTALL wiki
>> page so
>> they should remain relatively stable.
>> 
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>> 
>>> -----Original Message-----
>>> From: Nathan S. Haigh [mailto:n.haigh at sheffield.ac.uk]
>>> Sent: Tuesday, October 17, 2006 6:46 AM
>>> To: Chris Fields
>>> Cc: bioperl-l
>>> Subject: Re: [Bioperl-l] INSTALL and INSTALL.WIN
>>> 
>>> Chris Fields wrote:
>>>> The general consensus was to keep text versions available; we could
>>>> add URL links to the wiki pages for the most up-to-dat version.
>>>> BTW,
>>>> I have modified INSTALL already.  INSTALL.WIN is next in line (I was
>>>> waiting for your changes).
>>>> 
>>> Is it possible to generate these files from the wiki whenever
>>> there is a
>>> release? I now edits shouldn't be too severe or too often - but I can
>>> see things getting a little messy/annoying if edits have to be
>>> made in 2
>>> places.
>>> 
>>> Nath
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Oct 17 20:30:15 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 15:30:15 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
Message-ID: <0FB91820-B2A1-4F7F-866C-8D4791DD8306@uiuc.edu>

I can confirm this using bioperl-live:

GenBank:AL591065.17.17
/Users/cjfields/src/bioperl-live/Bio/Annotation/DBLink.pm

Could you file a bug report via bugzilla?

Chris

On Oct 17, 2006, at 2:01 PM, Erikjan wrote:

> Hello,
>
> I noticed a little problem with the Annotation "DBLink" from  
> GenBank entries
>
> When I run:
>
> perl -MBio::DB::GenBank -e 'my $gi =
> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations 
> ("dblink");
> for(@annotations) { print $_, "\n";} print $INC{
> "Bio/Annotation/DBLink.pm" }, "\n"; '
>
> This yields:
>
>    GenBank:AL591065.17.17
>
> and the place where the used Bio/Annotation/DBLink.pm resides.
>
> Can others repeat this?
>
> I have dug into the source a little and Bio::Annotation::DBLink  
> seems to
> be the place where this happens: it has a concatenation which leads to
> that repeated version number.
>
> It this something that I should fix "client-side", so to speak, or  
> is it
> worthwhile to add some logic to that concatenation to prevent this?
>
>
> Thanks,
>
> Eric
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From paul.boutros at utoronto.ca  Tue Oct 17 23:49:52 2006
From: paul.boutros at utoronto.ca (Paul Boutros)
Date: Tue, 17 Oct 2006 19:49:52 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
Message-ID: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>

Hi Chris,

Yup, that's it.  I installed XML::SAX::ExpatXS (make test output  
below).  Should there be a note somewhere in the INSTALL docs saying  
basically what you just wrote?  Or maybe it's already there somewhere  
and I missed it.

Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks  
if DBD::mysql can be loaded, and if not doesn't run the test.  Since  
the file is only one-line long, here's the modified file rather than a  
patch:
################################################################
BEGIN {
         # DBD::mysql is required
         eval {
                 require DBD::mysql;
                 };
         if ( $@ ) {
                 use Test::More skip_all => "DBD::mysql is not  
installed or is installed incorrectly - skipping BioDBSeqFeature
_mysql.t";
                 exit(0);
                 }
         }

system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1  
-dsn test";
################################################################

And when I run it I get:
t/BioDBSeqFeature_mysql......skipped
         all skipped: DBD::mysql is not installed or is installed  
incorrectly - skipping BioDBSeqFeature_mysql.t

And for the overall make test:
All tests successful, 3 tests and 106 subtests skipped.
Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys =  
164.24 CPU)

Hope this helps,
Paul


Quoting Chris Fields <cjfields at uiuc.edu>:

> Your local copy of XML::SAX has XML::SAX::Expat set as the default SAX
> backend parser.  For some reason BLAST XML parsing doesn't work with that
> parser (it tries to verify the XML first before parsing, hence the DTD
> error).  I may try getting this to work again, but so far I haven't found an
> easy way to prevent XML verification via XML::SAX::Expat.
>
> There are two options: 1) install XML::SAX::ExpatXS (the better option),
> which works AND is 4x faster than XML::SAX::Expat, or  2) set the default
> parser in the PareserDetails.ini file in your local to use
> XML::SAX::PurePerl.
>
> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it just
> hasn't officially happened yet); the latter hasn't had significant
> development in about three years.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>> -----Original Message-----
>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
>> Sent: Tuesday, October 17, 2006 1:00 PM
>> To: Chris Fields
>> Cc: bioperl-l at lists.open-bio.org
>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
>>
>> Hi Chris,
>>
>> Here it is:
>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
>> 1..1337
>> ok 1
>>
>> -------------------- WARNING ---------------------
>> MSG: XML::SAX::Expat not currently supported; must have local copies
>> of NCBI DTD docs!
>> ---------------------------------------------------
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>
>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> does not exist
>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> error in processing external entity reference at line 2, column 82,
>> byte 104 at
>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> 187
>>
>> ---------------------------------------------------
>> not ok 2
>> # Failed test 2 in t/SearchIO.t at line 68
>> Can't call method "database_name" on an undefined value at
>> t/SearchIO.t line 69.
>>
>>
>> Quoting Chris Fields <cjfields at uiuc.edu>:
>>
>> > What do you get when you run the SearchIO.t test by itself using 'perl -
>> I.
>> > t/SearchIO.t'?  It looks like something pretty catastrophic happened.
>> >
>> > Christopher Fields
>> > Postdoctoral Researcher - Switzer Lab
>> > Dept. of Biochemistry
>> > University of Illinois Urbana-Champaign
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> >> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>> >> Sent: Tuesday, October 17, 2006 11:57 AM
>> >> To: bioperl-l at lists.open-bio.org
>> >> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>> >>
>> >> Hi,
>> >> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two failed
>> >> tests, the first seems to be just a result of me not having DBD::mysql
>> >> installed.
>> >> Paul
>> >>
>> >> Test Summary
>> >> ============
>> >>
>> >> Failed Test               Stat Wstat Total Fail  List of Failed
>> >> -----------------------------------------------------------------------
>> ---
>> >> -----
>> >> t/BioDBSeqFeature_mysql.t               46   46  1-46
>> >> t/SearchIO.t                22  5632  1337 2671  2-1337
>> >> 2 tests and 106 subtests skipped.
>> >> Failed 2/236 test scripts. 1382/11688 subtests failed.
>> >> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14 csys =
>> >> 159.61 CPU)
>> >>
>> >> BioDBSeqFeature_mysql
>> >> =====================
>> >> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>> >> 1..46
>> >> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
>> >> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>> >> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>> >> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/site_perl) at
>> >> (eval 37) line 3.
>> >> Perhaps the DBD::mysql perl module hasn't been fully installed,
>> >> or perhaps the capitalisation of 'mysql' isn't right.
>> >> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>> >>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>> >>
>> >> SearchIO
>> >> ========
>> >> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>> >> 1..1337
>> >> ok 1
>> >>
>> >> -------------------- WARNING ---------------------
>> >> MSG: XML::SAX::Expat not currently supported; must have local copies
>> >> of NCBI DTD docs!
>> >> ---------------------------------------------------
>> >>
>> >> -------------------- WARNING ---------------------
>> >> MSG: error in parsing a report:
>> >>
>> >> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>> >> does not exist
>> >> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>> >> Handler couldn't resolve external entity at line 2, column 82, byte 104
>> >> error in processing external entity reference at line 2, column 82,
>> >> byte 104 at
>> >> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm line
>> >> 187
>> >>
>> >> ---------------------------------------------------
>> >> not ok 2
>> >> # Failed test 2 in t/SearchIO.t at line 68
>> >> Can't call method "database_name" on an undefined value at
>> >> t/SearchIO.t line 69.
>> >>
>> >> ------------------------------
>> >>
>> >> Message: 10
>> >> Date: Tue, 17 Oct 2006 11:32:54 +0100
>> >> From: Sendu Bala <bix at sendu.me.uk>
>> >> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>> >> To: bioperl-l at bioperl.org
>> >> Message-ID: <4534B156.4090501 at sendu.me.uk>
>> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>> >>
>> >> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
>> >> See http://www.bioperl.org/wiki/Release_1.5.2 for
>> >> instructions on getting and testing this RC.
>> >>
>> >> Developers:
>> >>     This should be the last RC before release ~next monday. Now would
>> >>     be a good time for last minute documentaiton updates and additions.
>> >>
>> >> Users:
>> >>     Even though 1.5.2 is a 'developer' release, we consider it the most
>> >>     stable and capable version of Bioperl, and recommend that you use
>> >>     it in all but the most critical production environments. Please
>> >>     try it out and let us know of any problems or difficulties you run
>> >>     into.
>> >>
>> >>
>> >> Thank you,
>> >> Sendu.
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> >
>> >
>>
>
>
>


From cjfields at uiuc.edu  Wed Oct 18 00:51:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 17 Oct 2006 19:51:35 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>
	<20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
Message-ID: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>

On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote:

> Hi Chris,
>
> Yup, that's it.  I installed XML::SAX::ExpatXS (make test output
> below).  Should there be a note somewhere in the INSTALL docs saying
> basically what you just wrote?  Or maybe it's already there somewhere
> and I missed it.

The INSTALL docs should have this, yes.  I'll double-check though.

Pretty much anything that plugs into XML::SAX except XML::SAX::Expat  
works (XML::LibXML also works, I found).

> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks
> if DBD::mysql can be loaded, and if not doesn't run the test.  Since
> the file is only one-line long, here's the modified file rather than a
> patch:
> ################################################################
> BEGIN {
>          # DBD::mysql is required
>          eval {
>                  require DBD::mysql;
>                  };
>          if ( $@ ) {
>                  use Test::More skip_all => "DBD::mysql is not
> installed or is installed incorrectly - skipping BioDBSeqFeature
> _mysql.t";
>                  exit(0);
>                  }
>          }
>
> system "perl t/BioDBSeqFeature.t -adaptor DBI::mysql -create 1 -temp 1
> -dsn test";
> ################################################################
>
> And when I run it I get:
> t/BioDBSeqFeature_mysql......skipped
>          all skipped: DBD::mysql is not installed or is installed
> incorrectly - skipping BioDBSeqFeature_mysql.t
>
> And for the overall make test:
> All tests successful, 3 tests and 106 subtests skipped.
> Files=236, Tests=11642, 247 wallclock secs (143.85 cusr + 20.39 csys =
> 164.24 CPU)

It should check this when using 'perl Makefile.PL', since the tests  
are only set up if MySQL is present (so you would assume that it  
checks for DBD::mysql).  I'll look into it.

Chris

> Hope this helps,
> Paul
>
>
> Quoting Chris Fields <cjfields at uiuc.edu>:
>
>> Your local copy of XML::SAX has XML::SAX::Expat set as the default  
>> SAX
>> backend parser.  For some reason BLAST XML parsing doesn't work  
>> with that
>> parser (it tries to verify the XML first before parsing, hence the  
>> DTD
>> error).  I may try getting this to work again, but so far I  
>> haven't found an
>> easy way to prevent XML verification via XML::SAX::Expat.
>>
>> There are two options: 1) install XML::SAX::ExpatXS (the better  
>> option),
>> which works AND is 4x faster than XML::SAX::Expat, or  2) set the  
>> default
>> parser in the PareserDetails.ini file in your local to use
>> XML::SAX::PurePerl.
>>
>> BTW, XML::SAX::ExpatXS is intended to replace XML::SAX::Expat (it  
>> just
>> hasn't officially happened yet); the latter hasn't had significant
>> development in about three years.
>>
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>> -----Original Message-----
>>> From: Paul Boutros [mailto:paul.boutros at utoronto.ca]
>>> Sent: Tuesday, October 17, 2006 1:00 PM
>>> To: Chris Fields
>>> Cc: bioperl-l at lists.open-bio.org
>>> Subject: RE: [Bioperl-l] Bioperl 1.5.2 RC2
>>>
>>> Hi Chris,
>>>
>>> Here it is:
>>> pcboutro at ccb690[643] >> perl -I. t/SearchIO.t
>>> 1..1337
>>> ok 1
>>>
>>> -------------------- WARNING ---------------------
>>> MSG: XML::SAX::Expat not currently supported; must have local copies
>>> of NCBI DTD docs!
>>> ---------------------------------------------------
>>>
>>> -------------------- WARNING ---------------------
>>> MSG: error in parsing a report:
>>>
>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd'
>>> does not exist
>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>>> Handler couldn't resolve external entity at line 2, column 82,  
>>> byte 104
>>> error in processing external entity reference at line 2, column 82,
>>> byte 104 at
>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/Parser.pm  
>>> line
>>> 187
>>>
>>> ---------------------------------------------------
>>> not ok 2
>>> # Failed test 2 in t/SearchIO.t at line 68
>>> Can't call method "database_name" on an undefined value at
>>> t/SearchIO.t line 69.
>>>
>>>
>>> Quoting Chris Fields <cjfields at uiuc.edu>:
>>>
>>>> What do you get when you run the SearchIO.t test by itself using  
>>>> 'perl -
>>> I.
>>>> t/SearchIO.t'?  It looks like something pretty catastrophic  
>>>> happened.
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher - Switzer Lab
>>>> Dept. of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>>> bounces at lists.open-bio.org] On Behalf Of Paul Boutros
>>>>> Sent: Tuesday, October 17, 2006 11:57 AM
>>>>> To: bioperl-l at lists.open-bio.org
>>>>> Subject: Re: [Bioperl-l] Bioperl 1.5.2 RC2
>>>>>
>>>>> Hi,
>>>>> Here's a quick make test on AIX 5.2 with Perl 5.8.8.  I get two  
>>>>> failed
>>>>> tests, the first seems to be just a result of me not having  
>>>>> DBD::mysql
>>>>> installed.
>>>>> Paul
>>>>>
>>>>> Test Summary
>>>>> ============
>>>>>
>>>>> Failed Test               Stat Wstat Total Fail  List of Failed
>>>>> ------------------------------------------------------------------ 
>>>>> -----
>>> ---
>>>>> -----
>>>>> t/BioDBSeqFeature_mysql.t               46   46  1-46
>>>>> t/SearchIO.t                22  5632  1337 2671  2-1337
>>>>> 2 tests and 106 subtests skipped.
>>>>> Failed 2/236 test scripts. 1382/11688 subtests failed.
>>>>> Files=236, Tests=11688, 259 wallclock secs (139.47 cusr + 20.14  
>>>>> csys =
>>>>> 159.61 CPU)
>>>>>
>>>>> BioDBSeqFeature_mysql
>>>>> =====================
>>>>> pcboutro at ccb690[674] >> perl -w t/BioDBSeqFeature_mysql.t
>>>>> 1..46
>>>>> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC  
>>>>> (@INC
>>>>> contains: /home/pcboutro/cvswork/bioperl-live/ . .. ./blib/lib t
>>>>> /db2blast/perl/lib/5.8.8/aix-thread-multi /db2blast/perl/lib/5.8.8
>>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi
>>>>> /db2blast/perl/lib/site_perl/5.8.8 /db2blast/perl/lib/ 
>>>>> site_perl) at
>>>>> (eval 37) line 3.
>>>>> Perhaps the DBD::mysql perl module hasn't been fully installed,
>>>>> or perhaps the capitalisation of 'mysql' isn't right.
>>>>> Available drivers: DB2, DBM, ExampleP, File, Proxy, Sponge.
>>>>>   at Bio/DB/SeqFeature/Store/DBI/mysql.pm line 208
>>>>>
>>>>> SearchIO
>>>>> ========
>>>>> pcboutro at ccb690[670] >> perl -w t/SearchIO.t | more
>>>>> 1..1337
>>>>> ok 1
>>>>>
>>>>> -------------------- WARNING ---------------------
>>>>> MSG: XML::SAX::Expat not currently supported; must have local  
>>>>> copies
>>>>> of NCBI DTD docs!
>>>>> ---------------------------------------------------
>>>>>
>>>>> -------------------- WARNING ---------------------
>>>>> MSG: error in parsing a report:
>>>>>
>>>>> 404 File `/home/pcboutro/tmp/bioperl-1.5.2-RC2/ 
>>>>> NCBI_BlastOutput.dtd'
>>>>> does not exist
>>>>> file:///home/pcboutro/tmp/bioperl-1.5.2-RC2/NCBI_BlastOutput.dtd
>>>>> Handler couldn't resolve external entity at line 2, column 82,  
>>>>> byte 104
>>>>> error in processing external entity reference at line 2, column  
>>>>> 82,
>>>>> byte 104 at
>>>>> /db2blast/perl/lib/site_perl/5.8.8/aix-thread-multi/XML/ 
>>>>> Parser.pm line
>>>>> 187
>>>>>
>>>>> ---------------------------------------------------
>>>>> not ok 2
>>>>> # Failed test 2 in t/SearchIO.t at line 68
>>>>> Can't call method "database_name" on an undefined value at
>>>>> t/SearchIO.t line 69.
>>>>>
>>>>> ------------------------------
>>>>>
>>>>> Message: 10
>>>>> Date: Tue, 17 Oct 2006 11:32:54 +0100
>>>>> From: Sendu Bala <bix at sendu.me.uk>
>>>>> Subject: [Bioperl-l] Bioperl 1.5.2 RC2
>>>>> To: bioperl-l at bioperl.org
>>>>> Message-ID: <4534B156.4090501 at sendu.me.uk>
>>>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>>>
>>>>> Bioperl 1.5.2 Release Candidate 2 is ready and available for  
>>>>> testing.
>>>>> See http://www.bioperl.org/wiki/Release_1.5.2 for
>>>>> instructions on getting and testing this RC.
>>>>>
>>>>> Developers:
>>>>>     This should be the last RC before release ~next monday. Now  
>>>>> would
>>>>>     be a good time for last minute documentaiton updates and  
>>>>> additions.
>>>>>
>>>>> Users:
>>>>>     Even though 1.5.2 is a 'developer' release, we consider it  
>>>>> the most
>>>>>     stable and capable version of Bioperl, and recommend that  
>>>>> you use
>>>>>     it in all but the most critical production environments.  
>>>>> Please
>>>>>     try it out and let us know of any problems or difficulties  
>>>>> you run
>>>>>     into.
>>>>>
>>>>>
>>>>> Thank you,
>>>>> Sendu.
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>
>>
>>
>>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Wed Oct 18 06:52:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 07:52:05 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4534B156.4090501@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>
Message-ID: <4535CF15.4090502@sendu.me.uk>

Sendu Bala wrote:
> Bioperl 1.5.2 Release Candidate 2 is ready and available for testing.
> See http://www.bioperl.org/wiki/Release_1.5.2 for
> instructions on getting and testing this RC.
> 
> Developers:
>    This should be the last RC before release ~next monday. Now would
>    be a good time for last minute documentaiton updates and additions.

Given the few issues that have come up, it would be prudent to have 
another RC, so expect one around the time the 'Needs investigation' 
issues on the release page have been solved.

If you think there are more things that need investigation, please add 
them, but note the bias toward things that affect the successful 
completion of the test suite as opposed to general bugs which should go 
to Bugzilla as normal.


From bix at sendu.me.uk  Wed Oct 18 08:55:21 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 09:55:21 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45350BA6.3040102@genomics.dk>
References: <4534B156.4090501@sendu.me.uk>
	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>
	<45350BA6.3040102@genomics.dk>
Message-ID: <4535EBF9.1090706@sendu.me.uk>

Niels Larsen wrote:

> ------------ EBI
> 
> I invoked the EBI script
> 
> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip
> 
> like this
> 
> WSWUBlastClient.pl -p blastn -D embl test.fasta
> 
> where the content of test.fasta is below, and got
> 
> Can't find method element in the message at 
> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.

As you admit, this is not a Bioperl issue. I would suggest you contact 
EBI support.

In the mean time/alternatively I'd suggest investigating the Bioperl 
interface to the SOAP server, which is part of the Bioperl-run package.

http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/Analysis.html


> ------------ DDBJ
> 
> Inspired by this page,
> 
> http://xml.nig.ac.jp/doc/Blast.txt
> 
> I made this test script
[snip]
> which for me prints undef.

Again, not something I can really help you with. You'll need to 
triple-check your code and then seek support from the providers of that 
SOAP service.


> ------------- NCBI/Bioperl
> 
> I installed 1.5.2-RC2, looked at the RemoteBlast example in
> 
> http://www.bioperl.org/wiki/Bptutorial.pl
> 
> and then put that into this test code, more or less cut/paste,
[snip]
> Maybe I am supposed to add a check for content in $rc and then stop
> the inner loop?

Yes, the wiki page example isn't really adequate. I'll update it. For a 
better code example see the RemoteBlast documentation:

http://doc.bioperl.org/bioperl-live/Bio/Tools/Run/RemoteBlast.html


> I could figure that out maybe, but I wish there was a
> function which simply takes a single sequence + arguments and only
> returns a list of matches when done, and does not return until then
> (or until a specified timeout).

Yes, I hardly find dealing with RIDs that pleasant. You might like to 
add a feature request to Bugzilla.


From n.haigh at sheffield.ac.uk  Wed Oct 18 09:58:00 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 10:58:00 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
Message-ID: <4535FAA8.2050506@sheffield.ac.uk>

I get all tests passing except for BioDBSeqFeature_mysql which fails all
tests (1-46).

During perl Makefile.PL I get:
"I see you have Berkeleydb installed. I will create the DBD tests for
Bio::DB::SeqFeature::Store..."

I notice under the "needs investigation" there is mention about tests
been generated even if DBD::mysql isn't installed. I assume this is the
problem? If this is the problem should DBD::mysql be added to the
dependencies in Makefile.PL?

Is there an easy way to find out what tests are being skipped due to
absent modules?

Cheers
Nath


From n.haigh at sheffield.ac.uk  Wed Oct 18 11:34:21 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 12:34:21 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4535EBF9.1090706@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk>
Message-ID: <4536113D.1080307@sheffield.ac.uk>

I've just added test results for 1.5.2 RC2 to the wiki.

There are lots of fails for packages other than bioperl-live. I'm not
sure excatly how the test fails/skipps are/should be handled since my
setups are as follows.

Clean WinXP Pro:
This is a clean install of WinXP Pro SP2 with no major software
installed, other than ActivePerl 5.8.8.819 and a few tools for archive
extracting, anti virus etc. Therefore, I'm unsure how tests in
bioperl-network and bioperl-db should return. For example, I have made
no effort to setup biosql-schema but I thought that maybe there would be
a test that would detect this, and fail, then skip over other tests
gracefully - like the bioperl-run tests when a piece of software is not
installed???

Debian Linux:
This is a Bio-Linux machine with quite a lot of bioinformatics software
installed in the Path. So most of the tests in bioperl-run should
probably have passed. The same goes for bioperl-network and bioperl-db
as with my Windows setup.

If my thoughts are totally wrong - let me know!
Nath


From bix at sendu.me.uk  Wed Oct 18 12:03:11 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 13:03:11 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <4535FAA8.2050506@sheffield.ac.uk>
References: <4535FAA8.2050506@sheffield.ac.uk>
Message-ID: <453617FF.9080508@sendu.me.uk>

Nathan Haigh wrote:
> I get all tests passing except for BioDBSeqFeature_mysql which fails all
> tests (1-46).
> 
> During perl Makefile.PL I get:
> "I see you have Berkeleydb installed. I will create the DBD tests for
> Bio::DB::SeqFeature::Store..."
> 
> I notice under the "needs investigation" there is mention about tests
> been generated even if DBD::mysql isn't installed. I assume this is the
> problem? 

Probably. I'm looking into it. Not sure why it wasn't causing a problem 
before now.

 > If this is the problem should DBD::mysql be added to the
 > dependencies in Makefile.PL?

No. You can use the modules in question without mysql (presumably; ie. 
you have a different sql setup), so it makes no sense to warn people 
they don't have a module they absolutely do not need.


> Is there an easy way to find out what tests are being skipped due to
> absent modules?

Ideally, when the skip occurs the test script will issue a message. I 
think that happens in most, if not all cases.


From bix at sendu.me.uk  Wed Oct 18 13:02:50 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 14:02:50 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <453617FF.9080508@sendu.me.uk>
References: <4535FAA8.2050506@sheffield.ac.uk> <453617FF.9080508@sendu.me.uk>
Message-ID: <453625FA.6090907@sendu.me.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
?
>> I notice under the "needs investigation" there is mention about tests
>> been generated even if DBD::mysql isn't installed. I assume this is the
>> problem? 
> 
> Probably. I'm looking into it. Not sure why it wasn't causing a problem 
> before now.
> 
>  > If this is the problem should DBD::mysql be added to the
>  > dependencies in Makefile.PL?
> 
> No. You can use the modules in question without mysql (presumably; ie. 
> you have a different sql setup), so it makes no sense to warn people 
> they don't have a module they absolutely do not need.

Oops. It /is/ in the pre-reqs in Makefile.PL. Maybe DBD::mysql is the 
only supported driver?


From bix at sendu.me.uk  Wed Oct 18 13:16:24 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 18 Oct 2006 14:16:24 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>
References: <001401c6f21a$836f9fc0$15327e82@pyrimidine>	<20061017194952.dd4a1guc1iww4sgk@webmail.utoronto.ca>
	<67CA244C-5A2C-46A0-B474-47C5B67D199B@uiuc.edu>
Message-ID: <45362928.8070104@sendu.me.uk>

Chris Fields wrote:
> On Oct 17, 2006, at 6:49 PM, Paul Boutros wrote:
> 
>> Hi Chris,
>>
>> Yup, that's it.  I installed XML::SAX::ExpatXS (make test output
>> below).  Should there be a note somewhere in the INSTALL docs saying
>> basically what you just wrote?  Or maybe it's already there somewhere
>> and I missed it.
> 
> The INSTALL docs should have this, yes.  I'll double-check though.
> 
> Pretty much anything that plugs into XML::SAX except XML::SAX::Expat  
> works (XML::LibXML also works, I found).
> 
>> Also, I'd propose changing BioDBSeqFeature_mysql.t so that it checks
>> if DBD::mysql can be loaded,
[snip]
> It should check this when using 'perl Makefile.PL', since the tests  
> are only set up if MySQL is present (so you would assume that it  
> checks for DBD::mysql).  I'll look into it.

This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in 
my t directory when I packed it up for release.

I'm tweaking Makefile.PL right now in any case; there are a few errors 
and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean.


From cjfields at uiuc.edu  Wed Oct 18 13:55:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 08:55:37 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
Message-ID: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>

Ding dong the witch is dead!  As announce previously, from the latest
GenBank release (156.0):

-----------------------------------------------

1.3.8 Feature location syntax X.Y no longer supported

  The Feature Table has supported feature locations of the form 'X.Y', to
represent a base position which is greater or equal to X, and less than or
equal to Y. For example:

	misc_feature    1.10..20
	misc_feature    join(100..150,200.210..250)

  In the first example, the misc_feature starts somewhere between bases 1
and 10 (inclusive), and ends at basepair 20. In the second, the 51 bases
from 100..150 are joined together with a second basepair interval, which
could be anywhere from 200..250 to 210..250 .

  Although this syntax seems like a reasonable way to capture an uncertain
interval, it is used for features on a vanishingly small number of sequence
records, most database submission mechanisms don't support it, and the
meaning of its use in a join() context is not entirely clear.

  As of October 2006, this type of location is no longer supported.
Those records with features which utilize X.Y locations will be reviewed and
converted to a non-uncertain format.

-----------------------------------------------

EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
Not sure about UniProt/SwissProt.

I guess we're keeping this in for backwards compatibility only, but how do
we handle any bugs that pop up related to this?  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Wed Oct 18 14:10:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:10:07 -0500
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <453617FF.9080508@sendu.me.uk>
Message-ID: <001f01c6f2bf$20737270$15327e82@pyrimidine>

> Nathan Haigh wrote:
> > I get all tests passing except for BioDBSeqFeature_mysql which fails all
> > tests (1-46).
> >
> > During perl Makefile.PL I get:
> > "I see you have Berkeleydb installed. I will create the DBD tests for
> > Bio::DB::SeqFeature::Store..."
> >
> > I notice under the "needs investigation" there is mention about tests
> > been generated even if DBD::mysql isn't installed. I assume this is the
> > problem?
> 
> Probably. I'm looking into it. Not sure why it wasn't causing a problem
> before now.

Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP
because 'perl Makefile.PL' doesn't detect my MySQL installation, so the
MySQL-based tests don't run even though I have DBD::mysql installed.  I
thought this might just be a WinXP issue, but apparently not.  If I can get
to it I'll run a few checks.

>  > If this is the problem should DBD::mysql be added to the
>  > dependencies in Makefile.PL?
> 
> No. You can use the modules in question without mysql (presumably; ie.
> you have a different sql setup), so it makes no sense to warn people
> they don't have a module they absolutely do not need.

Agreed, though I don't know if other relational DB's are supported like
PostgreSQL.

> > Is there an easy way to find out what tests are being skipped due to
> > absent modules?
> 
> Ideally, when the skip occurs the test script will issue a message. I
> think that happens in most, if not all cases.

Yes, though we may run into the same issue we had with XEMBL tests not
reporting the reasons it skipped.  Each test suite should run an eval{} to
check the required modules, then only skip blocks of tests that rely on
those modules.  I think we have caught most of those, but who knows w/o
doing a complete test suite audit?

Our eventual complete switchover to Test::More should hopefully clean these
up.  I don't consider it a pressing issue for this release, though Sendu may
feel differently.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Wed Oct 18 14:12:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:12:52 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45362928.8070104@sendu.me.uk>
Message-ID: <002001c6f2bf$807849c0$15327e82@pyrimidine>

...
> This looks like a booboo on my part. I left t/BioDBSeqFeature_mysql.t in
> my t directory when I packed it up for release.
> 
> I'm tweaking Makefile.PL right now in any case; there are a few errors
> and t/BioDBSeqFeature_mysql.t et al. don't get deleted in a make clean.

Okay, makes sense now.  No big deal, it's still an RC (a developer's RC at
that!).

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Wed Oct 18 14:17:35 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 15:17:35 +0100
Subject: [Bioperl-l] RC2 test results on WinXP
In-Reply-To: <001f01c6f2bf$20737270$15327e82@pyrimidine>
References: <001f01c6f2bf$20737270$15327e82@pyrimidine>
Message-ID: <4536377F.6000408@sheffield.ac.uk>

Chris Fields wrote:
>> Nathan Haigh wrote:
>>     
>>> I get all tests passing except for BioDBSeqFeature_mysql which fails all
>>> tests (1-46).
>>>
>>> During perl Makefile.PL I get:
>>> "I see you have Berkeleydb installed. I will create the DBD tests for
>>> Bio::DB::SeqFeature::Store..."
>>>
>>> I notice under the "needs investigation" there is mention about tests
>>> been generated even if DBD::mysql isn't installed. I assume this is the
>>> problem?
>>>       
>> Probably. I'm looking into it. Not sure why it wasn't causing a problem
>> before now.
>>     
>
> Funny, I don't have problems with the BioDBSeqFeature_mysql tests on WinXP
> because 'perl Makefile.PL' doesn't detect my MySQL installation, so the
> MySQL-based tests don't run even though I have DBD::mysql installed.  I
> thought this might just be a WinXP issue, but apparently not.  If I can get
> to it I'll run a few checks.
>
>   
This was on WinXP.
>>  > If this is the problem should DBD::mysql be added to the
>>  > dependencies in Makefile.PL?
>>
>> No. You can use the modules in question without mysql (presumably; ie.
>> you have a different sql setup), so it makes no sense to warn people
>> they don't have a module they absolutely do not need.
>>     
>
> Agreed, though I don't know if other relational DB's are supported like
> PostgreSQL.
>
>   
>>> Is there an easy way to find out what tests are being skipped due to
>>> absent modules?
>>>       
>> Ideally, when the skip occurs the test script will issue a message. I
>> think that happens in most, if not all cases.
>>     
>
> Yes, though we may run into the same issue we had with XEMBL tests not
> reporting the reasons it skipped.  Each test suite should run an eval{} to
> check the required modules, then only skip blocks of tests that rely on
> those modules.  I think we have caught most of those, but who knows w/o
> doing a complete test suite audit?
>
> Our eventual complete switchover to Test::More should hopefully clean these
> up.  I don't consider it a pressing issue for this release, though Sendu may
> feel differently.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   


From hlapp at gmx.net  Wed Oct 18 14:36:31 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 18 Oct 2006 10:36:31 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
References: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
Message-ID: <B8036AB0-741F-427A-9EB1-7E80A28EC79F@gmx.net>


On Oct 18, 2006, at 9:55 AM, Chris Fields wrote:

> how do we handle any bugs that pop up related to this?

By an evil grin, followed by deflecting the blame to NCBI, followed  
by another evil grin.
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Oct 18 14:43:31 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 09:43:31 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <B8036AB0-741F-427A-9EB1-7E80A28EC79F@gmx.net>
Message-ID: <002401c6f2c3$c83c7e30$15327e82@pyrimidine>

> On Oct 18, 2006, at 9:55 AM, Chris Fields wrote:
> 
> > how do we handle any bugs that pop up related to this?
> 
> By an evil grin, followed by deflecting the blame to NCBI, followed
> by another evil grin.
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Sounds good to me!  One less thing to worry about.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From n.haigh at sheffield.ac.uk  Wed Oct 18 14:45:57 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Wed, 18 Oct 2006 15:45:57 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
Message-ID: <45363E25.8010806@sheffield.ac.uk>

Nathan Haigh wrote:
> I've just added test results for 1.5.2 RC2 to the wiki.
>
> There are lots of fails for packages other than bioperl-live. I'm not
> sure excatly how the test fails/skipps are/should be handled since my
> setups are as follows.
>
> Clean WinXP Pro:
> This is a clean install of WinXP Pro SP2 with no major software
> installed, other than ActivePerl 5.8.8.819 and a few tools for archive
> extracting, anti virus etc. Therefore, I'm unsure how tests in
> bioperl-network and bioperl-db should return. For example, I have made
> no effort to setup biosql-schema but I thought that maybe there would be
> a test that would detect this, and fail, then skip over other tests
> gracefully - like the bioperl-run tests when a piece of software is not
> installed???
>
> Debian Linux:
> This is a Bio-Linux machine with quite a lot of bioinformatics software
> installed in the Path. So most of the tests in bioperl-run should
> probably have passed. The same goes for bioperl-network and bioperl-db
> as with my Windows setup.
>
> If my thoughts are totally wrong - let me know!
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just looking into the failed Linux tests.

Several of the tests result in errors like:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: ARGUMENTS !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Alignment::Exonerate::AUTOLOAD
/home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:126
STACK: Bio::Tools::Run::Alignment::Exonerate::new
/home/bo1nsh/cvswc/bioperl-run-live/blib/lib/Bio/Tools/Run/Alignment/Exonerate.pm:154
STACK: t/Exonerate.t:32
-----------------------------------------------------------

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: 'arguments' !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Hmmer::AUTOLOAD Bio/Tools/Run/Hmmer.pm:172
STACK: Bio::Tools::Run::Hmmer::_run Bio/Tools/Run/Hmmer.pm:253
STACK: Bio::Tools::Run::Hmmer::run Bio/Tools/Run/Hmmer.pm:228
STACK: t/Hmmer.t:54
-----------------------------------------------------------

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unallowed parameter: ARGUMENTS !
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Phrap::AUTOLOAD Bio/Tools/Run/Phrap.pm:137
STACK: Bio::Tools::Run::Phrap::new Bio/Tools/Run/Phrap.pm:165
STACK: t/Phrap.t:34
-----------------------------------------------------------

Any ideas??

Nath


From hlapp at gmx.net  Wed Oct 18 14:51:36 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 18 Oct 2006 10:51:36 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk> <4536113D.1080307@sheffield.ac.uk>
Message-ID: <E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>


On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:

>  For example, I have made
> no effort to setup biosql-schema but I thought that maybe there  
> would be
> a test that would detect this

I'm afraid there isn't. Bioperl-db is meaningless without biosql-schema.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bosborne11 at verizon.net  Wed Oct 18 14:43:06 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 10:43:06 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
 GenBank/EMBL/DDBJ
In-Reply-To: <001e01c6f2bd$1769aca0$15327e82@pyrimidine>
Message-ID: <C15BB5BA.ADAA%bosborne11@verizon.net>

Chris,

I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
of the more recent examples in t/LocationFactory.t come from there.

Brian O.


On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
> Not sure about UniProt/SwissProt.


From cjfields at uiuc.edu  Wed Oct 18 15:00:30 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 10:00:30 -0500
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
	GenBank/EMBL/DDBJ
In-Reply-To: <C15BB5BA.ADAA%bosborne11@verizon.net>
Message-ID: <002501c6f2c6$27625540$15327e82@pyrimidine>

Do they still use the X.Y notations?  Those are the most troublesome.  I
guess we still don't support the ones containing '?'.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: Brian Osborne [mailto:bosborne11 at verizon.net]
> Sent: Wednesday, October 18, 2006 9:43 AM
> To: Chris Fields; bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in
> GenBank/EMBL/DDBJ
> 
> Chris,
> 
> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
> of the more recent examples in t/LocationFactory.t come from there.
> 
> Brian O.
> 
> 
> On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:
> 
> > EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
> > Not sure about UniProt/SwissProt.


From Kevin.M.Brown at asu.edu  Wed Oct 18 15:16:50 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 18 Oct 2006 08:16:50 -0700
Subject: [Bioperl-l] Blast information
Message-ID: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>

I just recently upgraded to 1.5.1 on WinXP to bring this version closer
to live to parse some locally created blast files.  I'm trying to find
the method that returns the values that are underneath the Identities
and Positives information as I'm trying to replicate the output of an
old blast parser we have here written in RealBasic which is showing its
age.  Once I have it replicating the old output I then intend to add
more features in terms of filtering returned hits (like not returning
self->self hits or a->b so don't show b->a).

Example:
I'm looking for the methods that will return 117 from identities and 117
from positives.  I can't just use num_identical/percent_identity as that
isn't 100% accurate.

>BurkM_2016
          Length = 241

 Score = 43.2 bits (88), Expect = 7e-005
 Identities = 26/117 (22%), Positives = 51/117 (43%)

Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
357
           Q   F  F  + A+    ++ +         + + L +R   GL   + P   E + A+L
Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
170

Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
              A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227

Thanks,
Kevin


From cjfields at uiuc.edu  Wed Oct 18 15:25:59 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 10:25:59 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <4536113D.1080307@sheffield.ac.uk>
Message-ID: <002601c6f2c9$b6d04a90$15327e82@pyrimidine>

> I've just added test results for 1.5.2 RC2 to the wiki.
> 
> There are lots of fails for packages other than bioperl-live. I'm not
> sure excatly how the test fails/skipps are/should be handled since my
> setups are as follows.
> 
> Clean WinXP Pro:
> This is a clean install of WinXP Pro SP2 with no major software
> installed, other than ActivePerl 5.8.8.819 and a few tools for archive
> extracting, anti virus etc. Therefore, I'm unsure how tests in
> bioperl-network and bioperl-db should return. For example, I have made
> no effort to setup biosql-schema but I thought that maybe there would be
> a test that would detect this, and fail, then skip over other tests
> gracefully - like the bioperl-run tests when a piece of software is not
> installed???
> 
> Debian Linux:
> This is a Bio-Linux machine with quite a lot of bioinformatics software
> installed in the Path. So most of the tests in bioperl-run should
> probably have passed. The same goes for bioperl-network and bioperl-db
> as with my Windows setup.
> 
> If my thoughts are totally wrong - let me know!
> Nath

The bioperl-db tests rely on a local BioSQL database and on having a
properly set up configuration file (these are detailed in the bioperl-db
INSTALL doc).  Furthermore, there are serious problems with bioperl-db and
WinXP (see Bug 1938 in bugzilla).  There is a workaround, but it isn't
perfect by any means.  

http://bugzilla.open-bio.org/show_bug.cgi?id=1938

Many of the bioperl-run tests rely on env. variables being set properly, so
maybe that's why they failed.  These should all be detailed in the INSTALL
file (but maybe they aren't?).

I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac OS
X yet but intended on doing this within the week.  The INSTALL file details
the requirements for the packages (Graph 0.80 is the only one for
bioperl-network, for instance, and there isn't a PPM for that version
available yet).  

It would be nice to skip the tests based on absence of the particular
modules or installed programs, and I think the final goal is to possibly
attempt to do this.  However, all of the bioperl-related distributions have
their own documentation which outline their installation, requirements, and
use.  At least we can point to that, which works for now.  We could always
start up a wiki page for the various bioperl distributions to monitor
problems or issues with each based on OS, proposed enhancements/ideas, etc.


Also, most (if not all, including core) have been primarily tested on some
*nix-related system, which means that they may not work on Win32 systems.
Though the Windows support is light-years ahead of what it used to be circa
rel 0.7, I don't think it is full-proof yet, as witnessed by the bioperl-db
bug.  Frankly, we need more WinXP users for those packages willing to test
them out and offer suggestions.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign l


From bosborne11 at verizon.net  Wed Oct 18 15:13:51 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 11:13:51 -0400
Subject: [Bioperl-l] X.Y Fuzzy locations no longer supported in
 GenBank/EMBL/DDBJ
In-Reply-To: <002501c6f2c6$27625540$15327e82@pyrimidine>
Message-ID: <C15BBCEF.ADB8%bosborne11@verizon.net>

Chris,

No, I don't think they use the form X.Y. See below, from
t/LocationFactory.t, we do support most of the forms using ?. Supposedly
these tests accommodate all of the possible fuzzy locations encountered in
Swissprot, I wrote these a year or so ago.

Brian O.


        # UNCERTAIN locations and positions (Swissprot)
   "?2465..2774" => [$fuzzy_impl,
       2465, 2465, "UNCERTAIN", 2774, 2774, "EXACT", "EXACT", 1, 1],
   "22..?64" => [$fuzzy_impl,
       22, 22, "EXACT", 64, 64, "UNCERTAIN", "EXACT", 1, 1],
   "?22..?64" => [$fuzzy_impl,
       22, 22, "UNCERTAIN", 64, 64, "UNCERTAIN", "EXACT", 1, 1],
   "?..>393" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", 393, undef, "AFTER", "UNCERTAIN", 1, 1],
   "<1..?" => [$fuzzy_impl,
       undef, 1, "BEFORE", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1],
   "?..536" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", 536, 536, "EXACT", "UNCERTAIN", 1, 1],
   "1..?" => [$fuzzy_impl,
       1, 1, "EXACT", undef, undef, "UNCERTAIN", "UNCERTAIN", 1, 1],
   "?..?" => [$fuzzy_impl,
       undef, undef, "UNCERTAIN", undef, undef, "UNCERTAIN", "UNCERTAIN", 1,
1],
   # Not working yet:
   #"12..?1" => [$fuzzy_impl,
   #    1, 1, "UNCERTAIN", 12, 12, "EXACT", "EXACT", 1, 1]


On 10/18/06 11:00 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> Do they still use the X.Y notations?  Those are the most troublesome.  I
> guess we still don't support the ones containing '?'.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
>> -----Original Message-----
>> From: Brian Osborne [mailto:bosborne11 at verizon.net]
>> Sent: Wednesday, October 18, 2006 9:43 AM
>> To: Chris Fields; bioperl-l at lists.open-bio.org
>> Subject: Re: [Bioperl-l] X.Y Fuzzy locations no longer supported in
>> GenBank/EMBL/DDBJ
>> 
>> Chris,
>> 
>> I'm fairly sure that Swissprot/Uniprot still supports fuzzy locations, all
>> of the more recent examples in t/LocationFactory.t come from there.
>> 
>> Brian O.
>> 
>> 
>> On 10/18/06 9:55 AM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>> 
>>> EMBL/DDBJ formats are also dropping support for these 'fuzzy' locations.
>>> Not sure about UniProt/SwissProt.
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Wed Oct 18 16:56:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 11:56:07 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <002601c6f2c9$b6d04a90$15327e82@pyrimidine>
Message-ID: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>

...
> I haven't tried installing bioperl-network or bioperl-run on WinXP or Mac
> OS
All,

> X yet but intended on doing this within the week.  The INSTALL file
> details
> the requirements for the packages (Graph 0.80 is the only one for
> bioperl-network, for instance, and there isn't a PPM for that version
> available yet).
...

As a followup in this, I tried bioperl-network and had similar failed tests
with Graph 0.79 (the only PPM available from ActiveState).  However, the
INSTALL docs state that Graph 0.80 is needed, and the test run gave several
warnings about not having Graph 0.80 installed. 

I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and
everything passed.  Maybe we need to have a Graph PPM available for those
who want bioperl-network?

As for bioperl-run, all tests passed from a new CVS checkout even though I
have none of the programs installed, so they seem to skip properly.  The
test run also printed warnings when a program wasn't available or installed.


Chris


From bosborne11 at verizon.net  Wed Oct 18 17:10:34 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 13:10:34 -0400
Subject: [Bioperl-l] Blast information
In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
Message-ID: <C15BD84A.ADCC%bosborne11@verizon.net>

Kevin,

Are you looking for hsp_length()? See the SearchIO HOWTO for a list of
methods:

http://www.bioperl.org/wiki/HOWTO:SearchIO


Brian O.


On 10/18/06 11:16 AM, "Kevin Brown" <Kevin.M.Brown at asu.edu> wrote:

> I just recently upgraded to 1.5.1 on WinXP to bring this version closer
> to live to parse some locally created blast files.  I'm trying to find
> the method that returns the values that are underneath the Identities
> and Positives information as I'm trying to replicate the output of an
> old blast parser we have here written in RealBasic which is showing its
> age.  Once I have it replicating the old output I then intend to add
> more features in terms of filtering returned hits (like not returning
> self->self hits or a->b so don't show b->a).
> 
> Example:
> I'm looking for the methods that will return 117 from identities and 117
> from positives.  I can't just use num_identical/percent_identity as that
> isn't 100% accurate.
> 
>> BurkM_2016
>           Length = 241
> 
>  Score = 43.2 bits (88), Expect = 7e-005
>  Identities = 26/117 (22%), Positives = 51/117 (43%)
> 
> Query: 298 QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> 357
>            Q   F  F  + A+    ++ +         + + L +R   GL   + P   E + A+L
> Sbjct: 111 QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> 170
> 
> Query: 358 MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
>               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> Sbjct: 171 KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
> 
> Thanks,
> Kevin
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From Kevin.M.Brown at asu.edu  Wed Oct 18 21:25:48 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 18 Oct 2006 14:25:48 -0700
Subject: [Bioperl-l] Blast information
Message-ID: <1A4207F8295607498283FE9E93B775B4022A71C3@EX02.asurite.ad.asu.edu>

Yes, that does indeed look like what I was after. 

> -----Original Message-----
> From: Brian Osborne [mailto:bosborne11 at verizon.net] 
> Sent: Wednesday, October 18, 2006 10:11 AM
> To: Kevin Brown; bioperl-l
> Subject: Re: [Bioperl-l] Blast information
> 
> Kevin,
> 
> Are you looking for hsp_length()? See the SearchIO HOWTO for a list of
> methods:
> 
> http://www.bioperl.org/wiki/HOWTO:SearchIO
> 
> 
> Brian O.
> 
> 
> On 10/18/06 11:16 AM, "Kevin Brown" <Kevin.M.Brown at asu.edu> wrote:
> 
> > I just recently upgraded to 1.5.1 on WinXP to bring this 
> version closer
> > to live to parse some locally created blast files.  I'm 
> trying to find
> > the method that returns the values that are underneath the 
> Identities
> > and Positives information as I'm trying to replicate the 
> output of an
> > old blast parser we have here written in RealBasic which is 
> showing its
> > age.  Once I have it replicating the old output I then intend to add
> > more features in terms of filtering returned hits (like not 
> returning
> > self->self hits or a->b so don't show b->a).
> > 
> > Example:
> > I'm looking for the methods that will return 117 from 
> identities and 117
> > from positives.  I can't just use 
> num_identical/percent_identity as that
> > isn't 100% accurate.
> > 
> >> BurkM_2016
> >           Length = 241
> > 
> >  Score = 43.2 bits (88), Expect = 7e-005
> >  Identities = 26/117 (22%), Positives = 51/117 (43%)
> > 
> > Query: 298 
> QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> > 357
> >            Q   F  F  + A+    ++ +         + + L +R   GL   + 
> P   E + A+L
> > Sbjct: 111 
> QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> > 170
> > 
> > Query: 358 
> MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
> >               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> > Sbjct: 171 
> KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
> > 
> > Thanks,
> > Kevin
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> 


From n.appleby at uq.edu.au  Wed Oct 18 21:58:06 2006
From: n.appleby at uq.edu.au (Nikki Appleby)
Date: Thu, 19 Oct 2006 07:58:06 +1000
Subject: [Bioperl-l] CONTIG dealing
Message-ID: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>


I have just entered the wonderful new world of BioPerl, so the answer to my
question may be obvious to any of the gurus reading this.

I need to collect sequence features and ontology annotations. Here goes.

I am retrieving sequences from SwissProt via Bio::DB::SwissProt and
get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into an RDBMS
format that I am happy with I can get at the xref ids. In this case, they
are 

AP003451; BAB86144.1; -; Genomic_DNA. 
AP008207; BAF07116.1; -; Genomic_DNA. 
AB103395; BAC81207.1; -; mRNA. 

I can happily go off and fetch those from Bio::DB::GenBank (first column),
and Bio::DB::GenPept (second). All good, except...

AP008207 is a contig. I don't want to get all of the features for the entire
thing, just the single contig that actually matches the original sequence.
It takes a couple of hours to get at it and then it gives me way too much.

I will come across this problem with other sequences. How do I (a) find out
if it is a contig without downloading it in it's entirety and (b) extract
the list of sequences that are about to be contigged together.

I have searched the web for answers, including this list, but see nothing.
Help!
 
Nikki Appleby.


From bosborne11 at verizon.net  Thu Oct 19 00:54:04 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Wed, 18 Oct 2006 20:54:04 -0400
Subject: [Bioperl-l] LocatableSeq object vs Sequence Object
In-Reply-To: <830b42e70610170926h5bdaabd6hc1352b8d4f3c8c4c@mail.gmail.com>
Message-ID: <C15C44EC.ADF8%bosborne11@verizon.net>

Peter,

I'm not understanding your question, partly because your letter and your
code are saying different things. You say you want to call
location_from_column() but your code shows you calling species(). What
happens when you call location_from_column? Do you see errors?

Brian O.


On 10/17/06 12:26 PM, "Peter H. Baenziger" <plu5even at gmail.com> wrote:

> I was thinking I could use:
> foreach $seq ($alignment->each_seq())
> to loop through the sequences and call:
> $seq->location_from_column($pos)
> on each of the sequences.  


From cjfields at uiuc.edu  Thu Oct 19 02:46:14 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 18 Oct 2006 21:46:14 -0500
Subject: [Bioperl-l] CONTIG dealing
In-Reply-To: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>
References: <000f01c6f300$7e2f3390$1b776682@IMBPC.AD>
Message-ID: <FAEAE9E1-EF95-4B79-AD75-B54D3E24E827@uiuc.edu>

On Oct 18, 2006, at 4:58 PM, Nikki Appleby wrote:

>
> I have just entered the wonderful new world of BioPerl, so the  
> answer to my
> question may be obvious to any of the gurus reading this.
>
> I need to collect sequence features and ontology annotations. Here  
> goes.
>
> I am retrieving sequences from SwissProt via Bio::DB::SwissProt and
> get_Seq_by_id, for this example Q8RZV7. Once I have parsed it into  
> an RDBMS
> format that I am happy with I can get at the xref ids. In this  
> case, they
> are
>
> AP003451; BAB86144.1; -; Genomic_DNA.
> AP008207; BAF07116.1; -; Genomic_DNA.
> AB103395; BAC81207.1; -; mRNA.
>
> I can happily go off and fetch those from Bio::DB::GenBank (first  
> column),
> and Bio::DB::GenPept (second). All good, except...
>
> AP008207 is a contig. I don't want to get all of the features for  
> the entire
> thing, just the single contig that actually matches the original  
> sequence.
> It takes a couple of hours to get at it and then it gives me way  
> too much.
>
> I will come across this problem with other sequences. How do I (a)  
> find out
> if it is a contig without downloading it in it's entirety and (b)  
> extract
> the list of sequences that are about to be contigged together.
>
> I have searched the web for answers, including this list, but see  
> nothing.
> Help!
>
> Nikki Appleby.

The default setting for the retrieval format for GenBank is  
'gbwithparts' (which gets the full sequence at all times).  You can  
set this to 'gb' using request_format() to retrieve the sequence file  
with the contig information instead of the sequence, if it contains  
such (otherwise it just retrieves the sequence anyway).

However, I have noticed this particular file does not represent a  
true contig record but is the entire chromosome sequence.  The contig  
information is in the comments section, probably b/c the record is  
converted over.  You could just download the sequence record and run  
regexp to grab the comments section, then parse out the contigs (a  
pain) if you really want that.  Or you could try to find the  
equivalent GenBank record, such as the ones derived from the WGS  
records.

I did notice the list of dbxrefs in your swissprot record indicate  
three EMBL sequences.  If the order is consistent for the SwissProt  
entries you want, they probably represent:

The contig (what you want): AP003451; BAB86144.1; -; Genomic_DNA.
The supercontig (chromosome) : AP008207; BAF07116.1; -; Genomic_DNA.
The cDNA : AB103395; BAC81207.1; -; mRNA.

I checked the first one (AP003451), which seems to confirm this.

Since the chromosome supercontig is built from the smaller sequence  
contigs you could just grab the first EMBL dbxref instead of all of  
them.  It parses much faster than the chromosome file.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Wed Oct 18 15:47:14 2006
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Oct 2006 08:47:14 -0700
Subject: [Bioperl-l] Blast information
In-Reply-To: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
References: <1A4207F8295607498283FE9E93B775B4022A7096@EX02.asurite.ad.asu.edu>
Message-ID: <6B7D24F3-69F1-498D-AB53-B4CEB14E4F3D@bioperl.org>

I think this will work for you.

The seq_inds method parses the middle homology sequence and  
classifies each alignment column and returns a list of the columns  
meeting the criteria.  You can interrogate query or hit in this case  
since you are requiring it to be identical

my $identicalbases = scalar $hsp->seq_inds('query', 'identical');
my $conservedbases =  scalar $hsp->seq_inds('query','conserved');

Conserved returns those identical or conserved, if you want just  
those with conservative replacements use 'conserved-not-identical'

See http://bioperl.org/wiki/HOWTO:SearchIO#Table_of_Methods for more  
info.

-jason
On Oct 18, 2006, at 8:16 AM, Kevin Brown wrote:

> I just recently upgraded to 1.5.1 on WinXP to bring this version  
> closer
> to live to parse some locally created blast files.  I'm trying to find
> the method that returns the values that are underneath the Identities
> and Positives information as I'm trying to replicate the output of an
> old blast parser we have here written in RealBasic which is showing  
> its
> age.  Once I have it replicating the old output I then intend to add
> more features in terms of filtering returned hits (like not returning
> self->self hits or a->b so don't show b->a).
>
> Example:
> I'm looking for the methods that will return 117 from identities  
> and 117
> from positives.  I can't just use num_identical/percent_identity as  
> that
> isn't 100% accurate.
>
>> BurkM_2016
>           Length = 241
>
>  Score = 43.2 bits (88), Expect = 7e-005
>  Identities = 26/117 (22%), Positives = 51/117 (43%)
>
> Query: 298  
> QEEFFYAFEALVANKAQVIITSDTYPKEISGIDDRLISRFDSGLTVAIEPPELEMRVAIL
> 357
>            Q   F  F  + A+    ++ +         + + L +R   GL   + P   E +  
> A+L
> Sbjct: 111  
> QIALFNLFNEVRAHPMTALVVAGPAAPLALDVREDLRTRLGWGLVFHLAPLTDEGKAAVL
> 170
>
> Query: 358  
> MRKAQSEGVSLSEDVAFFVAKHLRSNVRELEGALRKILAYSKFHGREITIELTKEAL 414
>               A+  G++L++DV  ++  H R ++  L   L  +  +S    R +T+ L +  L
> Sbjct: 171  
> KHAAKERGIALADDVPSYLLTHFRRDMPSLMSLLDALDRFSLEQKRAVTLPLLRAML 227
>
> Thanks,
> Kevin
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Thu Oct 19 05:00:28 2006
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 18 Oct 2006 22:00:28 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
Message-ID: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>

So I'm unsure what we should do here.

We can certainly fix the problem which you report which is relying on  
the "" method -- if you were to do instead:
print $_->database, ":", $_->primary_id, "\n";

you'll get the right answer.  We at a minimum just fix the auto- 
string converting method to do The Right Thing.

But I am not sure if we should keep the version out of the primary_id  
field.  This will require some rejiggering in several modules when it  
comes to printing DBlinks and I don't want to do this before the  
release. I also am not sure if there was an explicit reason why  
someone did put the version information in the primary_id. (I hope it  
wasn't me because I don't think I'm going to remember why).

Does anyone else have a strong feeling?

-jason
On Oct 17, 2006, at 12:01 PM, Erikjan wrote:

> Hello,
>
> I noticed a little problem with the Annotation "DBLink" from  
> GenBank entries
>
> When I run:
>
> perl -MBio::DB::GenBank -e 'my $gi =
> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations 
> ("dblink");
> for(@annotations) { print $_, "\n";} print $INC{
> "Bio/Annotation/DBLink.pm" }, "\n"; '
>
> This yields:
>
>    GenBank:AL591065.17.17
>
> and the place where the used Bio/Annotation/DBLink.pm resides.
>
> Can others repeat this?
>
> I have dug into the source a little and Bio::Annotation::DBLink  
> seems to
> be the place where this happens: it has a concatenation which leads to
> that repeated version number.
>
> It this something that I should fix "client-side", so to speak, or  
> is it
> worthwhile to add some logic to that concatenation to prevent this?
>
>
> Thanks,
>
> Eric
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From n.haigh at sheffield.ac.uk  Thu Oct 19 06:41:02 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 07:41:02 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>
References: <000401c6f2d6$5144e2f0$15327e82@pyrimidine>
Message-ID: <45371DFE.6050306@sheffield.ac.uk>


> As a followup in this, I tried bioperl-network and had similar failed tests
> with Graph 0.79 (the only PPM available from ActiveState).  However, the
> INSTALL docs state that Graph 0.80 is needed, and the test run gave several
> warnings about not having Graph 0.80 installed. 
>
> I made a PPM of Graph 0.80, installed, retried bioperl-network tests, and
> everything passed.  Maybe we need to have a Graph PPM available for those
> who want bioperl-network?
>
> As for bioperl-run, all tests passed from a new CVS checkout even though I
> have none of the programs installed, so they seem to skip properly.  The
> test run also printed warnings when a program wasn't available or installed.
>
>
> Chris
>
>   
If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make 
modifications to integrate them into the package.xml file for PPM4 clients.

Nath


From n.haigh at sheffield.ac.uk  Thu Oct 19 10:40:21 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 11:40:21 +0100
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
Message-ID: <45375615.1020603@sheffield.ac.uk>

Should line 25 read:
require Bio::Factory::EMBOSS

instead of:
require Bio::EMBOSS::Factory;

Nath


From hlapp at gmx.net  Thu Oct 19 13:56:05 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 19 Oct 2006 09:56:05 -0400
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
Message-ID: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>

Here is the overload code:

use overload '""' => sub {
	(($_[0]->database ? $_[0]->database . ':' : '' )
	. ($_[0]->primary_id ? $_[0]->primary_id : '')
	. ($_[0]->version ? '.' . $_[0]->version : ''))
	|| '' };

Except that the last '||' is redundant and unnecessary (it either  
does nothing or replaces an empty string with an empty string), I  
don't see the potential for duplicating the version number here -  
unless primary_id() did that, which I don't see it doing.

So, to me this seems to come from a parsing error in the beginning,  
rather than an erroneous mangling of version into primary_id later.

Is someone in the position to confirm this?

	-hilmar

On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:

> So I'm unsure what we should do here.
>
> We can certainly fix the problem which you report which is relying on
> the "" method -- if you were to do instead:
> print $_->database, ":", $_->primary_id, "\n";
>
> you'll get the right answer.  We at a minimum just fix the auto-
> string converting method to do The Right Thing.
>
> But I am not sure if we should keep the version out of the primary_id
> field.  This will require some rejiggering in several modules when it
> comes to printing DBlinks and I don't want to do this before the
> release. I also am not sure if there was an explicit reason why
> someone did put the version information in the primary_id. (I hope it
> wasn't me because I don't think I'm going to remember why).
>
> Does anyone else have a strong feeling?
>
> -jason
> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>
>> Hello,
>>
>> I noticed a little problem with the Annotation "DBLink" from
>> GenBank entries
>>
>> When I run:
>>
>> perl -MBio::DB::GenBank -e 'my $gi =
>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my $seqio =
>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>> ("dblink");
>> for(@annotations) { print $_, "\n";} print $INC{
>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>
>> This yields:
>>
>>    GenBank:AL591065.17.17
>>
>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>
>> Can others repeat this?
>>
>> I have dug into the source a little and Bio::Annotation::DBLink
>> seems to
>> be the place where this happens: it has a concatenation which  
>> leads to
>> that repeated version number.
>>
>> It this something that I should fix "client-side", so to speak, or
>> is it
>> worthwhile to add some logic to that concatenation to prevent this?
>>
>>
>> Thanks,
>>
>> Eric
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From dmessina at wustl.edu  Thu Oct 19 13:55:31 2006
From: dmessina at wustl.edu (David Messina)
Date: Thu, 19 Oct 2006 08:55:31 -0500
Subject: [Bioperl-l] missing documentation (request for help)
Message-ID: <69453D5F-7794-4DC7-BAE1-A8B2191752E6@wustl.edu>

Hi all,

There are a few modules missing a one-line description, and by one- 
line description, I'm referring to the part that comes after the  
module name in the POD.

e.g. in

=head1 NAME

Bio::SearchIO - Driver for parsing Sequence Database Searches
(BLAST, FASTA, ...)

=head1 SYNOPSIS

[etc...]

"Driver for parsing Sequence Database Searches (BLAST, FASTA, ...)"  
is the one-line description (even though it falls onto two lines) :).

I fixed the modules that I knew something about, but there are some I  
haven't used. Perhaps the author, or someone else familiar with these  
modules, could fill in an appropriate short description?

Here is the list of affected modules:
Bio::DB::Expression
Bio::Expression::Contact
Bio::Expression::DataSet
Bio::Expression::Platform
Bio::Expression::Sample
Bio::Search::Processor
Bio::DB::EUtilities::ElinkData
Bio::DB::GFF::Adaptor::memory::feature_serializer
Bio::DB::SeqFeature::Store::DBI::Iterator
Bio::Expression::FeatureGroup::FeatureGroupMas50
Bio::Expression::FeatureSet::FeatureSetMas50
Bio::Matrix::PSM::PsmHeaderI
Bio::OntologyIO::Handlers::BaseSAXHandler

Some of these are missing other POD parts as well -- please add those  
too if you can.


Thanks,
Dave


From mckays at cshl.edu  Thu Oct 19 13:51:18 2006
From: mckays at cshl.edu (Sheldon McKay)
Date: Thu, 19 Oct 2006 09:51:18 -0400
Subject: [Bioperl-l] chromosome ideograms
Message-ID: <6b0de00426b3c04b0d0d7641bc8e14e3@cshl.edu>

Hi,

Sorry for the late reply.  I have been working on a karyotype drawing 
tool as part of the Generic Genome Browser that may be useful.  In 
addition to drawing features next to chromosome ideograms, it also 
supports making chromosome 'bands' from any kind of scored features to 
create a sort of heat map on the chromosome itself.

I have a demo running at

http://mckay.cshl.edu/cgi-bin/gbrowse_karyotype

and the source is available from the GMOD CVS HEAD 
http://www.gmod.org/cvs

Sheldon

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Sheldon McKay, PhD
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724


From n.haigh at sheffield.ac.uk  Thu Oct 19 15:37:31 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 19 Oct 2006 15:37:31 +0000
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
In-Reply-To: <45375615.1020603@sheffield.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>
Message-ID: <45379BBB.1040400@sheffield.ac.uk>

Thanks for committing that change Brian. Now the tests proceed from this
point, I get the following error:

------------- EXCEPTION: Bio::Root::NotImplemented -------------
MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not
implemented by package Bio::Tools::Run::EMBOSSApplication.
This is not your fault - author of Bio::Tools::Run::EMBOSSApplication
should be blamed!

STACK: Error::throw
STACK: Bio::Root::Root::throw
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350
STACK: Bio::Root::RootI::throw_not_implemented
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522
STACK: Bio::Tools::Run::WrapperBase::program_dir
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346
STACK: Bio::Tools::Run::WrapperBase::program_path
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327
STACK: Bio::Tools::Run::WrapperBase::executable
/home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297
STACK: t/EMBOSS.t:58
----------------------------------------------------------------


From N.Haigh at sheffield.ac.uk  Thu Oct 19 15:03:00 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 16:03:00 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45379BBB.1040400@sheffield.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>
	<45379BBB.1040400@sheffield.ac.uk>
Message-ID: <1161270180.453793a432e4f@webmail.shef.ac.uk>

I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be
consistent with other tests.

Failing that - Is there a good test writing style I should follow in one of the other test files?

Thanks
Nathan


From bosborne11 at verizon.net  Thu Oct 19 15:06:08 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Thu, 19 Oct 2006 11:06:08 -0400
Subject: [Bioperl-l] bioperl-run t/EMBOSS.t
In-Reply-To: <45379BBB.1040400@sheffield.ac.uk>
Message-ID: <C15D0CA0.AE2C%bosborne11@verizon.net>

Nathan,

Yes, I see. Those EMBOSS programs work a bit differently from the typical
app run by bioperl-run, there's no need for WrapperBase methods like
program_dir(), executable(), it seems. Well, I can try and take a look at
this tonight but there's probably someone better suited to this than me,
I've spent very little time with bioperl-run. Volunteer?

Brian O.


On 10/19/06 11:37 AM, "Nathan S. Haigh" <n.haigh at sheffield.ac.uk> wrote:

> Thanks for committing that change Brian. Now the tests proceed from this
> point, I get the following error:
> 
> ------------- EXCEPTION: Bio::Root::NotImplemented -------------
> MSG: Abstract method "Bio::Tools::Run::WrapperBase::program_dir" is not
> implemented by package Bio::Tools::Run::EMBOSSApplication.
> This is not your fault - author of Bio::Tools::Run::EMBOSSApplication
> should be blamed!
> 
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/Root.pm:350
> STACK: Bio::Root::RootI::throw_not_implemented
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Root/RootI.pm:522
> STACK: Bio::Tools::Run::WrapperBase::program_dir
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:346
> STACK: Bio::Tools::Run::WrapperBase::program_path
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:327
> STACK: Bio::Tools::Run::WrapperBase::executable
> /home/bo1nsh/cvswc/bioperl-1-5-2/Bio/Tools/Run/WrapperBase.pm:297
> STACK: t/EMBOSS.t:58
> ----------------------------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From niels at genomics.dk  Thu Oct 19 15:16:37 2006
From: niels at genomics.dk (Niels Larsen)
Date: Thu, 19 Oct 2006 17:16:37 +0200
Subject: [Bioperl-l] From EBI support re WU-Blast SOAP service
In-Reply-To: <4535EBF9.1090706@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>
	<4535EBF9.1090706@sendu.me.uk>
Message-ID: <453796D5.2070808@genomics.dk>

Sendu Bala wrote:
>> I invoked the EBI script
>>
>> http://www.ebi.ac.uk/Tools/webservices/downloads/WSWUBlastSOAPLite.zip
>>
>> like this
>>
>> WSWUBlastClient.pl -p blastn -D embl test.fasta
>>
>> where the content of test.fasta is below, and got
>>
>> Can't find method element in the message at 
>> /ebi/extserv/bin/perl-5.8.3/lib/site_perl/5.8.3/SOAP/Lite.pm line 2311.
> 
> As you admit, this is not a Bioperl issue. I would suggest you contact 
> EBI support.
> 

To use EBI's WU-blast SOAP interface from perl, EBI support
says it one must use SOAP::Lite v 0.60 (no later version)
and include '--email you.example.com' on the command line.
This is neither evident from their web pages or the script
usage statement, but they promised to fix.

------------------------------------------------------------------------

Niels Larsen
Danish Genome Institute
Gustav Wieds vej 10 C
DK-8000 Aarhus C
Denmark

Electronic mail: niels at genomics.dk
Skype: niels_larsen_denmark

Telephone: +45-8942-5268
Telefax: +45-8620-1222

------------------------------------------------------------------------


From cjfields at uiuc.edu  Thu Oct 19 15:31:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 10:31:45 -0500
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <45371DFE.6050306@sheffield.ac.uk>
Message-ID: <001501c6f393$b66bd4a0$15327e82@pyrimidine>

> > As a followup in this, I tried bioperl-network and had similar failed
> tests
> > with Graph 0.79 (the only PPM available from ActiveState).  However, the
> > INSTALL docs state that Graph 0.80 is needed, and the test run gave
> several
> > warnings about not having Graph 0.80 installed.
> >
> > I made a PPM of Graph 0.80, installed, retried bioperl-network tests,
> and
> > everything passed.  Maybe we need to have a Graph PPM available for
> those
> > who want bioperl-network?
> >
> > As for bioperl-run, all tests passed from a new CVS checkout even though
> I
> > have none of the programs installed, so they seem to skip properly.  The
> > test run also printed warnings when a program wasn't available or
> installed.
> >
> >
> > Chris
> >
> >
> If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make
> modifications to integrate them into the package.xml file for PPM4
> clients.
> 
> Nath

Will do.  Should these be forwarded to Mauricio?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From N.Haigh at sheffield.ac.uk  Thu Oct 19 15:38:05 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 16:38:05 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <001501c6f393$b66bd4a0$15327e82@pyrimidine>
References: <001501c6f393$b66bd4a0$15327e82@pyrimidine>
Message-ID: <1161272285.45379bdd1aea4@webmail.shef.ac.uk>


> > If you can pop the Graph 0.80 ppd files in DIST/RC/ i'll make
> > modifications to integrate them into the package.xml file for PPM4
> > clients.
> > 
> > Nath
> 
> Will do.  Should these be forwarded to Mauricio?
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 


If you don't have access to the web, you can send them to me - I now have an account on that server.

Cheers
Nath


From cjfields at uiuc.edu  Thu Oct 19 15:45:00 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 10:45:00 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk>
Message-ID: <001601c6f395$8a752ed0$15327e82@pyrimidine>

> I thought I'd have my first proper try at writing some tests. I was
> wondering if there is a template test file that I should use/study in
> order to be
> consistent with other tests.
> 
> Failing that - Is there a good test writing style I should follow in one
> of the other test files?
> 
> Thanks
> Nathan

I would start with the Test::Simple and Test::More perldoc; they're pretty
self-explanatory.  You can look at the various test suites using Test::More
as well for pointers.  By far, most tests will use is().  You can use SKIP
blocks to skip tests that have a requirement, or skip all tests if they all
require something.  Pretty flexible.

We should probably get a wiki page for the developers underway, maybe a
HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
DB tests, etc.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Thu Oct 19 16:23:40 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 11:23:40 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
Message-ID: <001b01c6f39a$f0288ba0$15327e82@pyrimidine>

> Here is the overload code:
> 
> use overload '""' => sub {
> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> 	|| '' };
> 
> Except that the last '||' is redundant and unnecessary (it either
> does nothing or replaces an empty string with an empty string), I
> don't see the potential for duplicating the version number here -
> unless primary_id() did that, which I don't see it doing.
> 
> So, to me this seems to come from a parsing error in the beginning,
> rather than an erroneous mangling of version into primary_id later.
> 
> Is someone in the position to confirm this?
> 
> 	-hilmar

I have attached a script to the bug report on bugzilla, as well as the test
output sequence and the actual GenBank record.  There are a number of
problems:

1)  primary_id() is assigned both the id and version.
2)  version() is still assigned the version.

The above explain when printing the object directly using the overload (it
concatenates them).  

However, there are a few more issues.  The ID is printed normally
(accession.version), but the source DB is not present when SeqIO handles the
sequence.  I have attached the output and the original GenBank record to the
bug report.  

I can look into it but it won't be today; got my hands full with enzyme
assays. 

Chris


From N.Haigh at sheffield.ac.uk  Thu Oct 19 16:50:57 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 17:50:57 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine>
References: <001601c6f395$8a752ed0$15327e82@pyrimidine>
Message-ID: <1161276657.4537acf1edc80@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> > I thought I'd have my first proper try at writing some tests. I was
> > wondering if there is a template test file that I should use/study in
> > order to be
> > consistent with other tests.
> > 
> > Failing that - Is there a good test writing style I should follow in one
> > of the other test files?
> > 
> > Thanks
> > Nathan
> 
> I would start with the Test::Simple and Test::More perldoc; they're pretty
> self-explanatory.  You can look at the various test suites using Test::More
> as well for pointers.  By far, most tests will use is().  You can use SKIP
> blocks to skip tests that have a requirement, or skip all tests if they all
> require something.  Pretty flexible.
> 
> We should probably get a wiki page for the developers underway, maybe a
> HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
> DB tests, etc.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 
> 
> 


Just working through some test things now, I thought I'd start on the bioperl-run stuff as I thought it might be a bit more straight forward, i'm
familiar with some of them and they seem to get neglected.

I'm heavily commenting my tests with the thought of starting a wiki guide to testing Bioperl modules. See how far I get!

Nath


From hlapp at gmx.net  Thu Oct 19 17:11:27 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 19 Oct 2006 13:11:27 -0400
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
	<8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
Message-ID: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>

Actually you did that Jason: http://tinyurl.com/ye2edk

Apparently the motivation was to "parse swissprot fields in genpept  
file (dbsource)"?

It clearly looks wrong to add the version. You've probably had a  
reason why you did this at the time but if we (you :) can't recover  
that I guess it's best to just fix it to do the right thing (in both  
places obviously).

	-hilmar

On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:

> Well there is explicit addition of the version to the primary id so  
> it isn't so much a parsing error as a deliberate decision to append  
> it.
> see Bio::SeqIO::genbank
>
> to make the dblink
>                                               $annotation- 
> >add_Annotation
>                                                     ('dblink',
>                                                       
> Bio::Annotation::DBLink->new
>                                                      (-primary_id  
> => $id . "." . $version,
>                                                       -version =>  
> $version,
>                                                       -database =>  
> $db,
>                                                       -tagname =>  
> 'dblink'));
>
> and the code to print the dblink back out in the writer already  
> assumes the version number is appended...
>
>         foreach my $ref ( $seq->annotation->get_Annotations 
> ('dblink') ) {
>             # if ($ref->comment eq 'DBSOURCE') {
>             $self->_print('DBSOURCE    accession ',
>                           $ref->primary_id, "\n");
>             # }
>         }
>
> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
>
>> Here is the overload code:
>>
>> use overload '""' => sub {
>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
>> 	|| '' };
>>
>> Except that the last '||' is redundant and unnecessary (it either  
>> does nothing or replaces an empty string with an empty string), I  
>> don't see the potential for duplicating the version number here -  
>> unless primary_id() did that, which I don't see it doing.
>>
>> So, to me this seems to come from a parsing error in the  
>> beginning, rather than an erroneous mangling of version into  
>> primary_id later.
>>
>> Is someone in the position to confirm this?
>>
>> 	-hilmar
>>
>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>>
>>> So I'm unsure what we should do here.
>>>
>>> We can certainly fix the problem which you report which is  
>>> relying on
>>> the "" method -- if you were to do instead:
>>> print $_->database, ":", $_->primary_id, "\n";
>>>
>>> you'll get the right answer.  We at a minimum just fix the auto-
>>> string converting method to do The Right Thing.
>>>
>>> But I am not sure if we should keep the version out of the  
>>> primary_id
>>> field.  This will require some rejiggering in several modules  
>>> when it
>>> comes to printing DBlinks and I don't want to do this before the
>>> release. I also am not sure if there was an explicit reason why
>>> someone did put the version information in the primary_id. (I  
>>> hope it
>>> wasn't me because I don't think I'm going to remember why).
>>>
>>> Does anyone else have a strong feeling?
>>>
>>> -jason
>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>>
>>>> Hello,
>>>>
>>>> I noticed a little problem with the Annotation "DBLink" from
>>>> GenBank entries
>>>>
>>>> When I run:
>>>>
>>>> perl -MBio::DB::GenBank -e 'my $gi =
>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>>> $seqio =
>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>>> ("dblink");
>>>> for(@annotations) { print $_, "\n";} print $INC{
>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>>
>>>> This yields:
>>>>
>>>>    GenBank:AL591065.17.17
>>>>
>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>>
>>>> Can others repeat this?
>>>>
>>>> I have dug into the source a little and Bio::Annotation::DBLink
>>>> seems to
>>>> be the place where this happens: it has a concatenation which  
>>>> leads to
>>>> that repeated version number.
>>>>
>>>> It this something that I should fix "client-side", so to speak, or
>>>> is it
>>>> worthwhile to add some logic to that concatenation to prevent this?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Eric
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> --
>>> Jason Stajich, PhD
>>> Miller Research Fellow
>>> University of California
>>> Dept of Plant and Microbial Biology
>>> 321 Koshland Hall #3102
>>> Berkeley, CA 94720-3102
>>> lab: 510.642.8441
>>> http://pmb.berkeley.edu/~taylor/people/js.html
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From N.Haigh at sheffield.ac.uk  Thu Oct 19 17:17:33 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 18:17:33 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <001601c6f395$8a752ed0$15327e82@pyrimidine>
References: <001601c6f395$8a752ed0$15327e82@pyrimidine>
Message-ID: <1161278253.4537b32dd3d15@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> > I thought I'd have my first proper try at writing some tests. I was
> > wondering if there is a template test file that I should use/study in
> > order to be
> > consistent with other tests.
> > 
> > Failing that - Is there a good test writing style I should follow in one
> > of the other test files?
> > 
> > Thanks
> > Nathan
> 
> I would start with the Test::Simple and Test::More perldoc; they're pretty
> self-explanatory.  You can look at the various test suites using Test::More
> as well for pointers.  By far, most tests will use is().  You can use SKIP
> blocks to skip tests that have a requirement, or skip all tests if they all
> require something.  Pretty flexible.
> 
> We should probably get a wiki page for the developers underway, maybe a
> HOWTO on writing tests.  At least have these focus on BioPerl, OOP, remote
> DB tests, etc.
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 

Just wrote a partial and small test script for t/Amap.t in bioperl-run. When I run "perl -I. t/Amap.t" I get the following output:
1..10
ok 1 - use Bio::Tools::Run::Alignment::Amap;
ok 2 - use Bio::AlignIO;
ok 3 - use Bio::SeqIO;
ok 4 - use Bio::Root::IO;
ok 5 - All the required modules are present
ok 6 - new() returned something
ok 7 -   and its the right class
not ok 8 - executable() got the correct filename
#   Failed test 'executable() got the correct filename'
#   in t/Amap.t at line 90.
#          got: undef
#     expected: 'filename'
ok 9 # skip Got incorrect filename for executable
ok 10 # skip Got incorrect filename for executable
# Looks like you failed 1 test of 10.


So far this looks good (well, that it's failing passing expected tests). However, when i run "make test" the output is unexpected and I don't know
why. It seems to die and produce the results of the testing before the rest of the test suit is run:
t/Amap....................NOK 8
#   Failed test 'executable() got the correct filename'
#   in t/Amap.t at line 90.
#          got: undef
#     expected: 'filename'
# Looks like you failed 1 test of 10.
t/Amap....................dubious
        Test returned status 1 (wstat 256, 0x100)
DIED. FAILED test 8
        Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay, 70.00%)
t/Analysis_soap...........ok 7/17make: *** wait: No child processes.  Stop.


Is there something I'm missing?? If it's something less obvious, let me know and i'll post whole test file.
Nath


From cjfields at uiuc.edu  Thu Oct 19 17:26:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 12:26:45 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161278253.4537b32dd3d15@webmail.shef.ac.uk>
Message-ID: <002001c6f3a3$c00b9080$15327e82@pyrimidine>

...
> Just wrote a partial and small test script for t/Amap.t in bioperl-run.
> When I run "perl -I. t/Amap.t" I get the following output:
> 1..10
> ok 1 - use Bio::Tools::Run::Alignment::Amap;
> ok 2 - use Bio::AlignIO;
> ok 3 - use Bio::SeqIO;
> ok 4 - use Bio::Root::IO;
> ok 5 - All the required modules are present
> ok 6 - new() returned something
> ok 7 -   and its the right class
> not ok 8 - executable() got the correct filename
> #   Failed test 'executable() got the correct filename'
> #   in t/Amap.t at line 90.
> #          got: undef
> #     expected: 'filename'
> ok 9 # skip Got incorrect filename for executable
> ok 10 # skip Got incorrect filename for executable
> # Looks like you failed 1 test of 10.
> 
> 
> So far this looks good (well, that it's failing passing expected tests).
> However, when i run "make test" the output is unexpected and I don't know
> why. It seems to die and produce the results of the testing before the
> rest of the test suit is run:
> t/Amap....................NOK 8
> #   Failed test 'executable() got the correct filename'
> #   in t/Amap.t at line 90.
> #          got: undef
> #     expected: 'filename'
> # Looks like you failed 1 test of 10.
> t/Amap....................dubious
>         Test returned status 1 (wstat 256, 0x100)
> DIED. FAILED test 8
>         Failed 1/10 tests, 90.00% okay (less 2 skipped tests: 7 okay,
> 70.00%)
> t/Analysis_soap...........ok 7/17make: *** wait: No child processes.
> Stop.
> 
> 
> 
> Is there something I'm missing?? If it's something less obvious, let me
> know and i'll post whole test file.
> Nath

Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
the problem.  The only issue I can think of is that Test::More TODO blocks
require a newer version of Test::Harness (which most users have anyway).
Are you using a TODO block?

You can send me Amap.t and I'll give it a try, but I can't promise I'll get
to it immediately (busy day).

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From N.Haigh at sheffield.ac.uk  Thu Oct 19 17:38:25 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 18:38:25 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161279505.4537b811e143f@webmail.shef.ac.uk>


> Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
> the problem.  The only issue I can think of is that Test::More TODO blocks
> require a newer version of Test::Harness (which most users have anyway).
> Are you using a TODO block?
> 
> You can send me Amap.t and I'll give it a try, but I can't promise I'll get
> to it immediately (busy day).
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 
> 

No TODO blocks.

I must have done something wrong - it's the first time I've seen this - but then again, I don't look that closely at the output of "make test" unless
something shows as a fail. Anyway, below is the short bit of code.

Thanks
Nath

use strict;
use Bio::Root::IO;  # cant test for this, might be needed to get Test::More

BEGIN {
  # Things to do ASAP once the script is run
  # even before anything else in the file is parsed
  use vars qw($NUMTESTS $DEBUG $error);
  $DEBUG = $ENV{'BIOIPERLDEBUG'} || 0;

  # Use installed Test module, otherwise fall back
  # to copy of Test.pm located in the t dir
  eval { require Test::More; };
  if ( $@ ) {
    use lib Bio::Root::IO->catfile('t','lib');
  }

  # Currently no errors
  $error = 0;

  # Setup the number of tests to be run
  # what about using:
  # use Test::More 'no_plan';
  use Test::More;
  $NUMTESTS = 10;
  plan tests => $NUMTESTS;

  # Use modules that are needed in this test that are from
  # any of the Bioperl packages: Bioperl-core, Bioperl-run ... etc
  # use_ok('<module::to::use>');
  use_ok('Bio::Tools::Run::Alignment::Amap');
  use_ok('Bio::AlignIO');
  use_ok('Bio::SeqIO');
  use_ok('Bio::Root::IO');
}

# Multiple END blocks are run in reverse order of their definition
# Last In, First Out (LIFO)
END {
  # Things to do right at the very end, just
  # when the  interpreter finishes/exits
  # E.g. deleting intermediate files produced during the test

  foreach my $file ( qw(cysprot.dnd cysprot1a.dnd) ) {
    unlink $file;
    # check it was deleted

  }
  #unlink qw(cysprot.dnd cysprot1a.dnd)
}

END {
  # Not sure what this is doing?
  #for ( $Test::ntest..$NUMTESTS ) {
  #  skip("Amap program not found. Skipping.\n",1);
  #}
}

# if we got to here, thats OK!
# is this really needed?
ok( 1, 'All the required modules are present');

# setup input files etc
my $inputfilename = Bio::Root::IO->catfile("t","data","cysprot.fa");

# setup output files etc
# none in this test

# setup global objects that are to be used in more than one test
# Also test they were initialised correctly
my @params = ();
my $aln;
my $factory = Bio::Tools::Run::Alignment::Amap->new(@params);
ok( defined $factory,                                  'new() returned something' );
ok( $factory->isa('Bio::Tools::Run::Alignment::Amap'), '  and its the right class' );

# Now onto the nitty gritty tests of the modules methods
my $executable_file = $factory->executable();
#is( $factory->executable(), 'filename',                'executable() got the correct filename' );

# block of tests to skip if you know the tests will fail
# under some condition. E.g.:
#   Need network access,
#   Wont work on particular OS,
#   Cant find the exectuable
# Do not just skip tests that seem to fail for an unknown reason
SKIP: {
  # condition used to skip this block of tests
  #skip($why, $how_many_in_block);
  skip("Got incorrect filename for executable", 2)
    unless is($factory->executable(), 'filename',       'executable() got the correct filename');

  ok( -e $executable_file,                              'Found executable' );
  ok( $factory->version >= 2.0,                         'Code tested on Amap versions >= 2.0' );

}


From jason at bioperl.org  Thu Oct 19 17:44:51 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 10:44:51 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
	<8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>
	<7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
Message-ID: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>

Yikes - I was worried that it might have been me.....

Okay I'll look into fixing it -- ChrisF - check in with me before  
diving in, in case I've gotten it done and I expect your enzyme  
assays might take up the time.

-jason
On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:

> Actually you did that Jason: http://tinyurl.com/ye2edk
>
> Apparently the motivation was to "parse swissprot fields in genpept  
> file (dbsource)"?
>
> It clearly looks wrong to add the version. You've probably had a  
> reason why you did this at the time but if we (you :) can't recover  
> that I guess it's best to just fix it to do the right thing (in  
> both places obviously).
>
> 	-hilmar
>
> On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
>
>> Well there is explicit addition of the version to the primary id  
>> so it isn't so much a parsing error as a deliberate decision to  
>> append it.
>> see Bio::SeqIO::genbank
>>
>> to make the dblink
>>                                               $annotation- 
>> >add_Annotation
>>                                                     ('dblink',
>>                                                       
>> Bio::Annotation::DBLink->new
>>                                                      (-primary_id  
>> => $id . "." . $version,
>>                                                       -version =>  
>> $version,
>>                                                       -database =>  
>> $db,
>>                                                       -tagname =>  
>> 'dblink'));
>>
>> and the code to print the dblink back out in the writer already  
>> assumes the version number is appended...
>>
>>         foreach my $ref ( $seq->annotation->get_Annotations 
>> ('dblink') ) {
>>             # if ($ref->comment eq 'DBSOURCE') {
>>             $self->_print('DBSOURCE    accession ',
>>                           $ref->primary_id, "\n");
>>             # }
>>         }
>>
>> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
>>
>>> Here is the overload code:
>>>
>>> use overload '""' => sub {
>>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
>>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
>>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
>>> 	|| '' };
>>>
>>> Except that the last '||' is redundant and unnecessary (it either  
>>> does nothing or replaces an empty string with an empty string), I  
>>> don't see the potential for duplicating the version number here -  
>>> unless primary_id() did that, which I don't see it doing.
>>>
>>> So, to me this seems to come from a parsing error in the  
>>> beginning, rather than an erroneous mangling of version into  
>>> primary_id later.
>>>
>>> Is someone in the position to confirm this?
>>>
>>> 	-hilmar
>>>
>>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>>>
>>>> So I'm unsure what we should do here.
>>>>
>>>> We can certainly fix the problem which you report which is  
>>>> relying on
>>>> the "" method -- if you were to do instead:
>>>> print $_->database, ":", $_->primary_id, "\n";
>>>>
>>>> you'll get the right answer.  We at a minimum just fix the auto-
>>>> string converting method to do The Right Thing.
>>>>
>>>> But I am not sure if we should keep the version out of the  
>>>> primary_id
>>>> field.  This will require some rejiggering in several modules  
>>>> when it
>>>> comes to printing DBlinks and I don't want to do this before the
>>>> release. I also am not sure if there was an explicit reason why
>>>> someone did put the version information in the primary_id. (I  
>>>> hope it
>>>> wasn't me because I don't think I'm going to remember why).
>>>>
>>>> Does anyone else have a strong feeling?
>>>>
>>>> -jason
>>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I noticed a little problem with the Annotation "DBLink" from
>>>>> GenBank entries
>>>>>
>>>>> When I run:
>>>>>
>>>>> perl -MBio::DB::GenBank -e 'my $gi =
>>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>>>> $seqio =
>>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>>>> ("dblink");
>>>>> for(@annotations) { print $_, "\n";} print $INC{
>>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>>>
>>>>> This yields:
>>>>>
>>>>>    GenBank:AL591065.17.17
>>>>>
>>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>>>
>>>>> Can others repeat this?
>>>>>
>>>>> I have dug into the source a little and Bio::Annotation::DBLink
>>>>> seems to
>>>>> be the place where this happens: it has a concatenation which  
>>>>> leads to
>>>>> that repeated version number.
>>>>>
>>>>> It this something that I should fix "client-side", so to speak, or
>>>>> is it
>>>>> worthwhile to add some logic to that concatenation to prevent  
>>>>> this?
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Eric
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> --
>>>> Jason Stajich, PhD
>>>> Miller Research Fellow
>>>> University of California
>>>> Dept of Plant and Microbial Biology
>>>> 321 Koshland Hall #3102
>>>> Berkeley, CA 94720-3102
>>>> lab: 510.642.8441
>>>> http://pmb.berkeley.edu/~taylor/people/js.html
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> Jason Stajich, PhD
>> Miller Research Fellow
>> University of California
>> Dept of Plant and Microbial Biology
>> 321 Koshland Hall #3102
>> Berkeley, CA 94720-3102
>> lab: 510.642.8441
>> http://pmb.berkeley.edu/~taylor/people/js.html
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From cjfields at uiuc.edu  Thu Oct 19 18:03:52 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:03:52 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <7A79D0EC-14A1-4FE0-8587-BC4FF8D63BDD@gmx.net>
Message-ID: <000001c6f3a8$f0a46a00$15327e82@pyrimidine>

Also seems that the DBSOURCE line isn't caught correctly and stuffs it by
default into a GenBank dblink (the dbsource ihn the test case is EMBL, not
GenBank).  

http://bugzilla.open-bio.org/show_bug.cgi?id=2124

It looks like NCBI may be now using:

DBSOURCE    embl accession Z49548.1

instead of the old version:

DBSOURCE    embl locus SCYJR048W, accession Z49548.1

I don't recall NCBI mentioning changes regarding DBSOURCE in any of the
recent release notes.

Chris

> Actually you did that Jason: http://tinyurl.com/ye2edk
> 
> Apparently the motivation was to "parse swissprot fields in genpept
> file (dbsource)"?
> 
> It clearly looks wrong to add the version. You've probably had a
> reason why you did this at the time but if we (you :) can't recover
> that I guess it's best to just fix it to do the right thing (in both
> places obviously).
> 
> 	-hilmar
> 
> On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> 
> > Well there is explicit addition of the version to the primary id so
> > it isn't so much a parsing error as a deliberate decision to append
> > it.
> > see Bio::SeqIO::genbank
> >
> > to make the dblink
> >                                               $annotation-
> > >add_Annotation
> >                                                     ('dblink',
> >
> > Bio::Annotation::DBLink->new
> >                                                      (-primary_id
> > => $id . "." . $version,
> >                                                       -version =>
> > $version,
> >                                                       -database =>
> > $db,
> >                                                       -tagname =>
> > 'dblink'));
> >
> > and the code to print the dblink back out in the writer already
> > assumes the version number is appended...
> >
> >         foreach my $ref ( $seq->annotation->get_Annotations
> > ('dblink') ) {
> >             # if ($ref->comment eq 'DBSOURCE') {
> >             $self->_print('DBSOURCE    accession ',
> >                           $ref->primary_id, "\n");
> >             # }
> >         }
> >
> > On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >
> >> Here is the overload code:
> >>
> >> use overload '""' => sub {
> >> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >> 	|| '' };
> >>
> >> Except that the last '||' is redundant and unnecessary (it either
> >> does nothing or replaces an empty string with an empty string), I
> >> don't see the potential for duplicating the version number here -
> >> unless primary_id() did that, which I don't see it doing.
> >>
> >> So, to me this seems to come from a parsing error in the
> >> beginning, rather than an erroneous mangling of version into
> >> primary_id later.
> >>
> >> Is someone in the position to confirm this?
> >>
> >> 	-hilmar
> >>
> >> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>
> >>> So I'm unsure what we should do here.
> >>>
> >>> We can certainly fix the problem which you report which is
> >>> relying on
> >>> the "" method -- if you were to do instead:
> >>> print $_->database, ":", $_->primary_id, "\n";
> >>>
> >>> you'll get the right answer.  We at a minimum just fix the auto-
> >>> string converting method to do The Right Thing.
> >>>
> >>> But I am not sure if we should keep the version out of the
> >>> primary_id
> >>> field.  This will require some rejiggering in several modules
> >>> when it
> >>> comes to printing DBlinks and I don't want to do this before the
> >>> release. I also am not sure if there was an explicit reason why
> >>> someone did put the version information in the primary_id. (I
> >>> hope it
> >>> wasn't me because I don't think I'm going to remember why).
> >>>
> >>> Does anyone else have a strong feeling?
> >>>
> >>> -jason
> >>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> I noticed a little problem with the Annotation "DBLink" from
> >>>> GenBank entries
> >>>>
> >>>> When I run:
> >>>>
> >>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>> $seqio =
> >>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>> ("dblink");
> >>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>
> >>>> This yields:
> >>>>
> >>>>    GenBank:AL591065.17.17
> >>>>
> >>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>
> >>>> Can others repeat this?
> >>>>
> >>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>> seems to
> >>>> be the place where this happens: it has a concatenation which
> >>>> leads to
> >>>> that repeated version number.
> >>>>
> >>>> It this something that I should fix "client-side", so to speak, or
> >>>> is it
> >>>> worthwhile to add some logic to that concatenation to prevent this?
> >>>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Eric
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>> --
> >>> Jason Stajich, PhD
> >>> Miller Research Fellow
> >>> University of California
> >>> Dept of Plant and Microbial Biology
> >>> 321 Koshland Hall #3102
> >>> Berkeley, CA 94720-3102
> >>> lab: 510.642.8441
> >>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>
> >>
> >> --
> >> ===========================================================
> >> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >> ===========================================================
> >>
> >>
> >>
> >>
> >>
> >
> > --
> > Jason Stajich, PhD
> > Miller Research Fellow
> > University of California
> > Dept of Plant and Microbial Biology
> > 321 Koshland Hall #3102
> > Berkeley, CA 94720-3102
> > lab: 510.642.8441
> > http://pmb.berkeley.edu/~taylor/people/js.html
> >
> >
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From N.Haigh at sheffield.ac.uk  Thu Oct 19 18:06:11 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:06:11 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161281171.4537be93b63c9@webmail.shef.ac.uk>


> 
> Well, Analysis_soap.t ran immediately after, so Amap.t tests shouldn't be
> the problem.  The only issue I can think of is that Test::More TODO blocks
> require a newer version of Test::Harness (which most users have anyway).
> Are you using a TODO block?
> 
> You can send me Amap.t and I'll give it a try, but I can't promise I'll get
> to it immediately (busy day).
> 
> Chris
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 
> 

Nevermind about this - It's working as expected!

I got confused as a previous run threw errors but wasn't included in the final table of failed tests - working now.

Nath 


From N.Haigh at sheffield.ac.uk  Thu Oct 19 18:14:54 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:14:54 +0100
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
Message-ID: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>

I have a few questions about How bioperl-run modules.

1) How do modules define what the name of the executable is that it uses?
2) Is there a way to test what this is?
3) Does $factory->executable return this or does it only return the name if it successfully found it?

Thanks
Nath


From cjfields at uiuc.edu  Thu Oct 19 18:15:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:15:08 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>
Message-ID: <000001c6f3aa$82845ba0$15327e82@pyrimidine>

Go for it.  I haven't got the time to spare at the moment, sucky protein
assays....

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Jason Stajich
> Sent: Thursday, October 19, 2006 12:45 PM
> To: Hilmar Lapp
> Cc: bioperl-l at lists.open-bio.org; Erikjan
> Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating
> 
> Yikes - I was worried that it might have been me.....
> 
> Okay I'll look into fixing it -- ChrisF - check in with me before
> diving in, in case I've gotten it done and I expect your enzyme
> assays might take up the time.
> 
> -jason
> On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:
> 
> > Actually you did that Jason: http://tinyurl.com/ye2edk
> >
> > Apparently the motivation was to "parse swissprot fields in genpept
> > file (dbsource)"?
> >
> > It clearly looks wrong to add the version. You've probably had a
> > reason why you did this at the time but if we (you :) can't recover
> > that I guess it's best to just fix it to do the right thing (in
> > both places obviously).
> >
> > 	-hilmar
> >
> > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> >
> >> Well there is explicit addition of the version to the primary id
> >> so it isn't so much a parsing error as a deliberate decision to
> >> append it.
> >> see Bio::SeqIO::genbank
> >>
> >> to make the dblink
> >>                                               $annotation-
> >> >add_Annotation
> >>                                                     ('dblink',
> >>
> >> Bio::Annotation::DBLink->new
> >>                                                      (-primary_id
> >> => $id . "." . $version,
> >>                                                       -version =>
> >> $version,
> >>                                                       -database =>
> >> $db,
> >>                                                       -tagname =>
> >> 'dblink'));
> >>
> >> and the code to print the dblink back out in the writer already
> >> assumes the version number is appended...
> >>
> >>         foreach my $ref ( $seq->annotation->get_Annotations
> >> ('dblink') ) {
> >>             # if ($ref->comment eq 'DBSOURCE') {
> >>             $self->_print('DBSOURCE    accession ',
> >>                           $ref->primary_id, "\n");
> >>             # }
> >>         }
> >>
> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >>
> >>> Here is the overload code:
> >>>
> >>> use overload '""' => sub {
> >>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >>> 	|| '' };
> >>>
> >>> Except that the last '||' is redundant and unnecessary (it either
> >>> does nothing or replaces an empty string with an empty string), I
> >>> don't see the potential for duplicating the version number here -
> >>> unless primary_id() did that, which I don't see it doing.
> >>>
> >>> So, to me this seems to come from a parsing error in the
> >>> beginning, rather than an erroneous mangling of version into
> >>> primary_id later.
> >>>
> >>> Is someone in the position to confirm this?
> >>>
> >>> 	-hilmar
> >>>
> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>>
> >>>> So I'm unsure what we should do here.
> >>>>
> >>>> We can certainly fix the problem which you report which is
> >>>> relying on
> >>>> the "" method -- if you were to do instead:
> >>>> print $_->database, ":", $_->primary_id, "\n";
> >>>>
> >>>> you'll get the right answer.  We at a minimum just fix the auto-
> >>>> string converting method to do The Right Thing.
> >>>>
> >>>> But I am not sure if we should keep the version out of the
> >>>> primary_id
> >>>> field.  This will require some rejiggering in several modules
> >>>> when it
> >>>> comes to printing DBlinks and I don't want to do this before the
> >>>> release. I also am not sure if there was an explicit reason why
> >>>> someone did put the version information in the primary_id. (I
> >>>> hope it
> >>>> wasn't me because I don't think I'm going to remember why).
> >>>>
> >>>> Does anyone else have a strong feeling?
> >>>>
> >>>> -jason
> >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> I noticed a little problem with the Annotation "DBLink" from
> >>>>> GenBank entries
> >>>>>
> >>>>> When I run:
> >>>>>
> >>>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>>> $seqio =
> >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>>> ("dblink");
> >>>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>>
> >>>>> This yields:
> >>>>>
> >>>>>    GenBank:AL591065.17.17
> >>>>>
> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>>
> >>>>> Can others repeat this?
> >>>>>
> >>>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>>> seems to
> >>>>> be the place where this happens: it has a concatenation which
> >>>>> leads to
> >>>>> that repeated version number.
> >>>>>
> >>>>> It this something that I should fix "client-side", so to speak, or
> >>>>> is it
> >>>>> worthwhile to add some logic to that concatenation to prevent
> >>>>> this?
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l at lists.open-bio.org
> >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>> --
> >>>> Jason Stajich, PhD
> >>>> Miller Research Fellow
> >>>> University of California
> >>>> Dept of Plant and Microbial Biology
> >>>> 321 Koshland Hall #3102
> >>>> Berkeley, CA 94720-3102
> >>>> lab: 510.642.8441
> >>>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>
> >>> --
> >>> ===========================================================
> >>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >>> ===========================================================
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Jason Stajich, PhD
> >> Miller Research Fellow
> >> University of California
> >> Dept of Plant and Microbial Biology
> >> 321 Koshland Hall #3102
> >> Berkeley, CA 94720-3102
> >> lab: 510.642.8441
> >> http://pmb.berkeley.edu/~taylor/people/js.html
> >>
> >>
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Thu Oct 19 18:35:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 13:35:08 -0500
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
Message-ID: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>

I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase
but I'm not sure.  I haven't used them very much myself but plan on making
wrappers at some point soon for some programs I use.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Nathan Haigh [mailto:N.Haigh at sheffield.ac.uk]
> Sent: Thursday, October 19, 2006 1:15 PM
> To: Chris Fields
> Cc: 'bioperl-l'
> Subject: bioperl-run executable
> 
> I have a few questions about How bioperl-run modules.
> 
> 1) How do modules define what the name of the executable is that it uses?
> 2) Is there a way to test what this is?
> 3) Does $factory->executable return this or does it only return the name
> if it successfully found it?
> 
> Thanks
> Nath


From N.Haigh at sheffield.ac.uk  Thu Oct 19 18:47:01 2006
From: N.Haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 19 Oct 2006 19:47:01 +0100
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
Message-ID: <1161283620.4537c82501c43@webmail.shef.ac.uk>

Quoting Chris Fields <cjfields at uiuc.edu>:

> I think a lot of the bioperl-run modules use Bio::Tools::Run::WrapperBase
> but I'm not sure.  I haven't used them very much myself but plan on making
> wrappers at some point soon for some programs I use.
> 
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
> 

On closer inspection of a couple of other modules (Clustalw.pm and TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME and have a sub
(program_name) that simply returns this value. I'd like to see the program_name become a getter/setter so users can change the default and have the
string stored in the factory object.

Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core not bioperl-run? I suppose not since bioperl-core is a prerep for bioperl-run but
wouldn't it make sence to go in bioperl-run?

Nath


From cjfields at uiuc.edu  Thu Oct 19 19:07:05 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 14:07:05 -0500
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <B9A24BF0-CD5A-40CA-B8AA-1A449CA9D7AA@bioperl.org>
Message-ID: <000701c6f3b1$c5914230$15327e82@pyrimidine>

Jason, Hilmar, 

How about changing the default parsed dblink in SeqIO::genbank (line 520) to

		if( $dbsource =~ /^(\S*?)\s*accession\s+(\S+)\.(\d+)/ ) {
		    my ($db,$id,$version) = ($1,$2,$3);
		    $annotation->add_Annotation
			('dblink',
			 Bio::Annotation::DBLink->new
			 (-primary_id => $id,
			  -version => $version,
			  -database => $db || 'GenBank',
			  -tagname => 'dblink'));
		} 

It passes tests and catches the optional database ('embl' for the bugzilla
report).  The output sequence still doesn't print the DB if it isn't GenBank
via write_seq(), but that should be too hard to fix (famous last words).

Okay, okay, back to the assays...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Jason Stajich
> Sent: Thursday, October 19, 2006 12:45 PM
> To: Hilmar Lapp
> Cc: bioperl-l at lists.open-bio.org; Erikjan
> Subject: Re: [Bioperl-l] Annotation-DBLink- version numbers repeating
> 
> Yikes - I was worried that it might have been me.....
> 
> Okay I'll look into fixing it -- ChrisF - check in with me before
> diving in, in case I've gotten it done and I expect your enzyme
> assays might take up the time.
> 
> -jason
> On Oct 19, 2006, at 10:11 AM, Hilmar Lapp wrote:
> 
> > Actually you did that Jason: http://tinyurl.com/ye2edk
> >
> > Apparently the motivation was to "parse swissprot fields in genpept
> > file (dbsource)"?
> >
> > It clearly looks wrong to add the version. You've probably had a
> > reason why you did this at the time but if we (you :) can't recover
> > that I guess it's best to just fix it to do the right thing (in
> > both places obviously).
> >
> > 	-hilmar
> >
> > On Oct 19, 2006, at 11:50 AM, Jason Stajich wrote:
> >
> >> Well there is explicit addition of the version to the primary id
> >> so it isn't so much a parsing error as a deliberate decision to
> >> append it.
> >> see Bio::SeqIO::genbank
> >>
> >> to make the dblink
> >>                                               $annotation-
> >> >add_Annotation
> >>                                                     ('dblink',
> >>
> >> Bio::Annotation::DBLink->new
> >>                                                      (-primary_id
> >> => $id . "." . $version,
> >>                                                       -version =>
> >> $version,
> >>                                                       -database =>
> >> $db,
> >>                                                       -tagname =>
> >> 'dblink'));
> >>
> >> and the code to print the dblink back out in the writer already
> >> assumes the version number is appended...
> >>
> >>         foreach my $ref ( $seq->annotation->get_Annotations
> >> ('dblink') ) {
> >>             # if ($ref->comment eq 'DBSOURCE') {
> >>             $self->_print('DBSOURCE    accession ',
> >>                           $ref->primary_id, "\n");
> >>             # }
> >>         }
> >>
> >> On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:
> >>
> >>> Here is the overload code:
> >>>
> >>> use overload '""' => sub {
> >>> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> >>> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> >>> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> >>> 	|| '' };
> >>>
> >>> Except that the last '||' is redundant and unnecessary (it either
> >>> does nothing or replaces an empty string with an empty string), I
> >>> don't see the potential for duplicating the version number here -
> >>> unless primary_id() did that, which I don't see it doing.
> >>>
> >>> So, to me this seems to come from a parsing error in the
> >>> beginning, rather than an erroneous mangling of version into
> >>> primary_id later.
> >>>
> >>> Is someone in the position to confirm this?
> >>>
> >>> 	-hilmar
> >>>
> >>> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
> >>>
> >>>> So I'm unsure what we should do here.
> >>>>
> >>>> We can certainly fix the problem which you report which is
> >>>> relying on
> >>>> the "" method -- if you were to do instead:
> >>>> print $_->database, ":", $_->primary_id, "\n";
> >>>>
> >>>> you'll get the right answer.  We at a minimum just fix the auto-
> >>>> string converting method to do The Right Thing.
> >>>>
> >>>> But I am not sure if we should keep the version out of the
> >>>> primary_id
> >>>> field.  This will require some rejiggering in several modules
> >>>> when it
> >>>> comes to printing DBlinks and I don't want to do this before the
> >>>> release. I also am not sure if there was an explicit reason why
> >>>> someone did put the version information in the primary_id. (I
> >>>> hope it
> >>>> wasn't me because I don't think I'm going to remember why).
> >>>>
> >>>> Does anyone else have a strong feeling?
> >>>>
> >>>> -jason
> >>>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> I noticed a little problem with the Annotation "DBLink" from
> >>>>> GenBank entries
> >>>>>
> >>>>> When I run:
> >>>>>
> >>>>> perl -MBio::DB::GenBank -e 'my $gi =
> >>>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my
> >>>>> $seqio =
> >>>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
> >>>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
> >>>>> ("dblink");
> >>>>> for(@annotations) { print $_, "\n";} print $INC{
> >>>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
> >>>>>
> >>>>> This yields:
> >>>>>
> >>>>>    GenBank:AL591065.17.17
> >>>>>
> >>>>> and the place where the used Bio/Annotation/DBLink.pm resides.
> >>>>>
> >>>>> Can others repeat this?
> >>>>>
> >>>>> I have dug into the source a little and Bio::Annotation::DBLink
> >>>>> seems to
> >>>>> be the place where this happens: it has a concatenation which
> >>>>> leads to
> >>>>> that repeated version number.
> >>>>>
> >>>>> It this something that I should fix "client-side", so to speak, or
> >>>>> is it
> >>>>> worthwhile to add some logic to that concatenation to prevent
> >>>>> this?
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioperl-l mailing list
> >>>>> Bioperl-l at lists.open-bio.org
> >>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>> --
> >>>> Jason Stajich, PhD
> >>>> Miller Research Fellow
> >>>> University of California
> >>>> Dept of Plant and Microbial Biology
> >>>> 321 Koshland Hall #3102
> >>>> Berkeley, CA 94720-3102
> >>>> lab: 510.642.8441
> >>>> http://pmb.berkeley.edu/~taylor/people/js.html
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>>>
> >>>
> >>> --
> >>> ===========================================================
> >>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> >>> ===========================================================
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >> --
> >> Jason Stajich, PhD
> >> Miller Research Fellow
> >> University of California
> >> Dept of Plant and Microbial Biology
> >> 321 Koshland Hall #3102
> >> Berkeley, CA 94720-3102
> >> lab: 510.642.8441
> >> http://pmb.berkeley.edu/~taylor/people/js.html
> >>
> >>
> >
> > --
> > ===========================================================
> > : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> > ===========================================================
> >
> >
> >
> >
> >
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From jason at bioperl.org  Thu Oct 19 18:48:28 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 11:48:28 -0700
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
	<1161281694.4537c09ebd0f8@webmail.shef.ac.uk>
Message-ID: <67650240-D61B-4842-AE7C-75F15F608F6F@bioperl.org>

program_name()
  Should return the name of the program

executable()
  Is a function that you don't have to mess with that tries to find  
the executable named  program_name() based on your PATH.


-jason
On Oct 19, 2006, at 11:14 AM, Nathan Haigh wrote:

> I have a few questions about How bioperl-run modules.
>
> 1) How do modules define what the name of the executable is that it  
> uses?
> 2) Is there a way to test what this is?
> 3) Does $factory->executable return this or does it only return the  
> name if it successfully found it?
>
> Thanks
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Thu Oct 19 21:06:43 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 14:06:43 -0700
Subject: [Bioperl-l] bioperl-run executable
In-Reply-To: <1161283620.4537c82501c43@webmail.shef.ac.uk>
References: <000301c6f3ad$4ec9af10$15327e82@pyrimidine>
	<1161283620.4537c82501c43@webmail.shef.ac.uk>
Message-ID: <AA1A41EC-C0E1-49C3-818E-64210971E331@bioperl.org>

It can be reset now but of course this not a very nice way of doing it:

$Bio::Tools::Run::Alignment::Clustalw::PROGRAM_NAME = 'clustalw_smp';

I am not sure if there are pros and cons to making it a getter- 
setter, but if you want to run with it, please do.

The whole run system has been hard to keep people adhering to a  
standard (and the standard has changed a bit) so some auditing is  
warranted.

-jason

On Oct 19, 2006, at 11:47 AM, Nathan Haigh wrote:

> Quoting Chris Fields <cjfields at uiuc.edu>:
>
>> I think a lot of the bioperl-run modules use  
>> Bio::Tools::Run::WrapperBase
>> but I'm not sure.  I haven't used them very much myself but plan  
>> on making
>> wrappers at some point soon for some programs I use.
>>
>> Christopher Fields
>> Postdoctoral Researcher - Switzer Lab
>> Dept. of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>
> On closer inspection of a couple of other modules (Clustalw.pm and  
> TCoffee.pm), they seem to hardcode the exe name in $PROGRAM_NAME  
> and have a sub
> (program_name) that simply returns this value. I'd like to see the  
> program_name become a getter/setter so users can change the default  
> and have the
> string stored in the factory object.
>
> Does it matter that Bio::Tools::Run::WrapperBase is in bioperl-core  
> not bioperl-run? I suppose not since bioperl-core is a prerep for  
> bioperl-run but
> wouldn't it make sence to go in bioperl-run?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From torsten.seemann at infotech.monash.edu.au  Thu Oct 19 23:24:03 2006
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 20 Oct 2006 09:24:03 +1000
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161279505.4537b811e143f@webmail.shef.ac.uk>
References: <002001c6f3a3$c00b9080$15327e82@pyrimidine>
	<1161279505.4537b811e143f@webmail.shef.ac.uk>
Message-ID: <45380913.3070506@infotech.monash.edu.au>

Nathan,

> use strict;
> use Bio::Root::IO;  # cant test for this, might be needed to get Test::More

use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
and File::Spec is "guaranteed" to be installed with Perl 5.6+.

>     use lib Bio::Root::IO->catfile('t','lib');

Simpler as:
	use lib 't/lib';
I understand the 'lib.pm' accepts Unix style directories REGARDLESS of native 
platform.

-- 
Torsten Seemann
Victorian Bioinformatics Consortium, Monash University, Australia


From prabubio at gmail.com  Fri Oct 20 00:11:36 2006
From: prabubio at gmail.com (Prabu Raja)
Date: 20 Oct 2006 00:11:36 -0000
Subject: [Bioperl-l] Prabu Raja sent you this link
Message-ID: <20061020001136.86586.qmail@x05.namesdatabase.com>

Remember your link from Prabu Raja:

http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2


1 -> Use Prabu Raja's link by clicking above.

2 -> Enter your info for a membership connected to Prabu.

3 -> Share links with other friends, family and co-workers.

4 -> Use the members-only people search tools.

Prabu selected you for this on 09-02-2004 22:52 ET.


prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-bio.org
at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99.
If you do not know a Prabu Raja, use http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more reminders about this.
For reference, the address of The Names Database is 1253 N. Research Way, Suite Q-2500, Orem, UT 84097.


From cjfields at uiuc.edu  Fri Oct 20 00:29:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 19:29:11 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <45380913.3070506@infotech.monash.edu.au>
Message-ID: <000f01c6f3de$c3d91170$15327e82@pyrimidine>

> Nathan,
> 
> > use strict;
> > use Bio::Root::IO;  # cant test for this, might be needed to get
> Test::More
> 
> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
> and File::Spec is "guaranteed" to be installed with Perl 5.6+.
> 
> >     use lib Bio::Root::IO->catfile('t','lib');
> 
> Simpler as:
> 	use lib 't/lib';
> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of
> native
> platform.
> 
> --
> Torsten Seemann
> Victorian Bioinformatics Consortium, Monash University, Australia

That is true, at least for WinXP (not sure about older Windows versions out
there).  I was using 'Root::IO->catfile' but found 'use lib 't/lib' works.
I may have a few of the 'catfile' versions floating around out there, which
may be where that originated.

Note that if you plan on using Test::More with the bioperl-run test suite,
you should add it to the bioperl-run CVS distribution directory in 't/lib'.
Most people will have it installed, but you never know.

Chris


From cjfields at uiuc.edu  Fri Oct 20 00:33:22 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 19 Oct 2006 19:33:22 -0500
Subject: [Bioperl-l] Prabu Raja sent you this link
In-Reply-To: <20061020001136.86586.qmail@x05.namesdatabase.com>
Message-ID: <001001c6f3df$598a24c0$15327e82@pyrimidine>

That Prabu Raja sure gets around...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Prabu Raja
> Sent: Thursday, October 19, 2006 7:12 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Prabu Raja sent you this link
> 
> Remember your link from Prabu Raja:
> 
> http://namesdatabase.com/2m.pl?k2=40892642637&s=nsender2
> 
> 
> 1 -> Use Prabu Raja's link by clicking above.
> 
> 2 -> Enter your info for a membership connected to Prabu.
> 
> 3 -> Share links with other friends, family and co-workers.
> 
> 4 -> Use the members-only people search tools.
> 
> Prabu selected you for this on 09-02-2004 22:52 ET.
> 
> 
> prabubio at gmail.com (Prabu Raja) initiated this to bioperl-l at lists.open-
> bio.org
> at 10-05-2006 02:30 on namesdatabase.com from the IP address 59.92.88.99.
> If you do not know a Prabu Raja, use
> http://namesdatabase.com/u.pl?bb2=40892642637&s=nsender2 to halt more
> reminders about this.
> For reference, the address of The Names Database is 1253 N. Research Way,
> Suite Q-2500, Orem, UT 84097.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From keithplayer at hotmail.com  Fri Oct 20 02:13:52 2006
From: keithplayer at hotmail.com (Keith Player)
Date: Fri, 20 Oct 2006 02:13:52 +0000 (UTC)
Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning
Message-ID: <loom.20061020T041338-193@post.gmane.org>

I know that there may be some changes resulting from new GFF3 implementations, 
but thought I would see if the following is useful anyway.

I implemented the R-tree binning schema as used by Bio::DB::GFF::Util::Binning 
and as mention in this article:

I tested the following query on a normal table (no binning), but it assumes 
that you know the longest range in the table.  So for example with a table of 
human genes, where the longest gene we know of is around 2.4Mb.

 SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb]) AND 
g.start < [end] AND g.end > [start] AND g.chromosome = '1'

so for 100Mb:101Mb

SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start < 
101000000 AND g.end > 100000000 AND g.chromosome = '1'


where [start] and [end] define the region of interest.  This query outperforms 
the R-Tree implementation on all tests that I have performed (for lengths of 
200bp to 10Mb across a whole chromsome).  Could this be of some practical use?


From jason at bioperl.org  Thu Oct 19 15:50:49 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 19 Oct 2006 08:50:49 -0700
Subject: [Bioperl-l] Annotation-DBLink- version numbers repeating
In-Reply-To: <0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
References: <001301c6f215$07a9a070$15327e82@pyrimidine>
	<6817.156.83.0.181.1161111708.squirrel@webmail.xs4all.nl>
	<6686E701-3BF2-462C-BAEE-EC20FE14F5FC@bioperl.org>
	<0562E6F4-D94D-4D3C-8BA7-B18084DBCDF7@gmx.net>
Message-ID: <8F3E3FEA-66AA-47AB-ABC0-74580E870203@bioperl.org>

Well there is explicit addition of the version to the primary id so  
it isn't so much a parsing error as a deliberate decision to append it.
see Bio::SeqIO::genbank

to make the dblink
                                               $annotation- 
 >add_Annotation
                                                     ('dblink',
                                                       
Bio::Annotation::DBLink->new
                                                      (-primary_id =>  
$id . "." . $version,
                                                       -version =>  
$version,
                                                       -database => $db,
                                                       -tagname =>  
'dblink'));

and the code to print the dblink back out in the writer already  
assumes the version number is appended...

         foreach my $ref ( $seq->annotation->get_Annotations 
('dblink') ) {
             # if ($ref->comment eq 'DBSOURCE') {
             $self->_print('DBSOURCE    accession ',
                           $ref->primary_id, "\n");
             # }
         }

On Oct 19, 2006, at 6:56 AM, Hilmar Lapp wrote:

> Here is the overload code:
>
> use overload '""' => sub {
> 	(($_[0]->database ? $_[0]->database . ':' : '' )
> 	. ($_[0]->primary_id ? $_[0]->primary_id : '')
> 	. ($_[0]->version ? '.' . $_[0]->version : ''))
> 	|| '' };
>
> Except that the last '||' is redundant and unnecessary (it either  
> does nothing or replaces an empty string with an empty string), I  
> don't see the potential for duplicating the version number here -  
> unless primary_id() did that, which I don't see it doing.
>
> So, to me this seems to come from a parsing error in the beginning,  
> rather than an erroneous mangling of version into primary_id later.
>
> Is someone in the position to confirm this?
>
> 	-hilmar
>
> On Oct 19, 2006, at 1:00 AM, Jason Stajich wrote:
>
>> So I'm unsure what we should do here.
>>
>> We can certainly fix the problem which you report which is relying on
>> the "" method -- if you were to do instead:
>> print $_->database, ":", $_->primary_id, "\n";
>>
>> you'll get the right answer.  We at a minimum just fix the auto-
>> string converting method to do The Right Thing.
>>
>> But I am not sure if we should keep the version out of the primary_id
>> field.  This will require some rejiggering in several modules when it
>> comes to printing DBlinks and I don't want to do this before the
>> release. I also am not sure if there was an explicit reason why
>> someone did put the version information in the primary_id. (I hope it
>> wasn't me because I don't think I'm going to remember why).
>>
>> Does anyone else have a strong feeling?
>>
>> -jason
>> On Oct 17, 2006, at 12:01 PM, Erikjan wrote:
>>
>>> Hello,
>>>
>>> I noticed a little problem with the Annotation "DBLink" from
>>> GenBank entries
>>>
>>> When I run:
>>>
>>> perl -MBio::DB::GenBank -e 'my $gi =
>>> 56205924;$db=Bio::DB::GenBank->new(-format => "genbank"); my  
>>> $seqio =
>>> $db->get_Stream_by_id($gi); my$seq = $seqio->next_seq; my
>>> $ac=$seq->annotation(); my @annotations = $ac->get_Annotations
>>> ("dblink");
>>> for(@annotations) { print $_, "\n";} print $INC{
>>> "Bio/Annotation/DBLink.pm" }, "\n"; '
>>>
>>> This yields:
>>>
>>>    GenBank:AL591065.17.17
>>>
>>> and the place where the used Bio/Annotation/DBLink.pm resides.
>>>
>>> Can others repeat this?
>>>
>>> I have dug into the source a little and Bio::Annotation::DBLink
>>> seems to
>>> be the place where this happens: it has a concatenation which  
>>> leads to
>>> that repeated version number.
>>>
>>> It this something that I should fix "client-side", so to speak, or
>>> is it
>>> worthwhile to add some logic to that concatenation to prevent this?
>>>
>>>
>>> Thanks,
>>>
>>> Eric
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> --
>> Jason Stajich, PhD
>> Miller Research Fellow
>> University of California
>> Dept of Plant and Microbial Biology
>> 321 Koshland Hall #3102
>> Berkeley, CA 94720-3102
>> lab: 510.642.8441
>> http://pmb.berkeley.edu/~taylor/people/js.html
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From n.haigh at sheffield.ac.uk  Fri Oct 20 08:35:03 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 20 Oct 2006 08:35:03 +0000
Subject: [Bioperl-l] test::more template
In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
Message-ID: <45388A37.7040505@sheffield.ac.uk>

Chris Fields wrote:
>> Nathan,
>>
>>     
>>> use strict;
>>> use Bio::Root::IO;  # cant test for this, might be needed to get
>>>       
>> Test::More
>>
>> use File::Spec->catfile as Bio::Root::IO->catfile just delegates anyway,
>> and File::Spec is "guaranteed" to be installed with Perl 5.6+.
>>
>>     
>>>     use lib Bio::Root::IO->catfile('t','lib');
>>>       
>> Simpler as:
>> 	use lib 't/lib';
>> I understand the 'lib.pm' accepts Unix style directories REGARDLESS of
>> native
>> platform.
>>
>> --
>> Torsten Seemann
>> Victorian Bioinformatics Consortium, Monash University, Australia
>>     
>
> That is true, at least for WinXP (not sure about older Windows versions out
> there).  I was using 'Root::IO->catfile' but found 'use lib 't/lib' works.
> I may have a few of the 'catfile' versions floating around out there, which
> may be where that originated.
>
> Note that if you plan on using Test::More with the bioperl-run test suite,
> you should add it to the bioperl-run CVS distribution directory in 't/lib'.
> Most people will have it installed, but you never know.
>
> Chris
>
>
>   
What is the reason for including Test::More in 't/lib' rather than
having it as a prereq?

-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>


From n.haigh at sheffield.ac.uk  Fri Oct 20 09:27:19 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 20 Oct 2006 10:27:19 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
Message-ID: <45389677.1000709@sheffield.ac.uk>

Is it really necessary to specify the number of tests that are to be
conducted in advance? It seems a bit annoying to have to count the
number of tests in the script or to run the test just to see how many
tests were done, we could just use:
use Test::More 'no_plan';

And then it's up to Test::More to keep a track of how many tests it's
run. The only thing then to worry about is how many tests are in a SKIP
block if the skip criteria are met. This is unless there is a good
reason to use it that I am unaware of.

Thanks
Nath


From bix at sendu.me.uk  Fri Oct 20 10:01:09 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 11:01:09 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45389677.1000709@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk>
Message-ID: <45389E65.6080908@sendu.me.uk>

Nathan Haigh wrote:
> Is it really necessary to specify the number of tests that are to be
> conducted in advance? It seems a bit annoying to have to count the
> number of tests in the script or to run the test just to see how many
> tests were done, we could just use:
> use Test::More 'no_plan';

It's very important to have a plan. That way you know all the tests 
actually ran and weren't skipped (either due to an actual SKIP block or 
an if block that returned false due to a bug, or a for/foreach/while 
that didn't loop enough times due to a bug, or any number of other reasons).


From bix at sendu.me.uk  Fri Oct 20 10:04:48 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 11:04:48 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45388A37.7040505@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45388A37.7040505@sheffield.ac.uk>
Message-ID: <45389F40.5060601@sendu.me.uk>

Nathan S. Haigh wrote:
> Chris Fields wrote:
>
>> Note that if you plan on using Test::More with the bioperl-run test suite,
>> you should add it to the bioperl-run CVS distribution directory in 't/lib'.
>> Most people will have it installed, but you never know.
>
> What is the reason for including Test::More in 't/lib' rather than
> having it as a prereq?

Because we want to ensure that the test suite runs and tells you real 
problems (if any) about the code (Bioperl) that it is testing, not 
problems about actually running the tests (which are NOT required for 
using Bioperl, so cannot be considered 'pre-requisites').


From n.haigh at sheffield.ac.uk  Fri Oct 20 10:54:30 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 20 Oct 2006 11:54:30 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <45389E65.6080908@sendu.me.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk>
Message-ID: <4538AAE6.5070600@sheffield.ac.uk>

If there are known bugs in a particular version of software, what is the
best approach for dealing with tests that would fail due to this bug?
Simply skip those tests that would be affected by the bug, or to fail if
the affected version is detected and report the reason so the user is
informed? Or simply bump the minimum version to one above the affected
versions?

For example, t/Clustalw has a test for at least version 1.8. It then has
some profile alignment tests that are only run if version > 1.82 is
installed. It states that versions 1.81 and 1.82 are affected by a
profile alignment bug - which i assume would make the tests fail.

Cheers
Nath


From bix at sendu.me.uk  Fri Oct 20 11:06:07 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 12:06:07 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <4538AAE6.5070600@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45389677.1000709@sheffield.ac.uk> <45389E65.6080908@sendu.me.uk>
	<4538AAE6.5070600@sheffield.ac.uk>
Message-ID: <4538AD9F.8040003@sendu.me.uk>

Nathan Haigh wrote:
> If there are known bugs in a particular version of software, what is the
> best approach for dealing with tests that would fail due to this bug?
> Simply skip those tests that would be affected by the bug, or to fail if
> the affected version is detected and report the reason so the user is
> informed? Or simply bump the minimum version to one above the affected
> versions?
> 
> For example, t/Clustalw has a test for at least version 1.8. It then has
> some profile alignment tests that are only run if version > 1.82 is
> installed. It states that versions 1.81 and 1.82 are affected by a
> profile alignment bug - which i assume would make the tests fail.

Specific cases like this, I'd discuss on the list/ with the author of
the module in question. Maybe there is some great need to allow usage
with <1.81?

My view, based purely on what you've said above, bump the pre-requisite
to a version that works.


From cjfields at uiuc.edu  Fri Oct 20 12:36:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 07:36:37 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <45388A37.7040505@sheffield.ac.uk>
References: <000f01c6f3de$c3d91170$15327e82@pyrimidine>
	<45388A37.7040505@sheffield.ac.uk>
Message-ID: <80A2D210-B0DB-4CD2-9B56-A38097F4F63F@uiuc.edu>


>> ,,,
>>
> What is the reason for including Test::More in 't/lib' rather than
> having it as a prereq?

We could do that.  Many CPAN modules include it in 't/lib' b/c it is  
only needed for testing purposes.

Chris

>
> -- 
>> A: Yes.
>>> Q: Are you sure?
>>>
>>>> A: Because it reverses the logical flow of conversation.
>>>>
>>>>> Q: Why is top posting frowned upon?
>>>>>
> Get Thunderbird <http://www.mozilla.org/products/thunderbird/>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Fri Oct 20 14:44:29 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 15:44:29 +0100
Subject: [Bioperl-l] Updated Makefile.PL
Message-ID: <4538E0CD.1030908@sendu.me.uk>

Hi,
I've just committed an updated Makefile.PL to HEAD for bioperl-live. 
Could some people test it on multiple platforms and confirm it is ok 
(try out the different possible options as well)?

(NB. in the below, 'pre-reqs' are things the makefile considers optional 
dependencies)

Note that some pre-reqs have been removed:
# DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end 
up requiring it but only after the user makes an explicit choice by 
typing 'DBD::mysql' in their own code to supply as an option to Bioperl 
code)
# File::Temp (standard in 5.6.1)


This pre-req was wrong:
# Data::Stag::Writer
and has been replaced with:
Data::Stag::XMLWriter


Also, I note that very many Bioperl modules need IO::String, including 
Bio::SeqIO, so I'm not sure to what extent we can pretend it is an 
optional module. I didn't make any change though.


I don't know if these changes affect the Windows ppm Nathan, or anything 
else (Bundle?)?

The INSTALL docs need updating with these new and improved pre-reqs 
(note that some pre-reqs had wrong/not enough Bioperl modules listed as 
needing them); does someone want to correct the wiki (based on the new 
Makefile.PL) and then Chris can re-create the text version?


From hlapp at gmx.net  Fri Oct 20 15:03:34 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 20 Oct 2006 11:03:34 -0400
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
References: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>


On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote:

> Also, I note that very many Bioperl modules need IO::String, including
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
> optional module. I didn't make any change though.

I agree. There's really not that many terribly useful things you can  
do with Bioperl w/o having IO::String installed, which is in stark  
contrast to many other dependencies.

I don't have a problem with making it (and a few others used all over  
the place) required, to better contrast them with the dependencies  
that are really optional (and not needed for 90% of users).

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct 20 15:18:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 10:18:32 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <001501c6f45b$019103c0$15327e82@pyrimidine>

> Hi,
> I've just committed an updated Makefile.PL to HEAD for bioperl-live.
> Could some people test it on multiple platforms and confirm it is ok
> (try out the different possible options as well)?
> 
> (NB. in the below, 'pre-reqs' are things the makefile considers optional
> dependencies)
> 
> Note that some pre-reqs have been removed:
> # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end
> up requiring it but only after the user makes an explicit choice by
> typing 'DBD::mysql' in their own code to supply as an option to Bioperl
> code)
> # File::Temp (standard in 5.6.1)

I'll try it out on WinXP and Mac OS X.  BTW, do any of Lincoln's Bio::DB*
use DBD::mySQL?  Bio::DB::GFF comes to mind.  I don't think it should be an
absolute requirement, though.

If we plan on removing those, then we should also remove them from
Bundle::Bioperl (if they are present).

> This pre-req was wrong:
> # Data::Stag::Writer
> and has been replaced with:
> Data::Stag::XMLWriter
> 
> 
> Also, I note that very many Bioperl modules need IO::String, including
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
> optional module. I didn't make any change though.

Do they all require IO::String or is it an option?  There are a few
instances (WebDBSeqI-implementing, for instance) where this is presented as
an option for most OS's (along with the default, pipeline, and tempfile).
However, it is currently used by default with Windows due to lack of
pipe/fork support at the time.

BTW, the latter may now work with WinXP ActivePerl.  ActiveState has been
working on WinXP fork() emulation for a while, but I think it is still
somewhat experimental.  

> I don't know if these changes affect the Windows ppm Nathan, or anything
> else (Bundle?)?
> 
> The INSTALL docs need updating with these new and improved pre-reqs
> (note that some pre-reqs had wrong/not enough Bioperl modules listed as
> needing them); does someone want to correct the wiki (based on the new
> Makefile.PL) and then Chris can re-create the text version?

Easier to just modify the text version based on what is changed in the wiki,
at least for the time being.  The text dumping from elinks/lynx isn't
full-proof re: tables and such, which is one reason I think we should move
the prereqs to a separate file as it's easier to maintain long-term (this
seems to be where most changes occur anyway).  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Fri Oct 20 15:23:38 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 16:23:38 +0100
Subject: [Bioperl-l] test::more template
In-Reply-To: <1161270180.453793a432e4f@webmail.shef.ac.uk>
References: <45375615.1020603@sheffield.ac.uk>	<45379BBB.1040400@sheffield.ac.uk>
	<1161270180.453793a432e4f@webmail.shef.ac.uk>
Message-ID: <4538E9FA.60701@sendu.me.uk>

Nathan Haigh wrote:
> I thought I'd have my first proper try at writing some tests. I was wondering if there is a template test file that I should use/study in order to be
> consistent with other tests.
> 
> Failing that - Is there a good test writing style I should follow in one of the other test files?

I originally based mine on one of Chris's EUtilities tests, but now 
refer to t/ESEfinder.t since it is small and demonstrates all the major 
tricky things you might have to do - skip remote tests if no 
BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests 
under some condition, fall-back to t/lib for Test::More if necessary.

(Though I just spotted an oops in the latter...)


From cjfields at uiuc.edu  Fri Oct 20 15:38:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 10:38:02 -0500
Subject: [Bioperl-l] test::more template
In-Reply-To: <4538E9FA.60701@sendu.me.uk>
Message-ID: <001601c6f45d$bb824350$15327e82@pyrimidine>

> Nathan Haigh wrote:
> > I thought I'd have my first proper try at writing some tests. I was
> wondering if there is a template test file that I should use/study in
> order to be
> > consistent with other tests.
> >
> > Failing that - Is there a good test writing style I should follow in one
> of the other test files?
> 
> I originally based mine on one of Chris's EUtilities tests, but now
> refer to t/ESEfinder.t since it is small and demonstrates all the major
> tricky things you might have to do - skip remote tests if no
> BIOPERLDEBUG, skip remote tests on remote server failure, skip all tests
> under some condition, fall-back to t/lib for Test::More if necessary.
> 
> (Though I just spotted an oops in the latter...)

I agree.  The EUtilities tests are quite long.  I plan on eventually cutting
out some of them  Making them somewhat less prone to changes in returned XML
data has also been a pain, as demonstrated by some of the tests from MAIN
now failing... d'oh!

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Fri Oct 20 15:39:32 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 20 Oct 2006 16:39:32 +0100
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <001501c6f45b$019103c0$15327e82@pyrimidine>
References: <001501c6f45b$019103c0$15327e82@pyrimidine>
Message-ID: <4538EDB4.3030500@sendu.me.uk>

Chris Fields wrote:
> BTW, do any of Lincoln's Bio::DB*
> use DBD::mySQL?  Bio::DB::GFF comes to mind.

No, just a require on a user-passed variable as I described.


>> Also, I note that very many Bioperl modules need IO::String, including
>> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
>> optional module. I didn't make any change though.
> 
> Do they all require IO::String or is it an option?

Oops, I take that back. Bio::SeqIO doesn't use IO::String. That's what 
you get for relying on grep output...
It's still many modules that use it, but I suppose you could do useful 
things without. So actually, let's keep it optional.


From cjfields at uiuc.edu  Fri Oct 20 20:32:32 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 20 Oct 2006 15:32:32 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
Message-ID: <000001c6f486$df508930$15327e82@pyrimidine>


Seth, 

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto:bioperl-l-
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------
> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

-- 
Best Regards,

Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From olenka.m at gmail.com  Fri Oct 20 21:47:15 2006
From: olenka.m at gmail.com (Olena Morozova)
Date: Fri, 20 Oct 2006 14:47:15 -0700
Subject: [Bioperl-l] GO annotations
Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>

Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena


From olenka.m at gmail.com  Fri Oct 20 21:47:15 2006
From: olenka.m at gmail.com (Olena Morozova)
Date: Fri, 20 Oct 2006 14:47:15 -0700
Subject: [Bioperl-l] GO annotations
Message-ID: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>

Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena


From sdavis2 at mail.nih.gov  Sat Oct 21 15:05:26 2006
From: sdavis2 at mail.nih.gov (Davis, Sean (NIH/NCI) [E])
Date: Sat, 21 Oct 2006 11:05:26 -0400
Subject: [Bioperl-l] GO annotations
References: <259a224c0610201447i44bd5226x3d66c31dbf3f715d@mail.gmail.com>
Message-ID: <014DBF86B19310419F0DF8910FC56457240CE3@nihcesmlbx10.nih.gov>

You can use the ensembl perl API, or (more simply) use the Ensembl MART interface:

http://www.ensembl.org/Multi/martview

Sean


-----Original Message-----
From: Olena Morozova [mailto:olenka.m at gmail.com]
Sent: Fri 10/20/2006 5:47 PM
To: bioperl-l
Subject: [Bioperl-l] GO annotations
 
Dear all,

Does anyone know an easy way to get GO-BP annotations for ensembl genes?

Thank you very much for your help,
Olena
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Sun Oct 22 10:34:51 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 22 Oct 2006 10:34:51 +0000
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
Message-ID: <453B494B.7040702@sheffield.ac.uk>

Hilmar Lapp wrote:
> On Oct 20, 2006, at 10:44 AM, Sendu Bala wrote:
>
>   
>> Also, I note that very many Bioperl modules need IO::String, including
>> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an
>> optional module. I didn't make any change though.
>>     
>
> I agree. There's really not that many terribly useful things you can  
> do with Bioperl w/o having IO::String installed, which is in stark  
> contrast to many other dependencies.
>
> I don't have a problem with making it (and a few others used all over  
> the place) required, to better contrast them with the dependencies  
> that are really optional (and not needed for 90% of users).
>
> 	-hilmar
>
>   

Is it possible to  make a distinction in Makefile.PL between those
modules that are an absolute must for Bioperl-core and those which are
optional and should go into Bundle::BioPerl?

Once I'm sure what should be "option" I'll do the Bundle::BioPerl
package and PPD's.

Cheers
Nath


From vitacolonna at appliedgenomics.org  Sun Oct 22 13:04:48 2006
From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna)
Date: Sun, 22 Oct 2006 15:04:48 +0200
Subject: [Bioperl-l] Submission proposal: ABIF module
Message-ID: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>

Hi everybody,
I would like to submit to CPAN a module for reading and parsing the  
ABIF files (with .ab1 suffix) produced by Applied Biosequence  
sequencers. The need for such a module arose in our lab because the  
existing ABI module we found on CPAN had too limited functionality.  
As an example, our module allows us to easily produce analysis  
reports similar to the ones generated by the Sequencing Analysis  
software.

May I call the module Bio::ABIF? Or should I follow other conventions?

Nicola


From cjfields at uiuc.edu  Sun Oct 22 13:54:51 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 08:54:51 -0500
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
Message-ID: <F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>


On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:

> Hi everybody,
> I would like to submit to CPAN a module for reading and parsing the
> ABIF files (with .ab1 suffix) produced by Applied Biosequence
> sequencers. The need for such a module arose in our lab because the
> existing ABI module we found on CPAN had too limited functionality.
> As an example, our module allows us to easily produce analysis
> reports similar to the ones generated by the Sequencing Analysis
> software.
>
> May I call the module Bio::ABIF? Or should I follow other conventions?
>
> Nicola

It depends.  Does it interact with bioperl in any way?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct 22 13:57:18 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 08:57:18 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <453B494B.7040702@sheffield.ac.uk>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
	<453B494B.7040702@sheffield.ac.uk>
Message-ID: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>


On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote:

> Is it possible to  make a distinction in Makefile.PL between those
> modules that are an absolute must for Bioperl-core and those which are
> optional and should go into Bundle::BioPerl?
>
> Once I'm sure what should be "option" I'll do the Bundle::BioPerl
> package and PPD's.
>
> Cheers
> Nath

We probably should steer this way eventually.  Do you aim on placing  
prereqs required for bioperl core in the bioperl PPD and the  
'optional' ones with the bundle?

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From vitacolonna at appliedgenomics.org  Sun Oct 22 14:16:26 2006
From: vitacolonna at appliedgenomics.org (Nicola Vitacolonna)
Date: Sun, 22 Oct 2006 16:16:26 +0200
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
	<F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
Message-ID: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>


On 22/ott/06, at 15:54, Chris Fields wrote:

>
> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:
>
>> Hi everybody,
>> I would like to submit to CPAN a module for reading and parsing the
>> ABIF files (with .ab1 suffix) [...]
>> May I call the module Bio::ABIF? Or should I follow other  
>> conventions?
>
> It depends.  Does it interact with bioperl in any way?

No. Can you suggest a suitable pattern for the name?

Nicola


From cjfields at uiuc.edu  Sun Oct 22 14:55:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 09:55:46 -0500
Subject: [Bioperl-l] Submission proposal: ABIF module
In-Reply-To: <8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>
References: <2CE723F7-16CF-42D4-8185-009C42D01870@appliedgenomics.org>
	<F9A7DC53-4470-4A91-A655-6114B1B101C1@uiuc.edu>
	<8619131E-0F3D-4718-B7E3-7B5118B1139E@appliedgenomics.org>
Message-ID: <B4155C40-8E3D-4AA0-88F5-7A1FFBD3A134@uiuc.edu>

On Oct 22, 2006, at 9:16 AM, Nicola Vitacolonna wrote:

> On 22/ott/06, at 15:54, Chris Fields wrote:
>
>>
>> On Oct 22, 2006, at 8:04 AM, Nicola Vitacolonna wrote:
>>
>>> Hi everybody,
>>> I would like to submit to CPAN a module for reading and parsing the
>>> ABIF files (with .ab1 suffix) [...]
>>> May I call the module Bio::ABIF? Or should I follow other
>>> conventions?
>>
>> It depends.  Does it interact with bioperl in any way?
>
> No. Can you suggest a suitable pattern for the name?
>
> Nicola

I don't think it will be a problem to name it Bio::ABIF; there is  
already a Bio::ASN1::EntrezGene, and Rutger Vos's Bio::Phylo modules  
(the latter doesn't require BioPerl either).

Saying that, if you plan on contributing more CPAN modules with  
similar functionality (such as parsing other trace files), you might  
want to consider using a namespace that isn't limiting but doesn't  
conflict with Bioperl core (like Bio::Trace or similar, then name  
your module Bio::Trace::ABIF).  You can use search.cpan.org to check  
namespaces for conflicts.

Just as an note: we have bioperl-ext, which also parses ABI and other  
trace file formats.  It's a bit old now and needs updating, but is  
supposed to be quite fast (it uses the Staden io_lib C library via  
PerlXS).

-c

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From arareko at campus.iztacala.unam.mx  Sun Oct 22 17:26:37 2006
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Sun, 22 Oct 2006 12:26:37 -0500
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <4538E0CD.1030908@sendu.me.uk>
References: <4538E0CD.1030908@sendu.me.uk>
Message-ID: <453BA9CD.4060107@campus.iztacala.unam.mx>

Works fine on FreeBSD.

Mauricio.

Sendu Bala wrote:
> Hi,
> I've just committed an updated Makefile.PL to HEAD for bioperl-live. 
> Could some people test it on multiple platforms and confirm it is ok 
> (try out the different possible options as well)?
> 
> (NB. in the below, 'pre-reqs' are things the makefile considers optional 
> dependencies)
> 
> Note that some pre-reqs have been removed:
> # DBD::mysql (nothing in Bioperl-live uses it; Bioperl modules may end 
> up requiring it but only after the user makes an explicit choice by 
> typing 'DBD::mysql' in their own code to supply as an option to Bioperl 
> code)
> # File::Temp (standard in 5.6.1)
> 
> 
> This pre-req was wrong:
> # Data::Stag::Writer
> and has been replaced with:
> Data::Stag::XMLWriter
> 
> 
> Also, I note that very many Bioperl modules need IO::String, including 
> Bio::SeqIO, so I'm not sure to what extent we can pretend it is an 
> optional module. I didn't make any change though.
> 
> 
> I don't know if these changes affect the Windows ppm Nathan, or anything 
> else (Bundle?)?
> 
> The INSTALL docs need updating with these new and improved pre-reqs 
> (note that some pre-reqs had wrong/not enough Bioperl modules listed as 
> needing them); does someone want to correct the wiki (based on the new 
> Makefile.PL) and then Chris can re-create the text version?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From n.haigh at sheffield.ac.uk  Sun Oct 22 19:37:07 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 22 Oct 2006 20:37:07 +0100
Subject: [Bioperl-l] Updated Makefile.PL
In-Reply-To: <7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>
References: <4538E0CD.1030908@sendu.me.uk>
	<3D4CC627-7479-48F0-A84B-E4C471C5B600@gmx.net>
	<453B494B.7040702@sheffield.ac.uk>
	<7E563B1E-11D7-44F4-9E3D-3E319CE8DDA4@uiuc.edu>
Message-ID: <453BC863.4090803@sheffield.ac.uk>

Chris Fields wrote:
>
> On Oct 22, 2006, at 5:34 AM, Nathan S. Haigh wrote:
>
>> Is it possible to  make a distinction in Makefile.PL between those
>> modules that are an absolute must for Bioperl-core and those which are
>> optional and should go into Bundle::BioPerl?
>>
>> Once I'm sure what should be "option" I'll do the Bundle::BioPerl
>> package and PPD's.
>>
>> Cheers
>> Nath
>
> We probably should steer this way eventually.  Do you aim on placing 
> prereqs required for bioperl core in the bioperl PPD and the 
> 'optional' ones with the bundle?
>
That's correct. However, PPM will always try to update packages to the 
latest available. Therefore, if at some point in the future, a 
dependency is removed, and thus removed from Bundle::BioPerl, a 
situation may arise where an older version of BioPerl is running with 
the a recent version of Bundle::BioPerl and could have missing 
dependencies - not ideal but it is how things currently stand. The 
process of making the Bundle::BioPerl PPD would be simplified if these 
"optional" dependencies are separated from the "core" dependencies. If 
one of the following solutions is possible (i'm not sure if they are), 
it would be very useful:

1) Maintain 2 hashes in Makefile.PL that contain the "core" and 
"optional" dependencies. In unsure of the way dependencies are ordered 
during a "make ppd", but it may be possible to pass hash references of 
both to PREREQS_PM in MakeMakefile and have the "optional" depenencies 
grouped separately from "core" depenedcies in the ppd file - thus making 
it easy to stip them out into a Bundle::BioPerl ppd.

2) Again, maintain 2 hashes in Makefile.PL that contain the "core" and 
"optional" dependencies. Have some Makefile setup that allows the 
generation of a Bundle::BioPerl ppd separately from the main Bioperl ppd.

Like I said, these are just some thoughts and I'm not sure if they are 
even viable options.

Nath


From chhalling at alumni.ls.berkeley.edu  Sun Oct 22 23:45:33 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Sun, 22 Oct 2006 19:45:33 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
Message-ID: <453C029D.1070708@alumni.ls.berkeley.edu>

I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 
that prevent these modules from being installed:

Data::Stag::Writer (listed as Data::Stag::writer)
HTTP::Request::Common (listed as HTTP::Request::Common-)
Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From cjfields at uiuc.edu  Mon Oct 23 02:24:07 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 22 Oct 2006 21:24:07 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
Message-ID: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>

Thanks for letting us know!  Did PPM4 throw errors or just silently  
pass them over?

Chris

On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:

> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17- 
> Oct-2006
> that prevent these modules from being installed:
>
> Data::Stag::Writer (listed as Data::Stag::writer)
> HTTP::Request::Common (listed as HTTP::Request::Common-)
> Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)
>
> -- 
> Conrad Halling
> chhalling at alumni.ls.berkeley.edu
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Mon Oct 23 06:45:29 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 06:45:29 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
Message-ID: <453C6509.90005@sheffield.ac.uk>

Chris Fields wrote:
> Thanks for letting us know!  Did PPM4 throw errors or just silently  
> pass them over?
>
> Chris
>
> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:
>
>   
I believe he is talking about the bundle on cpan and not the ppd. I will
get this updated as soon as possible.

Sendu/Chris - can you confirm to me which Bioperl modules are essential
to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
reason for not putting *all* dependencies into the bundle?

Nath


From bix at sendu.me.uk  Mon Oct 23 06:43:36 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 07:43:36 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C029D.1070708@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
Message-ID: <453C6498.5@sendu.me.uk>

Conrad Halling wrote:
> I have found three misspellings in Bundle::BioPerl 2.1.6 of 17-Oct-2006 
> that prevent these modules from being installed:
> 
> Data::Stag::Writer (listed as Data::Stag::writer)

This should be Data::Stag::XMLWriter

> HTTP::Request::Common (listed as HTTP::Request::Common-)
> Spreadsheet::ParseExcel (listed as Spreadhseet::ParseExcel)


From bix at sendu.me.uk  Mon Oct 23 06:52:47 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 07:52:47 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C6509.90005@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
Message-ID: <453C66BF.1060008@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu/Chris - can you confirm to me which Bioperl modules are essential
> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
> reason for not putting *all* dependencies into the bundle?

AFAIK, there are no essential external dependencies. Everything in 
%packages in Makefile.PL, for example, is optional.

We had the discussion about making all the easy-to-install ones a forced 
requirement anyway (so that most things work out of the box), but 
perhaps we'll hold off on making such a change until after 1.5.2.


From jyotikshah at gmail.com  Mon Oct 23 07:10:43 2006
From: jyotikshah at gmail.com (Jyoti Shah)
Date: Mon, 23 Oct 2006 00:10:43 -0700
Subject: [Bioperl-l] short motif searches
Message-ID: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>

Hi,

I am interested in searching motifs as small as 6 or 7 nucleotides in
genomic databases. I need exact matches. Is there any bioperl module
available which can help me do this? I tried WU BLAST with word size one,
but I am getting warning messages such as "WARNING: the maximum achievable
score of 7 in context 0 (frame +1) is less than the ungapped cutoff score S2
(=13). Exit code 0...". Any suggestions?

Thanks in advance,
Jyoti


From bix at sendu.me.uk  Mon Oct 23 07:55:40 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 08:55:40 +0100
Subject: [Bioperl-l] short motif searches
In-Reply-To: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>
References: <769931430610230010t5f68e54eg7fbc1be658c2cb3a@mail.gmail.com>
Message-ID: <453C757C.1010408@sendu.me.uk>

Jyoti Shah wrote:
> Hi,
> 
> I am interested in searching motifs as small as 6 or 7 nucleotides in
> genomic databases. I need exact matches. Is there any bioperl module
> available which can help me do this?

At 6 or 7bp long doing a simple exact match I should point out you're 
going to get very many hits; are you sure this is an appropriate thing 
to do for your purposes?

Assuming yes, you can use Bio::SeqIO, Bio::Index or Bio::DB::<something> 
to get your genomic sequences of interest, then simply use a normal perl 
regexp on the resulting $seq->seq strings.

If your motifs are anything like transcription factor binding sites, and 
you have more information than just a single sequence string for the 
motif, investigate Bio::Matrix::PSM.


From bix at sendu.me.uk  Mon Oct 23 08:29:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 09:29:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C7648.8030004@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk>
	<453C7648.8030004@sheffield.ac.uk>
Message-ID: <453C7D80.80207@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> Sendu/Chris - can you confirm to me which Bioperl modules are essential
>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
>>> reason for not putting *all* dependencies into the bundle?
>> AFAIK, there are no essential external dependencies. Everything in
>> %packages in Makefile.PL, for example, is optional.
>>
>> We had the discussion about making all the easy-to-install ones a
>> forced requirement anyway (so that most things work out of the box),
>> but perhaps we'll hold off on making such a change until after 1.5.2.
 >
> How are they forced?

They're not. Right now they're optional. I'm suggesting we might change 
that in the future.

If you're asking how we /would/ force them, probably by adding 
PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs 
successfully (or should!) without its optional dependencies given in 
PREREQ_PM because make test succeeds (because tests skip ok when the 
optional dependency isn't there).

I don't really know how CPAN discovers dependencies and auto-installs 
them before a dependent module though. Anyone care to explain?


From n.haigh at sheffield.ac.uk  Mon Oct 23 10:09:12 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 10:09:12 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C7D80.80207@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk> <453C66BF.1060008@sendu.me.uk>
	<453C7648.8030004@sheffield.ac.uk> <453C7D80.80207@sendu.me.uk>
Message-ID: <453C94C8.5040900@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Nathan S. Haigh wrote:
>>>> Sendu/Chris - can you confirm to me which Bioperl modules are
>>>> essential
>>>> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
>>>> reason for not putting *all* dependencies into the bundle?
>>> AFAIK, there are no essential external dependencies. Everything in
>>> %packages in Makefile.PL, for example, is optional.
>>>
>>> We had the discussion about making all the easy-to-install ones a
>>> forced requirement anyway (so that most things work out of the box),
>>> but perhaps we'll hold off on making such a change until after 1.5.2.
> >
>> How are they forced?
>
> They're not. Right now they're optional. I'm suggesting we might
> change that in the future.
> If you're asking how we /would/ force them, probably by adding
> PREREQ_FATAL to the WriteMakefile() call. As it is, Bioperl installs
> successfully (or should!) without its optional dependencies given in
> PREREQ_PM because make test succeeds (because tests skip ok when the
> optional dependency isn't there).
>
> I don't really know how CPAN discovers dependencies and auto-installs
> them before a dependent module though. Anyone care to explain?

I thought so! I misunderstood something earlier which confused me. Just
to clarify for my own sanities sake:

1) Currently all dependencies are optional.
2) All dependencies are in %packages
3) all these are passed to PREREQ_PM

As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's:
--snip--

    I installed a Bundle and had a couple of fails. When I retried,
    everything resolved nicely. Can this be fixed to work on first try?

    The reason for this is that CPAN does not know the dependencies of
    all modules when it starts out. To decide about the additional items
    to install, it just uses data found in the META.yml file or the
    generated Makefile. An undetected missing piece breaks the process.
    But it may well be that your Bundle installs some prerequisite later
    than some depending item and thus your second try is able to resolve
    everything. Please note, CPAN.pm does not know the dependency tree
    in advance and cannot sort the queue of things to install in a
    topologically correct order. It resolves perfectly well IF all
    modules declare the prerequisites correctly with the PREREQ_PM
    attribute to MakeMaker or the |requires| stanza of Module::Build.
    For bundles which fail and you need to install often, it is
    recommended to sort the Bundle definition file manually.

--snip--

Therefore, recent modifications to Makefile.PL should result in a fully
operational Bioperl installation, if installed via CPAN. Although only
Bioperl 1.4 is available via CPAN currently. It is possible to upload a
developer release to CPAN which can only be ownloaded via CPAN if
specifically asked for - would be good for 1.5.x.:
--snip--

    How do I install a "DEVELOPER RELEASE" of a module?

    By default, CPAN will install the latest non-developer release of a
    module. If you want to install a dev release, you have to specify
    the partial path starting with the author id to the tarball you wish
    to install, like so:

        cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz

    Note that you can use the |ls| command to get this path listed.

--snip--

HTH
Nath


From bix at sendu.me.uk  Mon Oct 23 09:41:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 10:41:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C94C8.5040900@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
Message-ID: <453C8E60.7000105@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>
>> I don't really know how CPAN discovers dependencies and auto-installs
>> them before a dependent module though. Anyone care to explain?
> 
> I thought so! I misunderstood something earlier which confused me. Just
> to clarify for my own sanities sake:
> 
> 1) Currently all dependencies are optional.
> 2) All dependencies are in %packages
> 3) all these are passed to PREREQ_PM

All correct.


> As far as CPAN discovering dependencies, here is a snip from the CPAN FAQ's:
> --snip--
> 
>     I installed a Bundle and had a couple of fails. When I retried,
>     everything resolved nicely. Can this be fixed to work on first try?
> 
>     The reason for this is that CPAN does not know the dependencies of
>     all modules when it starts out. To decide about the additional items
>     to install, it just uses data found in the META.yml file or the
>     generated Makefile. An undetected missing piece breaks the process.
>     But it may well be that your Bundle installs some prerequisite later
>     than some depending item and thus your second try is able to resolve
>     everything. Please note, CPAN.pm does not know the dependency tree
>     in advance and cannot sort the queue of things to install in a
>     topologically correct order. It resolves perfectly well IF all
>     modules declare the prerequisites correctly with the PREREQ_PM
>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>     For bundles which fail and you need to install often, it is
>     recommended to sort the Bundle definition file manually.
> 
> --snip--
>
> Therefore, recent modifications to Makefile.PL should result in a fully
> operational Bioperl installation, if installed via CPAN.

Right, thanks for that.


> Although only Bioperl 1.4 is available via CPAN currently. It is possible to upload a
> developer release to CPAN which can only be ownloaded via CPAN if
> specifically asked for - would be good for 1.5.x.:
> --snip--
> 
>     How do I install a "DEVELOPER RELEASE" of a module?
> 
>     By default, CPAN will install the latest non-developer release of a
>     module. If you want to install a dev release, you have to specify
>     the partial path starting with the author id to the tarball you wish
>     to install, like so:
> 
>         cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz
> 
>     Note that you can use the |ls| command to get this path listed.
> 
> --snip--

That's the user point of view - how does the developer actually tell 
CPAN that something is a developer release so that normal users don't 
automatically install it?


From bix at sendu.me.uk  Mon Oct 23 09:59:52 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 10:59:52 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C8E60.7000105@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk>
Message-ID: <453C9298.9000900@sendu.me.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> As far as CPAN discovering dependencies, here is a snip from the CPAN 
>> FAQ's:
>> --snip--
>>
>>     I installed a Bundle and had a couple of fails. When I retried,
>>     everything resolved nicely. Can this be fixed to work on first try?
>>
>>     The reason for this is that CPAN does not know the dependencies of
>>     all modules when it starts out. To decide about the additional items
>>     to install, it just uses data found in the META.yml file or the
>>     generated Makefile. An undetected missing piece breaks the process.
>>     But it may well be that your Bundle installs some prerequisite later
>>     than some depending item and thus your second try is able to resolve
>>     everything. Please note, CPAN.pm does not know the dependency tree
>>     in advance and cannot sort the queue of things to install in a
>>     topologically correct order. It resolves perfectly well IF all
>>     modules declare the prerequisites correctly with the PREREQ_PM
>>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>>     For bundles which fail and you need to install often, it is
>>     recommended to sort the Bundle definition file manually.
>>
>> --snip--
>>
>> Therefore, recent modifications to Makefile.PL should result in a fully
>> operational Bioperl installation, if installed via CPAN.
> 
> Right, thanks for that.

Oh, so this effectively means that our 'optional' dependencies are 
installed for CPAN users, which matches up to my 'force the optional 
ones anyway' desire, leaving Bundle::BioPerl without any use.

Makefile.PL could be altered again to remove from PREREQ_PM those 
modules the user didn't already have installed, thus CPAN would only 
install Bioperl itself and nothing optional. The user could then install 
Bundle::BioPerl if they wanted a quick way of getting all the optional 
stuff to work.

I'm happy either way; what do other people think?


From n.haigh at sheffield.ac.uk  Mon Oct 23 11:22:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 11:22:17 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C9298.9000900@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk> <453C9298.9000900@sendu.me.uk>
Message-ID: <453CA5E9.1060406@sheffield.ac.uk>

Sendu Bala wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> As far as CPAN discovering dependencies, here is a snip from the
>>> CPAN FAQ's:
>>> --snip--
>>>
>>>     I installed a Bundle and had a couple of fails. When I retried,
>>>     everything resolved nicely. Can this be fixed to work on first try?
>>>
>>>     The reason for this is that CPAN does not know the dependencies of
>>>     all modules when it starts out. To decide about the additional
>>> items
>>>     to install, it just uses data found in the META.yml file or the
>>>     generated Makefile. An undetected missing piece breaks the process.
>>>     But it may well be that your Bundle installs some prerequisite
>>> later
>>>     than some depending item and thus your second try is able to
>>> resolve
>>>     everything. Please note, CPAN.pm does not know the dependency tree
>>>     in advance and cannot sort the queue of things to install in a
>>>     topologically correct order. It resolves perfectly well IF all
>>>     modules declare the prerequisites correctly with the PREREQ_PM
>>>     attribute to MakeMaker or the |requires| stanza of Module::Build.
>>>     For bundles which fail and you need to install often, it is
>>>     recommended to sort the Bundle definition file manually.
>>>
>>> --snip--
>>>
>>> Therefore, recent modifications to Makefile.PL should result in a fully
>>> operational Bioperl installation, if installed via CPAN.
>>
>> Right, thanks for that.
>
> Oh, so this effectively means that our 'optional' dependencies are
> installed for CPAN users, which matches up to my 'force the optional
> ones anyway' desire, leaving Bundle::BioPerl without any use.
>
> Makefile.PL could be altered again to remove from PREREQ_PM those
> modules the user didn't already have installed, thus CPAN would only
> install Bioperl itself and nothing optional. The user could then
> install Bundle::BioPerl if they wanted a quick way of getting all the
> optional stuff to work.
>
> I'm happy either way; what do other people think?
>From my point of view, removing them from PREREQ_PM means building the
Bundle::BioPerl a bit of a pain :o(

I prefer the way it is currently set up - most people have fast internet
connections and GB of harddrive space. Other than the reason "why
install something I won't ever need" I don't see much point maintaining
Bundle::BioPerl and having "optional" dependencies. I think if there are
any modules which are not going to be used by the majority of users,
then this could be used as the rationale for removing them from
bioperl-core into another package?

Nath


From n.haigh at sheffield.ac.uk  Mon Oct 23 11:38:05 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 11:38:05 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C8E60.7000105@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>
	<453C7D80.80207@sendu.me.uk> <453C94C8.5040900@sheffield.ac.uk>
	<453C8E60.7000105@sendu.me.uk>
Message-ID: <453CA99D.9060009@sheffield.ac.uk>


>> Although only Bioperl 1.4 is available via CPAN currently. It is
>> possible to upload a
>> developer release to CPAN which can only be ownloaded via CPAN if
>> specifically asked for - would be good for 1.5.x.:
>> --snip--
>>
>>     How do I install a "DEVELOPER RELEASE" of a module?
>>
>>     By default, CPAN will install the latest non-developer release of a
>>     module. If you want to install a dev release, you have to specify
>>     the partial path starting with the author id to the tarball you wish
>>     to install, like so:
>>
>>         cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz
>>
>>     Note that you can use the |ls| command to get this path listed.
>>
>> --snip--
>
> That's the user point of view - how does the developer actually tell
> CPAN that something is a developer release so that normal users don't
> automatically install it?

I found this:
http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt

Is says that $VERSION should simply be changed from a naked number into
a single quoted number and this should be recognized by the CPAN indexer.

Nath


From bix at sendu.me.uk  Mon Oct 23 10:47:38 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 11:47:38 +0100
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
	<E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
Message-ID: <453C9DCA.4020802@sendu.me.uk>

Hilmar Lapp wrote:
> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:
> 
>> For example, I have made no effort to setup biosql-schema but I
>> thought that maybe there would be a test that would detect this
> 
> I'm afraid there isn't. Bioperl-db is meaningless without
> biosql-schema.

Can you suggest a way we might detect if biosql-schema has been 
installed prior to running the test suite, so we can give some 
meaningful error message?


From bix at sendu.me.uk  Mon Oct 23 12:43:30 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 13:43:30 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
Message-ID: <453CB8F2.7070703@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>
>> Makefile.PL could be altered again to remove from PREREQ_PM those
>> modules the user didn't already have installed, thus CPAN would only
>> install Bioperl itself and nothing optional. The user could then
>> install Bundle::BioPerl if they wanted a quick way of getting all the
>> optional stuff to work.
>>
>> I'm happy either way; what do other people think?
 >
> From my point of view, removing them from PREREQ_PM means building the
> Bundle::BioPerl a bit of a pain :o(

Can I ask how you're generating Bundle::BioPerl? That is, how did the 
typos get in there? Is there a way to certainly avoid typos in the future?


From n.haigh at sheffield.ac.uk  Mon Oct 23 13:46:17 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 13:46:17 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CB8F2.7070703@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk>
Message-ID: <453CC7A9.6090609@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>
>>> Makefile.PL could be altered again to remove from PREREQ_PM those
>>> modules the user didn't already have installed, thus CPAN would only
>>> install Bioperl itself and nothing optional. The user could then
>>> install Bundle::BioPerl if they wanted a quick way of getting all the
>>> optional stuff to work.
>>>
>>> I'm happy either way; what do other people think?
> >
>> From my point of view, removing them from PREREQ_PM means building the
>> Bundle::BioPerl a bit of a pain :o(
>
> Can I ask how you're generating Bundle::BioPerl? That is, how did the
> typos get in there? Is there a way to certainly avoid typos in the
> future?

I just modified the list by hand a while back :o( - I'm sure there must
be a better way.


From bix at sendu.me.uk  Mon Oct 23 12:58:13 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 13:58:13 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CC7A9.6090609@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk>
Message-ID: <453CBC65.2020202@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Nathan S. Haigh wrote:
>>> Sendu Bala wrote:
>>>
>>>> Makefile.PL could be altered again to remove from PREREQ_PM those
>>>> modules the user didn't already have installed, thus CPAN would only
>>>> install Bioperl itself and nothing optional. The user could then
>>>> install Bundle::BioPerl if they wanted a quick way of getting all the
>>>> optional stuff to work.
>>>>
>>>> I'm happy either way; what do other people think?
 >>>
>>> From my point of view, removing them from PREREQ_PM means building the
>>> Bundle::BioPerl a bit of a pain :o(
 >>
>> Can I ask how you're generating Bundle::BioPerl? That is, how did the
>> typos get in there? Is there a way to certainly avoid typos in the
>> future?
> 
> I just modified the list by hand a while back :o( - I'm sure there must
> be a better way.

I'm not sure I understand why removing things from PREREQ_PM would be a 
problem for you then; the %packages hash would remain unchanged (ie. 
have everything) so you have something to refer to when manually editing 
the Bundle.

http://www.cpan.org/misc/cpan-faq.html#How_make_bundle
might be helpful? I didn't really pay too much attention to the advice - 
does it offer a typo-avoiding solution?


From n.haigh at sheffield.ac.uk  Mon Oct 23 14:04:12 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 14:04:12 +0000
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CBC65.2020202@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453C9298.9000900@sendu.me.uk> <453CA5E9.1060406@sheffield.ac.uk>
	<453CB8F2.7070703@sendu.me.uk> <453CC7A9.6090609@sheffield.ac.uk>
	<453CBC65.2020202@sendu.me.uk>
Message-ID: <453CCBDC.6030904@sheffield.ac.uk>


> I'm not sure I understand why removing things from PREREQ_PM would be
> a problem for you then; the %packages hash would remain unchanged (ie.
> have everything) so you have something to refer to when manually
> editing the Bundle.
>
> http://www.cpan.org/misc/cpan-faq.html#How_make_bundle
> might be helpful? I didn't really pay too much attention to the advice
> - does it offer a typo-avoiding solution?

It's helpful in producing the Bundle PPD as all the XML tags are present
in the Bioperl PPD and they simply need to be copied over to a
Bundle-BioPerl PPD file.

Looks like manual editing of the relevant file is required for making a
CPAN bundle. Unfortunately - no typo-avoiding solution. :o(


From dhoworth at mrc-lmb.cam.ac.uk  Mon Oct 23 12:46:29 2006
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Mon, 23 Oct 2006 13:46:29 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA99D.9060009@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>
	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>
	<453CA99D.9060009@sheffield.ac.uk>
Message-ID: <453CB9A5.2020409@mrc-lmb.cam.ac.uk>

>> That's the user point of view - how does the developer actually tell
>> CPAN that something is a developer release so that normal users don't
>> automatically install it?
> 
> I found this:
> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
> 
> Is says that $VERSION should simply be changed from a naked number into
> a single quoted number and this should be recognized by the CPAN indexer.

<http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>

Cheers, Dave


From hlapp at gmx.net  Mon Oct 23 13:40:29 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 23 Oct 2006 09:40:29 -0400
Subject: [Bioperl-l] Bioperl 1.5.2 RC2
In-Reply-To: <453C9DCA.4020802@sendu.me.uk>
References: <4534B156.4090501@sendu.me.uk>	<4534E09C.9030707@genomics.dk>	<4534E207.8030508@sendu.me.uk>	<45350BA6.3040102@genomics.dk>	<4535EBF9.1090706@sendu.me.uk>
	<4536113D.1080307@sheffield.ac.uk>
	<E28AD9EC-3823-46C9-BFB7-887B33A3A4FB@gmx.net>
	<453C9DCA.4020802@sendu.me.uk>
Message-ID: <5C22B9C8-CEF0-457B-8565-793D56389A86@gmx.net>

You would need a lot of information to make that determination (host,  
port, db driver, db name, user, password; i.e., the entire connection  
information, and there is no 'standard').

You might just ask a simple question in Makefile.PL as to whether  
biosql is installed or not, similar to the DB::GFF tests.

	-hilmar

On Oct 23, 2006, at 6:47 AM, Sendu Bala wrote:

> Hilmar Lapp wrote:
>> On Oct 18, 2006, at 7:34 AM, Nathan Haigh wrote:
>>
>>> For example, I have made no effort to setup biosql-schema but I
>>> thought that maybe there would be a test that would detect this
>>
>> I'm afraid there isn't. Bioperl-db is meaningless without
>> biosql-schema.
>
> Can you suggest a way we might detect if biosql-schema has been
> installed prior to running the test suite, so we can give some
> meaningful error message?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Mon Oct 23 13:59:23 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 14:59:23 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CB9A5.2020409@mrc-lmb.cam.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>
	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
Message-ID: <453CCABB.2060308@sendu.me.uk>

Dave Howorth wrote:
>>> That's the user point of view - how does the developer actually tell
>>> CPAN that something is a developer release so that normal users don't
>>> automatically install it?
>> I found this:
>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>
>> Is says that $VERSION should simply be changed from a naked number into
>> a single quoted number and this should be recognized by the CPAN indexer.
> 
> <http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>

Thanks for that.

I guess from that the 1.5.2 version number should be:

$VERSION = 1.05_02

And 1.6 would be

$VERSION = 1.06

But will this cause a problem wrt 1.4? 1.4 has:

$VERSION = 1.4;

Is 1.4 lower than 1.06? Should we keep to a single digit version, so 
1.5_02 and 1.6? Does this really not work with CPAN? Should we call them 
version fifty and version sixty? 1.50_02, 1.60?


From cjfields at uiuc.edu  Mon Oct 23 14:12:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:12:16 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C9298.9000900@sendu.me.uk>
Message-ID: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>

...
> > Right, thanks for that.
> 
> Oh, so this effectively means that our 'optional' dependencies are
> installed for CPAN users, which matches up to my 'force the optional
> ones anyway' desire, leaving Bundle::BioPerl without any use.
> 
> Makefile.PL could be altered again to remove from PREREQ_PM those
> modules the user didn't already have installed, thus CPAN would only
> install Bioperl itself and nothing optional. The user could then install
> Bundle::BioPerl if they wanted a quick way of getting all the optional
> stuff to work.
> 
> I'm happy either way; what do other people think?

I think that we should have it so Bioperl installs as-is (no additional
reqs) and have Bundle::BioPerl used as a convenient way to install all
optional modules for full functionality.  The catch is to make sure that any
optional installations do not crash tests during a CPAN bioperl
installation, otherwise they aren't considered optional by CPAN, and the
install won't work without forcing it.

Frankly, most users will find themselves wanting to install the Bundle
anyway to get full functionality, so we could always 'strongly recommend'
preceding the bioperl installation with a Bundle::Bioperl CPAN installation
to avoid problems, at least for this release. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Oct 23 14:23:04 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:23:04 -0500
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453CA5E9.1060406@sheffield.ac.uk>
Message-ID: <002101c6f6ae$c14d7860$15327e82@pyrimidine>

...
> >> Right, thanks for that.
> >
> > Oh, so this effectively means that our 'optional' dependencies are
> > installed for CPAN users, which matches up to my 'force the optional
> > ones anyway' desire, leaving Bundle::BioPerl without any use.
> >
> > Makefile.PL could be altered again to remove from PREREQ_PM those
> > modules the user didn't already have installed, thus CPAN would only
> > install Bioperl itself and nothing optional. The user could then
> > install Bundle::BioPerl if they wanted a quick way of getting all the
> > optional stuff to work.
> >
> > I'm happy either way; what do other people think?
> >From my point of view, removing them from PREREQ_PM means building the
> Bundle::BioPerl a bit of a pain :o(
> 
> I prefer the way it is currently set up - most people have fast internet
> connections and GB of harddrive space. Other than the reason "why
> install something I won't ever need" I don't see much point maintaining
> Bundle::BioPerl and having "optional" dependencies. I think if there are
> any modules which are not going to be used by the majority of users,
> then this could be used as the rationale for removing them from
> bioperl-core into another package?
> 
> Nath

I think you'll likely find it much easier to maintain a Bundle package
long-term and indicate that it should be installed along with bioperl, than
to have users complain about a particular Bioperl module failing b/c a
particular dependency wasn't installed.  

If we have the Bundle around in CPAN and in PPM for Win32 users, and
indicate in the INSTALL docs and the wiki our preference that it be
installed prior to or along with a Bioperl installation for beginners, we
can mitigate most of those problems.  Nip it in the bud, to quote a Mr.
Barney Fife.

My 2c

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 23 14:29:33 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 09:29:33 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CCABB.2060308@sendu.me.uk>
Message-ID: <002201c6f6af$a91e4200$15327e82@pyrimidine>

> Dave Howorth wrote:
> >>> That's the user point of view - how does the developer actually tell
> >>> CPAN that something is a developer release so that normal users don't
> >>> automatically install it?
> >> I found this:
> >> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
> >>
> >> Is says that $VERSION should simply be changed from a naked number into
> >> a single quoted number and this should be recognized by the CPAN
> indexer.
> >
> > <http://search.cpan.org/~nwclark/perl-
> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
> 
> Thanks for that.
> 
> I guess from that the 1.5.2 version number should be:
> 
> $VERSION = 1.05_02
> 
> And 1.6 would be
> 
> $VERSION = 1.06
> 
> But will this cause a problem wrt 1.4? 1.4 has:
> 
> $VERSION = 1.4;
> 
> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them
> version fifty and version sixty? 1.50_02, 1.60?

Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
much simpler to use that. 

Simon Cozens wrote about this a while back:

http://www.perl.com/pub/a/2000/04/whatsnew.html

...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From bix at sendu.me.uk  Mon Oct 23 14:41:24 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 15:41:24 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <002201c6f6af$a91e4200$15327e82@pyrimidine>
References: <002201c6f6af$a91e4200$15327e82@pyrimidine>
Message-ID: <453CD494.8070905@sendu.me.uk>

Chris Fields wrote:
>> Dave Howorth wrote:
>>>>> That's the user point of view - how does the developer actually tell
>>>>> CPAN that something is a developer release so that normal users don't
>>>>> automatically install it?
>>>> I found this:
>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>>
>>>> Is says that $VERSION should simply be changed from a naked number into
>>>> a single quoted number and this should be recognized by the CPAN
>> indexer.
>>> <http://search.cpan.org/~nwclark/perl-
>> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>
>> Thanks for that.
>>
>> I guess from that the 1.5.2 version number should be:
>>
>> $VERSION = 1.05_02
>>
>> And 1.6 would be
>>
>> $VERSION = 1.06
>>
>> But will this cause a problem wrt 1.4? 1.4 has:
>>
>> $VERSION = 1.4;
>>
>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
>> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them
>> version fifty and version sixty? 1.50_02, 1.60?
> 
> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
> much simpler to use that. 

That does not present us with a way to have 1.5.2 marked as a developer 
release in CPAN.

Also, see the discussion here: 
http://perldoc.perl.org/functions/require.html

Since we require 5.6.1 the backwards-compatible issues maybe don't apply 
to us, but do these ideas work with modules, or just Perl itself? Is 
CPAN et al. happy with this form of versioning?

/Something/ needs to be done about Bioperl versioning, because the 
current 1.4 or 1.5 is completely inadequate.


From bix at sendu.me.uk  Mon Oct 23 14:51:25 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 15:51:25 +0100
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
Message-ID: <453CD6ED.5050507@sendu.me.uk>

Chris Fields wrote:

[option 1]
>> Oh, so this effectively means that our 'optional' dependencies are 
>> installed for CPAN users, which matches up to my 'force the
>> optional ones anyway' desire, leaving Bundle::BioPerl without any
>> use.

[option 2]
>> Makefile.PL could be altered again to remove from PREREQ_PM those 
>> modules the user didn't already have installed, thus CPAN would
>> only install Bioperl itself and nothing optional. The user could
>> then install Bundle::BioPerl if they wanted a quick way of getting
>> all the optional stuff to work.
>> 
>> I'm happy either way; what do other people think?
> 
> I think that we should have it so Bioperl installs as-is (no
> additional reqs) and have Bundle::BioPerl used as a convenient way to
> install all optional modules for full functionality.

Note we're specifically considering a CPAN install here. If you download
the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
still needed as a convenience if you want to install the optional
external dependencies.


> The catch is to make sure that any optional installations do not
> crash tests during a CPAN bioperl installation, otherwise they aren't
> considered optional by CPAN, and the install won't work without
> forcing it.

I'm pretty sure this isn't a problem, though it would be nice if someone 
could test it on a clean system: does 'make test' pass all ok with none 
of the optional modules installed?


Anyway, to reiterate the question: Do we care if CPAN users get all the 
optional external dependencies installed for them automatically, or do 
we want to force them to install Bundle?

The current situation is: CPAN users will get all optional external 
dependencies without using Bundle::BioPerl. Manual installers of bioperl 
(from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to 
get full functionality.


From n.haigh at sheffield.ac.uk  Mon Oct 23 16:30:34 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 16:30:34 +0000
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CCABB.2060308@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
	<453CCABB.2060308@sendu.me.uk>
Message-ID: <453CEE2A.8000002@sheffield.ac.uk>

Sendu Bala wrote:
> Dave Howorth wrote:
>   
>>>> That's the user point of view - how does the developer actually tell
>>>> CPAN that something is a developer release so that normal users don't
>>>> automatically install it?
>>>>         
>>> I found this:
>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>
>>> Is says that $VERSION should simply be changed from a naked number into
>>> a single quoted number and this should be recognized by the CPAN indexer.
>>>       
>> <http://search.cpan.org/~nwclark/perl-5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>     
>
> Thanks for that.
>
> I guess from that the 1.5.2 version number should be:
>
> $VERSION = 1.05_02
>
> And 1.6 would be
>
> $VERSION = 1.06
>
> But will this cause a problem wrt 1.4? 1.4 has:
>
> $VERSION = 1.4;
>
> Is 1.4 lower than 1.06? Should we keep to a single digit version, so 
> 1.5_02 and 1.6? Does this really not work with CPAN? Should we call them 
> version fifty and version sixty? 1.50_02, 1.60?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
I believe the link to the documentation above describes a common CPAN
versioning scheme as follows:

1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32

Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
be better as 1.52. Then to indicate that the 1.5 series is a developer
release, you append the underscore and at least 2 digits. Thus resulting
in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
1.52_01. The only thing i'm unsure about would be when does the _01 get
incremented? I suspect we would probably not increment this number since
each release would be an increment of the minor release number e.g.
1.52_01, 1.53_01, 1.54_01 etc.

Although I'm still not sure how this versioning would affect bioperl 1.4
since 1.4 uses a non-standard versioning scheme :o(

As I understand it, the versioning of the Perl releases uses the x.y.z
scheme. But apparently CPAN modules should use the above versioning scheme.

Nath


From cjfields at uiuc.edu  Mon Oct 23 15:36:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 10:36:37 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CD6ED.5050507@sendu.me.uk>
Message-ID: <000c01c6f6b9$0781af40$15327e82@pyrimidine>

...
> 
> Note we're specifically considering a CPAN install here. If you download
> the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
> still needed as a convenience if you want to install the optional
> external dependencies.
> 

Agreed.  I don't think the Bundle is dispensable.  For instance, it's very
easy for us to just state to beginners to install Bundle::Bioperl before
installing bioperl itself,  as opposed to having them inundate the mail list
with requests on why x.pl script didn't work, which could be simply from
lack of the required module. 

> I'm pretty sure this isn't a problem, though it would be nice if someone
> could test it on a clean system: does 'make test' pass all ok with none
> of the optional modules installed?

So far on WinXP everything passes; I ran a clean perl installation a while
ago using nmake and tests passed.

> Anyway, to reiterate the question: Do we care if CPAN users get all the
> optional external dependencies installed for them automatically, or do
> we want to force them to install Bundle?
> 
> The current situation is: CPAN users will get all optional external
> dependencies without using Bundle::BioPerl. Manual installers of bioperl
> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
> get full functionality.

I don't think forcing is necessary, so a CPAN installation shouldn't force
someone to install optional modules.  Graph.pm, for instance has a few
optional modules, and the tests which use those get skipped and pass so the
installation proceeds w/o problems.  We could do the same (any tests using
those optional modules display the reason why they are skipped).  

I would strongly state in the INSTALL and INSTALL.WIN docs that (new) users
should install Bundle::Bioperl before installing Bioperl core for full
functionality.  If you are an advanced user and know your way around
CPAN/Perl, then you can install the various independent requirements
depending on your particular requirements. 

Chris


From n.haigh at sheffield.ac.uk  Mon Oct 23 16:38:00 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 16:38:00 +0000
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CD6ED.5050507@sendu.me.uk>
References: <002001c6f6ad$3f68dd90$15327e82@pyrimidine>
	<453CD6ED.5050507@sendu.me.uk>
Message-ID: <453CEFE8.4000704@sheffield.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>
> [option 1]
>   
>>> Oh, so this effectively means that our 'optional' dependencies are 
>>> installed for CPAN users, which matches up to my 'force the
>>> optional ones anyway' desire, leaving Bundle::BioPerl without any
>>> use.
>>>       
>
> [option 2]
>   
>>> Makefile.PL could be altered again to remove from PREREQ_PM those 
>>> modules the user didn't already have installed, thus CPAN would
>>> only install Bioperl itself and nothing optional. The user could
>>> then install Bundle::BioPerl if they wanted a quick way of getting
>>> all the optional stuff to work.
>>>
>>> I'm happy either way; what do other people think?
>>>       
>> I think that we should have it so Bioperl installs as-is (no
>> additional reqs) and have Bundle::BioPerl used as a convenient way to
>> install all optional modules for full functionality.
>>     
>
> Note we're specifically considering a CPAN install here. If you download
> the tar.gz or use cvs, [option 1] doesn't affect you. Bundle::Bioperl is
> still needed as a convenience if you want to install the optional
> external dependencies.
>
>
>   
>> The catch is to make sure that any optional installations do not
>> crash tests during a CPAN bioperl installation, otherwise they aren't
>> considered optional by CPAN, and the install won't work without
>> forcing it.
>>     
>
> I'm pretty sure this isn't a problem, though it would be nice if someone 
> could test it on a clean system: does 'make test' pass all ok with none 
> of the optional modules installed?
>
>   

I could definitely do this on WinXP and *possibly* on a Linux system.

> Anyway, to reiterate the question: Do we care if CPAN users get all the 
> optional external dependencies installed for them automatically, or do 
> we want to force them to install Bundle?
>
>   

I'd prefer any dependencies, whether the are seen as vital to the main
functionality of Bioperl or not actually specified in PREREQ_PM (as they
currently are). A dependency is a dependency - is it not? If a
distinction is to be made based on whether the requiring module is
simply adding additional functionality to Bioperl-core, then shouldn't
it be moved out of core and into another package as with the run modules
if we are to have "optional" dependencies?

my 2p
Nath

> The current situation is: CPAN users will get all optional external 
> dependencies without using Bundle::BioPerl. Manual installers of bioperl 
> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to 
> get full functionality.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   


From cjfields at uiuc.edu  Mon Oct 23 15:39:09 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 10:39:09 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CD494.8070905@sendu.me.uk>
Message-ID: <000d01c6f6b9$62033d80$15327e82@pyrimidine>

...
> That does not present us with a way to have 1.5.2 marked as a developer
> release in CPAN.
> 
> Also, see the discussion here:
> http://perldoc.perl.org/functions/require.html
> 
> Since we require 5.6.1 the backwards-compatible issues maybe don't apply
> to us, but do these ideas work with modules, or just Perl itself? Is
> CPAN et al. happy with this form of versioning?
> 
> /Something/ needs to be done about Bioperl versioning, because the
> current 1.4 or 1.5 is completely inadequate.

I think using 'require Foo x.y.z' is applicable to modules as well.  There
is something in Programming Perl about this, just don't have it on hand...

Not sure about CPAN, so we need to look into it.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Oct 23 15:42:15 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 16:42:15 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CEE2A.8000002@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>
	<453CCABB.2060308@sendu.me.uk> <453CEE2A.8000002@sheffield.ac.uk>
Message-ID: <453CE2D7.5080608@sendu.me.uk>

Nathan S. Haigh wrote:
> I believe the link to the documentation above describes a common CPAN
> versioning scheme as follows:
> 
> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32
> 
> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
> be better as 1.52. Then to indicate that the 1.5 series is a developer
> release, you append the underscore and at least 2 digits. Thus resulting
> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
> 1.52_01. The only thing i'm unsure about would be when does the _01 get
> incremented? I suspect we would probably not increment this number since
> each release would be an increment of the minor release number e.g.
> 1.52_01, 1.53_01, 1.54_01 etc.
> 
> Although I'm still not sure how this versioning would affect bioperl 1.4
> since 1.4 uses a non-standard versioning scheme :o(

Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
treated higher than 1.4? Anyway, we can cross that bridge when we get 
there, but this seems appropriate now.


Cheers,
Sendu.


From bix at sendu.me.uk  Mon Oct 23 15:59:01 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 23 Oct 2006 16:59:01 +0100
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
References: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
Message-ID: <453CE6C5.6000108@sendu.me.uk>

Chris Fields wrote:
> ...
>> The current situation is: CPAN users will get all optional external
>> dependencies without using Bundle::BioPerl. Manual installers of bioperl
>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
>> get full functionality.
> 
> I don't think forcing is necessary, so a CPAN installation shouldn't force
> someone to install optional modules.  Graph.pm, for instance has a few
> optional modules, and the tests which use those get skipped and pass so the
> installation proceeds w/o problems.  We could do the same (any tests using
> those optional modules display the reason why they are skipped).  

I should clarify and say that that's what happens in Bioperl as well. 
The 'forcing' that I talk about is simply what I assume will happen if 
the user has CPAN set to automatically install dependencies. The user 
could say 'no' to every question regarding the installation of 
dependencies that CPAN discovers and Bioperl would still install fine.

So really the difference between the current situation and, say, the 
situation when 1.5.1 was released, is that the CPAN user doesn't have to 
use Bundle::BioPerl for full functionality anymore, but can still chose 
not to install all the optional external modules.

The difference is the possible default behaviour. Those users that 
auto-install dependencies get all the optional ones, whereas in the past 
they would not have. I have to point out the benefit of this behaviour: 
those people that don't care and just want it to work are more likely to 
get an installation that does just work. People who know what they're 
doing can still do what they want.


Before we decide what to do I guess we need hard confirmation of how 
CPAN will actually behave with the current Makefile.PL. Any ideas how we 
can find out?

It would also be good to have more options to break the current tie 
(Nathan is for keeping PREREQ_PM populated, Chris is for having it 
empty, I can go either way)...


From dhoworth at mrc-lmb.cam.ac.uk  Mon Oct 23 15:55:42 2006
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Mon, 23 Oct 2006 16:55:42 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CD494.8070905@sendu.me.uk>
References: <002201c6f6af$a91e4200$15327e82@pyrimidine>
	<453CD494.8070905@sendu.me.uk>
Message-ID: <453CE5FE.9070001@mrc-lmb.cam.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>>> Dave Howorth wrote:
>>>>>> That's the user point of view - how does the developer actually tell
>>>>>> CPAN that something is a developer release so that normal users don't
>>>>>> automatically install it?
>>>>> I found this:
>>>>> http://www.atrixnet.com/PM/So-You-Want-to-Contribute-to-The-CPAN.ppt
>>>>>
>>>>> Is says that $VERSION should simply be changed from a naked number into
>>>>> a single quoted number and this should be recognized by the CPAN
>>> indexer.
>>>> <http://search.cpan.org/~nwclark/perl-
>>> 5.8.8/pod/perlmodstyle.pod#Version_numbering>
>>>
>>> Thanks for that.
>>>
>>> I guess from that the 1.5.2 version number should be:
>>>
>>> $VERSION = 1.05_02

I believe so - the underscore is key. Look at your favourite CPAN
modules and see what they do.

>>> And 1.6 would be
>>>
>>> $VERSION = 1.06
>>>
>>> But will this cause a problem wrt 1.4? 1.4 has:

I think it will cause a problem, yes. 1.4 > 1.06 As a workaround, you
could remove 1.4 from CPAN and require everybody who installs from CPAN
to uninstall it before installing 1.06.

>>> $VERSION = 1.4;
>>>
>>> Is 1.4 lower than 1.06? Should we keep to a single digit version, so
>>> 1.5_02 and 1.6? Does this really not work with CPAN?

I think that would work but see at the end.

>> Should we call them
>>> version fifty and version sixty? 1.50_02, 1.60?

Then you can count 1.50_02, 1.50_03, 1.52, 1.53_01 ... if you wish.

>> Doesn't perl 5.6.1 and up use the 'x.y.z' versioning syntax?  It would be
>> much simpler to use that. 
> 
> That does not present us with a way to have 1.5.2 marked as a developer 
> release in CPAN.
> 
> Also, see the discussion here: 
> http://perldoc.perl.org/functions/require.html
> 
> Since we require 5.6.1 the backwards-compatible issues maybe don't apply 
> to us, but do these ideas work with modules, or just Perl itself? Is 
> CPAN et al. happy with this form of versioning?

I'm not an expert :( It's my understanding that there is an awful lot of
flexibility in Perl module version numbering (as you might expect :)
However, I believe there are some gotchas. So I would recommend (a)
finding an expert and (b) trying an experiment!

> /Something/ needs to be done about Bioperl versioning, because the 
> current 1.4 or 1.5 is completely inadequate.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 


From n.haigh at sheffield.ac.uk  Mon Oct 23 17:37:13 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 17:37:13 +0000
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CE6C5.6000108@sendu.me.uk>
References: <000c01c6f6b9$0781af40$15327e82@pyrimidine>
	<453CE6C5.6000108@sendu.me.uk>
Message-ID: <453CFDC9.8030107@sheffield.ac.uk>

Sendu Bala wrote:
> Chris Fields wrote:
>   
>> ...
>>     
>>> The current situation is: CPAN users will get all optional external
>>> dependencies without using Bundle::BioPerl. Manual installers of bioperl
>>> (from tar.gz, from cvs etc.) must install Bundle::BioPerl manually to
>>> get full functionality.
>>>       
>> I don't think forcing is necessary, so a CPAN installation shouldn't force
>> someone to install optional modules.  Graph.pm, for instance has a few
>> optional modules, and the tests which use those get skipped and pass so the
>> installation proceeds w/o problems.  We could do the same (any tests using
>> those optional modules display the reason why they are skipped).  
>>     
>
> I should clarify and say that that's what happens in Bioperl as well. 
> The 'forcing' that I talk about is simply what I assume will happen if 
> the user has CPAN set to automatically install dependencies. The user 
> could say 'no' to every question regarding the installation of 
> dependencies that CPAN discovers and Bioperl would still install fine.
>
> So really the difference between the current situation and, say, the 
> situation when 1.5.1 was released, is that the CPAN user doesn't have to 
> use Bundle::BioPerl for full functionality anymore, but can still chose 
> not to install all the optional external modules.
>
>   
--snip--

Obviously, we could maintain a Bundle::BioPerl which includes all
dependencies required for a fully functional Bioperl. I think the whole
idea for a Bundle is to provide a common environment for a particular
package. If for example, someone chooses not to install the dependencies
through CPAN (in the current setup), that can easily go back and install
Bundle::BioPerl and it would retrieve any missing dependencies for a
fully functional Bioperl-core.

Nath


From n.haigh at sheffield.ac.uk  Mon Oct 23 18:06:16 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 18:06:16 +0000
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CE2D7.5080608@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
Message-ID: <453D0498.8050206@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>   
>> I believe the link to the documentation above describes a common CPAN
>> versioning scheme as follows:
>>
>> 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32
>>
>> Therefore version 1.5 of Bioperl would be: 1.50 and version 1.5.2 would
>> be better as 1.52. Then to indicate that the 1.5 series is a developer
>> release, you append the underscore and at least 2 digits. Thus resulting
>> in the following: Bioperl 1.5 would be 1.50_01 and 1.5.2 would be
>> 1.52_01. The only thing i'm unsure about would be when does the _01 get
>> incremented? I suspect we would probably not increment this number since
>> each release would be an increment of the minor release number e.g.
>> 1.52_01, 1.53_01, 1.54_01 etc.
>>
>> Although I'm still not sure how this versioning would affect bioperl 1.4
>> since 1.4 uses a non-standard versioning scheme :o(
>>     
>
> Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
> treated higher than 1.4? Anyway, we can cross that bridge when we get 
> there, but this seems appropriate now.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just tried the suggested:
perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)'
bioperl-1-5-2/Bio/Root/Version.pm

To see how it parses the various different version schemes - here are
the results:
1.5       -> 1.5
1.4       -> 1.4
1.60      -> 1.60
1.05_01   -> 1.0501
1.5_01    -> 1.501
1.50_01   -> 1.5001

Nath


From cjfields at uiuc.edu  Mon Oct 23 17:15:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:15:44 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CE6C5.6000108@sendu.me.uk>
Message-ID: <002701c6f6c6$e2622c40$15327e82@pyrimidine>

...
> I should clarify and say that that's what happens in Bioperl as well.
> The 'forcing' that I talk about is simply what I assume will happen if
> the user has CPAN set to automatically install dependencies. The user
> could say 'no' to every question regarding the installation of
> dependencies that CPAN discovers and Bioperl would still install fine.
> 
> So really the difference between the current situation and, say, the
> situation when 1.5.1 was released, is that the CPAN user doesn't have to
> use Bundle::BioPerl for full functionality anymore, but can still chose
> not to install all the optional external modules.
> 
> The difference is the possible default behaviour. Those users that
> auto-install dependencies get all the optional ones, whereas in the past
> they would not have. I have to point out the benefit of this behaviour:
> those people that don't care and just want it to work are more likely to
> get an installation that does just work. People who know what they're
> doing can still do what they want.

OK with me.  Any way we go about it, we have to assume that anyone who set
CPAN to automatically install dependencies would want this behavior.

> Before we decide what to do I guess we need hard confirmation of how
> CPAN will actually behave with the current Makefile.PL. Any ideas how we
> can find out?
> 
> It would also be good to have more options to break the current tie
> (Nathan is for keeping PREREQ_PM populated, Chris is for having it
> empty, I can go either way)...

Frankly I'm for whatever is easiest for the end-user.  I think we should
continue maintaining Bundle::Bioperl b/c of its convenience (easier for us
to say 'install Bundle::Bioperl' as opposed to 'install modules a b d d e f
g...'  ).  I should note that Chris D. maintains Bundle::Bioperl via CPAN
and can easily add/remove modules as needed, so all that would be necessary
prior to a release is to make sure the various modules present in the Bundle
are up-to-date.

The only difficulty would updating the bundle PPM version for Win32; I agree
with Nathan that it would be nice if it were easier to maintain.  The PPD
file generated using 'nmake ppd' needs modifications, likely b/c these are
probably still generated as PPM3-compatible vs PPM4-compatible.

I also think the idea of having the developer releases available via CPAN is
a good one, as long as they are marked as such (which you are taking care of
with versioning changes).  It makes them a little more official, even if
they are interim developer releases.

Chris


From cjfields at uiuc.edu  Mon Oct 23 17:19:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:19:08 -0500
Subject: [Bioperl-l] Bundle::BioPerl and Pre-reqs
In-Reply-To: <453CFDC9.8030107@sheffield.ac.uk>
Message-ID: <002801c6f6c7$5a58ed60$15327e82@pyrimidine>

...
> > So really the difference between the current situation and, say, the
> > situation when 1.5.1 was released, is that the CPAN user doesn't have to
> > use Bundle::BioPerl for full functionality anymore, but can still chose
> > not to install all the optional external modules.
> >
> >
> --snip--
> 
> Obviously, we could maintain a Bundle::BioPerl which includes all
> dependencies required for a fully functional Bioperl. I think the whole
> idea for a Bundle is to provide a common environment for a particular
> package. If for example, someone chooses not to install the dependencies
> through CPAN (in the current setup), that can easily go back and install
> Bundle::BioPerl and it would retrieve any missing dependencies for a
> fully functional Bioperl-core.
> 
> Nath

Succinctly put; I would've spent five paragraphs describing that!  Too much
coffee (from lab meetings...)

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Mon Oct 23 17:26:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 12:26:57 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>
Message-ID: <002c01c6f6c8$7163dd20$15327e82@pyrimidine>

Seth, 

Did you try this with a clean, taxonomy-installed database?  There may be
some junk left over tfrom the previous test runs.

I'm looking into it this week; it may not make the developer release but
we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
with a call to gzip.  I'll look into a workaround for that.  

Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
introduces others.  One alternative which I found works is cygwin, but
there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
another...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 11:37 AM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
--- 
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): 
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85) 
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' 
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88) 
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis, J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2) 
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2) 
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3) 
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' 
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4) 
#    Expected: '99199225'
==============================


On 10/20/06, Chris Fields < <mailto:cjfields at uiuc.edu>  cjfields at uiuc.edu>
wrote:


Seth,

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with 
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
<mailto:bioperl-l-> 
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM 
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------

> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42 
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list 
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
<http://lists.open-bio.org/mailman/listinfo/bioperl-l> 
>
>


From johnson.biotech at gmail.com  Mon Oct 23 16:36:36 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 23 Oct 2006 12:36:36 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <000001c6f486$df508930$15327e82@pyrimidine>
References: <000001c6f486$df508930$15327e82@pyrimidine>
Message-ID: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>

Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31):
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85)
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators'
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88)
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2)
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2)
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3)
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein'
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4)
#    Expected: '99199225'
==============================


On 10/20/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
>
> Seth,
>
> Did you work out the problem here?  There was a recent CVS update to OBDA
> tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
> apparently left data from tests in the database, which caused problems
> with
> repeated test runs.
>
> Chris
>
> > > -----Original Message-----
> > > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
> > > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > > Sent: Saturday, September 30, 2006 6:35 PM
> > > To: Hilmar Lapp
> > > Cc: Chris Fields; Bioperl List
> > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> > >
> > > Here're complete test details:
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > ...
> >
> > > FAILED tests 10-12
> > >     Failed 3/12 tests, 75.00% okay
> > > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> > >
> >
> --------------------------------------------------------------------------
> > > -----
> > > t\02species.t                 65    2   3.08%  63 65
> > > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > > t\16obda.t                    12    3  25.00%  10-12
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l <at> lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>
>


From n.haigh at sheffield.ac.uk  Mon Oct 23 20:08:00 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 23 Oct 2006 20:08:00 +0000
Subject: [Bioperl-l] CPAN testing Service
Message-ID: <453D2120.9010301@sheffield.ac.uk>

We should also check the CPAN testing service (CPANTS) to see how "good"
our package is for CPAN and try to increase the Kwalitee score. There
only appears to be details for bioperl-1.2.3 for some reason:
http://cpants.perl.org/dist/bioperl

Nath


From pabloivan at gmail.com  Sun Oct 22 19:54:35 2006
From: pabloivan at gmail.com (Pablo Ivan)
Date: Sun, 22 Oct 2006 16:54:35 -0300
Subject: [Bioperl-l] Bioperl installation under Windows
Message-ID: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>

Hello,

I have been trying to install Bioperl 1.4 on a Windows XP system, but I
didn't get too far; my perl installation was made using ActiveState
5.8.8build 816. I then tried the ppm method of searching for bioperl
in the
repositories and installing the core package 1.4. It says that the
installation was made successfully, but the /Bio folder doesn't show up in
/lib, and it's like nothing new was installed at all. I was wondering if
using that version of ActiveState could be causing it, but the uninstall
option for it isn't showing in Add/Remove, and I'm afraid just deleting the
folders and installing version 5.6 of AS could somehow damage and make
things worse. Or should I just forget about it and try using Cygwin?

Thank you,

Pablo.


From cjfields at uiuc.edu  Mon Oct 23 21:34:47 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 16:34:47 -0500
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <b99962880610231422o24029a0cu229fccc2b5809b85@mail.gmail.com>
Message-ID: <000401c6f6eb$111df040$15327e82@pyrimidine>

Don't know what that particular error is, but it looks ActivePerl-related
(PPM generates HTML from the blib directory).  You may need to run 'nmake
clean' in between test cycles get rid of old blib and other files.

 
The carryover issue from old test runs was a definite problem.  Brian fixed
that in the bioperl-db CVS recently.  Also,  I tried Sendu's fixes from CVS
head to Bio::Root::Root and they seem to fix the problems with
Bio::Root::Root.  The issue came down to a use of indirect syntax (a bad
perl practice).  There are other errors popping up related to Bio::Species,
but these seem fixable at least.

 
I committed a few changes to bioperl-db CVS to fix 03simpleseq.t test
failures due to a lack of gzip on WinXP (I didn't see them b/c I had a copy
on GNU gzip in my path).  These should pass w/o problems now on WinXP.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 4:22 PM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

I have not cleaned my test database yet.  I'll purge it and redo the tests. 

This error keeps popping up in unexpected places while running nmake during
installation: 
 "Undefined subroutine &main::UpdateHTML_blib called at -e line 1. 
NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code
'0xff'"

Is there a way around it??

Seth

On 10/23/06, Chris Fields <cjfields at uiuc.edu> wrote:

Seth, 

Did you try this with a clean, taxonomy-installed database?  There may be
some junk left over tfrom the previous test runs.

I'm looking into it this week; it may not make the developer release but
we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
with a call to gzip.  I'll look into a workaround for that.  

Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
introduces others.  One alternative which I found works is cygwin, but
there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
another...

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
  _____  

From: Seth Johnson [mailto:johnson.biotech at gmail.com] 
Sent: Monday, October 23, 2006 11:37 AM
To: Chris Fields
Cc: bioperl-l
Subject: Re: Error retrieving sequence from BioSQL

 
Chris,

There's definite improvement:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed Test     Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
--- 
t/02species.t                 65    2   3.08%  63 65
t/03simpleseq.t    1   256    59  106 179.66%  7-59
t/04swiss.t                   52   14  26.92%  25 27-34 38-42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There's some weirdness going on during the 'swiss.t' test.  It almost seems
to me that expectations of some tests are swapped (27 & 39, 28 & 40, 29 &
41, 31 & 27, 32 & 28, 33 & 29, 39 & 31): 
================================
not ok 25
# Test 25 got: '10097078' (t/04swiss.t at line 79)
#    Expected: '91309150'
ok 26
not ok 27
# Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
at line 85) 
#    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
not ok 28
# Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' (t/04swiss.t at line 86)
#    Expected: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' 
not ok 29
# Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
(t/04swiss.t at line 87)
#    Expected: 'Cell 66 (2), 383-394 (1991)'
not ok 30
# Test 30 got: <UNDEF> (t/04swiss.t at line 88) 
#    Expected: '91309150'
not ok 31
# Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.' (t/04swiss.t
at line 85 fail #2)
#    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis, J.E. and Leffers,H.'
not ok 32
# Test 32 got: 'Functional expression of cloned human splicing factor SF2:
homology to RNA-binding proteins, U1 70K, and Drosophila splicing
regulators' (t/04swiss.t at line 86 fail #2) 
#    Expected: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
not ok 33
# Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
#2) 
#    Expected: 'Gene 134 (2), 283-287 (1993)'
not ok 34
# Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
#    Expected: '94085792'
ok 35
ok 36
ok 37
not ok 38
# Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3) 
#    Expected: '94253723'
not ok 39
# Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
#    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
not ok 40
# Test 40 got: 'Cloning and expression of a cDNA covering the complete
coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
(t/04swiss.t at line 86 fail #4)
#    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
mitochondrial matrix protein' 
not ok 41
# Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
#4)
#    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
not ok 42
# Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4) 
#    Expected: '99199225'
==============================

On 10/20/06, Chris Fields < cjfields at uiuc.edu <mailto:cjfields at uiuc.edu> >
wrote:


Seth,

Did you work out the problem here?  There was a recent CVS update to OBDA
tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
apparently left data from tests in the database, which caused problems with 
repeated test runs.

Chris

> > -----Original Message-----
> > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
<mailto:bioperl-l-> 
> > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > Sent: Saturday, September 30, 2006 6:35 PM 
> > To: Hilmar Lapp
> > Cc: Chris Fields; Bioperl List
> > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> >
> > Here're complete test details:
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ...
>
> > FAILED tests 10-12
> >     Failed 3/12 tests, 75.00% okay
> > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> >
> --------------------------------------------------------------------------

> > -----
> > t\02species.t                 65    2   3.08%  63 65
> > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > t\04swiss.t                   52   14  26.92%  25 27-34 38-42 
> > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > t\16obda.t                    12    3  25.00%  10-12
> > _______________________________________________
> > Bioperl-l mailing list 
> > Bioperl-l <at> lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
<http://lists.open-bio.org/mailman/listinfo/bioperl-l> 
>
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358 


From cjfields at uiuc.edu  Mon Oct 23 21:53:27 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 23 Oct 2006 16:53:27 -0500
Subject: [Bioperl-l] Bioperl installation under Windows
In-Reply-To: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>
References: <a8acf25f0610221254v18c8d172kc3ce6fa4ea34f676@mail.gmail.com>
Message-ID: <9994CFF6-FCA1-4C7F-9A33-31765C6AE255@uiuc.edu>

It won't install in Perl\lib, but in Perl\site\lib.  Check there.

We are working intently on the next developer release for BioPerl and  
plan on having several PPMs available, but we only are supporting  
ActivePerl 5.8.8.819.  I would suggest that you upgrade your  
ActivePerl installation to that if possible since PPM has undergone  
major changes (they use PPM4 now, which has a GUI by default).  Most  
repositories are now moving over to using PPM4 so you'll likely be  
seeing less PPM3-compatible packages being made.

Chris

On Oct 22, 2006, at 2:54 PM, Pablo Ivan wrote:

> Hello,
>
> I have been trying to install Bioperl 1.4 on a Windows XP system,  
> but I
> didn't get too far; my perl installation was made using ActiveState
> 5.8.8build 816. I then tried the ppm method of searching for bioperl
> in the
> repositories and installing the core package 1.4. It says that the
> installation was made successfully, but the /Bio folder doesn't  
> show up in
> /lib, and it's like nothing new was installed at all. I was  
> wondering if
> using that version of ActiveState could be causing it, but the  
> uninstall
> option for it isn't showing in Add/Remove, and I'm afraid just  
> deleting the
> folders and installing version 5.6 of AS could somehow damage and make
> things worse. Or should I just forget about it and try using Cygwin?
>
> Thank you,
>
> Pablo.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnson.biotech at gmail.com  Mon Oct 23 21:22:13 2006
From: johnson.biotech at gmail.com (Seth Johnson)
Date: Mon, 23 Oct 2006 17:22:13 -0400
Subject: [Bioperl-l] Error retrieving sequence from BioSQL
In-Reply-To: <002c01c6f6c8$7163dd20$15327e82@pyrimidine>
References: <b99962880610230936t3024173s5d6f21526dac87a4@mail.gmail.com>
	<002c01c6f6c8$7163dd20$15327e82@pyrimidine>
Message-ID: <b99962880610231422o24029a0cu229fccc2b5809b85@mail.gmail.com>

Chris,

I have not cleaned my test database yet.  I'll purge it and redo the tests.

This error keeps popping up in unexpected places while running nmake during
installation:
 "Undefined subroutine &main::UpdateHTML_blib called at -e line 1.
NMAKE : fatal error U1077: 'C:\WINDOWS\system32\cmd.exe' : return code
'0xff'"

Is there a way around it??

Seth

On 10/23/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
>  Seth,
>
> Did you try this with a clean, taxonomy-installed database?  There may be
> some junk left over tfrom the previous test runs.
>
> I'm looking into it this week; it may not make the developer release but
> we'll try to get it in.  BTW, the 02sinmpleseq.t test failures have to do
> with a call to gzip.  I'll look into a workaround for that.
>
> Sendu has posted a Bio::Root::Root fix which does get rid of some bugs but
> introduces others.  One alternative which I found works is cygwin, but
> there's a catch: DBD-mysql is hard to install.  If it isn't one thing it's
> another...
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>   ------------------------------
>
> *From:* Seth Johnson [mailto:johnson.biotech at gmail.com]
> *Sent:* Monday, October 23, 2006 11:37 AM
> *To:* Chris Fields
> *Cc:* bioperl-l
> *Subject:* Re: Error retrieving sequence from BioSQL
>
>
>
> Chris,
>
> There's definite improvement:
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> -------------------------------------------------------------------------------
>
> t/02species.t                 65    2   3.08%  63 65
> t/03simpleseq.t    1   256    59  106 179.66%  7-59
> t/04swiss.t                   52   14  26.92%  25 27-34 38-42
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> There's some weirdness going on during the 'swiss.t' test.  It almost
> seems to me that expectations of some tests are swapped (27 & 39, 28 & 40,
> 29 & 41, 31 & 27, 32 & 28, 33 & 29, 39 & 31):
> ================================
> not ok 25
> # Test 25 got: '10097078' (t/04swiss.t at line 79)
> #    Expected: '91309150'
> ok 26
> not ok 27
> # Test 27 got: 'Jiang,J., Zhang,Y., Krainer,A.R. and Xu,R.M.' (t/04swiss.t
> at line 85)
> #    Expected: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
> not ok 28
> # Test 28 got: 'Crystal structure of human p32, a doughnut-shaped acidic
> mitochondrial matrix protein' (t/04swiss.t at line 86)
> #    Expected: 'Functional expression of cloned human splicing factor SF2:
> homology to RNA-binding proteins, U1 70K, and Drosophila splicing
> regulators'
> not ok 29
> # Test 29 got: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
> (t/04swiss.t at line 87)
> #    Expected: 'Cell 66 (2), 383-394 (1991)'
> not ok 30
> # Test 30 got: <UNDEF> (t/04swiss.t at line 88)
> #    Expected: '91309150'
> not ok 31
> # Test 31 got: 'Krainer,A.R., Mayeda,A., Kozak,D. and Binns,G.'
> (t/04swiss.t at line 85 fail #2)
> #    Expected: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
> Celis, J.E. and Leffers,H.'
> not ok 32
> # Test 32 got: 'Functional expression of cloned human splicing factor SF2:
> homology to RNA-binding proteins, U1 70K, and Drosophila splicing
> regulators' (t/04swiss.t at line 86 fail #2)
> #    Expected: 'Cloning and expression of a cDNA covering the complete
> coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
> not ok 33
> # Test 33 got: 'Cell 66 (2), 383-394 (1991)' (t/04swiss.t at line 87 fail
> #2)
> #    Expected: 'Gene 134 (2), 283-287 (1993)'
> not ok 34
> # Test 34 got: <UNDEF> (t/04swiss.t at line 88 fail #2)
> #    Expected: '94085792'
> ok 35
> ok 36
> ok 37
> not ok 38
> # Test 38 got: <UNDEF> (t/04swiss.t at line 88 fail #3)
> #    Expected: '94253723'
> not ok 39
> # Test 39 got: 'Honore,B., Madsen,P., Rasmussen,H.H., Vandekerckhove,J.,
> Celis,J.E. and Leffers,H.' (t/04swiss.t at line 85 fail #4)
> #    Expected: 'Jiang,J., Zhang,Y., Krainer, A.R. and Xu,R.M.'
> not ok 40
> # Test 40 got: 'Cloning and expression of a cDNA covering the complete
> coding region of the P32 subunit of human pre-mRNA splicing factor SF2'
> (t/04swiss.t at line 86 fail #4)
> #    Expected: 'Crystal structure of human p32, a doughnut-shaped acidic
> mitochondrial matrix protein'
> not ok 41
> # Test 41 got: 'Gene 134 (2), 283-287 (1993)' (t/04swiss.t at line 87 fail
> #4)
> #    Expected: 'Proc. Natl. Acad. Sci. U.S.A. 96 (7), 3572-3577 (1999)'
> not ok 42
> # Test 42 got: <UNDEF> (t/04swiss.t at line 88 fail #4)
> #    Expected: '99199225'
> ==============================
>
>  On 10/20/06, *Chris Fields* < cjfields at uiuc.edu> wrote:
>
>
>
> Seth,
>
> Did you work out the problem here?  There was a recent CVS update to OBDA
> tests (16obda.t) that fixed similar problems on Mac OS X.  Old OBDA tests
> apparently left data from tests in the database, which caused problems
> with
> repeated test runs.
>
> Chris
>
> > > -----Original Message-----
> > > From: bioperl-l-bounces <at> lists.open-bio.org [mailto: bioperl-l-
> > > bounces <at> lists.open-bio.org] On Behalf Of Seth Johnson
> > > Sent: Saturday, September 30, 2006 6:35 PM
> > > To: Hilmar Lapp
> > > Cc: Chris Fields; Bioperl List
> > > Subject: Re: [Bioperl-l] Error retrieving sequence from BioSQL
> > >
> > > Here're complete test details:
> > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > ...
> >
> > > FAILED tests 10-12
> > >     Failed 3/12 tests, 75.00% okay
> > > Failed Test     Stat Wstat Total Fail  Failed  List of Failed
> > >
> >
> --------------------------------------------------------------------------
> > > -----
> > > t\02species.t                 65    2   3.08%  63 65
> > > t\03simpleseq.t    1   256    59  106 179.66%  7-59
> > > t\04swiss.t                   52   14  26.92%  25 27-34 38-42
> > > t\12ontology.t     2   512   738 1471 199.32%  3-738
> > > t\16obda.t                    12    3  25.00%  10-12
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l <at> lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>


-- 
Best Regards,


Seth Johnson
Senior Bioinformatics Associate

Ph: (202) 470-0900
Fx: (775) 251-0358


From chhalling at alumni.ls.berkeley.edu  Tue Oct 24 01:02:24 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Mon, 23 Oct 2006 21:02:24 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453C6509.90005@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
Message-ID: <453D6620.5020401@alumni.ls.berkeley.edu>

Sorry, I should know better about giving all the details.

This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a 
fresh compile) with Mac OS X 10.4.8.

-- Conrad

Nathan S. Haigh wrote:
> Chris Fields wrote:
>   
>> Thanks for letting us know!  Did PPM4 throw errors or just silently  
>> pass them over?
>>
>> Chris
>>
>> On Oct 22, 2006, at 6:45 PM, Conrad Halling wrote:
>>
>>   
>>     
> I believe he is talking about the bundle on cpan and not the ppd. I will
> get this updated as soon as possible.
>
> Sendu/Chris - can you confirm to me which Bioperl modules are essential
> to Bioperl and thus should *not* go into Bundle::BioPerl? Is there any
> reason for not putting *all* dependencies into the bundle?
>
> Nath
>
>
>
>
>
>   


-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From n.haigh at sheffield.ac.uk  Tue Oct 24 07:05:53 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 24 Oct 2006 08:05:53 +0100
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453D6620.5020401@alumni.ls.berkeley.edu>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>
	<453D6620.5020401@alumni.ls.berkeley.edu>
Message-ID: <453DBB51.6010505@sheffield.ac.uk>

Conrad Halling wrote:
> Sorry, I should know better about giving all the details.
>
> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 (a 
> fresh compile) with Mac OS X 10.4.8.
>
> -- Conrad
>
>   
My apologies Conrad, this was my bad! Are you in need of the corrections 
being made swiftly or can you wait until the Bioperl 1.5.2 release when 
I'll ensure the Bundle is updated correctly for that release?

Cheers
Nath


From n.haigh at sheffield.ac.uk  Tue Oct 24 09:57:25 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 10:57:25 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453CE2D7.5080608@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
Message-ID: <453DE385.8010700@sheffield.ac.uk>

--snip--
> Ok, I'm going to go ahead and call it 1.52_01 then. Surely 1.60 will be 
> treated higher than 1.4? Anyway, we can cross that bridge when we get 
> there, but this seems appropriate now.
>
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Just been having a think about this versioning. Does this work well and
is it intuitive with versioning the official 1.5.2 developer release and
also the 1.6 stable release? I'd like to put forward the following
versioning scheme for consideration (most is the same as what it is now,
but with some clarification - hopefully):
major-version . minor-version sub-version _ developer-release-version
RC-version

The sub-version represents bug-fixes and possibly some minor feature
enhancements with no API changes.
The minor-version represents some significant feature enhancements/API
changes/bug fixes.
The major-version represents significant rewrites of Bioperl.

For an RC of a developer release the version would have _0x (where x=the
RC number)
For a non RC of a developer release the version would have _10
For an RC of a stable release the version would have _0x (where x=RC number)
Fo a non RC of a stable release the version would not have the
underscore suffix

Therefore I would see the following $VERSION being applied:
1.5.2 RC1            = 1.52_01
1.5.2 RC2            = 1.52_02
1.5.2 RC3            = 1.52_03
1.5.2                = 1.52_10
1.6 RC1              = 1.60_01
1.6 RC2              = 1.60_02
1.6                  = 1.60
1.6.1 RC1            = 1.61_01
1.6.1                = 1.61

This should satisfy the requirement of CPAN for having underscores in
versions to indicate a developer release, which here is a Bioperl
release with an odd minor version number or any RC whether it be of a
developer release or a stable release. This should mean that we could
have the RC's on CPAN, but by default, CPAN would only install the
latest "non developer release" (i.e. the last package without an
underscore in the version).

If we are going ahead with the new $VERSION scheme (as it currently is
in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
1.52 instead of Bioperl 1.5.2 and make an effort to sync the
documentation with regards to this.

Nath


From bix at sendu.me.uk  Tue Oct 24 10:19:05 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 11:19:05 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DE385.8010700@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>
	<453CEE2A.8000002@sheffield.ac.uk> <453CE2D7.5080608@sendu.me.uk>
	<453DE385.8010700@sheffield.ac.uk>
Message-ID: <453DE899.4030603@sendu.me.uk>

Nathan Haigh wrote:
>
> Therefore I would see the following $VERSION being applied:
> 1.5.2 RC1            = 1.52_01
> 1.5.2 RC2            = 1.52_02
> 1.5.2 RC3            = 1.52_03
> 1.5.2                = 1.52_10
> 1.6 RC1              = 1.60_01
> 1.6 RC2              = 1.60_02
> 1.6                  = 1.60
> 1.6.1 RC1            = 1.61_01
> 1.6.1                = 1.61
> 
> This should satisfy the requirement of CPAN for having underscores in
> versions to indicate a developer release, which here is a Bioperl
> release with an odd minor version number or any RC whether it be of a
> developer release or a stable release. This should mean that we could
> have the RC's on CPAN, but by default, CPAN would only install the
> latest "non developer release" (i.e. the last package without an
> underscore in the version).

That all sounds good to me, except I worry about potential confusion if 
people look manually at the things available in CPAN, see 1.60_02 and 
think it is more recent than 1.60 and try to install it manually.

Since
$VERSION = 1.52_10;
is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
final release version should be
$VERSION = 1.6010.


> If we are going ahead with the new $VERSION scheme (as it currently is
> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
> documentation with regards to this.

I might disagree with this though. I think perl people, and perhaps unix 
people in general, should be used to version numbers like '1.5.2', but 
then getting '1.52' from the code since such a number allows simple 
numerical comparisons while the former does not. The former is easier to 
read and understand. This is just how Perl itself behaves.

Most users who wouldn't expect such a behaviour aren't going to be 
checking the version number programatically anyway.


BTW. do we have someone with a CPAN account, or should I get one?


From n.haigh at sheffield.ac.uk  Tue Oct 24 11:37:12 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 12:37:12 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DE899.4030603@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk>
Message-ID: <453DFAE8.5050602@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>   
>> Therefore I would see the following $VERSION being applied:
>> 1.5.2 RC1            = 1.52_01
>> 1.5.2 RC2            = 1.52_02
>> 1.5.2 RC3            = 1.52_03
>> 1.5.2                = 1.52_10
>> 1.6 RC1              = 1.60_01
>> 1.6 RC2              = 1.60_02
>> 1.6                  = 1.60
>> 1.6.1 RC1            = 1.61_01
>> 1.6.1                = 1.61
>>
>> This should satisfy the requirement of CPAN for having underscores in
>> versions to indicate a developer release, which here is a Bioperl
>> release with an odd minor version number or any RC whether it be of a
>> developer release or a stable release. This should mean that we could
>> have the RC's on CPAN, but by default, CPAN would only install the
>> latest "non developer release" (i.e. the last package without an
>> underscore in the version).
>>     
>
> That all sounds good to me, except I worry about potential confusion if 
> people look manually at the things available in CPAN, see 1.60_02 and 
> think it is more recent than 1.60 and try to install it manually.
>
>   

I not sure if this would be a problem. As far as I understand, CPAN
treats these packages with underscores in $VERSION as something
distinctly different to the others releases (i.e. developer releases).
If you look at such a page, it is clearly evident that it is a
developers release. For example, if you search on CPAN for the latest
version of the CPAN module is shows 1.8802. if you go to that page:
http://search.cpan.org/~andk/CPAN-1.8802/
There is also a link for the latest developer release, released 1 day
after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).
This too appears to be later that 1.8802, but since it is dealt with as
a developer release it doesn't seem to matter - CPAN will only deal with
the stable (non-developer) releases, while the developer releases can be
used as a convenient way to access developer releases. Although I'm
thinking CPAN uses some hocus pocus with release dates too.

> Since
> $VERSION = 1.52_10;
> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
> final release version should be
> $VERSION = 1.6010.
>
>
>   

Because they are dealt with separately, I don't think this is an issue
(see above).

>> If we are going ahead with the new $VERSION scheme (as it currently is
>> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
>> documentation with regards to this.
>>     
>
> I might disagree with this though. I think perl people, and perhaps unix 
> people in general, should be used to version numbers like '1.5.2', but 
> then getting '1.52' from the code since such a number allows simple 
> numerical comparisons while the former does not. The former is easier to 
> read and understand. This is just how Perl itself behaves.
>
> Most users who wouldn't expect such a behaviour aren't going to be 
> checking the version number programatically anyway.
>
>
> BTW. do we have someone with a CPAN account, or should I get one?
>   

It says Ewan Birney is the author of Bioperl - I assume it must be
possible to have multiple people have the permissions to update a single
package.

Nath


From chhalling at alumni.ls.berkeley.edu  Tue Oct 24 11:15:12 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Tue, 24 Oct 2006 07:15:12 -0400
Subject: [Bioperl-l] Misspellings in Bundle::BioPerl
In-Reply-To: <453DBB51.6010505@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>
	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>
	<453C6509.90005@sheffield.ac.uk>
	<453D6620.5020401@alumni.ls.berkeley.edu>
	<453DBB51.6010505@sheffield.ac.uk>
Message-ID: <453DF5C0.3040104@alumni.ls.berkeley.edu>

Nathan S. Haigh wrote:
> Conrad Halling wrote:
>> Sorry, I should know better about giving all the details.
>>
>> This was using the Bundle::BioPerl on CPAN, installing for Perl 5.8.8 
>> (a fresh compile) with Mac OS X 10.4.8.
>>
>> -- Conrad  
> My apologies Conrad, this was my bad! Are you in need of the 
> corrections being made swiftly or can you wait until the Bioperl 1.5.2 
> release when I'll ensure the Bundle is updated correctly for that 
> release?
>
> Cheers
> Nath

No, I'm fine. I used the cpan utility to load the three modules manually.

-- Conrad

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From bix at sendu.me.uk  Tue Oct 24 12:16:54 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 13:16:54 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453DFAE8.5050602@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
Message-ID: <453E0436.3050903@sendu.me.uk>

Nathan Haigh wrote:
> Sendu Bala wrote:
>
>> That all sounds good to me, except I worry about potential confusion if 
>> people look manually at the things available in CPAN, see 1.60_02 and 
>> think it is more recent than 1.60 and try to install it manually.
> 
> I not sure if this would be a problem. As far as I understand, CPAN
> treats these packages with underscores in $VERSION as something
> distinctly different to the others releases (i.e. developer releases).
> If you look at such a page, it is clearly evident that it is a
> developers release. For example, if you search on CPAN for the latest
> version of the CPAN module is shows 1.8802. if you go to that page:
> http://search.cpan.org/~andk/CPAN-1.8802/
> There is also a link for the latest developer release, released 1 day
> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).

[snip]

>> Since
>> $VERSION = 1.52_10;
>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release, 
>> final release version should be
>> $VERSION = 1.6010.
>
> Because they are dealt with separately, I don't think this is an issue
> (see above).

If you don't notice the dates, or are doing numerical version number 
comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may 
not be automatic, but you can still chose to download the developer 
releases. Which means if we say to someone 'use Bioperl 1.6 or better' 
they may choose to get the latest version and think it is 1.6002 when 
infact 1.60 was the more recent version. 1.6010 solves the problem, is 
consistent with your 1.50_10 suggestion, and doesn't cause any problems 
as far as I can see.


>>> If we are going ahead with the new $VERSION scheme (as it currently is
>>> in HEAD), we should, for the sake of clarity,  try to talk about Bioperl
>>> 1.52 instead of Bioperl 1.5.2 and make an effort to sync the
>>> documentation with regards to this.
>>>     
>> I might disagree with this though. I think perl people, and perhaps unix 
>> people in general, should be used to version numbers like '1.5.2', but 
>> then getting '1.52' from the code since such a number allows simple 
>> numerical comparisons while the former does not. The former is easier to 
>> read and understand. This is just how Perl itself behaves.
>>
>> Most users who wouldn't expect such a behaviour aren't going to be 
>> checking the version number programatically anyway.
>>
>>
>> BTW. do we have someone with a CPAN account, or should I get one?
>>   
> 
> It says Ewan Birney is the author of Bioperl - I assume it must be
> possible to have multiple people have the permissions to update a single
> package.

How did you get Bundle::BioPerl updated? Did you just ask Chris 
Dagdigian to do it for you? Or do you have access to his account? I'll 
ask Ewan about it.


From n.haigh at sheffield.ac.uk  Tue Oct 24 12:21:56 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 13:21:56 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0436.3050903@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk>
Message-ID: <453E0564.9030302@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> Sendu Bala wrote:
>>
>>> That all sounds good to me, except I worry about potential confusion
>>> if people look manually at the things available in CPAN, see 1.60_02
>>> and think it is more recent than 1.60 and try to install it manually.
>>
>> I not sure if this would be a problem. As far as I understand, CPAN
>> treats these packages with underscores in $VERSION as something
>> distinctly different to the others releases (i.e. developer releases).
>> If you look at such a page, it is clearly evident that it is a
>> developers release. For example, if you search on CPAN for the latest
>> version of the CPAN module is shows 1.8802. if you go to that page:
>> http://search.cpan.org/~andk/CPAN-1.8802/
>> There is also a link for the latest developer release, released 1 day
>> after 1.8802 with a version of 1.88_57 (which would convert to 1.8857).
>
> [snip]
>
>>> Since
>>> $VERSION = 1.52_10;
>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before
>>> release, final release version should be
>>> $VERSION = 1.6010.
>>
>> Because they are dealt with separately, I don't think this is an issue
>> (see above).
>
> If you don't notice the dates, or are doing numerical version number
> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
> not be automatic, but you can still chose to download the developer
> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
> they may choose to get the latest version and think it is 1.6002 when
> infact 1.60 was the more recent version. 1.6010 solves the problem, is
> consistent with your 1.50_10 suggestion, and doesn't cause any
> problems as far as I can see.
>
>

I see - you mean for a non-RC release append 10 to the version number
and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
the version.

--snip--
>
> How did you get Bundle::BioPerl updated? Did you just ask Chris
> Dagdigian to do it for you? Or do you have access to his account? I'll
> ask Ewan about it.
I just asked Chris D. to do it for me :o)

Nath


From bix at sendu.me.uk  Tue Oct 24 13:01:22 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 14:01:22 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0564.9030302@sheffield.ac.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk>
Message-ID: <453E0EA2.6050306@sendu.me.uk>

Nathan Haigh wrote:
> I see - you mean for a non-RC release append 10 to the version number
> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
> the version.

Precisely.

1.5.2 RC3 will have in Bio::Root::Version :

$VERSION = 1.52_03;
$VERSION = eval $VERSION; # $VERSION is 1.5203

1.5.2 final release would have:

$VERSION = 1.52_10;
$VERSION = eval $VERSION; # $VERSION is 1.5210

1.6.0 RC1 would have:

$VERSION = 1.60_01;
$VERSION = eval $VERSION; # $VERSION is 1.6001

1.6.0 final release would have:

$VERSION = 1.6010;


Nice thing about putting RCs up on CPAN is that I suppose we'd see the 
test results from cpantesters. The more test results the better :)


From n.haigh at sheffield.ac.uk  Tue Oct 24 13:05:54 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 24 Oct 2006 14:05:54 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0EA2.6050306@sendu.me.uk>
References: <453C029D.1070708@alumni.ls.berkeley.edu>	<4D7454F6-3362-4373-BCDD-C4DF8A73C1CC@uiuc.edu>	<453C6509.90005@sheffield.ac.uk>	<453C66BF.1060008@sendu.me.uk>	<453C7648.8030004@sheffield.ac.uk>	<453C7D80.80207@sendu.me.uk>	<453C94C8.5040900@sheffield.ac.uk>	<453C8E60.7000105@sendu.me.uk>	<453CA99D.9060009@sheffield.ac.uk>	<453CB9A5.2020409@mrc-lmb.cam.ac.uk>	<453CCABB.2060308@sendu.me.uk>	<453CEE2A.8000002@sheffield.ac.uk>
	<453CE2D7.5080608@sendu.me.uk>	<453DE385.8010700@sheffield.ac.uk>
	<453DE899.4030603@sendu.me.uk> <453DFAE8.5050602@sheffield.ac.uk>
	<453E0436.3050903@sendu.me.uk> <453E0564.9030302@sheffield.ac.uk>
	<453E0EA2.6050306@sendu.me.uk>
Message-ID: <453E0FB2.4080002@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> I see - you mean for a non-RC release append 10 to the version number
>> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
>> the version.
>
> Precisely.
>
> 1.5.2 RC3 will have in Bio::Root::Version :
>
> $VERSION = 1.52_03;
> $VERSION = eval $VERSION; # $VERSION is 1.5203
>
> 1.5.2 final release would have:
>
> $VERSION = 1.52_10;
> $VERSION = eval $VERSION; # $VERSION is 1.5210
>
> 1.6.0 RC1 would have:
>
> $VERSION = 1.60_01;
> $VERSION = eval $VERSION; # $VERSION is 1.6001
>
> 1.6.0 final release would have:
>
> $VERSION = 1.6010;
>
>
> Nice thing about putting RCs up on CPAN is that I suppose we'd see the
> test results from cpantesters. The more test results the better :)
Did you see the cpants site I sent earlier:
http://cpants.perl.org/dist/bioperl

But I'm not sure why 1.4 didn't make it in there instead of 1.2.3


From bix at sendu.me.uk  Tue Oct 24 13:14:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 14:14:08 +0100
Subject: [Bioperl-l] CPAN testing Service
In-Reply-To: <453D2120.9010301@sheffield.ac.uk>
References: <453D2120.9010301@sheffield.ac.uk>
Message-ID: <453E11A0.20304@sendu.me.uk>

Nathan S. Haigh wrote:
> We should also check the CPAN testing service (CPANTS) to see how "good"
> our package is for CPAN and try to increase the Kwalitee score. There
> only appears to be details for bioperl-1.2.3 for some reason:
> http://cpants.perl.org/dist/bioperl

Yes, but I think it will be pretty similar score this time round. We'll 
resolve the remaining issues for 1.6.


From cjfields at uiuc.edu  Tue Oct 24 14:24:44 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 09:24:44 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0436.3050903@sendu.me.uk>
Message-ID: <000501c6f778$279cee10$15327e82@pyrimidine>

...
> >> Since
> >> $VERSION = 1.52_10;
> >> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release,
> >> final release version should be
> >> $VERSION = 1.6010.
> >
> > Because they are dealt with separately, I don't think this is an issue
> > (see above).
> 
> If you don't notice the dates, or are doing numerical version number
> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
> not be automatic, but you can still chose to download the developer
> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
> they may choose to get the latest version and think it is 1.6002 when
> infact 1.60 was the more recent version. 1.6010 solves the problem, is
> consistent with your 1.50_10 suggestion, and doesn't cause any problems
> as far as I can see.

CPAN looks like it can handle 'x.y.z', at least for Pugs:

http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/

>From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':

our $VERSION = 6.002013;

That's also a very perlish-way to do it.  And there are no developer
versions of Pugs, since it is always under active development.  We could try
something like:

our $VERSION = 1.005002_01;

just to tag it as a developer release or release candidate, if that's what
you want; I'm neutral to that point.  I don't think it's necessary to post
every RC to CPAN, though, unless you feel very strongly about it.  It just
seems like more hassle than it's worth, esp. since you've been releasing
about one per week leading up to a final 1.5.2 (due soon).  

> >> I might disagree with this though. I think perl people, and perhaps
> unix
> >> people in general, should be used to version numbers like '1.5.2', but
> >> then getting '1.52' from the code since such a number allows simple
> >> numerical comparisons while the former does not. The former is easier
> to
> >> read and understand. This is just how Perl itself behaves.
> >>
> >> Most users who wouldn't expect such a behaviour aren't going to be
> >> checking the version number programatically anyway.
> >>
> >>
> >> BTW. do we have someone with a CPAN account, or should I get one?
> >>
> >
> > It says Ewan Birney is the author of Bioperl - I assume it must be
> > possible to have multiple people have the permissions to update a single
> > package.

As a quick response to the above, I would read 'rel. 1.5.2' as the second
patched release of the second revision (here in a developer cycle) of the
first major release.  I would read 'rel 1.52' as the 52nd release of the
major release (just can't quite make it to version 2, I guess).  I don't
think we can use the latter as it is just too confusing, especially since
we've adopted the 'major.minor.patch' versioning quite early on.  

As for CPAN, I believe there is usually a person or group responsible for
maintaining each distribution.  As Ewan seems to be the point man, you'll
have to ask him.  I suppose it is possible to add more if needed

> How did you get Bundle::BioPerl updated? Did you just ask Chris
> Dagdigian to do it for you? Or do you have access to his account? I'll
> ask Ewan about it.

When I inquired about XML::Simple, I emailed Chris D. via his contact
information from CPAN.  He let me know that adding it would be pretty easy,
so all you need to do is let him know about any errors/additions/deletions.
I think his wiki page also has some contact info.  

Which reminds me, if anyone contacts him, could you make sure that
XML::Simple is added?  I can't remember if it has been.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 24 14:29:11 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 09:29:11 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E0FB2.4080002@sheffield.ac.uk>
Message-ID: <000601c6f778$c639f0e0$15327e82@pyrimidine>

> Sendu Bala wrote:
> > Nathan Haigh wrote:
> >> I see - you mean for a non-RC release append 10 to the version number
> >> and if it's a developer release i.e. 1.5, 1.7 series, you append _10 to
> >> the version.
> >
> > Precisely.
> >
> > 1.5.2 RC3 will have in Bio::Root::Version :
> >
> > $VERSION = 1.52_03;
> > $VERSION = eval $VERSION; # $VERSION is 1.5203
> >
> > 1.5.2 final release would have:
> >
> > $VERSION = 1.52_10;
> > $VERSION = eval $VERSION; # $VERSION is 1.5210
> >
> > 1.6.0 RC1 would have:
> >
> > $VERSION = 1.60_01;
> > $VERSION = eval $VERSION; # $VERSION is 1.6001
> >
> > 1.6.0 final release would have:
> >
> > $VERSION = 1.6010;
> >
> >
> > Nice thing about putting RCs up on CPAN is that I suppose we'd see the
> > test results from cpantesters. The more test results the better :)
> Did you see the cpants site I sent earlier:
> http://cpants.perl.org/dist/bioperl
> 
> But I'm not sure why 1.4 didn't make it in there instead of 1.2.3

Yes, odd.  Another thing to note is that CPAN also list two bugs related to
bioperl 1.4.  We may need to have some way of either redirecting users from
there to bugzilla, or routinely checking the CPAN site.  Otherwise we'll
miss those. 

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From JK at novozymes.com  Tue Oct 24 14:45:26 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 16:45:26 +0200
Subject: [Bioperl-l] Keeping references around in the objects?
Message-ID: <934F95E71B6C9347A873C42AE3C196191299E011@NZT0004E.dknz.nzcorp.net>

Hi All. 

When getting a Bio::Seq object back from a feature it would be really 
nice to have access to the old objects through the new object as:

$featseq->feature()->parent_seq();

Would it be possible to keep the references around for (as an example) 
to be able to access the global information through the particular
feature. 

Most of the annotation in the general header of a EMBL/Genbank-record
also
applies to the specific features. 

Jesper


From JK at novozymes.com  Tue Oct 24 14:28:22 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 16:28:22 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
Message-ID: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>

Hi. 

We're trying to "extend" bioperl in our own setup. We have some funtions

that we'd like to "allways" have available on a Bio::Seq-object. As an
example, 
I'd like to have the sequence-digest available on ->digest that just
returns
A hex-encoded message-digest of the sequence in the object. This is
really comfortable
when trying to figure out wether we've got some computations stored in
the cache
for this particular sequence. 

Another example is that we have some fields we want to be mandatory in
the objects,
thus adding additional checks in the constructor is nessesary. 

Our approach has been to "subclass" Bio::Seq in a new object: (Nz::Seq)
and add 
the functionality there. This generally works fine (->translate() calls
->can_call_new()
and instantiates the correct subclassed object. 

But the logic fails when the ->seq of a feature just instantiates a
Bio::PrimarySeq 
without trying to get the subclassed object. 

So the question basically is: 
What is the preferred way of extending/subclassing Bio-perl -objects
with 
our own methods? 

Jesper


From bix at sendu.me.uk  Tue Oct 24 15:26:19 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 16:26:19 +0100
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <000501c6f778$279cee10$15327e82@pyrimidine>
References: <000501c6f778$279cee10$15327e82@pyrimidine>
Message-ID: <453E309B.9090007@sendu.me.uk>

Chris Fields wrote:
> ...
>>>> Since
>>>> $VERSION = 1.52_10;
>>>> is evaluated to 1.5210, by analogy if 1.60_02 was RC2 before release,
>>>> final release version should be
>>>> $VERSION = 1.6010.
>>> Because they are dealt with separately, I don't think this is an issue
>>> (see above).
>> If you don't notice the dates, or are doing numerical version number
>> comparisons, 1.6002 (an RC) is greater than 1.60 (the release). It may
>> not be automatic, but you can still chose to download the developer
>> releases. Which means if we say to someone 'use Bioperl 1.6 or better'
>> they may choose to get the latest version and think it is 1.6002 when
>> infact 1.60 was the more recent version. 1.6010 solves the problem, is
>> consistent with your 1.50_10 suggestion, and doesn't cause any problems
>> as far as I can see.
> 
> CPAN looks like it can handle 'x.y.z', at least for Pugs:
> 
> http://search.cpan.org/~audreyt/Perl6-Pugs-6.2.13/

'handle'? I think it shows up as '6.2.13' simply because it was uploaded 
with the filename Perl6-Pugs-6.2.13.tar.gz


As you point out, the code has the kind of $VERSION number we've been 
suggesting in this thread:

> From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> 
> our $VERSION = 6.002013;
> 
> That's also a very perlish-way to do it.  And there are no developer
> versions of Pugs, since it is always under active development.  We could try
> something like:
> 
> our $VERSION = 1.005002_01;

Yes, this was already like one of my suggestions (1.0502_01), but I 
brought up the concern that 1.05 might be < 1.4.

So then we have a question: do we try and fumble a 1.4 compatible number 
by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if 
it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no 
room for RC numbering, or 1.006000010 (1.6.0.10) - the first final 
release following some 1.006000_001 (1.6.0.01 == rc1) RCs?


> just to tag it as a developer release or release candidate, if that's what
> you want; I'm neutral to that point.  I don't think it's necessary to post
> every RC to CPAN, though, unless you feel very strongly about it.  It just
> seems like more hassle than it's worth, esp. since you've been releasing
> about one per week leading up to a final 1.5.2 (due soon).  

I don't think it would be a hassle; on the contrary it would be very 
useful to know the CPAN distribution actually works. I'm very happy with 
the idea that a release candidate gets fully tested...


From bix at sendu.me.uk  Tue Oct 24 15:39:16 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 24 Oct 2006 16:39:16 +0100
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
Message-ID: <453E33A4.5060004@sendu.me.uk>

JK (Jesper Agerbo Krogh) wrote:
> Hi. 
> 
> We're trying to "extend" bioperl in our own setup. We have some funtions
> that we'd like to "allways" have available on a Bio::Seq-object.
[snip]
> So the question basically is: 
> What is the preferred way of extending/subclassing Bio-perl -objects
> with our own methods? 

http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit


From hlapp at gmx.net  Tue Oct 24 16:24:09 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 24 Oct 2006 12:24:09 -0400
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.net>
Message-ID: <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>

I think you've generally taken the right path, but see below.

First off, object factories are used extensively already but not yet  
in each and every place where Bioperl creates an object internally.  
Achieving your goal may entail fixes to Bioperl to use a factory  
instead of a hard-coded module name. Also be on the lookout for  
factory() or seq_factory() methods for classes whose work entails  
creating sequence objects and that already give you control over the  
type to be created.

The problem that hits you here though isn't one of determining the  
type of the object to be created, because the respective method  
doesn't create a sequence object. It only returns the sequence object  
that the feature has a reference to.

The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your  
extension of the latter is that the Perl garbage collector can't deal  
with circular references. The way we've circumvented the problem with  
sequence (who hold references to their feature objects) and feature  
objects (who need to hold a reference to their sequence object) is to  
make Bio::Seq a wrapper around Bio::PrimarySeq (i.e., Bio::Seq  
implements Bio::PrimarySeqI by delegating all the Bio::PrimarySeqI  
methods to an instance of Bio::PrimarySeq, and then adds  
implementations of the Bio::SeqI methods), and then make feature  
objects only hold a reference to the 'base' Bio::PrimarySeq instance.  
This works because Bio::PrimarySeq doesn't hold features, only  
Bio::SeqI objects do.

Having said all that, note that if all what you want to do is  
defining computations on Bio::Seq objects, as opposed to storing  
values for additional attributes, the best design approach is not to  
extend the class but to create a class with those computations as  
static methods (which would accept the seq object on which to compute  
as an argument; e.g., print $seqComputations->message_digest($seq)).

	-hlmar


On Oct 24, 2006, at 10:28 AM, JK ((Jesper Agerbo Krogh)) wrote:

> Hi.
>
> We're trying to "extend" bioperl in our own setup. We have some  
> funtions
>
> that we'd like to "allways" have available on a Bio::Seq-object. As an
> example,
> I'd like to have the sequence-digest available on ->digest that just
> returns
> A hex-encoded message-digest of the sequence in the object. This is
> really comfortable
> when trying to figure out wether we've got some computations stored in
> the cache
> for this particular sequence.
>
> Another example is that we have some fields we want to be mandatory in
> the objects,
> thus adding additional checks in the constructor is nessesary.
>
> Our approach has been to "subclass" Bio::Seq in a new object:  
> (Nz::Seq)
> and add
> the functionality there. This generally works fine (->translate()  
> calls
> ->can_call_new()
> and instantiates the correct subclassed object.
>
> But the logic fails when the ->seq of a feature just instantiates a
> Bio::PrimarySeq
> without trying to get the subclassed object.
>
> So the question basically is:
> What is the preferred way of extending/subclassing Bio-perl -objects
> with
> our own methods?
>
> Jesper
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 24 16:45:25 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 24 Oct 2006 11:45:25 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <453E309B.9090007@sendu.me.uk>
Message-ID: <000001c6f78b$d1c65a30$15327e82@pyrimidine>

...
> 
> 'handle'? I think it shows up as '6.2.13' simply because it was uploaded
> with the filename Perl6-Pugs-6.2.13.tar.gz

Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is
'6.002013'.  So maybe we should follow a similar convention.  Seems easier
and less confusing to me, at least.
 
> As you point out, the code has the kind of $VERSION number we've been
> suggesting in this thread:
> 
> > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> >
> > our $VERSION = 6.002013;
> >
> > That's also a very perlish-way to do it.  And there are no developer
> > versions of Pugs, since it is always under active development.  We could
> try
> > something like:
> >
> > our $VERSION = 1.005002_01;
> 
> Yes, this was already like one of my suggestions (1.0502_01), but I
> brought up the concern that 1.05 might be < 1.4.
> 
> So then we have a question: do we try and fumble a 1.4 compatible number
> by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if
> it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no
> room for RC numbering, or 1.006000010 (1.6.0.10) - the first final
> release following some 1.006000_001 (1.6.0.01 == rc1) RCs?

I would go for the clean break if it follows perl/CPAN convention.
'1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing.

If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6
RC1, 1.6 RC2 etc then that would be consistent and perl-compatible. 

BTW, the reason I looked at Pugs was to see what some of the Perl6
developers were using.  Who knows; they'll probably change it!

...

> I don't think it would be a hassle; on the contrary it would be very
> useful to know the CPAN distribution actually works. I'm very happy with
> the idea that a release candidate gets fully tested...

So you obviously feel strongly about it!  ;> 

I don't have a problem as long as we stick with doing this from now on (i.e.
have a consistent versioning scheme, release policy, CPAN release policy,
etc).  Would be nice for Jason/Brian/Hilmar to chime in as to the reasoning
behind the older versioning scheme.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From JK at novozymes.com  Tue Oct 24 17:59:10 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 19:59:10 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>

>  
> I think you've generally taken the right path, but see below.
> 
> First off, object factories are used extensively already but not yet  
> in each and every place where Bioperl creates an object internally.  
> Achieving your goal may entail fixes to Bioperl to use a factory  
> instead of a hard-coded module name. Also be on the lookout for  
> factory() or seq_factory() methods for classes whose work entails  
> creating sequence objects and that already give you control over the  
> type to be created.

Can you elaborate/describe this a bit more? 

> The problem that hits you here though isn't one of determining the  
> type of the object to be created, because the respective method  
> doesn't create a sequence object. It only returns the sequence object  
> that the feature has a reference to.

This was what Data::Dumper told me, but stuff I'd likewise would like to 
change was to get a RichSeq object returned every-time from Bio::Seq, adding
in the stuff that allways seems appropriate. 

> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your  
> extension of the latter is that the Perl garbage collector can't deal  
> with circular references. 

Doesn't Scalar::Util::weaken solve that? 

> Having said all that, note that if all what you want to do is  
> defining computations on Bio::Seq objects, as opposed to storing  
> values for additional attributes, the best design approach is not to  
> extend the class but to create a class with those computations as  
> static methods (which would accept the seq object on which to compute  
> as an argument; e.g., print $seqComputations->message_digest($seq)).

I could but there are some functionality that I'd by design would like to 
have available on every sequence in the system. This way I would end up 
coding the functionality for getting the message_digest every place that
I needed to get the value (which would be quite often in this application), 
whereas it by design belongs into the Bio::Seq-stuff. 

Jesper


From JK at novozymes.com  Tue Oct 24 17:59:19 2006
From: JK at novozymes.com (JK (Jesper Agerbo Krogh))
Date: Tue, 24 Oct 2006 19:59:19 +0200
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <453E33A4.5060004@sendu.me.uk>
Message-ID: <934F95E71B6C9347A873C42AE3C196190B84C5FD@NZT0004E.dknz.nzcorp.net>


> JK (Jesper Agerbo Krogh) wrote:
> > Hi. 
> > 
> > We're trying to "extend" bioperl in our own setup. We have some funtions
> > that we'd like to "allways" have available on a Bio::Seq-object.
> [snip]
> > So the question basically is: 
> > What is the preferred way of extending/subclassing Bio-perl -objects
> > with our own methods? 
> 
> http://www.bioperl.org/wiki/Advanced_BioPerl#Extending_the_toolkit

That is definately a way of extending Bio-perl, thanks. 

Jesper


From hlapp at gmx.net  Tue Oct 24 18:57:02 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 24 Oct 2006 14:57:02 -0400
Subject: [Bioperl-l] Subclassing Bio::Seq ?  Extending Bio::Perl
In-Reply-To: <934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>
References: <934F95E71B6C9347A873C42AE3C196191299DFBF@NZT0004E.dknz.nzcorp.n
	et> <F563FB85-D711-481A-A038-EDCA4B941C3C@gmx.net>
	<934F95E71B6C9347A873C42AE3C196190B84C5FC@NZT0004E.dknz.nzcorp.net>
Message-ID: <C8DB5DCD-E5BB-4AA0-9CDA-3C2EC7B88621@gmx.net>


On Oct 24, 2006, at 1:59 PM, JK ((Jesper Agerbo Krogh)) wrote:

>>
>> I think you've generally taken the right path, but see below.
>>
>> First off, object factories are used extensively already but not yet
>> in each and every place where Bioperl creates an object internally.
>> Achieving your goal may entail fixes to Bioperl to use a factory
>> instead of a hard-coded module name. Also be on the lookout for
>> factory() or seq_factory() methods for classes whose work entails
>> creating sequence objects and that already give you control over the
>> type to be created.
>
> Can you elaborate/describe this a bit more?

See for example the POD of Bio::SeqIO (sorry, the method is called  
sequence_factory()).

>
>> The reason that this is a Bio::PrimarySeq and not a Bio::Seq or your
>> extension of the latter is that the Perl garbage collector can't deal
>> with circular references.
>
> Doesn't Scalar::Util::weaken solve that?

You're welcome to test and try. It should be a simple change in  
Bio::Seq::add_SeqFeature(). You will see that it is this method and  
not the feature object that makes sure the wrapped primarySeq gets  
passed as sequence reference. Just change that to creating a new  
reference to the sequence object and make it a weak reference before  
passing it to the feature object.

(The feature object has no requirement (or knowledge) that the  
referenced sequence object is a PrimarySeq.)

>
>> Having said all that, note that if all what you want to do is
>> defining computations on Bio::Seq objects, as opposed to storing
>> values for additional attributes, the best design approach is not to
>> extend the class but to create a class with those computations as
>> static methods (which would accept the seq object on which to compute
>> as an argument; e.g., print $seqComputations->message_digest($seq)).
>
> I could but there are some functionality that I'd by design would  
> like to
> have available on every sequence in the system. This way I would  
> end up
> coding the functionality for getting the message_digest every place  
> that
> I needed to get the value (which would be quite often in this  
> application),
> whereas it by design belongs into the Bio::Seq-stuff.

I'm not following you why this would make any difference (it would be  
$seq->message_digest() compared to $seqCompute->message_digest 
($seq)), unless what you are saying is that you would like to cache  
the result of the computation.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Wed Oct 25 10:36:27 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 25 Oct 2006 11:36:27 +0100
Subject: [Bioperl-l] Lagan environment variable
Message-ID: <453F3E2B.2040309@sendu.me.uk>

Notification to say I'm changing the environmental variable that 
Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
default variable that the lagan installation and scripts themselves look 
for.

I hope this isn't too much of a burden, but it seems like the sensible 
approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.


Thank you,
Sendu.


From n.haigh at sheffield.ac.uk  Wed Oct 25 13:07:47 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 25 Oct 2006 13:07:47 +0000
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F3E2B.2040309@sendu.me.uk>
References: <453F3E2B.2040309@sendu.me.uk>
Message-ID: <453F61A3.4090904@sheffield.ac.uk>

Sendu Bala wrote:
> Notification to say I'm changing the environmental variable that 
> Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
> default variable that the lagan installation and scripts themselves look 
> for.
>
> I hope this isn't too much of a burden, but it seems like the sensible 
> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.
>
>
> Thank you,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   
Woudn't it make more sense to change the test? That is what I've just
done for t/Genscan.t

It seemed to fit in with the ENV variable syntax that other modules in
Bioperl-run used.

Nath

-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>


From bix at sendu.me.uk  Wed Oct 25 12:12:00 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 25 Oct 2006 13:12:00 +0100
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F61A3.4090904@sheffield.ac.uk>
References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk>
Message-ID: <453F5490.7060808@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Notification to say I'm changing the environmental variable that 
>> Bio::Tools::Run::Alignment::Lagan expects to define the location of the 
>> lagan executables from LAGANDIR to LAGAN_DIR, since the latter is the 
>> default variable that the lagan installation and scripts themselves look 
>> for.
>>
>> I hope this isn't too much of a burden, but it seems like the sensible 
>> approach to getting Bio::Tools::Run::Alignment::Lagan to actually work.
>
> Woudn't it make more sense to change the test? That is what I've just
> done for t/Genscan.t

For Genscan.t, the test script looked at the wrong environment variable.

Here I'm talking about lagan itself (the thing you get from 
http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with 
Bioperl) needing the environment variable LAGAN_DIR to be set in order 
to work.

Since you need to set LAGAN_DIR to make lagan work, it makes sense that 
the Bioperl front-end to lagan also use the same variable.


From n.haigh at sheffield.ac.uk  Wed Oct 25 13:16:16 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Wed, 25 Oct 2006 13:16:16 +0000
Subject: [Bioperl-l] Lagan environment variable
In-Reply-To: <453F5490.7060808@sendu.me.uk>
References: <453F3E2B.2040309@sendu.me.uk> <453F61A3.4090904@sheffield.ac.uk>
	<453F5490.7060808@sendu.me.uk>
Message-ID: <453F63A0.7040609@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Notification to say I'm changing the environmental variable that
>>> Bio::Tools::Run::Alignment::Lagan expects to define the location of
>>> the lagan executables from LAGANDIR to LAGAN_DIR, since the latter
>>> is the default variable that the lagan installation and scripts
>>> themselves look for.
>>>
>>> I hope this isn't too much of a burden, but it seems like the
>>> sensible approach to getting Bio::Tools::Run::Alignment::Lagan to
>>> actually work.
>>
>> Woudn't it make more sense to change the test? That is what I've just
>> done for t/Genscan.t
>
> For Genscan.t, the test script looked at the wrong environment variable.
>
> Here I'm talking about lagan itself (the thing you get from
> http://lagan.stanford.edu/lagan_web/citing.shtml, nothing to do with
> Bioperl) needing the environment variable LAGAN_DIR to be set in order
> to work.
>
> Since you need to set LAGAN_DIR to make lagan work, it makes sense
> that the Bioperl front-end to lagan also use the same variable.
>
Ah, OK! :-[  teach me for speak up about something I know nothing about!
:-)

FYI, I've been busy this morning installing as much Bioperl-run external
software as I could (those that have tests). Will be posting results shorty.

Nath


From massimo.ubaldi at gmail.com  Wed Oct 25 14:28:52 2006
From: massimo.ubaldi at gmail.com (Massimo Ubaldi)
Date: Wed, 25 Oct 2006 16:28:52 +0200
Subject: [Bioperl-l] blastxml format
Message-ID: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>

Hi
I'm using the script below to parse a blastn output to multiple sequences
I got the output from the blast web interface asking for xml formatted
output.
Everything work fine except that I cannot print the name of each input
sequence (see below).
That is, using the line (see below) $result->query_description I got just
the name of the first sequence. Infact this is defined by the
<BlastOutput_query-def> tag.
What I really want is to extract the name that is defined by the
<Iteration_query-def> tag.
Now I digged out the bioperl mailing list and other sources but I did not
find anything to solve this.
Can somebody help me?
Thanks alot
Massimo


 This is an example of ouput I got

MRDNA_probe
46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form B
(LOC562171), mRNA    68354945    XM_685568
81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
68420187    XM_684078

This what I'd like to get
MRDNA_probe
46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form B
(LOC562171), mRNA    68354945    XM_685568
VDRacterm_probe
81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
ARalpcterm_probe
PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
68420187    XM_684078

This is the script
#!/usr/bin/perl
use strict;
use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'blast',
                            -file   => 'Blastn_danio.bls');
open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
stopped";
my $result = $in->next_result;
print OUTFILE $result->algorithm, "\n";
print OUTFILE $result->database_name, "\n";

print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
"\t", "GenBank Accession", "\n";

while($result = $in->next_result ) {
    print OUTFILE $result->query_description, "\n";
      while( my $hit = $result->next_hit ) {
           while( my $hsp = $hit->next_hsp ) {

                my $acc=$hit->name;
                my $description= $hit->description;

                $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;

                print OUTFILE

                  $hit->raw_score, "\t", # Score
                  $hit->description, "\t", # Description

                $1, "\t", $2, "\n";
         }
      }
}


From cjfields at uiuc.edu  Wed Oct 25 15:04:14 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Oct 2006 10:04:14 -0500
Subject: [Bioperl-l] blastxml format
In-Reply-To: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>
Message-ID: <000301c6f846$d6227760$15327e82@pyrimidine>

Iterations (which are related to PSIBLAST) aren't currently handled in
blastxml, which is why the tag isn't being parsed.  I'll give it a look but
I don't think it will be properly fixed anytime soon, since we're gearing up
for a developer release and are sorting out various bugs in relation to
that.

In the meantime, you could always try changing the relevant tag in the
%MAPPING hash in your local copy of Bio::SearchIO::blastxml from
'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick for
you.  I'm a bit reluctant to change this in CVS as it would be better to add
this in when iterations are handled properly by blastxml, and I'm not sure
all BLAST XML varieties have the <Iteration_query-def> tag.

If you want you can add this to the bioperl bugzilla as an enhancement
request to remind us:

http://bugzilla.open-bio.org/

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi
> Sent: Wednesday, October 25, 2006 9:29 AM
> To: bioperl-l List
> Subject: [Bioperl-l] blastxml format
> 
> Hi
> I'm using the script below to parse a blastn output to multiple sequences
> I got the output from the blast web interface asking for xml formatted
> output.
> Everything work fine except that I cannot print the name of each input
> sequence (see below).
> That is, using the line (see below) $result->query_description I got just
> the name of the first sequence. Infact this is defined by the
> <BlastOutput_query-def> tag.
> What I really want is to extract the name that is defined by the
> <Iteration_query-def> tag.
> Now I digged out the bioperl mailing list and other sources but I did not
> find anything to solve this.
> Can somebody help me?
> Thanks alot
> Massimo
> 
> 
>  This is an example of ouput I got
> 
> MRDNA_probe
> 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form
> B
> (LOC562171), mRNA    68354945    XM_685568
> 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> 68420187    XM_684078
> 
> This what I'd like to get
> MRDNA_probe
> 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor form
> B
> (LOC562171), mRNA    68354945    XM_685568
> VDRacterm_probe
> 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> ARalpcterm_probe
> PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> 68420187    XM_684078
> 
> This is the script
> #!/usr/bin/perl
> use strict;
> use Bio::SearchIO;
> my $in = new Bio::SearchIO(-format => 'blast',
>                             -file   => 'Blastn_danio.bls');
> open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
> stopped";
> my $result = $in->next_result;
> print OUTFILE $result->algorithm, "\n";
> print OUTFILE $result->database_name, "\n";
> 
> print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
> "\t", "GenBank Accession", "\n";
> 
> while($result = $in->next_result ) {
>     print OUTFILE $result->query_description, "\n";
>       while( my $hit = $result->next_hit ) {
>            while( my $hsp = $hit->next_hsp ) {
> 
>                 my $acc=$hit->name;
>                 my $description= $hit->description;
> 
>                 $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;
> 
>                 print OUTFILE
> 
>                   $hit->raw_score, "\t", # Score
>                   $hit->description, "\t", # Description
> 
>                 $1, "\t", $2, "\n";
>          }
>       }
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From massimo.ubaldi at gmail.com  Wed Oct 25 15:20:49 2006
From: massimo.ubaldi at gmail.com (Massimo Ubaldi)
Date: Wed, 25 Oct 2006 17:20:49 +0200
Subject: [Bioperl-l] blastxml format
In-Reply-To: <000301c6f846$d6227760$15327e82@pyrimidine>
References: <4b5350650610250728s1a421199if2493c9c4660474d@mail.gmail.com>
	<000301c6f846$d6227760$15327e82@pyrimidine>
Message-ID: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com>

Thanks for the reply. I've already tried this but I got exactly the same
results as before.
What other can I try?
Massimo

On 10/25/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Iterations (which are related to PSIBLAST) aren't currently handled in
> blastxml, which is why the tag isn't being parsed.  I'll give it a look
> but
> I don't think it will be properly fixed anytime soon, since we're gearing
> up
> for a developer release and are sorting out various bugs in relation to
> that.
>
> In the meantime, you could always try changing the relevant tag in the
> %MAPPING hash in your local copy of Bio::SearchIO::blastxml from
> 'BlastOutput_query-def' to 'Iteration_query-def', which may do the trick
> for
> you.  I'm a bit reluctant to change this in CVS as it would be better to
> add
> this in when iterations are handled properly by blastxml, and I'm not sure
> all BLAST XML varieties have the <Iteration_query-def> tag.
>
> If you want you can add this to the bioperl bugzilla as an enhancement
> request to remind us:
>
> http://bugzilla.open-bio.org/
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> > bounces at lists.open-bio.org] On Behalf Of Massimo Ubaldi
> > Sent: Wednesday, October 25, 2006 9:29 AM
> > To: bioperl-l List
> > Subject: [Bioperl-l] blastxml format
> >
> > Hi
> > I'm using the script below to parse a blastn output to multiple
> sequences
> > I got the output from the blast web interface asking for xml formatted
> > output.
> > Everything work fine except that I cannot print the name of each input
> > sequence (see below).
> > That is, using the line (see below) $result->query_description I got
> just
> > the name of the first sequence. Infact this is defined by the
> > <BlastOutput_query-def> tag.
> > What I really want is to extract the name that is defined by the
> > <Iteration_query-def> tag.
> > Now I digged out the bioperl mailing list and other sources but I did
> not
> > find anything to solve this.
> > Can somebody help me?
> > Thanks alot
> > Massimo
> >
> >
> >  This is an example of ouput I got
> >
> > MRDNA_probe
> > 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor
> form
> > B
> > (LOC562171), mRNA    68354945    XM_685568
> > 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> > 68420187    XM_684078
> >
> > This what I'd like to get
> > MRDNA_probe
> > 46.1    PREDICTED: Danio rerio similar to mineralocorticoid receptor
> form
> > B
> > (LOC562171), mRNA    68354945    XM_685568
> > VDRacterm_probe
> > 81.8    Danio rerio VDR-B mRNA, partial cds    68132043    DQ017633
> > ARalpcterm_probe
> > PREDICTED: Danio rerio similar to Rarab protein (LOC560679), mRNA
> > 68420187    XM_684078
> >
> > This is the script
> > #!/usr/bin/perl
> > use strict;
> > use Bio::SearchIO;
> > my $in = new Bio::SearchIO(-format => 'blast',
> >                             -file   => 'Blastn_danio.bls');
> > open OUTFILE, ">parsed_blastn_danio.txt" or die "Could not open file,
> > stopped";
> > my $result = $in->next_result;
> > print OUTFILE $result->algorithm, "\n";
> > print OUTFILE $result->database_name, "\n";
> >
> > print OUTFILE "Score", "\t", "Description", "\t", "NCBI gi identifiers",
> > "\t", "GenBank Accession", "\n";
> >
> > while($result = $in->next_result ) {
> >     print OUTFILE $result->query_description, "\n";
> >       while( my $hit = $result->next_hit ) {
> >            while( my $hsp = $hit->next_hsp ) {
> >
> >                 my $acc=$hit->name;
> >                 my $description= $hit->description;
> >
> >                 $acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/;
> >
> >                 print OUTFILE
> >
> >                   $hit->raw_score, "\t", # Score
> >                   $hit->description, "\t", # Description
> >
> >                 $1, "\t", $2, "\n";
> >          }
> >       }
> > }
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From cjfields at uiuc.edu  Wed Oct 25 16:56:46 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 25 Oct 2006 11:56:46 -0500
Subject: [Bioperl-l] blastxml format
In-Reply-To: <4b5350650610250820w1498b27dnd155896fbf9a2012@mail.gmail.com>
Message-ID: <000001c6f856$8ee44bc0$15327e82@pyrimidine>


> Thanks for the reply. I've already tried this but I got exactly the same >
> results as before.
> What other can I try? 
> Massimo

If you don't mind me asking, what version of perl and Bioperl are you using,
and what version of BLAST is used?  

I want to point out there are a number of problems with your script, now I
have had a chance to look at it.  

1) You have the SearchIO format set to 'blast'.  It should be 'blastxml' if
you are parsing XML format.  

2) Every time you call next_result() you iterate through each BLAST report.
In effect, you're doing something like this:

  my $result = $in->next_result();
   ....# do something here (in first BLAST report)
 
  while ($result = $in->next_result()) { # change to second BLAST report
      # more stuff here (in second BLAST report, if there is one)
  }

I don't know if it's intentional though, but it's something to point out.

3) You also use raw_score(), which doesn't return a value for me (this may
be related to the bioperl version, which is why I asked above).  If you use
$hit->bits() or $hit->significance() you can get the bits or hit evalue,
respectively.

4) Also, I didn't see a difference with the two XML tags
<BlastOutput_query-def> and <Iteration_query-def> using BLAST 2.2.15 output
(WebBLAST at NCBI), which makes sense since they should originate from the
same query sequence anyway.  This could be related to the BLAST version.

Here's my version of your script, using WinXP and bioperl-live (CVS):

use Bio::SearchIO;
my $file = shift @ARGV;

my $in = new Bio::SearchIO(-format => 'blastxml',
                            -file   => $file);

open OUTFILE, ">parsed_blastn_danio.txt" || 
die "Could not open file, stopped";

while(my $result = $in->next_result ) {
    print OUTFILE $result->algorithm, "\n";
    print OUTFILE $result->database_name, "\n";
    print OUTFILE "Score", "\t",
                  "Description", "\t",
                  "NCBI gi identifiers", "\t",
                  "GenBank Accession", "\n";
    print OUTFILE $result->query_description, "\n";
    while( my $hit = $result->next_hit ) {
        while( my $hsp = $hit->next_hsp ) {
            my $acc=$hit->name;
            my $description= $hit->description;
            if ($acc =~ /gi\|(\d+)\|\w+\|(\w+)\.\d/) {
                print OUTFILE $hit->bits, "\t", # Score
                  $hit->description, "\t", # Description
                  $1, "\t", $2, "\n";
            }
        }
    }
}

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign

...


From n.haigh at sheffield.ac.uk  Thu Oct 26 08:47:27 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 09:47:27 +0100
Subject: [Bioperl-l] More extensive Bioperl-run 1.5.2RC2 tests
Message-ID: <4540761F.6010904@sheffield.ac.uk>

Oops, I posted this to the Biojava list the other day by mistake!

I have recently installed some more software for which there are
bioperl-run tests and run the test suite with several versions of the
software I could find. I've added info to
http://www.bioperl.org/wiki/Release_1.5.2#bioperl-run. If there were any
fails in any of the versions I tested I've noted them together with
versions that were ok (if any).

There maybe another 6 or so programs I'm trying to get hold of to run
further tests - I'll update when I get them.
Nath


From n.haigh at sheffield.ac.uk  Thu Oct 26 09:14:07 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 10:14:07 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
Message-ID: <45407C5F.40104@sheffield.ac.uk>

I'm thinking that it's not wise to test for things like
overall_percentage_identity etc in alignments that are generated by
external software like T-Coffee, Clustalw etc. Changes to software
algorithms/efficiency, bug fixes etc may well alter the quality of the
alignment produced in different versions and thus affect the value
returned by such methods. Therefore, I think these methods should only
be tested from alignments loaded directly from t/data.

Nath


From bix at sendu.me.uk  Thu Oct 26 09:48:37 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 26 Oct 2006 10:48:37 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45407C5F.40104@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk>
Message-ID: <45408475.30903@sendu.me.uk>

Nathan Haigh wrote:
> I'm thinking that it's not wise to test for things like
> overall_percentage_identity etc in alignments that are generated by
> external software like T-Coffee, Clustalw etc. Changes to software
> algorithms/efficiency, bug fixes etc may well alter the quality of the
> alignment produced in different versions and thus affect the value
> returned by such methods. Therefore, I think these methods should only
> be tested from alignments loaded directly from t/data.

Did you discover some specific problem cases?


From n.haigh at sheffield.ac.uk  Thu Oct 26 10:04:54 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 11:04:54 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408475.30903@sendu.me.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
Message-ID: <45408846.1050001@sheffield.ac.uk>

Sendu Bala wrote:
> Nathan Haigh wrote:
>> I'm thinking that it's not wise to test for things like
>> overall_percentage_identity etc in alignments that are generated by
>> external software like T-Coffee, Clustalw etc. Changes to software
>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>> alignment produced in different versions and thus affect the value
>> returned by such methods. Therefore, I think these methods should only
>> be tested from alignments loaded directly from t/data.
>
> Did you discover some specific problem cases?
My messages seem to be taking a while to come through, but, yes. It may
be due to the software changing default parameters, but it makes testing
the output for specific details pretty difficult and inconsistent. For
example, running T-Coffee, the following command from t/TCoffee.t
results in slightly different alignment:
$aln = $factory->run('-type' => 'profile',
                     '-profile' => $aln1,
                     '-seq'  =>
Bio::Root::IO->catfile("t","data","cysprot1b.fa"));

Of particular note, is the gaps on the last line of the sequences. In
4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
<v4.45 this is ('gkn----mcg').

T-Coffee v4.45 returns the following alignment:

>CATH_RAT/1-333
------mwtalpllcagawllsagat----------aeltvnaiek------------fh
ftswmkqhqktyss-reyshrlqvfannwrkiqahn----qrnhtfkmglnqfsdmsfae
ikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqgacgscwtfs
ttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqafeyilynk
gimgedsypyigkngqckfnpekavafvknvv-nitlndeaamveavalynpvsfafevt
-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivknswgsnwgnn
gyfliergk-nm---cglaacasypipqv
>CATL_HUMAN/1-333
--------------------------------mnptlilaafclgiasatltfdhsleaq
wtkwkamhnrlygmnee-gwrravweknmkmielhnqeyregkhsftmamnafgdmtsee
frqvmngfqnrkpr----kgkvfqeplfyeaprsvdwrekg-yvtpvknqgqcgscwafs
atgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdyafqyvqdng
gldseesypyeateesckynpkysvandtgfv-dip-kqekalmkavatvgpisvaidag
hesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvknswgeewgmg
gyvkmakdrrnh---cgiasaasyptv--
>CATL_RAT/1-334
--------------------------------mtpllllavlclgtalatpkfdqtfnaq
whqwksthrrlygtnee-ewrravweknmrmiqlhngeysngkhgftmemnafgdmtnee
frqivngyrhqkhk----kgrlfqeplmlqipktvdwrekg-cvtpvknqgqcgscwafs
asgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfafqyikeng
gldseesypyeakdgsckyraeyavandtgfv-dip-qqekalmkavatvgpisvamdas
hpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvknswgkewgmd
gyikiakdrnnh---cglataasypivn-
>PAPA_CARPA/1-345
mamipsiskllfvaiclfvymglsfg-------------dfsivgysqndltsterliql
feswmlkhnkiyknidekiyrfeifkdnlkyidetn----kknnsywlglnvfadmsnde
fkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgscgscwafs
avvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsalqlvaqy-
gihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysian-qpvsvvleaa
gkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yiliknswgtgwgen
gyirikrgtgnsygvcglytssfypvkn-
>ALEU_HORVU/1-362
maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtrhalr
farfavrygksyesaaevrrrfrifsesleevrstn----rkglpyrlginrfsdmswee
fqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqahcgscwtfs
ttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqafeyikyng
gidteesypykgvngvchykaenaavqvldsv-nitlnaedelknavglvrpvsvafqvi
-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywliknswgadwgdn
gyfkmemgk-nm---caiatcasypvvaa
>CATH_HUMAN/1-335
------mwatlpllcagawllg--------vpvcgaaelsvnslek------------fh
fkswmskhrktys-teeyhhrlqtfasnwrkinahn----ngnhtfkmalnqfsdmsfae
ikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqgacgscwtfs
ttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqafeyilynk
gimgedtypyqgkdgyckfqpgkaigfvkdva-nitiydeeamveavalynpvsfafevt
-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivknswgpqwgmn
gyfliergk-nm---cglaacasypiplv
>CYS1_DICDI/1-343
-----mkvillfvlavftvfvs---------------srgippeeq------------sq
flefqdkfnkkys-heeylerfeifksnlgkieelnliainhkadtkfgvnkfadlssde
fknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgqcgscwsfs
ttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpnaynyiikng
giqtessypytaetgtqcnfnsanigakisnf-tmipknetvmagyivstgplaiaadav
-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivknswgadwgeq
gyiylrrgk-nt---cgvsnfvstsii--

While T-Coffee <4.45 returned:
>CATH_RAT/1-333
----------mwtalpllcagawllsagat----------aeltvnaiek----------
--fhftswmkqhqktyss-reyshrlqvfannwrkiqahn----q----rnhtfkmglnq
fsdmsfaeikhkylwsepqncsat--ksnylrgtgp--ypssmdwrkkgnvvspvknqga
cgscwtfsttgalesavaiasgkmmtlaeqqlvdcaqnfnnh--------gcqgglpsqa
feyilynkgimgedsypyigkngqckfnpekavafvknvvn-itlndeaamveavalynp
vsfafevt-edfmmyksgvyssnschktpdkvnhavlavgygeqn-----gllywivkns
wgsnwgnngyfliergkn----mcglaacasypipqv
>PAPA_CARPA/1-345
mamipsiskllfvaiclfvymglsfgdfsivgysqndltsterliqlfeswml-------
-------------khnkiyknidekiyrf-----eifkdnlkyidetnkknnsywlglnv
fadmsndefkekytgsiagnytttelsyeevlndgdvnipeyvdwrqkg-avtpvknqgs
cgscwafsavvtiegiikirtgnlneyseqelldc----------drrsygcnggypwsa
lq-lvaqygihyrntypyegvqrycrsrekgpyaaktdgvrqvqpynegallysia-nqp
vsvvleaagkdfqlyrggifvgpcgnk----vdhavaavgygpn---------yilikns
wgtgwgengyirikrgtgnsygvcglytssfypvkn-
>CATL_HUMAN/1-333
-----------------------------------------mnptlilaafclgiasatl
tfdhsleaqwtkwkamhnrlygmneegwrravweknmkmielhnqeyregkhsftmamna
fgdmtseefrqvmngfqnrkprkgkvfqeplf----yeaprsvdwrekg-yvtpvknqgq
cgscwafsatgalegqmfrktgrlislseqnlvdcsgpqgn--------egcngglmdya
fqyvqdnggldseesypyeateesckynpkysvandtgfvd--ipkqekalmkavatvgp
isvaidaghesflfykegiyfepdc--ssedmdhgvlvvgygfestesdnn-kywlvkns
wgeewgmggyvkmakdrrnh---cgiasaasyptv--
>CATL_RAT/1-334
-----------------------------------------mtpllllavlclgtalatp
kfdqtfnaqwhqwksthrrlygtneeewrravweknmrmiqlhngeysngkhgftmemna
fgdmtneefrqivngyrhqkhkkgrlfqeplm----lqipktvdwrekg-cvtpvknqgq
cgscwafsasgclegqmflktgklislseqnlvdcshdqgn--------qgcngglmdfa
fqyikenggldseesypyeakdgsckyraeyavandtgfvd--ipqqekalmkavatvgp
isvamdashpslqfyssgiyyepnc--sskdldhgvlvvgygyegtdsnkd-kywlvkns
wgkewgmdgyikiakdrnnh---cglataasypivn-
>ALEU_HORVU/1-362
----maharvlllalavlataavavassssfadsnpirpvtdraastlesavlgalgrtr
halrfarfavrygksyesaaevrrrfrifsesleevrstn----r----kglpyrlginr
fsdmsweefqatrlg-aaqtcsatlagnhlmrdaaa--lpetkdwredg-ivspvknqah
cgscwtfsttgaleaaytqatgknislseqqlvdcaggfnnf--------gcngglpsqa
feyikynggidteesypykgvngvchykaenaavqvldsvn-itlnaedelknavglvrp
vsvafqvi-dgfrqyksgvytsdhcgttpddvnhavlavgygven-----gvpywlikns
wgadwgdngyfkmemgkn----mcaiatcasypvvaa
>CATH_HUMAN/1-335
----------mwatlpllcagawllg--------vpvcgaaelsvnslek----------
--fhfkswmskhrktys-teeyhhrlqtfasnwrkinahn----n----gnhtfkmalnq
fsdmsfaeikhkylwsepqncsatks--nylrgtgp--yppsvdwrkkgnfvspvknqga
cgscwtfsttgalesaiaiatgkmlslaeqqlvdcaqdfnny--------gcqgglpsqa
feyilynkgimgedtypyqgkdgyckfqpgkaigfvkdvan-itiydeeamveavalynp
vsfafevt-qdfmmyrtgiysstschktpdkvnhavlavgygekn-----gipywivkns
wgpqwgmngyfliergkn----mcglaacasypiplv
>CYS1_DICDI/1-343
---------mkvillfvlavftvfvs---------------srgippeeq----------
--sqflefqdkfnkkys-heeylerfeifksnlgkieelnliain----hkadtkfgvnk
fadlssdefknyylnnkeaiftddlpvadylddefinsiptafdwrtrg-avtpvknqgq
cgscwsfsttgnvegqhfisqnklvslseqnlvdcdhecmeyegeeacdegcngglqpna
ynyiiknggiqtessypytaetgtqcnfnsanigakisnft-mipknetvmagyivstgp
laiaadav-e-wqfyiggvfdipcn---pnsldhgilivgysakntifrknmpywivkns
wgadwgeqgyiylrrgkn----tcgvsnfvstsii--


From sanges at biogem.it  Thu Oct 26 10:26:36 2006
From: sanges at biogem.it (Remo Sanges)
Date: Thu, 26 Oct 2006 11:26:36 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408846.1050001@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk>
Message-ID: <45408D5C.1000305@biogem.it>

Nathan Haigh wrote:
> Sendu Bala wrote:
>   
>> Nathan Haigh wrote:
>>     
>>> I'm thinking that it's not wise to test for things like
>>> overall_percentage_identity etc in alignments that are generated by
>>> external software like T-Coffee, Clustalw etc. Changes to software
>>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>>> alignment produced in different versions and thus affect the value
>>> returned by such methods. Therefore, I think these methods should only
>>> be tested from alignments loaded directly from t/data.
>>>       
>> Did you discover some specific problem cases?
>>     
> My messages seem to be taking a while to come through, but, yes. It may
> be due to the software changing default parameters, but it makes testing
> the output for specific details pretty difficult and inconsistent. For
> example, running T-Coffee, the following command from t/TCoffee.t
> results in slightly different alignment:
> $aln = $factory->run('-type' => 'profile',
>                      '-profile' => $aln1,
>                      '-seq'  =>
> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>
> Of particular note, is the gaps on the last line of the sequences. In
> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
> <v4.45 this is ('gkn----mcg').
>   
I'm not a T-coffee user but usually you can come across
these problems when you use different scoring parameters
when align sequences.

Could it be possible that they have simply changed the
default parameters for gap penalties and that kind of
stuff? It is possible to set them?

If so you can just run the test by defining
the scores in the param hash without using the default.

HTH

Remo


From n.haigh at sheffield.ac.uk  Thu Oct 26 10:33:55 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 11:33:55 +0100
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408D5C.1000305@biogem.it>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it>
Message-ID: <45408F13.9020209@sheffield.ac.uk>

Remo Sanges wrote:
> Nathan Haigh wrote:
>> Sendu Bala wrote:
>>  
>>> Nathan Haigh wrote:
>>>    
>>>> I'm thinking that it's not wise to test for things like
>>>> overall_percentage_identity etc in alignments that are generated by
>>>> external software like T-Coffee, Clustalw etc. Changes to software
>>>> algorithms/efficiency, bug fixes etc may well alter the quality of the
>>>> alignment produced in different versions and thus affect the value
>>>> returned by such methods. Therefore, I think these methods should only
>>>> be tested from alignments loaded directly from t/data.
>>>>       
>>> Did you discover some specific problem cases?
>>>     
>> My messages seem to be taking a while to come through, but, yes. It may
>> be due to the software changing default parameters, but it makes testing
>> the output for specific details pretty difficult and inconsistent. For
>> example, running T-Coffee, the following command from t/TCoffee.t
>> results in slightly different alignment:
>> $aln = $factory->run('-type' => 'profile',
>>                      '-profile' => $aln1,
>>                      '-seq'  =>
>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>>
>> Of particular note, is the gaps on the last line of the sequences. In
>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
>> <v4.45 this is ('gkn----mcg').
>>   
> I'm not a T-coffee user but usually you can come across
> these problems when you use different scoring parameters
> when align sequences.
>
> Could it be possible that they have simply changed the
> default parameters for gap penalties and that kind of
> stuff? It is possible to set them?
>
> If so you can just run the test by defining
> the scores in the param hash without using the default.
>
> HTH
>
> Remo
That is true, but it depends on the whether the wrapper is complete
enough to be able to set all the parameters provided by the software.

Nath


From n.haigh at sheffield.ac.uk  Thu Oct 26 16:13:03 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 17:13:03 +0100
Subject: [Bioperl-l] Bio::Restriction::Enzyme
Message-ID: <4540DE8F.7070501@sheffield.ac.uk>

I'm in the middle of writing some code that uses
Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
Bioperl from HEAD.

I seem to find that $enzyme->is_palindromic always seems to return true.
Can anyone verify this? If needs be, I can send some code.

Thanks
Nathan


From info at nanotechcongresssmailer.net  Tue Oct 24 14:45:10 2006
From: info at nanotechcongresssmailer.net (International Association of Nanotechnology)
Date: Tue, 24 Oct 2006 09:45:10 -0500
Subject: [Bioperl-l] ICNT2006-presents Nanotechnology Workforce Development
Message-ID: <200610241445.k9OEjBBA024478@portal.open-bio.org>

An HTML attachment was scrubbed...
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061024/d185772e/attachment-0004.html>

From bosborne11 at verizon.net  Thu Oct 26 16:37:06 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Thu, 26 Oct 2006 12:37:06 -0400
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk>
Message-ID: <C1665C72.B068%bosborne11@verizon.net>

Nathan,

Perhaps because most restriction sites are palindromes. Anyway, I added
tests for palindromic() and is_palindromic() where the site is not a
palindrome, these tests pass (t/RestrictionAnalyis.t).

Brian O.


On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:

> I'm in the middle of writing some code that uses
> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
> Bioperl from HEAD.
> 
> I seem to find that $enzyme->is_palindromic always seems to return true.
> Can anyone verify this? If needs be, I can send some code.
> 
> Thanks
> Nathan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From n.haigh at sheffield.ac.uk  Thu Oct 26 16:49:48 2006
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Thu, 26 Oct 2006 17:49:48 +0100
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1665C72.B068%bosborne11@verizon.net>
References: <C1665C72.B068%bosborne11@verizon.net>
Message-ID: <4540E72C.5020800@sheffield.ac.uk>

Brian Osborne wrote:
> Nathan,
>
> Perhaps because most restriction sites are palindromes. Anyway, I added
> tests for palindromic() and is_palindromic() where the site is not a
> palindrome, these tests pass (t/RestrictionAnalyis.t).
>
> Brian O.
>
>
> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>
>   
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>
>   
Ok, thanks - nice to know :-)


From cjfields at uiuc.edu  Thu Oct 26 16:58:34 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 26 Oct 2006 11:58:34 -0500
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4540DE8F.7070501@sheffield.ac.uk>
Message-ID: <001301c6f91f$f9611770$15327e82@pyrimidine>

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh
> Sent: Thursday, October 26, 2006 11:13 AM
> To: Bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Bio::Restriction::Enzyme
> 
> I'm in the middle of writing some code that uses
> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
> Bioperl from HEAD.
> 
> I seem to find that $enzyme->is_palindromic always seems to return true.
> Can anyone verify this? If needs be, I can send some code.
> 
> Thanks
> Nathan

You should file a bug report if you have found a test case where this method
isn't working as it should, especially if Brian's tests pass and you're
still getting the wrong results.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From jason at bioperl.org  Thu Oct 26 16:57:32 2006
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 26 Oct 2006 09:57:32 -0700
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <45408F13.9020209@sheffield.ac.uk>
References: <45407C5F.40104@sheffield.ac.uk> <45408475.30903@sendu.me.uk>
	<45408846.1050001@sheffield.ac.uk> <45408D5C.1000305@biogem.it>
	<45408F13.9020209@sheffield.ac.uk>
Message-ID: <C2AC4DE8-7E99-4744-9FA9-B11C51788BDE@bioperl.org>

Nathan -

I agree - the values tend to change with different versions of the  
applications unfortunately.  It would make sense to just test that  
you get out sequences that are in valid alignment format and perhaps  
have as many ending sequences as you started with.   The more  
restrictive tests probably aren't reliable with mixing and matching  
versions.

One thing we do for PAML is condition tests on the version used - but  
of course when a new version comes out we have to add more stuff to  
the tests (or just have some code that skips those tests).

-jason
On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:

> Remo Sanges wrote:
>> Nathan Haigh wrote:
>>> Sendu Bala wrote:
>>>
>>>> Nathan Haigh wrote:
>>>>
>>>>> I'm thinking that it's not wise to test for things like
>>>>> overall_percentage_identity etc in alignments that are  
>>>>> generated by
>>>>> external software like T-Coffee, Clustalw etc. Changes to software
>>>>> algorithms/efficiency, bug fixes etc may well alter the quality  
>>>>> of the
>>>>> alignment produced in different versions and thus affect the value
>>>>> returned by such methods. Therefore, I think these methods  
>>>>> should only
>>>>> be tested from alignments loaded directly from t/data.
>>>>>
>>>> Did you discover some specific problem cases?
>>>>
>>> My messages seem to be taking a while to come through, but, yes.  
>>> It may
>>> be due to the software changing default parameters, but it makes  
>>> testing
>>> the output for specific details pretty difficult and  
>>> inconsistent. For
>>> example, running T-Coffee, the following command from t/TCoffee.t
>>> results in slightly different alignment:
>>> $aln = $factory->run('-type' => 'profile',
>>>                      '-profile' => $aln1,
>>>                      '-seq'  =>
>>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
>>>
>>> Of particular note, is the gaps on the last line of the  
>>> sequences. In
>>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
>>> <v4.45 this is ('gkn----mcg').
>>>
>> I'm not a T-coffee user but usually you can come across
>> these problems when you use different scoring parameters
>> when align sequences.
>>
>> Could it be possible that they have simply changed the
>> default parameters for gap penalties and that kind of
>> stuff? It is possible to set them?
>>
>> If so you can just run the test by defining
>> the scores in the param hash without using the default.
>>
>> HTH
>>
>> Remo
> That is true, but it depends on the whether the wrapper is complete
> enough to be able to set all the parameters provided by the software.
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From cjfields at uiuc.edu  Thu Oct 26 22:01:08 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 26 Oct 2006 17:01:08 -0500
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <C2AC4DE8-7E99-4744-9FA9-B11C51788BDE@bioperl.org>
Message-ID: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>

I have been running into similar issues with EUtilities tests.  Since the
data on the server is constantly updated I have to try an future-proof the
tests so they don't constantly fail.  

I have been using Test::More and like/unlike or cmp_ok to get around some of
those 'fuzzy data' issues.  If some methods consistently return a particular
type of value, such as an integer, you could use:

like($foo->get_value, qr{^\d+$}, 'value test'); #integer

or similar.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> Nathan -
> 
> I agree - the values tend to change with different versions of the
> applications unfortunately.  It would make sense to just test that
> you get out sequences that are in valid alignment format and perhaps
> have as many ending sequences as you started with.   The more
> restrictive tests probably aren't reliable with mixing and matching
> versions.
> 
> One thing we do for PAML is condition tests on the version used - but
> of course when a new version comes out we have to add more stuff to
> the tests (or just have some code that skips those tests).
> 
> -jason
> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:
> 
> > Remo Sanges wrote:
> >> Nathan Haigh wrote:
> >>> Sendu Bala wrote:
> >>>
> >>>> Nathan Haigh wrote:
> >>>>
> >>>>> I'm thinking that it's not wise to test for things like
> >>>>> overall_percentage_identity etc in alignments that are
> >>>>> generated by
> >>>>> external software like T-Coffee, Clustalw etc. Changes to software
> >>>>> algorithms/efficiency, bug fixes etc may well alter the quality
> >>>>> of the
> >>>>> alignment produced in different versions and thus affect the value
> >>>>> returned by such methods. Therefore, I think these methods
> >>>>> should only
> >>>>> be tested from alignments loaded directly from t/data.
> >>>>>
> >>>> Did you discover some specific problem cases?
> >>>>
> >>> My messages seem to be taking a while to come through, but, yes.
> >>> It may
> >>> be due to the software changing default parameters, but it makes
> >>> testing
> >>> the output for specific details pretty difficult and
> >>> inconsistent. For
> >>> example, running T-Coffee, the following command from t/TCoffee.t
> >>> results in slightly different alignment:
> >>> $aln = $factory->run('-type' => 'profile',
> >>>                      '-profile' => $aln1,
> >>>                      '-seq'  =>
> >>> Bio::Root::IO->catfile("t","data","cysprot1b.fa"));
> >>>
> >>> Of particular note, is the gaps on the last line of the
> >>> sequences. In
> >>> 4.45, there are two gaps in CATH_RAT/1-133 ('gk-nm---cg') whereas in
> >>> <v4.45 this is ('gkn----mcg').
> >>>
> >> I'm not a T-coffee user but usually you can come across
> >> these problems when you use different scoring parameters
> >> when align sequences.
> >>
> >> Could it be possible that they have simply changed the
> >> default parameters for gap penalties and that kind of
> >> stuff? It is possible to set them?
> >>
> >> If so you can just run the test by defining
> >> the scores in the param hash without using the default.
> >>
> >> HTH
> >>
> >> Remo
> > That is true, but it depends on the whether the wrapper is complete
> > enough to be able to set all the parameters provided by the software.
> >
> > Nath
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich, PhD
> Miller Research Fellow
> University of California
> Dept of Plant and Microbial Biology
> 321 Koshland Hall #3102
> Berkeley, CA 94720-3102
> lab: 510.642.8441
> http://pmb.berkeley.edu/~taylor/people/js.html
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From gbazykin at Princeton.EDU  Thu Oct 26 22:49:56 2006
From: gbazykin at Princeton.EDU (Georgii A Bazykin)
Date: Thu, 26 Oct 2006 18:49:56 -0400
Subject: [Bioperl-l] about PAML running within bioperl
In-Reply-To: <001901c6dbcf$9af4de50$0915020a@zchou>
References: <001901c6dbcf$9af4de50$0915020a@zchou>
Message-ID: <185431468.20061026184956@princeton.edu>

I just had the exact same problem, which was also (as in Caleb Davis's
case) was solved by switching to PAML 3.14 from 3.15.


------------------------------
Tuesday, September 19, 2006, 5:40:07 AM, you wrote:

> Hello, every one,

> I use code in the PAML HOWTO (running PAML fom within Bioperl) on
> my Linux OS. And I set ENV as described by instructions. At the
> beginning, it seems that ClustalW run smoothly. However, when the
> programme run to call method "get_MLmatrix", somethign happened. The
> following information was listed as follows: (What reason or How to solve these problems?)
> ........
> Sequences (2:3) Aligned. Score:  87
> Sequences (2:4) Aligned. Score:  88
> Sequences (2:5) Aligned. Score:  87
> Sequences (2:6) Aligned. Score:  87
> Sequences (2:7) Aligned. Score:  87
> Sequences (2:8) Aligned. Score:  87
> Sequences (3:4) Aligned. Score:  93
> Sequences (3:5) Aligned. Score:  93
> Sequences (3:6) Aligned. Score:  93
> Sequences (3:7) Aligned. Score:  92
> Sequences (3:8) Aligned. Score:  92
> Sequences (4:5) Aligned. Score:  99
> Sequences (4:6) Aligned. Score:  99
> Sequences (4:7) Aligned. Score:  98
> Sequences (4:8) Aligned. Score:  98
> Sequences (5:6) Aligned. Score:  100
> Sequences (5:7) Aligned. Score:  99
> Sequences (5:8) Aligned. Score:  99
> Sequences (6:7) Aligned. Score:  99
> Sequences (6:8) Aligned. Score:  99
> Sequences (7:8) Aligned. Score:  100
> Guide tree        file created:  
> [/home/zchou/TMPDIR/8QEqLivAKY/JU833u8OTP.dnd]
> Start of Multiple Alignment
> There are 7 groups
> Aligning...
> Group 1: Sequences:   2      Score:5875
> Group 2: Sequences:   2      Score:5877
> Group 3: Sequences:   4      Score:5864
> Group 4: Sequences:   5      Score:5537
> Group 5: Sequences:   6      Score:5727
> Group 6: Sequences:   7      Score:5608
> Group 7: Sequences:   8      Score:5607
> Alignment Score 43650
> GCG-Alignment file created     
> [/home/zchou/TMPDIR/8QEqLivAKY/CussPD56rZ]
> aligned aa sequences were: Bio::SimpleAlign=HASH(0x87b93f4)
> Can't call method "get_MLmatrix" on an undefined value at
> originalpaml.pl line 57, <GEN2> line 332.


> Zhuocheng Hou
> Department of Animal Genetics and Breeding
> China Agricultural University


From himanshu.ardawatia at bccs.uib.no  Fri Oct 27 01:54:36 2006
From: himanshu.ardawatia at bccs.uib.no (Himanshu Ardawatia)
Date: Fri, 27 Oct 2006 03:54:36 +0200
Subject: [Bioperl-l] Query on tree bootstrap values
Message-ID: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>

Hi,

2 questions :

1. I have a phylogenetic tree and I wish to set (or modify or query)
bootstrap values for all internal nodes. How do I do that using BioPerl ?

2. I tried the example script attached below for general purpose for the
example newick tree with bootstrap values (also attached below) and It gives
strange results even for branch length. It shows Parent ID as 0.71 which
actually is the bootstrap value for the last ancestral node for human and
chimp and It shows the Child node ID as 'Human' ! Am I missing something in
the tree formatting ? Results also attached below. Also how to extract /
modify/ add bootstrap values in this tree ?

Thanks
Himanshu

EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
#################################
(
  ('Chimp'  : 0.052,
   'Human'  : 0.042) 0.71 : 0.007,
  'Gorilla'  : 0.060,
  ('Gibbon'  : 0.124,
   'Orangutan'  : 0.0971) 1 : 0.038
);
#################################

EXAMPLE SCRIPT:

#################################
#!/usr/bin/perl -w

use Bio::Seq;
# use Bio::TreeIO;
use Bio::Tree::TreeI;

# get a Tree::NodeI somehow
    # like from a TreeIO
    use Bio::TreeIO;
    # read in a clustalw NJ in phylip/newick format
    my $treeio = new Bio::TreeIO(-format => 'newick', -file =>
'example_newick_tree.newick');

    my $tree = $treeio->next_tree; # we'll assume it worked for demo
purposes
                                   # you might want to test that it was
defined

    my $rootnode = $tree->get_root_node;

    # process just the next generation
    foreach my $node ( $rootnode->each_Descendent() ) {
        print "branch len is ", $node->branch_length, "\n";
    }

    # process all the children
    my $example_leaf_node;
    foreach my $node ( $rootnode->get_Descendents() ) {
        if( $node->is_Leaf ) {
            print "node is a leaf ... ";
            # for example use below
            $example_leaf_node = $node unless defined $example_leaf_node;
        }
        print "branch len is ", $node->branch_length, "\n";
    }

    # The ancestor() method points to the parent of a node
    # A node can only have one parent

    my $parent = $example_leaf_node->ancestor;

    # parent won't likely have an description because it is an internal node
    # but child will because it is a leaf

    print "Parent id: ", $parent->id," child id: ",
          $example_leaf_node->id, "\n";

##########################################

RESULTS:
branch len is  0.007
branch len is  0.060
branch len is  0.038
node is a leaf ... branch len is  0.042
node is a leaf ... branch len is  0.052
branch len is  0.007
node is a leaf ... branch len is  0.060
node is a leaf ... branch len is  0.0971
node is a leaf ... branch len is  0.124
branch len is  0.038
Parent id: _0.71_ child id: ___'Human'__


From n.haigh at sheffield.ac.uk  Fri Oct 27 08:42:23 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 08:42:23 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1665C72.B068%bosborne11@verizon.net>
References: <C1665C72.B068%bosborne11@verizon.net>
Message-ID: <4541C66F.1020404@sheffield.ac.uk>

Hi Brian,

I wonder if i'm using is_prototype() correctly as I don't seem to get
any returning true:

my $enz_coll = Bio::Restriction::EnzymeCollection->new();
my $prototype = 0;
foreach my $enz ($enz_coll->each_enzyme) {
    $prototype++ if $enz->is_prototype;
}
print "$prototype have unique recognition sites\n";

prints:
0 have unique recognition sites

Thanks
Nath

Brian Osborne wrote:
> Nathan,
>
> Perhaps because most restriction sites are palindromes. Anyway, I added
> tests for palindromic() and is_palindromic() where the site is not a
> palindrome, these tests pass (t/RestrictionAnalyis.t).
>
> Brian O.
>
>
> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>
>   
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>
>   


-- 
> A: Yes.
>> Q: Are you sure?
>>     
>>> A: Because it reverses the logical flow of conversation.
>>>       
>>>> Q: Why is top posting frowned upon?
>>>>         
Get Thunderbird <http://www.mozilla.org/products/thunderbird/>


From n.haigh at sheffield.ac.uk  Fri Oct 27 08:47:21 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 08:47:21 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <001301c6f91f$f9611770$15327e82@pyrimidine>
References: <001301c6f91f$f9611770$15327e82@pyrimidine>
Message-ID: <4541C799.4090507@sheffield.ac.uk>

Chris Fields wrote:
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>> bounces at lists.open-bio.org] On Behalf Of Nathan Haigh
>> Sent: Thursday, October 26, 2006 11:13 AM
>> To: Bioperl-l at lists.open-bio.org
>> Subject: [Bioperl-l] Bio::Restriction::Enzyme
>>
>> I'm in the middle of writing some code that uses
>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>> Bioperl from HEAD.
>>
>> I seem to find that $enzyme->is_palindromic always seems to return true.
>> Can anyone verify this? If needs be, I can send some code.
>>
>> Thanks
>> Nathan
>>     
>
> You should file a bug report if you have found a test case where this method
> isn't working as it should, especially if Brian's tests pass and you're
> still getting the wrong results.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>
>   

I was doing some filtering of the default set of enzymes and happened to
removed the 2 that are not palindromic before I used is_palindromic().
Thus, I didn't see any that were not palindromic - if that makes sense!
Since I know very little about restriction enzymes, I'll trust that
these are correct :-)  and I'm getting the correct results.

Thanks
Nath
<http://www.mozilla.org/products/thunderbird/>


From n.haigh at sheffield.ac.uk  Fri Oct 27 09:04:40 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 09:04:40 +0000
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>
References: <000301c6f94a$3e2a3f10$15327e82@pyrimidine>
Message-ID: <4541CBA8.10006@sheffield.ac.uk>

Chris Fields wrote:
> I have been running into similar issues with EUtilities tests.  Since the
> data on the server is constantly updated I have to try an future-proof the
> tests so they don't constantly fail.  
>
> I have been using Test::More and like/unlike or cmp_ok to get around some of
> those 'fuzzy data' issues.  If some methods consistently return a particular
> type of value, such as an integer, you could use:
>
> like($foo->get_value, qr{^\d+$}, 'value test'); #integer
>
> or similar.
>
> Chris
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign 
>
>   
>> Nathan -
>>
>> I agree - the values tend to change with different versions of the
>> applications unfortunately.  It would make sense to just test that
>> you get out sequences that are in valid alignment format and perhaps
>> have as many ending sequences as you started with.   The more
>> restrictive tests probably aren't reliable with mixing and matching
>> versions.
>>
>> One thing we do for PAML is condition tests on the version used - but
>> of course when a new version comes out we have to add more stuff to
>> the tests (or just have some code that skips those tests).
>>
>> -jason
>> On Oct 26, 2006, at 3:33 AM, Nathan Haigh wrote:
>>
>>     
I think it makes sense to test that data of the expected type was
returned by the xternal resource but not to test the specifics of what
was retured. If specifics are tested we are then in the realm of testing
whether we believe the data returned by the external resource or not. We
should assume that the domain experts for these resources know what they
are doing - in some cases this might not be true :-)  but I think we
should stick to testing that the objects created hold the expected type
of data.

I like what Chris had to say (above) but wonder whether tests
would/should be tested for in the module itself - i.e. testing that a
stored value is an integer and warn/throw if not?

Nath


From bix at sendu.me.uk  Fri Oct 27 09:08:18 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 10:08:18 +0100
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
Message-ID: <4541CC82.2040705@sendu.me.uk>

Himanshu Ardawatia wrote:
> Hi,
> 
> 2 questions :
> 
> 1. I have a phylogenetic tree and I wish to set (or modify or query)
> bootstrap values for all internal nodes. How do I do that using BioPerl ?

Does bootstrap() not do what you need?


> 2. I tried the example script attached below for general purpose for the
> example newick tree with bootstrap values (also attached below) and It gives
> strange results even for branch length. It shows Parent ID as 0.71 which
> actually is the bootstrap value for the last ancestral node for human and
> chimp and It shows the Child node ID as 'Human' ! Am I missing something in
> the tree formatting ? Results also attached below. Also how to extract /
> modify/ add bootstrap values in this tree ?
[snip]
> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
> #################################
> (
>   ('Chimp'  : 0.052,
>    'Human'  : 0.042) 0.71 : 0.007,
>   'Gorilla'  : 0.060,
>   ('Gibbon'  : 0.124,
>    'Orangutan'  : 0.0971) 1 : 0.038
> );
> #################################

Are you sure this is in the correct format?

For example, with the tree:
( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 
'Gorilla':0.060, 
('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038);

and your script (with a print "--\n" between the two printing loops for 
clarity) I get...

> ##########################################
> 
> RESULTS:
> branch len is  0.007
> branch len is  0.060
> branch len is  0.038
> node is a leaf ... branch len is  0.042
> node is a leaf ... branch len is  0.052
> branch len is  0.007
> node is a leaf ... branch len is  0.060
> node is a leaf ... branch len is  0.0971
> node is a leaf ... branch len is  0.124
> branch len is  0.038
> Parent id: _0.71_ child id: ___'Human'__

...

branch len is 0.007
branch len is 0.060
branch len is 0.038
--
branch len is 0.007
node is a leaf ... branch len is 0.052
node is a leaf ... branch len is 0.042
node is a leaf ... branch len is 0.060
branch len is 0.038
node is a leaf ... branch len is 0.124
node is a leaf ... branch len is 0.0971
Parent id: 'Human_Chimp_Ancestor' child id: 'Chimp'

This seems reasonable to me. What were you expecting?


From n.haigh at sheffield.ac.uk  Fri Oct 27 11:36:10 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 11:36:10 +0000
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541CC82.2040705@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>
	<4541CC82.2040705@sendu.me.uk>
Message-ID: <4541EF2A.4050600@sheffield.ac.uk>

Sendu Bala wrote:
> Himanshu Ardawatia wrote:
>   
>> Hi,
>>
>> 2 questions :
>>
>> 1. I have a phylogenetic tree and I wish to set (or modify or query)
>> bootstrap values for all internal nodes. How do I do that using BioPerl ?
>>     
>
> Does bootstrap() not do what you need?
>
>
>   
>> 2. I tried the example script attached below for general purpose for the
>> example newick tree with bootstrap values (also attached below) and It gives
>> strange results even for branch length. It shows Parent ID as 0.71 which
>> actually is the bootstrap value for the last ancestral node for human and
>> chimp and It shows the Child node ID as 'Human' ! Am I missing something in
>> the tree formatting ? Results also attached below. Also how to extract /
>> modify/ add bootstrap values in this tree ?
>>     
> [snip]
>   
>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>> #################################
>> (
>>   ('Chimp'  : 0.052,
>>    'Human'  : 0.042) 0.71 : 0.007,
>>   'Gorilla'  : 0.060,
>>   ('Gibbon'  : 0.124,
>>    'Orangutan'  : 0.0971) 1 : 0.038
>> );
>> #################################
>>     
>
> Are you sure this is in the correct format?
>   

He/she may have a tree that already contains bootstrap values output
from another program. If this is so, which program did you use? Without
reminding myself of the formats, you should lookup newick format and
whther it is possible to store bootstraps in it. In addition you should
also look up the nhx format.

> For example, with the tree:
> ( ('Chimp':0.052,'Human':0.042)'Human_Chimp_Ancestor':0.007, 
> 'Gorilla':0.060, 
> ('Gibbon':0.124,'Orangutan':0.0971)'Gibbon_Orangutan_Ancestor':0.038);
>
>   

This tree does not contain any bootstrap values - only branch lengths.

Sorry I can't be much more help at the moment - if i get a spare 10 mins
i'll have a closer look.
Nath


From bix at sendu.me.uk  Fri Oct 27 11:16:08 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 12:16:08 +0100
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EF2A.4050600@sheffield.ac.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk>
Message-ID: <4541EA78.3050404@sendu.me.uk>

Nathan S. Haigh wrote:
> Sendu Bala wrote:
>> Himanshu Ardawatia wrote:
>>>
>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>>> #################################
>>> (
>>>   ('Chimp'  : 0.052,
>>>    'Human'  : 0.042) 0.71 : 0.007,
>>>   'Gorilla'  : 0.060,
>>>   ('Gibbon'  : 0.124,
>>>    'Orangutan'  : 0.0971) 1 : 0.038
>>> );
>>> #################################
>>>     
>> Are you sure this is in the correct format?
>>   
> 
> He/she may have a tree that already contains bootstrap values output
> from another program. If this is so, which program did you use? Without
> reminding myself of the formats, you should lookup newick format and
> whther it is possible to store bootstraps in it. In addition you should
> also look up the nhx format.

Ah, well from a brief google it seemed like some software do store 
boostrap values for internal nodes as the node ids when outputting in 
Newick format. I don't think Bioperl should be able to tell the 
difference between a normal id and a bootstrap value, so you'll have to 
detect that yourself and manually use bootstrap() when you get an id 
that looks like a number.

Or should Bioperl be making this assumption for you? Is that a safe 
thing to do? Maybe as an option only?


From n.haigh at sheffield.ac.uk  Fri Oct 27 12:24:49 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 12:24:49 +0000
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EA78.3050404@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk>
Message-ID: <4541FA91.3040505@sheffield.ac.uk>

--snip--
>
> Ah, well from a brief google it seemed like some software do store
> boostrap values for internal nodes as the node ids when outputting in
> Newick format. I don't think Bioperl should be able to tell the
> difference between a normal id and a bootstrap value, so you'll have
> to detect that yourself and manually use bootstrap() when you get an
> id that looks like a number.

If I remember rightly, in programs like Clustal you can specify where
bootstrap values are stored - node or branch. I can't remember which is
the default way, but TreeView can only see bootstraps in they are stored
using the "non-default" setting. This "could" be the same issue here.

>
> Or should Bioperl be making this assumption for you? Is that a safe
> thing to do? Maybe as an option only?
I don't know without a closer look - i'd also need to look at the newick
format definition as to whether this is an "extension" to the format or
if something is just flouting the newick rules.

Nath


From n.haigh at sheffield.ac.uk  Fri Oct 27 12:59:51 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 12:59:51 +0000
Subject: [Bioperl-l] Caching sequences
Message-ID: <454202C7.1040701@sheffield.ac.uk>

I have a script that is capable of downloading sequences from GenBank
based on GI numbers. I retrieve them if fasta format in order to save
bandwidth, but I'd like to take this one step further and cache the
sequences in case the user want to rerun the script using some of the
GI's they used previously.

Does anyone have any guidance on how best to do this?

Cheers
Nath


From bix at sendu.me.uk  Fri Oct 27 12:35:13 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 27 Oct 2006 13:35:13 +0100
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <454202C7.1040701@sheffield.ac.uk>
References: <454202C7.1040701@sheffield.ac.uk>
Message-ID: <4541FD01.6090803@sendu.me.uk>

Nathan S. Haigh wrote:
> I have a script that is capable of downloading sequences from GenBank
> based on GI numbers. I retrieve them if fasta format in order to save
> bandwidth, but I'd like to take this one step further and cache the
> sequences in case the user want to rerun the script using some of the
> GI's they used previously.
> 
> Does anyone have any guidance on how best to do this?

You'd probably write the sequences out in some suitable format and 
access them via Bio::Index

Or, I'm sure bioperl-db excels at this kind of thing, but is a little 
more involved if this is only a simple situation.


From bosborne11 at verizon.net  Fri Oct 27 13:09:30 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Fri, 27 Oct 2006 09:09:30 -0400
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <4541C66F.1020404@sheffield.ac.uk>
Message-ID: <C1677D4A.B0AF%bosborne11@verizon.net>

Nathan,

I don't know how this is supposed to work, there would be different ways to
make is_prototype true. One way would be to make the enzyme with the first
occurrence of a given restriction site the prototype (and the next enzymes
with the same site are isoschizomers). Or, one could wait until one site had
appeared twice, with 2 different enzymes, then make the first the prototype,
etc. I would have done it the first way myself but I took a quick look at
IO/withrefm.pm and it looks like it's doing it the second way. That means
one can read an enzyme file and end up with no duplicated restriction sites,
or prototypes and isoschizomers.

Brian O.


On 10/27/06 4:42 AM, "Nathan S. Haigh" <n.haigh at sheffield.ac.uk> wrote:

> Hi Brian,
> 
> I wonder if i'm using is_prototype() correctly as I don't seem to get
> any returning true:
> 
> my $enz_coll = Bio::Restriction::EnzymeCollection->new();
> my $prototype = 0;
> foreach my $enz ($enz_coll->each_enzyme) {
>     $prototype++ if $enz->is_prototype;
> }
> print "$prototype have unique recognition sites\n";
> 
> prints:
> 0 have unique recognition sites
> 
> Thanks
> Nath
> 
> Brian Osborne wrote:
>> Nathan,
>> 
>> Perhaps because most restriction sites are palindromes. Anyway, I added
>> tests for palindromic() and is_palindromic() where the site is not a
>> palindrome, these tests pass (t/RestrictionAnalyis.t).
>> 
>> Brian O.
>> 
>> 
>> On 10/26/06 12:13 PM, "Nathan Haigh" <n.haigh at sheffield.ac.uk> wrote:
>> 
>>   
>>> I'm in the middle of writing some code that uses
>>> Bio::Restriction::Analysis and Bio::Restriction::Enzyme. I'm using
>>> Bioperl from HEAD.
>>> 
>>> I seem to find that $enzyme->is_palindromic always seems to return true.
>>> Can anyone verify this? If needs be, I can send some code.
>>> 
>>> Thanks
>>> Nathan
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>     
>> 
>> 
>>   
> 


From n.haigh at sheffield.ac.uk  Fri Oct 27 14:19:02 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 14:19:02 +0000
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <C1677D4A.B0AF%bosborne11@verizon.net>
References: <C1677D4A.B0AF%bosborne11@verizon.net>
Message-ID: <45421556.9060300@sheffield.ac.uk>

Brian Osborne wrote:
> Nathan,
>
> I don't know how this is supposed to work, there would be different ways to
> make is_prototype true. One way would be to make the enzyme with the first
> occurrence of a given restriction site the prototype (and the next enzymes
> with the same site are isoschizomers). Or, one could wait until one site had
> appeared twice, with 2 different enzymes, then make the first the prototype,
> etc. I would have done it the first way myself but I took a quick look at
> IO/withrefm.pm and it looks like it's doing it the second way. That means
> one can read an enzyme file and end up with no duplicated restriction sites,
> or prototypes and isoschizomers.
>
> Brian O.
>
>   
Hmm, I'd have done it the first way also. Doing it the second way would
mean you only ended up with something as a prototype if there were
multiple enzymes with the same restriction site - is that correct
biologically?

Nath


From n.haigh at sheffield.ac.uk  Fri Oct 27 14:23:20 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 14:23:20 +0000
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
Message-ID: <45421658.5000103@sheffield.ac.uk>

As you may be aware by now, i'm working with Bio::Restriction::Analysis
and friends.

I'm doing restriction analysis on large sequences - chromosomes. I need
to identify an appropriate enzyme based on the total length of fragments
that are of a certain size (e.g. 100 - 500 bp). However, the amount of
memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
have the following code (bottom) which downloads 2 thaliana chromosomes
(mito and chloro - so pretty small) and runs an analysis and then loops
through the fragments for all enzymes in the default collection.

My memory usage just keep on climbing and none seems to get freed up
even when a $ra goes out of scope (start dealing with the next
sequence). Is this a memory leak of some sort, is there a way to free up
memory as I go? I'd appreciate any help/advice on how to reduce the
amount of memory being consumed as I'd like to use all the thaliana
chromosomes (not just mito and chloro), which at the moment probably
won't work.

Cheers
Nath

use strict;
use Bio::DB::GenBank;
use Bio::Restriction::Analysis;
use Bio::Restriction::EnzymeCollection;

my @seq_objs;
my @gis = ( 7525012,  26556996 );

my $db = Bio::DB::GenBank->new(-format => "fasta");
foreach my $gi (@gis) {
  print "Getting GI: $gi\n";
  push @seq_objs, $db->get_Seq_by_id($gi)
}

my $min_fragment_size = 100;
my $max_fragment_size = 500;
my $enz_Coll = Bio::Restriction::EnzymeCollection->new();

foreach my $seq (@seq_objs) {
  my $tot_size = 0;
  print "Processing ", $seq->primary_id,"\n";
  my $ra = Bio::Restriction::Analysis->new(
                                         -seq=>$seq,
                                         -enzymes=>$enz_Coll,
  );
 
  my @all_enzymes = $ra->cutters->each_enzyme;
  print "  Calc total length of fragments in range: $min_fragment_size -
$max_fragment_size\n";
  foreach my $enzyme ( @all_enzymes ) {
    # fragments() is a real memory hog
    foreach my $frag ($ra->fragments($enzyme)) {
      next if $min_fragment_size && (length $frag < $min_fragment_size);
      next if $max_fragment_size && (length $frag > $max_fragment_size);
      $tot_size += length $frag;
    }
    # do something based on value of $tot_size
    #print "    ", $enzyme->name, " total = $tot_size\n";
  }
  print "DONE\n";
}


From avilella at gmail.com  Fri Oct 27 13:39:41 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 27 Oct 2006 14:39:41 +0100
Subject: [Bioperl-l] scale branch lengths of a tree to sum 1
In-Reply-To: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>
References: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>
Message-ID: <358f4d650610270639q14870a6erae2e3c4e9063105d@mail.gmail.com>

I respond to myself: I think I found the way:

my $tree = $treeio->next_tree;
my $total_branch_length = 0;
foreach my $node ($tree->get_nodes) {
    $total_branch_length += $node->branch_length;
}
foreach my $node ($tree->get_nodes) {
    my $branch_length = $node->branch_length;
    next unless (defined($branch_length));
    $node->branch_length($branch_length/$total_branch_length);
    1;
}

my $new_branch_length;
foreach my $node ($tree->get_nodes) {
    $new_branch_length += $node->branch_length;
}
1;

On 10/27/06, Albert Vilella <avilella at gmail.com> wrote:
> Hi all,
>
> I am in need of a method that would scale the different branch lengths
> of a tree so that after the scaling they all sum up to exactly 1.
>
> Any pointers? Has anyone done that before?
>
> Thanks in advance,
>
>     Albert.
>


From cjfields at uiuc.edu  Fri Oct 27 14:35:35 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 09:35:35 -0500
Subject: [Bioperl-l] Bioperl-run: Testing alignments generated externally
In-Reply-To: <4541CBA8.10006@sheffield.ac.uk>
Message-ID: <001501c6f9d5$2e33e120$15327e82@pyrimidine>

...
> I think it makes sense to test that data of the expected type was
> returned by the xternal resource but not to test the specifics of what
> was retured. If specifics are tested we are then in the realm of testing
> whether we believe the data returned by the external resource or not. We
> should assume that the domain experts for these resources know what they
> are doing - in some cases this might not be true :-)  but I think we
> should stick to testing that the objects created hold the expected type
> of data.
> 
> I like what Chris had to say (above) but wonder whether tests
> would/should be tested for in the module itself - i.e. testing that a
> stored value is an integer and warn/throw if not?
> 
> Nath

Yeah, sorry about the top post (stupid Outlook always sticks the sig at the
top of the page!).  

Testing in the module would be best but can be tricky for the very same
reasons that writing tests entail, even more so.  For instance, for NCBI
esummary data, I parse the data in a very generic way in order to have
access to as much data as possible.  

For tests, I have to assume that NCBI will always return a particular type
of value (string, integer, date).  I can test for each of those with a regex
in the module fairly simply and throw/wanr, as you indicate.  However, if
they decide to add new data with a data tag other that the ones I test for
in the module (i.e. String, Integer, Date), I suddenly have warns/throws
showing up and cluttering/clobbering the code for perfectly valid data.  

However, if these are caught in tests and the tests fail, no big loss.  The
actual module still works, even if the tests are failing based on an new
unknown value being returned.  

For me, failed tests are sort of a warning light to let me know that
something has changed, but it doesn't necessarily mean a module doesn't
work.  I generally use throw/warn for something truly catastrophic, like no
response from the server or an error in the XML, which affects downstream
methods.  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Oct 27 15:09:36 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 10:09:36 -0500
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <454202C7.1040701@sheffield.ac.uk>
Message-ID: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>

> I have a script that is capable of downloading sequences from GenBank
> based on GI numbers. I retrieve them if fasta format in order to save
> bandwidth, but I'd like to take this one step further and cache the
> sequences in case the user want to rerun the script using some of the
> GI's they used previously.
> 
> Does anyone have any guidance on how best to do this?
> 
> Cheers
> Nath

There is Bio::DB::InMemoryCache, which is really an interface but appears to
have several methods defined; you could look for modules which implement it.
Sendu's suggestion of the Bio::Index modules and bioperl-db are also good
starting points.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From cjfields at uiuc.edu  Fri Oct 27 15:21:49 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 10:21:49 -0500
Subject: [Bioperl-l] Bio::Restriction::Enzyme
In-Reply-To: <45421556.9060300@sheffield.ac.uk>
Message-ID: <001701c6f9db$9f90d160$15327e82@pyrimidine>

> Brian Osborne wrote:
> > Nathan,
> >
> > I don't know how this is supposed to work, there would be different ways
> to
> > make is_prototype true. One way would be to make the enzyme with the
> first
> > occurrence of a given restriction site the prototype (and the next
> enzymes
> > with the same site are isoschizomers). Or, one could wait until one site
> had
> > appeared twice, with 2 different enzymes, then make the first the
> prototype,
> > etc. I would have done it the first way myself but I took a quick look
> at
> > IO/withrefm.pm and it looks like it's doing it the second way. That
> means
> > one can read an enzyme file and end up with no duplicated restriction
> sites,
> > or prototypes and isoschizomers.
> >
> > Brian O.
> >
> >
> Hmm, I'd have done it the first way also. Doing it the second way would
> mean you only ended up with something as a prototype if there were
> multiple enzymes with the same restriction site - is that correct
> biologically?
> 
> Nath

I had a look at all the Restriction::IO modules a while back; most need
serious updating!  It just hasn't been a top priority unfortunately.

I think the prototype issue may depend on the IO format and whether or not
one is defined explicitly in the file being parsed or is just chosen based
on what Brian said (order in the file, similar cutting site).

By the strictest definition (and cheating by looking at the Fermentas web
site), the prototype is supposed to be the first enzyme discovered which
cleaves a unique sequence, so it may not be the first enzyme found in the
file.  Isoschizomers are those discovered to cleave the same sequence
subsequent to the prototype.  Neoschizomers cleave the same sequence as a
prototype but at a different site.

So this calls into question whether the prototype should be defined at all
unless it is specifically indicated in the file.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Fri Oct 27 16:47:53 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 27 Oct 2006 16:47:53 +0000
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>
References: <454202C7.1040701@sheffield.ac.uk>	
	<001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
	<8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>
Message-ID: <45423839.9040503@sheffield.ac.uk>

Jason Stajich wrote:
> Bio::DB::FileCache does one better and lets you cache the data in a
> persistent file.  Not sure this index is shareable among users though
> - bioperl-db is a better soln when that is desired.
Thanks I'll have a look into it. No need for being sharable among users
- not unless the script becomes heavily used.

Thanks
Nath


From cjfields at uiuc.edu  Fri Oct 27 16:15:00 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 11:15:00 -0500
Subject: [Bioperl-l] StandAloneFasta.t bioperl-run tests
Message-ID: <000101c6f9e3$0e5e95d0$15327e82@pyrimidine>

Nathan,

The test fails you posted on the wiki seem to indicate that using the
wrapper works but the order of the returned hits is off.  Does the order of
the returned hits match the actual FASTA report order?  If it does then the
tests need to be fixed in a way to make it more flexible, to account for
some data 'fuzziness' due to variations in output based on different
versions.  

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From jason at bioperl.org  Fri Oct 27 16:50:54 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 27 Oct 2006 09:50:54 -0700
Subject: [Bioperl-l] Query on tree bootstrap values
In-Reply-To: <4541EA78.3050404@sendu.me.uk>
References: <62d36e2b0610261854n2ecee1c5rcb1815feab202afb@mail.gmail.com>	<4541CC82.2040705@sendu.me.uk>
	<4541EF2A.4050600@sheffield.ac.uk> <4541EA78.3050404@sendu.me.uk>
Message-ID: <1230E110-01AB-4D4E-842F-20B939555299@bioperl.org>

I've answered to this effect this multiple times in the past on the  
mailing list.  newick format does not distinguish between internal  
ids and bootstrap values (or whatever else you want to attach  
there).  Different programs have different conventions.  when both  
values are present and encoded so that we can parse out the  
bootstrap  like this: [BOOTSTRAP] the parser grabs it out.   If you  
know all the internal ids are boostraps you can just copy the values  
over manually very simply

for my $node ( grep { ! $_->is_Leaf } $tree->get_nodes ) { # get all  
the internal nodes
  $node->bootstrap($node->id) if defined $node->id && length($node- 
 >id); # copy id to boostrap
  $node->id(''); # set internal id to empty
}

If someone can make this clearer on a wiki page that would be great.

On Oct 27, 2006, at 4:16 AM, Sendu Bala wrote:

> Nathan S. Haigh wrote:
>> Sendu Bala wrote:
>>> Himanshu Ardawatia wrote:
>>>>
>>>> EXAMPLE TREE (Newick with bootstrap values and branch lengths) :
>>>> #################################
>>>> (
>>>>   ('Chimp'  : 0.052,
>>>>    'Human'  : 0.042) 0.71 : 0.007,
>>>>   'Gorilla'  : 0.060,
>>>>   ('Gibbon'  : 0.124,
>>>>    'Orangutan'  : 0.0971) 1 : 0.038
>>>> );
>>>> #################################
>>>>
>>> Are you sure this is in the correct format?
>>>
>>
>> He/she may have a tree that already contains bootstrap values output
>> from another program. If this is so, which program did you use?  
>> Without
>> reminding myself of the formats, you should lookup newick format and
>> whther it is possible to store bootstraps in it. In addition you  
>> should
>> also look up the nhx format.
>
> Ah, well from a brief google it seemed like some software do store
> boostrap values for internal nodes as the node ids when outputting in
> Newick format. I don't think Bioperl should be able to tell the
> difference between a normal id and a bootstrap value, so you'll  
> have to
> detect that yourself and manually use bootstrap() when you get an id
> that looks like a number.
>
> Or should Bioperl be making this assumption for you? Is that a safe
> thing to do? Maybe as an option only?
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From avilella at gmail.com  Fri Oct 27 13:23:07 2006
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 27 Oct 2006 14:23:07 +0100
Subject: [Bioperl-l] scale branch lengths of a tree to sum 1
Message-ID: <358f4d650610270623t13875c8bv50278af5ee5b3c44@mail.gmail.com>

Hi all,

I am in need of a method that would scale the different branch lengths
of a tree so that after the scaling they all sum up to exactly 1.

Any pointers? Has anyone done that before?

Thanks in advance,

    Albert.


From cjfields at uiuc.edu  Fri Oct 27 18:34:57 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 13:34:57 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
Message-ID: <000001c6f9f6$9ab12710$15327e82@pyrimidine>

I am working an refactoring the AlignIO::stockholm parser to get it reading
and writing Pfam/Rfam alignments, and noticed that many alignments have
EMBL-like annotations attached, which pertain to the entire alignment:

# STOCKHOLM 1.0
#=GF ID    ykkC-yxkD
#=GF AC    RF00442
#=GF DE    ykkC-yxkD element
#=GF AU    Moxon SJ
#=GF GA    20.0
#=GF NC    0.1
#=GF TC    59.4
#=GF SE    Barrick JE, Breaker RR
#=GF SS    Predicted; Barrick JE, Breaker RR
#=GF TP    Cis-reg; riboswitch;
#=GF BM    cmbuild CM SEED
#=GF BM    cmsearch -W 175 CM SEQDB
#=GF RN    [1]
#=GF RM    15096624
#=GF RT    New RNA motifs suggest an expanded scope for riboswitches in
#=GF RT    bacterial genetic control.
#=GF RA    Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, Collins J,
Lee
#=GF RA    M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR;
#=GF RL    Proc Natl Acad Sci U S A 2004;101:6421-6426.
#=GF CC    This family represents the bacterial ykkC/yxkD element. The
function of
#=GF CC    this family is unclear although it has been suggested that it may
function
#=GF CC    to switch on efflux pumps and detoxification systems in response
to harmful
#=GF CC    environmental molecules [1]. The Thermoanaerobacter tengcongensis
sequence
#=GF CC    EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that the two
#=GF CC    riboswitches may work in conjunction to regulate the the upstream
gene
#=GF CC    which codes for Swiss:Q8RC62, a member of Pfam:PF00860 (Personal
obs. Moxon
#=GF CC    SJ).
#=GF SQ    16

SimpleAlign, as implemented, seemingly doesn't have a way to store this
information.

I'll work on getting the core alignment IO working, but would there be any
interest in having a way to store annotations in Bio::SimpleAlign?  I'm
guessing the methods would be similar to the various Bio::Seq Annotation
methods.

Chris

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 


From hlapp at gmx.net  Fri Oct 27 20:23:46 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 27 Oct 2006 16:23:46 -0400
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
Message-ID: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>

You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose  
this is what you meant by the 'various Bio::Seq Annotation methods'  
too.)

Just to make sure I'm not misunderstanding, I suppose the annotation  
pertains to the entire alignment?

	-hilmar

On Oct 27, 2006, at 2:34 PM, Chris Fields wrote:

> I am working an refactoring the AlignIO::stockholm parser to get it  
> reading
> and writing Pfam/Rfam alignments, and noticed that many alignments  
> have
> EMBL-like annotations attached, which pertain to the entire alignment:
>
> # STOCKHOLM 1.0
> #=GF ID    ykkC-yxkD
> #=GF AC    RF00442
> #=GF DE    ykkC-yxkD element
> #=GF AU    Moxon SJ
> #=GF GA    20.0
> #=GF NC    0.1
> #=GF TC    59.4
> #=GF SE    Barrick JE, Breaker RR
> #=GF SS    Predicted; Barrick JE, Breaker RR
> #=GF TP    Cis-reg; riboswitch;
> #=GF BM    cmbuild CM SEED
> #=GF BM    cmsearch -W 175 CM SEQDB
> #=GF RN    [1]
> #=GF RM    15096624
> #=GF RT    New RNA motifs suggest an expanded scope for  
> riboswitches in
> #=GF RT    bacterial genetic control.
> #=GF RA    Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M,  
> Collins J,
> Lee
> #=GF RA    M, Roth A, Sudarsan N, Jona I, Wickiser JK, Breaker RR;
> #=GF RL    Proc Natl Acad Sci U S A 2004;101:6421-6426.
> #=GF CC    This family represents the bacterial ykkC/yxkD element. The
> function of
> #=GF CC    this family is unclear although it has been suggested  
> that it may
> function
> #=GF CC    to switch on efflux pumps and detoxification systems in  
> response
> to harmful
> #=GF CC    environmental molecules [1]. The Thermoanaerobacter  
> tengcongensis
> sequence
> #=GF CC    EMBL:AE013027 overlaps with Rfam:RF00167 suggesting that  
> the two
> #=GF CC    riboswitches may work in conjunction to regulate the the  
> upstream
> gene
> #=GF CC    which codes for Swiss:Q8RC62, a member of Pfam:PF00860  
> (Personal
> obs. Moxon
> #=GF CC    SJ).
> #=GF SQ    16
>
> SimpleAlign, as implemented, seemingly doesn't have a way to store  
> this
> information.
>
> I'll work on getting the core alignment IO working, but would there  
> be any
> interest in having a way to store annotations in Bio::SimpleAlign?   
> I'm
> guessing the methods would be similar to the various Bio::Seq  
> Annotation
> methods.
>
> Chris
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Fri Oct 27 20:38:17 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 15:38:17 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
Message-ID: <000001c6fa07$d8659990$15327e82@pyrimidine>

Hilmar Lapp wrote:
> You could make SimpleAlign be a Bio::AnnotationHolderI. (I
> suppose this is what you meant by the 'various Bio::Seq Annotation
> methods' too.)
> 
> Just to make sure I'm not misunderstanding, I suppose the
> annotation pertains to the entire alignment?
> 
> 	-hilmar
...

Yes, that's correct.  I would probably use Bio::Seq::Meta for the
sequence-specific markup lines.  I would have to add another new method to
deal with non-sequence-based consensus data (like sec. structure) for now.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Fri Oct 27 15:38:05 2006
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 27 Oct 2006 08:38:05 -0700
Subject: [Bioperl-l] Caching sequences
In-Reply-To: <001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
References: <454202C7.1040701@sheffield.ac.uk>
	<001601c6f9d9$ebd8c7f0$15327e82@pyrimidine>
Message-ID: <8273f6c20610270838y76bf0dd1haed77615f46a82fc@mail.gmail.com>

Bio::DB::FileCache does one better and lets you cache the data in a
persistent file.  Not sure this index is shareable among users though -
bioperl-db is a better soln when that is desired.

-jason

On 10/27/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> > I have a script that is capable of downloading sequences from GenBank
> > based on GI numbers. I retrieve them if fasta format in order to save
> > bandwidth, but I'd like to take this one step further and cache the
> > sequences in case the user want to rerun the script using some of the
> > GI's they used previously.
> >
> > Does anyone have any guidance on how best to do this?
> >
> > Cheers
> > Nath
>
> There is Bio::DB::InMemoryCache, which is really an interface but appears
> to
> have several methods defined; you could look for modules which implement
> it.
> Sendu's suggestion of the Bio::Index modules and bioperl-db are also good
> starting points.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Jason Stajich
jason at bioperl.org
http://www.duke.edu/~jes12/


From cjfields at uiuc.edu  Sat Oct 28 01:57:58 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 27 Oct 2006 20:57:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
Message-ID: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>


On Oct 27, 2006, at 3:23 PM, Hilmar Lapp wrote:

> You could make SimpleAlign be a Bio::AnnotationHolderI. (I suppose
> this is what you meant by the 'various Bio::Seq Annotation methods'
> too.)
>
> Just to make sure I'm not misunderstanding, I suppose the annotation
> pertains to the entire alignment?
>
> 	-hilmar

BTW, was that supposed to be Bio::AnnotatableI, or  
Bio::AnnotationHolderI?  The latter isn't present in CVS HEAD.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From eric.ross at neuro.utah.edu  Sat Oct 28 21:24:30 2006
From: eric.ross at neuro.utah.edu (Eric Ross)
Date: Sat, 28 Oct 2006 15:24:30 -0600
Subject: [Bioperl-l] PAML
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>

I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object.

I am able to extract other data from the report, but there seems to be a conflict in the documentation.  One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object. 


I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far.  Anyone have suggestions?


code:

----begin code-------
#!/usr/bin/perl -w

use strict;


use Bio::Tools::Phylo::PAML;
my $parser = new Bio::Tools::Phylo::PAML
             (-file => "mlc");
my $result = $parser->next_result;
my @posteriors = $result->get_posteriors();

print "@posteriors";

exit(0);

---------end code-------------


---------------
Eric Ross
Computer Analyst II
ejr at neuro.utah.edu
Howard Hughes Medical Institute
University of Utah
S?nchez Lab


From avilella at gmail.com  Sun Oct 29 10:52:04 2006
From: avilella at gmail.com (Albert Vilella)
Date: Sun, 29 Oct 2006 10:52:04 +0000
Subject: [Bioperl-l] PAML
In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>

I don't know if this method is implemented. I can't grep-find it.
Maybe it's simply not there yet, but was planned when the
documentation was written.

On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
> I am trying to extract the "Naive Empirical Bayes (NEB) probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>
> I am able to extract other data from the report, but there seems to be a conflict in the documentation.  One doc implies that there should be a get_posteriors method. (It's used as an example in the Bio::Tools::Phylo::PAML doc), but the method does not appear to exist in the Bio::Tools::Phylo::PAML::Result object.
>
>
> I have been trying various methods, in the event I'm just "confused", but I've had no luck, thus far.  Anyone have suggestions?
>
>
> code:
>
> ----begin code-------
> #!/usr/bin/perl -w
>
> use strict;
>
>
> use Bio::Tools::Phylo::PAML;
> my $parser = new Bio::Tools::Phylo::PAML
>              (-file => "mlc");
> my $result = $parser->next_result;
> my @posteriors = $result->get_posteriors();
>
> print "@posteriors";
>
> exit(0);
>
> ---------end code-------------
>
>
>
> ---------------
> Eric Ross
> Computer Analyst II
> ejr at neuro.utah.edu
> Howard Hughes Medical Institute
> University of Utah
> S?nchez Lab
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Sun Oct 29 14:23:45 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 08:23:45 -0600
Subject: [Bioperl-l] PAML
In-Reply-To: <358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
Message-ID: <9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>

Does the data show up in the object using Data::Dumper?

This should be filed as a bug since the docs imply the method  
exists.  This could be written up fairly quickly if one had test data  
and and a script to work with (hint hint...)

Chris

On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote:

> I don't know if this method is implemented. I can't grep-find it.
> Maybe it's simply not there yet, but was planned when the
> documentation was written.
>
> On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
>> I am trying to extract the "Naive Empirical Bayes (NEB)  
>> probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>>
>> I am able to extract other data from the report, but there seems  
>> to be a conflict in the documentation.  One doc implies that there  
>> should be a get_posteriors method. (It's used as an example in the  
>> Bio::Tools::Phylo::PAML doc), but the method does not appear to  
>> exist in the Bio::Tools::Phylo::PAML::Result object.
>>
>>
>> I have been trying various methods, in the event I'm just  
>> "confused", but I've had no luck, thus far.  Anyone have suggestions?
>>
>>
>> code:
>>
>> ----begin code-------
>> #!/usr/bin/perl -w
>>
>> use strict;
>>
>>
>> use Bio::Tools::Phylo::PAML;
>> my $parser = new Bio::Tools::Phylo::PAML
>>              (-file => "mlc");
>> my $result = $parser->next_result;
>> my @posteriors = $result->get_posteriors();
>>
>> print "@posteriors";
>>
>> exit(0);
>>
>> ---------end code-------------
>>
>>
>>
>> ---------------
>> Eric Ross
>> Computer Analyst II
>> ejr at neuro.utah.edu
>> Howard Hughes Medical Institute
>> University of Utah
>> S?nchez Lab
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From eric.ross at neuro.utah.edu  Sun Oct 29 17:06:54 2006
From: eric.ross at neuro.utah.edu (Eric Ross)
Date: Sun, 29 Oct 2006 10:06:54 -0700
Subject: [Bioperl-l] PAML
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
	<9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>
Message-ID: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>

Thanks for all the help.

I've been looking at the code for the PAML rst parser.  It's a bit tricky. 

We have written a parser specific for our needs, but it looks to be a pretty complicated matter to make it generic.  

The output of PAML can vary a lot depending upon your options and this section can be repeated multiple times.  I'm sure someone with a good grasp of the potential output of PAML could come up with something, but I'll admit to being at a loss. 


---------------
Eric Ross
Computer Analyst II
ejr at neuro.utah.edu
Howard Hughes Medical Institute
University of Utah
S?nchez Lab


-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu]
Sent: Sun 2006-10-29 7:23 AM
To: Albert Vilella
Cc: Eric Ross; Bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] PAML
 
Does the data show up in the object using Data::Dumper?

This should be filed as a bug since the docs imply the method  
exists.  This could be written up fairly quickly if one had test data  
and and a script to work with (hint hint...)

Chris

On Oct 29, 2006, at 4:52 AM, Albert Vilella wrote:

> I don't know if this method is implemented. I can't grep-find it.
> Maybe it's simply not there yet, but was planned when the
> documentation was written.
>
> On 10/28/06, Eric Ross <eric.ross at neuro.utah.edu> wrote:
>> I am trying to extract the "Naive Empirical Bayes (NEB)  
>> probabilities" from a Bio::Tools::Phylo::PAML::Result object.
>>
>> I am able to extract other data from the report, but there seems  
>> to be a conflict in the documentation.  One doc implies that there  
>> should be a get_posteriors method. (It's used as an example in the  
>> Bio::Tools::Phylo::PAML doc), but the method does not appear to  
>> exist in the Bio::Tools::Phylo::PAML::Result object.
>>
>>
>> I have been trying various methods, in the event I'm just  
>> "confused", but I've had no luck, thus far.  Anyone have suggestions?
>>
>>
>> code:
>>
>> ----begin code-------
>> #!/usr/bin/perl -w
>>
>> use strict;
>>
>>
>> use Bio::Tools::Phylo::PAML;
>> my $parser = new Bio::Tools::Phylo::PAML
>>              (-file => "mlc");
>> my $result = $parser->next_result;
>> my @posteriors = $result->get_posteriors();
>>
>> print "@posteriors";
>>
>> exit(0);
>>
>> ---------end code-------------
>>
>>
>>
>> ---------------
>> Eric Ross
>> Computer Analyst II
>> ejr at neuro.utah.edu
>> Howard Hughes Medical Institute
>> University of Utah
>> S?nchez Lab
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sun Oct 29 17:43:20 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 29 Oct 2006 17:43:20 +0000
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <45421658.5000103@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
Message-ID: <4544E838.7090400@sheffield.ac.uk>

Sorry for the repeat post but I haven't had a response. Just wondered if 
anyone had any idea about this?

Thanks
Nath

Nathan S. Haigh wrote:
> As you may be aware by now, i'm working with Bio::Restriction::Analysis
> and friends.
>
> I'm doing restriction analysis on large sequences - chromosomes. I need
> to identify an appropriate enzyme based on the total length of fragments
> that are of a certain size (e.g. 100 - 500 bp). However, the amount of
> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
> have the following code (bottom) which downloads 2 thaliana chromosomes
> (mito and chloro - so pretty small) and runs an analysis and then loops
> through the fragments for all enzymes in the default collection.
>
> My memory usage just keep on climbing and none seems to get freed up
> even when a $ra goes out of scope (start dealing with the next
> sequence). Is this a memory leak of some sort, is there a way to free up
> memory as I go? I'd appreciate any help/advice on how to reduce the
> amount of memory being consumed as I'd like to use all the thaliana
> chromosomes (not just mito and chloro), which at the moment probably
> won't work.
>
> Cheers
> Nath
>
> use strict;
> use Bio::DB::GenBank;
> use Bio::Restriction::Analysis;
> use Bio::Restriction::EnzymeCollection;
>
> my @seq_objs;
> my @gis = ( 7525012,  26556996 );
>
> my $db = Bio::DB::GenBank->new(-format => "fasta");
> foreach my $gi (@gis) {
>   print "Getting GI: $gi\n";
>   push @seq_objs, $db->get_Seq_by_id($gi)
> }
>
> my $min_fragment_size = 100;
> my $max_fragment_size = 500;
> my $enz_Coll = Bio::Restriction::EnzymeCollection->new();
>
> foreach my $seq (@seq_objs) {
>   my $tot_size = 0;
>   print "Processing ", $seq->primary_id,"\n";
>   my $ra = Bio::Restriction::Analysis->new(
>                                          -seq=>$seq,
>                                          -enzymes=>$enz_Coll,
>   );
>  
>   my @all_enzymes = $ra->cutters->each_enzyme;
>   print "  Calc total length of fragments in range: $min_fragment_size -
> $max_fragment_size\n";
>   foreach my $enzyme ( @all_enzymes ) {
>     # fragments() is a real memory hog
>     foreach my $frag ($ra->fragments($enzyme)) {
>       next if $min_fragment_size && (length $frag < $min_fragment_size);
>       next if $max_fragment_size && (length $frag > $max_fragment_size);
>       $tot_size += length $frag;
>     }
>     # do something based on value of $tot_size
>     #print "    ", $enzyme->name, " total = $tot_size\n";
>   }
>   print "DONE\n";
> }
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>   


From cjfields at uiuc.edu  Sun Oct 29 18:09:54 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 12:09:54 -0600
Subject: [Bioperl-l] PAML
In-Reply-To: <2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>
References: <2DCC13A920E6C640AD6257DEFA01F78408A3EF@CAMPUSV1.xds.umail.utah.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F0@CAMPUSV1.xds.umail.utah.edu>
	<358f4d650610290252j5e6318aase8cede2b0f04590a@mail.gmail.com>
	<9E776AD6-76E1-403A-8003-C27958F21D90@uiuc.edu>
	<2DCC13A920E6C640AD6257DEFA01F78408A3F5@CAMPUSV1.xds.umail.utah.edu>
Message-ID: <C775A898-5D18-48F6-874F-3B359C1A10C5@uiuc.edu>

On Oct 29, 2006, at 11:06 AM, Eric Ross wrote:

> Thanks for all the help.
>
> I've been looking at the code for the PAML rst parser.  It's a bit  
> tricky.
>
> We have written a parser specific for our needs, but it looks to be  
> a pretty complicated matter to make it generic.
>
> The output of PAML can vary a lot depending upon your options and  
> this section can be repeated multiple times.  I'm sure someone with  
> a good grasp of the potential output of PAML could come up with  
> something, but I'll admit to being at a loss.

Eric,

I planned on looking at ways to integrate the protein-based PAML  
programs but I'm working on a different area at the moment.  I agree  
it may be hard to adequately genericize parsing/methods to accomplish  
this, but if you have any ideas feel free to post them.  Again, I  
would suggest adding any proposed enhancements or bugs to Bugzilla:

http://bugzilla.open-bio.org/

Suggestions or bug reports on the list sometimes get lost in the  
shuffle, esp. since we're planning on a new developer release soon.

Chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sun Oct 29 18:16:37 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 29 Oct 2006 12:16:37 -0600
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <4544E838.7090400@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
	<4544E838.7090400@sheffield.ac.uk>
Message-ID: <6D9EAA04-199C-4BDD-AA60-4833BC1CE250@uiuc.edu>


On Oct 29, 2006, at 11:43 AM, Nathan S. Haigh wrote:

> Sorry for the repeat post but I haven't had a response. Just  
> wondered if
> anyone had any idea about this?
>
> Thanks
> Nath

...

I think Warnock applies here.  Likely no one is really sure, hence  
they aren't answering.  It probably bears investigating by submitting  
and tracking as a bug.  My guess is something isn't garbage-collected  
properly (i.e. there are circular references present), leading to a  
memory leak.

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From chhalling at alumni.ls.berkeley.edu  Sun Oct 29 19:16:36 2006
From: chhalling at alumni.ls.berkeley.edu (Conrad Halling)
Date: Sun, 29 Oct 2006 14:16:36 -0500
Subject: [Bioperl-l] Bio::Restriction::Analysis::fragments() memory usage
In-Reply-To: <4544E838.7090400@sheffield.ac.uk>
References: <45421658.5000103@sheffield.ac.uk>
	<4544E838.7090400@sheffield.ac.uk>
Message-ID: <4544FE14.7030701@alumni.ls.berkeley.edu>

Nathan S. Haigh wrote:
> Sorry for the repeat post but I haven't had a response. Just wondered if 
> anyone had any idea about this?
>
> Thanks
> Nath
>
> Nathan S. Haigh wrote:
>   
>> As you may be aware by now, i'm working with Bio::Restriction::Analysis
>> and friends.
>>
>> I'm doing restriction analysis on large sequences - chromosomes. I need
>> to identify an appropriate enzyme based on the total length of fragments
>> that are of a certain size (e.g. 100 - 500 bp). However, the amount of
>> memory used by Bio::Restriction::Analysis::fragments() is prohibative. I
>> have the following code (bottom) which downloads 2 thaliana chromosomes
>> (mito and chloro - so pretty small) and runs an analysis and then loops
>> through the fragments for all enzymes in the default collection.
>>
>> My memory usage just keep on climbing and none seems to get freed up
>> even when a $ra goes out of scope (start dealing with the next
>> sequence). Is this a memory leak of some sort, is there a way to free up
>> memory as I go? I'd appreciate any help/advice on how to reduce the
>> amount of memory being consumed as I'd like to use all the thaliana
>> chromosomes (not just mito and chloro), which at the moment probably
>> won't work.
>>
>> Cheers
>> Nath
>>
>> use strict;
>> use Bio::DB::GenBank;
>> use Bio::Restriction::Analysis;
>> use Bio::Restriction::EnzymeCollection;
>>
>> my @seq_objs;
>> my @gis = ( 7525012,  26556996 );
>>
>> my $db = Bio::DB::GenBank->new(-format => "fasta");
>> foreach my $gi (@gis) {
>>   print "Getting GI: $gi\n";
>>   push @seq_objs, $db->get_Seq_by_id($gi)
>> }
>>
>> my $min_fragment_size = 100;
>> my $max_fragment_size = 500;
>> my $enz_Coll = Bio::Restriction::EnzymeCollection->new();
>>
>> foreach my $seq (@seq_objs) {
>>   my $tot_size = 0;
>>   print "Processing ", $seq->primary_id,"\n";
>>   my $ra = Bio::Restriction::Analysis->new(
>>                                          -seq=>$seq,
>>                                          -enzymes=>$enz_Coll,
>>   );
>>  
>>   my @all_enzymes = $ra->cutters->each_enzyme;
>>   print "  Calc total length of fragments in range: $min_fragment_size -
>> $max_fragment_size\n";
>>   foreach my $enzyme ( @all_enzymes ) {
>>     # fragments() is a real memory hog
>>     foreach my $frag ($ra->fragments($enzyme)) {
>>       next if $min_fragment_size && (length $frag < $min_fragment_size);
>>       next if $max_fragment_size && (length $frag > $max_fragment_size);
>>       $tot_size += length $frag;
>>     }
>>     # do something based on value of $tot_size
>>     #print "    ", $enzyme->name, " total = $tot_size\n";
>>   }
>>   print "DONE\n";
>> }
>>
>>     
Try this code, which creates a new Bio::Restriction::Analysis object for 
each digest. On my PowerBook, this doesn't use more than 13 Mb of memory.

Reading the code for Bio::Restriction::Analysis reveals that the 
fragments() method calls the cut() method. The documentation for the cut 
method states:

Note: cut doesn't now re-initialize everything before figuring out
cuts. This is so that you can do multiple digests, or add more data or
whatever. You'll have to use new to reset everything.

This means there is no memory leak; it's just that the 
Bio::Restriction::Analysis object is retaining cut information for each 
enzyme, which takes a lot of memory.

use strict;
use warnings;
use Bio::DB::GenBank;
use Bio::Restriction::Analysis;
use Bio::Restriction::EnzymeCollection;

my @seq_objs;
my @gis = ( 7525012,  26556996 );

my $db = Bio::DB::GenBank->new(-format => "fasta");
foreach my $gi (@gis) {
  print "Getting GI: $gi\n";
  push @seq_objs, $db->get_Seq_by_id($gi)
}

my $min_fragment_size = 100;
my $max_fragment_size = 500;
my $enz_Coll = Bio::Restriction::EnzymeCollection->new();

foreach my $seq (@seq_objs) {
  print "Processing ", $seq->primary_id, "\n";
  foreach my $enzyme ( $enz_Coll->each_enzyme() ) {
    my $ra = Bio::Restriction::Analysis->new(
      -seq => $seq,
      -enzymes => $enzyme );
    my $tot_size = 0;
 
    print "  Calc total length of fragments in range: $min_fragment_size 
-" .
      " $max_fragment_size\n";

    foreach my $frag ($ra->fragments($enzyme)) {
      next if $min_fragment_size && (length $frag < $min_fragment_size);
      next if $max_fragment_size && (length $frag > $max_fragment_size);
      $tot_size += length $frag;
    }
    # do something based on value of $tot_size
    print "    ", $enzyme->name, " total = $tot_size\n";
  }
  print "DONE\n";
}

-- 
Conrad Halling
chhalling at alumni.ls.berkeley.edu


From n.haigh at sheffield.ac.uk  Mon Oct 30 08:51:49 2006
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Mon, 30 Oct 2006 08:51:49 +0000
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
Message-ID: <4545BD25.3030107@sheffield.ac.uk>

In my script I retrieve sequences from GenBank in FASTA format by GI
numbers and optionally store the sequence in a cache using
Bio::DB::Fasta. On subsequent runs of the script, the cache is first
checked for the GI and returns the sequence if it is found or the
sequence is obtained from GenBank as above.

I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
object which is defined within the Bio::DB::Fasta file. This is
annoying, since $seq_obj in my script would be either a Bio::Seq if it
was obtained from GenBank or a Bio::PrimarySeq if obtained from the
cache and calling primary_id() on it doesn't do the expected thing with
Bio::PrimarySeq:
ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)

Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?

Nath


From yuhki at ncifcrf.gov  Mon Oct 30 13:57:35 2006
From: yuhki at ncifcrf.gov (Naoya Yuhki)
Date: Mon, 30 Oct 2006 08:57:35 -0500
Subject: [Bioperl-l] bptutorial.pl 0
Message-ID: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov>

Hello,
I run

perl bptutorial.pl 0

and I got the following error.

-------------------- WARNING ---------------------
MSG: id (ROA1_HUMAN) does not exist
---------------------------------------------------
Can't call method "display_id" on an undefined value at bptutorial.pl  
line 3945.

other tests all worked.

I thank any suggestions from you.

NAOYA YUHKI.


From cjfields at uiuc.edu  Mon Oct 30 17:42:21 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 30 Oct 2006 11:42:21 -0600
Subject: [Bioperl-l] bptutorial.pl 0
In-Reply-To: <46170A95-13C8-4839-BD70-EF086ED1E0DA@ncifcrf.gov>
Message-ID: <000601c6fc4a$c3e43450$15327e82@pyrimidine>

> Hello,
> I run
> 
> perl bptutorial.pl 0
> 
> and I got the following error.
> 
> -------------------- WARNING ---------------------
> MSG: id (ROA1_HUMAN) does not exist
> ---------------------------------------------------
> Can't call method "display_id" on an undefined value at bptutorial.pl
> line 3945. 
> 
> other tests all worked.
> 
> I thank any suggestions from you.
> 
> NAOYA YUHKI.

What version of Bioperl are you running?  

As a warning, the bptutorial.pl script has been removed from CVS and will
not be included in future versions of Bioperl.  It can be found on the
bioperl wiki instead:

http://www.bioperl.org/wiki/Bptutorial

chris


Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Mon Oct 30 18:08:15 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 30 Oct 2006 10:08:15 -0800
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <4545BD25.3030107@sheffield.ac.uk>
References: <4545BD25.3030107@sheffield.ac.uk>
Message-ID: <29F47393-D134-4093-8751-E948BF521843@bioperl.org>

Bio::PrimarySeq makes sense because Fasta databases only provide  
sequences without features.  But you are actually getting a  
Bio::PrimarySeq::Fasta object which is a proxy object since the  
module won't pull a whole sequence into memory unless seq() is  
requested.

The problem is really why you are getting something useless set for  
primary_id.

What do you want it to be - the GI number?  you'll need to explicitly  
set it because DB::Fasta has no concept of GI numbers encoded in the  
header line.
AFAIK you cannot also set the primary_id to a value of your liking  
because this a proxy object.  The best bet is to create a Bio::Seq  
object out of one of these and set the primary_id and display_id to  
values that you can compute from the display_id.

At least that has been my strategy when using this - maybe someone  
wants to code something new into the object itsself.

-jason
On Oct 30, 2006, at 12:51 AM, Nathan S. Haigh wrote:

> In my script I retrieve sequences from GenBank in FASTA format by GI
> numbers and optionally store the sequence in a cache using
> Bio::DB::Fasta. On subsequent runs of the script, the cache is first
> checked for the GI and returns the sequence if it is found or the
> sequence is obtained from GenBank as above.
>
> I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
> returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
> object which is defined within the Bio::DB::Fasta file. This is
> annoying, since $seq_obj in my script would be either a Bio::Seq if it
> was obtained from GenBank or a Bio::PrimarySeq if obtained from the
> cache and calling primary_id() on it doesn't do the expected thing  
> with
> Bio::PrimarySeq:
> ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)
>
> Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From golharam at umdnj.edu  Mon Oct 30 20:11:51 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 15:11:51 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using IO:String?
Message-ID: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>

I'm trying to parse some blast output w/o actually creating the output
file.  Instead, I'm capturing the output in a variable and would like to
use IO::String to represent the file:

	$_ = `megablast -d somedatabase -i somesequence -D 2`;
	my $blast_file = new IO::String($_);
	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
$blast_file);
	my $results = $searchio->next_result;
	my $hit = $results->next_hit;
	if (! defined($hit)) {
		warn "No BLAST hit for $accession on chr $chr for
Seq/$orth_id/$organism\n\n";
		return;
	}

Now, when Bio::SearchIO tries to read the output line by line, instead
it reads the entire output as 1 line.

If I provide the output in a file and use:

	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
'/tmp/somefile.blast');

This works...so is it possible to use IO::String to provide
Bio::SearchIO with BLAST output?  

Ryan


From golharam at umdnj.edu  Mon Oct 30 20:54:29 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 15:54:29 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com>
Message-ID: <00e801c6fc65$9849aee0$e6028a0a@GOLHARMOBILE1>

Thanks.  How are you getting the output?  system()?  BTW- I'm using
v1.5.1...


> -----Original Message-----
> From: Bernd Web [mailto:bernd.web at gmail.com] 
> Sent: Monday, October 30, 2006 3:45 PM
> To: golharam at umdnj.edu
> Cc: bioperl-l
> Subject: Re: [Bioperl-l] Is it possible to parse BLAST output 
> using IO:String?
> 
> 
> Hi Ryan,
> 
> I parse blastn output using IO::String w/o problems:
> 
>  my $stringfh = new IO::String($input);
>  my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh);
> 
> however this is input does not come via backticks.
> 
> 
> bernd
> 
> On 10/30/06, Ryan Golhar <golharam at umdnj.edu> wrote:
> > I'm trying to parse some blast output w/o actually creating 
> the output 
> > file.  Instead, I'm capturing the output in a variable and 
> would like 
> > to use IO::String to represent the file:
> >
> >         $_ = `megablast -d somedatabase -i somesequence -D 2`;
> >         my $blast_file = new IO::String($_);
> >         my $searchio = new Bio::SearchIO(-format => 'blast', -fh => 
> > $blast_file);
> >         my $results = $searchio->next_result;
> >         my $hit = $results->next_hit;
> >         if (! defined($hit)) {
> >                 warn "No BLAST hit for $accession on chr $chr for 
> > Seq/$orth_id/$organism\n\n";
> >                 return;
> >         }
> >
> > Now, when Bio::SearchIO tries to read the output line by 
> line, instead 
> > it reads the entire output as 1 line.
> >
> > If I provide the output in a file and use:
> >
> >         my $searchio = new Bio::SearchIO(-format => 
> 'blast', -file => 
> > '/tmp/somefile.blast');
> >
> > This works...so is it possible to use IO::String to provide 
> > Bio::SearchIO with BLAST output?
> >
> > Ryan
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org 
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> 


From bix at sendu.me.uk  Mon Oct 30 21:27:58 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 30 Oct 2006 21:27:58 +0000
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
Message-ID: <45466E5E.9000504@sendu.me.uk>

Ryan Golhar wrote:
> I'm trying to parse some blast output w/o actually creating the output
> file.  Instead, I'm capturing the output in a variable and would like to
> use IO::String to represent the file:
> 
> 	$_ = `megablast -d somedatabase -i somesequence -D 2`;
> 	my $blast_file = new IO::String($_);
> 	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
> $blast_file);
> 	my $results = $searchio->next_result;
> 	my $hit = $results->next_hit;
> 	if (! defined($hit)) {
> 		warn "No BLAST hit for $accession on chr $chr for
> Seq/$orth_id/$organism\n\n";
> 		return;
> 	}
> 
> Now, when Bio::SearchIO tries to read the output line by line, instead
> it reads the entire output as 1 line.
> 
> If I provide the output in a file and use:
> 
> 	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
> '/tmp/somefile.blast');
> 
> This works...so is it possible to use IO::String to provide
> Bio::SearchIO with BLAST output?

Why must it be IO::String? Why not just open() your megablast and 
provide $searchio the real filehandle? It would be faster that way as well.

Read the docs for `. Your usage above is inappropriate.


From golharam at umdnj.edu  Mon Oct 30 21:54:45 2006
From: golharam at umdnj.edu (Ryan Golhar)
Date: Mon, 30 Oct 2006 16:54:45 -0500
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <C3209DC5-433B-4BAD-A184-AC9D2A2B4A90@bioperl.org>
Message-ID: <00f901c6fc6e$03916460$e6028a0a@GOLHARMOBILE1>

Hmmm.  Yes, I suppose I could.  
 
I did it with the backtick because I based my code off of the "To and
>From a String" from the SeqIO HOWTO...
 

-----Original Message-----
From: Jason Stajich [mailto:jason.stajich at gmail.com] On Behalf Of Jason
Stajich
Sent: Monday, October 30, 2006 4:44 PM
To: Sendu Bala
Cc: golharam at umdnj.edu; 'bioperl-l'
Subject: Re: [Bioperl-l] Is it possible to parse BLAST output using
IO:String?


right - can't you just do: 

my $fh;
open($fh, "megablast -d ... | ") || die $!;
my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh);

On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote:


Ryan Golhar wrote:

I'm trying to parse some blast output w/o actually creating the output
file.  Instead, I'm capturing the output in a variable and would like to
use IO::String to represent the file:

$_ = `megablast -d somedatabase -i somesequence -D 2`;
my $blast_file = new IO::String($_);
my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
$blast_file);
my $results = $searchio->next_result;
my $hit = $results->next_hit;
if (! defined($hit)) {
warn "No BLAST hit for $accession on chr $chr for
Seq/$orth_id/$organism\n\n";
return;
}

Now, when Bio::SearchIO tries to read the output line by line, instead
it reads the entire output as 1 line.

If I provide the output in a file and use:

my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
'/tmp/somefile.blast');

This works...so is it possible to use IO::String to provide
Bio::SearchIO with BLAST output?


Why must it be IO::String? Why not just open() your megablast and 
provide $searchio the real filehandle? It would be faster that way as
well.

Read the docs for `. Your usage above is inappropriate.


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


--
Jason Stajich, PhD 
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From bernd.web at gmail.com  Mon Oct 30 20:44:31 2006
From: bernd.web at gmail.com (Bernd Web)
Date: Mon, 30 Oct 2006 21:44:31 +0100
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
Message-ID: <716af09c0610301244q1c6d4cd6y23e72371dda4fa4d@mail.gmail.com>

Hi Ryan,

I parse blastn output using IO::String w/o problems:

 my $stringfh = new IO::String($input);
 my $in = new Bio::SearchIO(-format => 'blast', -fh => $stringfh);

however this is input does not come via backticks.


bernd

On 10/30/06, Ryan Golhar <golharam at umdnj.edu> wrote:
> I'm trying to parse some blast output w/o actually creating the output
> file.  Instead, I'm capturing the output in a variable and would like to
> use IO::String to represent the file:
>
>         $_ = `megablast -d somedatabase -i somesequence -D 2`;
>         my $blast_file = new IO::String($_);
>         my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
> $blast_file);
>         my $results = $searchio->next_result;
>         my $hit = $results->next_hit;
>         if (! defined($hit)) {
>                 warn "No BLAST hit for $accession on chr $chr for
> Seq/$orth_id/$organism\n\n";
>                 return;
>         }
>
> Now, when Bio::SearchIO tries to read the output line by line, instead
> it reads the entire output as 1 line.
>
> If I provide the output in a file and use:
>
>         my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
> '/tmp/somefile.blast');
>
> This works...so is it possible to use IO::String to provide
> Bio::SearchIO with BLAST output?
>
> Ryan
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From jason at bioperl.org  Mon Oct 30 21:44:18 2006
From: jason at bioperl.org (Jason Stajich)
Date: Mon, 30 Oct 2006 13:44:18 -0800
Subject: [Bioperl-l] Is it possible to parse BLAST output using
	IO:String?
In-Reply-To: <45466E5E.9000504@sendu.me.uk>
References: <00e301c6fc5f$a36de9e0$e6028a0a@GOLHARMOBILE1>
	<45466E5E.9000504@sendu.me.uk>
Message-ID: <C3209DC5-433B-4BAD-A184-AC9D2A2B4A90@bioperl.org>

right - can't you just do:

my $fh;
open($fh, "megablast -d ... | ") || die $!;
my $searchIO = Bio::SearchIO->new(-format => 'blast', -fh => $fh);

On Oct 30, 2006, at 1:27 PM, Sendu Bala wrote:

> Ryan Golhar wrote:
>> I'm trying to parse some blast output w/o actually creating the  
>> output
>> file.  Instead, I'm capturing the output in a variable and would  
>> like to
>> use IO::String to represent the file:
>>
>> 	$_ = `megablast -d somedatabase -i somesequence -D 2`;
>> 	my $blast_file = new IO::String($_);
>> 	my $searchio = new Bio::SearchIO(-format => 'blast', -fh =>
>> $blast_file);
>> 	my $results = $searchio->next_result;
>> 	my $hit = $results->next_hit;
>> 	if (! defined($hit)) {
>> 		warn "No BLAST hit for $accession on chr $chr for
>> Seq/$orth_id/$organism\n\n";
>> 		return;
>> 	}
>>
>> Now, when Bio::SearchIO tries to read the output line by line,  
>> instead
>> it reads the entire output as 1 line.
>>
>> If I provide the output in a file and use:
>>
>> 	my $searchio = new Bio::SearchIO(-format => 'blast', -file =>
>> '/tmp/somefile.blast');
>>
>> This works...so is it possible to use IO::String to provide
>> Bio::SearchIO with BLAST output?
>
> Why must it be IO::String? Why not just open() your megablast and
> provide $searchio the real filehandle? It would be faster that way  
> as well.
>
> Read the docs for `. Your usage above is inappropriate.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From lstein at cshl.edu  Mon Oct 30 18:59:29 2006
From: lstein at cshl.edu (Lincoln Stein)
Date: Mon, 30 Oct 2006 13:59:29 -0500
Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase
Message-ID: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>

Hi All,

I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not
to validate. I have committed a new version to live and to the release
candidate branch. I hope it isn't too late to get this into the release.

Lincoln

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From huangyi1 at hkusua.hku.hk  Tue Oct 31 05:46:20 2006
From: huangyi1 at hkusua.hku.hk (Huang Yi)
Date: Tue, 31 Oct 2006 13:46:20 +0800
Subject: [Bioperl-l] bioperl1.5 and GD2.35
Message-ID: <200610310546.k9V5kQGT010481@hkusua.hku.hk>

Hi,

 
I just installed bioperl 1.4 from CPAN to my Gentoo linux computer. But the
installation was failed. I had to install by force.

 
However, the GD module couldn't be installed for some unknown reasons.

 
I therefore use "emerge" tool of Gentoo to get bioperl and GD again. They
are fine. The version of bioperl became upgrade to1.5 and GD was 2.35.

 
However, when I tested it by using the program in HOWTO wiki page
(http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me:

 
Can't locate object method "png" via package "GD::Image" at
/usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 799, <> line 9.

 
In my other computer, bioperl1.4 and GD2.34 work fine. I therefore want to
remove the CPAN bioperl from the system and re-install it, but it seems to
be impossible.

 
Would you please give me some advices on how to let my GD and bioperl work. 

 
Thanks!

 
Huang Yi

 
From bix at sendu.me.uk  Tue Oct 31 08:20:21 2006
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 31 Oct 2006 08:20:21 +0000
Subject: [Bioperl-l] Last minute changes to Bio::Graphics::FeatureBase
In-Reply-To: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>
References: <6dce9a0b0610301059t6bfae183n8d359f4331985c62@mail.gmail.com>
Message-ID: <45470745.1050605@sendu.me.uk>

Lincoln Stein wrote:
> Hi All,
> 
> I found a bug in Bio::Graphics::FeatureBase that was causing GFF3 output not
> to validate. I have committed a new version to live and to the release
> candidate branch. I hope it isn't too late to get this into the release.

It isn't too late, thank you.


From avilella at gmail.com  Tue Oct 31 13:54:39 2006
From: avilella at gmail.com (Albert Vilella)
Date: Tue, 31 Oct 2006 13:54:39 +0000
Subject: [Bioperl-l] catfile and catdir
Message-ID: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>

Hi,

I was testing the bioperl-run/t/PAML.t and stumbled upon this a
catdir/catfile error:

Can't locate object method "catdir" via package "Bio::Root::IO" at
/home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
113.
BEGIN failed--compilation aborted at
/home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
143.
Compilation failed in require at t/PAML.t line 64.
BEGIN failed--compilation aborted at t/PAML.t line 64.

Should be be using File::Spec for catdir and catfile instead of Root::IO?

Cheers,

    Albert.


From Kevin.M.Brown at asu.edu  Tue Oct 31 15:34:34 2006
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Tue, 31 Oct 2006 08:34:34 -0700
Subject: [Bioperl-l] bioperl1.5 and GD2.35
Message-ID: <1A4207F8295607498283FE9E93B775B4023B5F3C@EX02.asurite.ad.asu.edu>

Not really a Bioperl issue per se, but sounds like when you had Gentoo
emerge GD it didn't include libpng and so didn't build the needed parts
to create PNG type graphics. 

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Huang Yi
> Sent: Monday, October 30, 2006 10:46 PM
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] bioperl1.5 and GD2.35
> 
> Hi,
> 
>  
> 
> I just installed bioperl 1.4 from CPAN to my Gentoo linux 
> computer. But the
> installation was failed. I had to install by force.
> 
>  
> 
> However, the GD module couldn't be installed for some unknown reasons.
> 
>  
> 
> I therefore use "emerge" tool of Gentoo to get bioperl and GD 
> again. They
> are fine. The version of bioperl became upgrade to1.5 and GD was 2.35.
> 
>  
> 
> However, when I tested it by using the program in HOWTO wiki page
> (http://www.bioperl.org/wiki/HOWTO:Graphics), it always told me:
> 
>  
> 
> Can't locate object method "png" via package "GD::Image" at
> /usr/lib/perl5/site_perl/5.8.8/Bio/Graphics/Panel.pm line 
> 799, <> line 9.
> 
>  
> 
> In my other computer, bioperl1.4 and GD2.34 work fine. I 
> therefore want to
> remove the CPAN bioperl from the system and re-install it, 
> but it seems to
> be impossible.
> 
>  
> 
> Would you please give me some advices on how to let my GD and 
> bioperl work. 
> 
>  
> 
> Thanks!
> 
>  
> 
> Huang Yi
> 
>  
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From hlapp at gmx.net  Tue Oct 31 16:21:40 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 11:21:40 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
	<24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
Message-ID: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>


On Oct 27, 2006, at 9:57 PM, Chris Fields wrote:

> BTW, was that supposed to be Bio::AnnotatableI, or  
> Bio::AnnotationHolderI?

Sorry, the former. I guess I got confused with FeatureHolders. Too  
bad Featureable isn't an English word.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Tue Oct 31 17:01:44 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 12:01:44 -0500
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <4545BD25.3030107@sheffield.ac.uk>
References: <4545BD25.3030107@sheffield.ac.uk>
Message-ID: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net>

The only thing I would add to Jason's reply is that it is easy to do

	if (! $seq->isa("Bio::SeqI")) {
		my $bioseq = Bio::Seq->new();
		$bioseq->primary_seq($seq);
		$seq = $bioseq;
	}

and from that point on all your objects are Bio::SeqI compliant  
regardless of whether they were obtained that way or not.

Aside from that I wonder why there isn't a -primary_seq option in  
Bio::Seq::new - this would shorten the above into a (more perl'ish)  
single line:

	$seq = Bio::Seq->new(-primary_seq=>$seq) unless $seq->isa("Bio::SeqI");

Anyone takers to add that capability?

-hilmar

On Oct 30, 2006, at 3:51 AM, Nathan S. Haigh wrote:

> In my script I retrieve sequences from GenBank in FASTA format by GI
> numbers and optionally store the sequence in a cache using
> Bio::DB::Fasta. On subsequent runs of the script, the cache is first
> checked for the GI and returns the sequence if it is found or the
> sequence is obtained from GenBank as above.
>
> I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
> returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
> object which is defined within the Bio::DB::Fasta file. This is
> annoying, since $seq_obj in my script would be either a Bio::Seq if it
> was obtained from GenBank or a Bio::PrimarySeq if obtained from the
> cache and calling primary_id() on it doesn't do the expected thing  
> with
> Bio::PrimarySeq:
> ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)
>
> Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 31 17:08:56 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 11:08:56 -0600
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
Message-ID: <001401c6fd0f$4239aa50$15327e82@pyrimidine>

>> BTW, was that supposed to be Bio::AnnotatableI, or
>> Bio::AnnotationHolderI?
> 
> Sorry, the former. I guess I got confused with
> FeatureHolders. Too bad Featureable isn't an English word.
> 
> 	-hilmar

Having SimpleAlign be AnnotatableI shouldn't be too much of a burden, since
the only additional implemented method is annotation().  So, I think all the
various Stockholm tags can be placed somewhere.

A bit OT: were we planning on getting rid of the various *_tag_* methods in
AnnotatableI at some point?  I'm a bit confused as to why they were added.

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Tue Oct 31 17:09:26 2006
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 31 Oct 2006 09:09:26 -0800
Subject: [Bioperl-l] catfile and catdir
In-Reply-To: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>
References: <358f4d650610310554v5eb57db1x137ac307a73fa64@mail.gmail.com>
Message-ID: <1AD4DB38-E08D-4E47-8A59-6539068474CB@bioperl.org>

Yep.  Unless we want this to also exist in Root::IO and delegate to  
File::Spec.

-jason
On Oct 31, 2006, at 5:54 AM, Albert Vilella wrote:

> Hi,
>
> I was testing the bioperl-run/t/PAML.t and stumbled upon this a
> catdir/catfile error:
>
> Can't locate object method "catdir" via package "Bio::Root::IO" at
> /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
> 113.
> BEGIN failed--compilation aborted at
> /home/avilella/src/bioperl-run/Bio/Tools/Run/Phylo/PAML/Yn00.pm line
> 143.
> Compilation failed in require at t/PAML.t line 64.
> BEGIN failed--compilation aborted at t/PAML.t line 64.
>
> Should be be using File::Spec for catdir and catfile instead of  
> Root::IO?
>
> Cheers,
>
>     Albert.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From jason at bioperl.org  Tue Oct 31 17:10:51 2006
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 31 Oct 2006 09:10:51 -0800
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
References: <000001c6f9f6$9ab12710$15327e82@pyrimidine>
	<C94A14C7-A254-486A-8F67-8D5F7BC14F8F@gmx.net>
	<24CFA984-9D5C-4E47-9430-6DDC52119164@uiuc.edu>
	<8163A5C6-7881-426C-ABFF-AB973615ACBC@gmx.net>
Message-ID: <65F92B54-33FD-4D8F-90B7-49E2697CDBA2@bioperl.org>

It just needs to have an annotation collection - so it would be  
Bio::AnnotateableI

On Oct 31, 2006, at 8:21 AM, Hilmar Lapp wrote:

>
> On Oct 27, 2006, at 9:57 PM, Chris Fields wrote:
>
>> BTW, was that supposed to be Bio::AnnotatableI, or
>> Bio::AnnotationHolderI?
>
> Sorry, the former. I guess I got confused with FeatureHolders. Too
> bad Featureable isn't an English word.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html


From hlapp at gmx.net  Tue Oct 31 17:44:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 12:44:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <C16CF3EE.B1A9%bosborne11@verizon.net>
References: <C16CF3EE.B1A9%bosborne11@verizon.net>
Message-ID: <ACF19E78-7FC3-42BE-8F41-86C45C710F4B@gmx.net>

Well isn't this a result of conflating some of the SeqFeatureI  
methods into the annotation collection?

If I'm not mistaken on this then those methods were introduced in  
1.5.0 and hence can go away without deprecation.

	-hilmar

On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote:

> Chris,
>
> I don't think the intent was to remove the methods, rather we'd  
> just call
> deprecated(). Example from AnnotatableI:
>
> sub remove_tag {
>   my ($self, at args) = @_;
>
>   #uncomment in 1.6
>   #$self->deprecated('remove_tag() is deprecated, use
> remove_Annotations()');
>
>   return $self->annotation->remove_Annotations(@args);
> }
>
> With regards to "why", I can't reconstruct the entire rationale  
> myself but I
> can say that the newer names make more sense. Take that example  
> above - it's
> function is to remove entire Annotations not just to remove tags, so
> remove_Annotations is a better name.
>
> Brian O.
>
>
> On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>
>> A bit OT: were we planning on getting rid of the various *_tag_*  
>> methods in
>> AnnotatableI at some point?  I'm a bit confused as to why they  
>> were added.
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bosborne11 at verizon.net  Tue Oct 31 16:37:01 2006
From: bosborne11 at verizon.net (Brian Osborne)
Date: Tue, 31 Oct 2006 12:37:01 -0400
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <001401c6fd0f$4239aa50$15327e82@pyrimidine>
Message-ID: <C16CF3EE.B1A9%bosborne11@verizon.net>

Chris,

I don't think the intent was to remove the methods, rather we'd just call
deprecated(). Example from AnnotatableI:

sub remove_tag {
  my ($self, at args) = @_;

  #uncomment in 1.6
  #$self->deprecated('remove_tag() is deprecated, use
remove_Annotations()');

  return $self->annotation->remove_Annotations(@args);
}

With regards to "why", I can't reconstruct the entire rationale myself but I
can say that the newer names make more sense. Take that example above - it's
function is to remove entire Annotations not just to remove tags, so
remove_Annotations is a better name.

Brian O.


On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:

> A bit OT: were we planning on getting rid of the various *_tag_* methods in
> AnnotatableI at some point?  I'm a bit confused as to why they were added.


From cjfields at uiuc.edu  Tue Oct 31 18:44:02 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 12:44:02 -0600
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <ACF19E78-7FC3-42BE-8F41-86C45C710F4B@gmx.net>
Message-ID: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>

Hilmar Lapp wrote:
> Well isn't this a result of conflating some of the
> SeqFeatureI methods into the annotation collection?
> 
> If I'm not mistaken on this then those methods were
> introduced in 1.5.0 and hence can go away without deprecation.
> 
> 	-hilmar
> 
> On Oct 31, 2006, at 11:37 AM, Brian Osborne wrote:
> 
>> Chris,
>> 
>> I don't think the intent was to remove the methods, rather we'd just
>> call deprecated(). Example from AnnotatableI:
>> 
>> sub remove_tag {
>>   my ($self, at args) = @_;
>> 
>>   #uncomment in 1.6
>>   #$self->deprecated('remove_tag() is deprecated, use
>> remove_Annotations()'); 
>> 
>>   return $self->annotation->remove_Annotations(@args); }
>> 
>> With regards to "why", I can't reconstruct the entire rationale
>> myself but I can say that the newer names make more sense. Take that
>> example above - it's function is to remove entire Annotations not
>> just to remove tags, so remove_Annotations is a better name.
>> 
>> Brian O.
>> 
>> 
>> On 10/31/06 1:08 PM, "Chris Fields" <cjfields at uiuc.edu> wrote:
>> 
>>> A bit OT: were we planning on getting rid of the various *_tag_*
>>> methods in AnnotatableI at some point?  I'm a bit confused as to why
>>> they were added.

Sorry Brian, what I meant was, based on CVS history, the various *tag*
methods in AnnotatableI were added all at once, with deprecations already
present in the commit.  So the methods weren't there to begin with, then
added only to be deprecated later?  Hence the confusion...

I think Hilmar's right; the CVS history indicates these were added just
prior to rel. 1.5 by Allen and seem to be related to SeqFeatureI.  I'm sure
the intent was good, but they contradict methods in the Feature/Annotation
HOWTO on retrieving Annotation objects via the Annotation::Collection
object.  I think that agrees with your point about the various Annotation*
method names being the more appropriate ones.  

Does everybody agree we should just remove them?

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Oct 31 18:53:16 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 12:53:16 -0600
Subject: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
In-Reply-To: <375EA8B5-EE6E-47BA-A335-AAC44CC454C3@gmx.net>
Message-ID: <000001c6fd1d$d4359c80$15327e82@pyrimidine>


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org 
> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Hilmar Lapp
> Sent: Tuesday, October 31, 2006 11:02 AM
> To: n.haigh at sheffield.ac.uk
> Cc: Bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()
> 
> The only thing I would add to Jason's reply is that it is easy to do
> 
> 	if (! $seq->isa("Bio::SeqI")) {
> 		my $bioseq = Bio::Seq->new();
> 		$bioseq->primary_seq($seq);
> 		$seq = $bioseq;
> 	}
> 
> and from that point on all your objects are Bio::SeqI 
> compliant regardless of whether they were obtained that way or not.
> 
> Aside from that I wonder why there isn't a -primary_seq 
> option in Bio::Seq::new - this would shorten the above into a 
> (more perl'ish) single line:
> 
> 	$seq = Bio::Seq->new(-primary_seq=>$seq) unless 
> $seq->isa("Bio::SeqI");
> 
> Anyone takers to add that capability?
> 
> -hilmar

Sounds good to me!

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
From nhansen at nhgri.nih.gov  Tue Oct 31 19:51:23 2006
From: nhansen at nhgri.nih.gov (Nancy Hansen)
Date: Tue, 31 Oct 2006 14:51:23 -0500 (EST)
Subject: [Bioperl-l] Bio::SeqIO::scf header/comments handling
Message-ID: <Pine.GSO.4.58.0610311438470.17750@stout.nhgri.nih.gov>


Hello,

	As sequencing centers begin to deposit trace data from "Medical
Sequencing" projects into the public archives, there is now the need to
"anonymize" sequence trace files by removing embedded information which
might be used to identify the individual who was the original source of
the DNA being sequenced.

	I was hoping I might be able to use Bio::SeqIO to manipulate the
comments contained in an SCF-formatted trace file, but I'm finding that
Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information.
Since SCF is a widely-accepted standard for trace files, would it be
reasonable to include fields like "scf_comments" and "scf_header" in a
Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them?
Likewise, it would be great if write_seq could pull these values right
from a SequenceTrace object rather than requiring them as arguments.

	I'd be happy to help in this effort if necessary.

	Thanks,
	--Nancy

*************************************
Nancy F. Hansen, PhD	nhansen at nhgri.nih.gov
Bioinformatics Group
NIH Intramural Sequencing Center (NISC)
5625 Fishers Lane
Rockville, MD 20852
Phone: (301) 435-1560	Fax: (301) 435-6170


From lincoln.stein at gmail.com  Tue Oct 31 20:24:17 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 31 Oct 2006 15:24:17 -0500
Subject: [Bioperl-l] Bioperl versioning
In-Reply-To: <000001c6f78b$d1c65a30$15327e82@pyrimidine>
References: <453E309B.9090007@sendu.me.uk>
	<000001c6f78b$d1c65a30$15327e82@pyrimidine>
Message-ID: <6dce9a0b0610311224x79256b29sf102eb5c35865caf@mail.gmail.com>

Are you going to go ahead with 1.52_XX ? If so, I will code GBrowse to look
for 1.52 or higher.

Lincoln

On 10/24/06, Chris Fields <cjfields at uiuc.edu> wrote:
>
> ..
> >
> > 'handle'? I think it shows up as '6.2.13' simply because it was uploaded
> > with the filename Perl6-Pugs-6.2.13.tar.gz
>
> Sorry, my point was that when Audrey T. uses '6.2.13', her $VERSION is
> '6.002013'.  So maybe we should follow a similar convention.  Seems easier
> and less confusing to me, at least.
>
> > As you point out, the code has the kind of $VERSION number we've been
> > suggesting in this thread:
> >
> > > From the Perl6::Pugs source, $VERSION for rel 6.2.13 is '6.002013':
> > >
> > > our $VERSION = 6.002013;
> > >
> > > That's also a very perlish-way to do it.  And there are no developer
> > > versions of Pugs, since it is always under active development.  We
> could
> > try
> > > something like:
> > >
> > > our $VERSION = 1.005002_01;
> >
> > Yes, this was already like one of my suggestions (1.0502_01), but I
> > brought up the concern that 1.05 might be < 1.4.
> >
> > So then we have a question: do we try and fumble a 1.4 compatible number
> > by using 1.60_10, or do we have a clean break, remove 1.4 from CPAN if
> > it causes problems, and go for the 'proper' 1.006000 (1.6.0) with no
> > room for RC numbering, or 1.006000010 (1.6.0.10) - the first final
> > release following some 1.006000_001 (1.6.0.01 == rc1) RCs?
>
> I would go for the clean break if it follows perl/CPAN convention.
> '1.60_10' looks like '1.60.10', not 1.6 or 1.6 RC1, so it's too confusing.
>
> If we can use '1.00600x' for 1.6, 1.6.1, etc, and '1.006000_00x' for 1.6
> RC1, 1.6 RC2 etc then that would be consistent and perl-compatible.
>
> BTW, the reason I looked at Pugs was to see what some of the Perl6
> developers were using.  Who knows; they'll probably change it!
>
> ..
>
> > I don't think it would be a hassle; on the contrary it would be very
> > useful to know the CPAN distribution actually works. I'm very happy with
> > the idea that a release candidate gets fully tested...
>
> So you obviously feel strongly about it!  ;>
>
> I don't have a problem as long as we stick with doing this from now on (
> i.e.
> have a consistent versioning scheme, release policy, CPAN release policy,
> etc).  Would be nice for Jason/Brian/Hilmar to chime in as to the
> reasoning
> behind the older versioning scheme.
>
> Christopher Fields
> Postdoctoral Researcher - Switzer Lab
> Dept. of Biochemistry
> University of Illinois Urbana-Champaign
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From hlapp at gmx.net  Tue Oct 31 21:53:58 2006
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 31 Oct 2006 16:53:58 -0500
Subject: [Bioperl-l] Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>
References: <001e01c6fd1c$891fc4b0$15327e82@pyrimidine>
Message-ID: <F244DEC6-0ADE-437E-9AED-1F864A54F7AD@gmx.net>


On Oct 31, 2006, at 1:44 PM, Chris Fields wrote:

> Does everybody agree we should just remove them?

I wish you could but I'm afraid that would break stuff? Otherwise why  
were they added in the first place? I thought  
Bio::SeqFeature::Annotated needs them maybe?

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Oct 31 22:41:17 2006
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Oct 2006 16:41:17 -0600
Subject: [Bioperl-l] AnnotatableI tag methods,
	was  Rfam/Pfam annotations and SimpleAlign
In-Reply-To: <F244DEC6-0ADE-437E-9AED-1F864A54F7AD@gmx.net>
Message-ID: <000001c6fd3d$ae37c240$15327e82@pyrimidine>


> On Oct 31, 2006, at 1:44 PM, Chris Fields wrote:
> 
> > Does everybody agree we should just remove them?
> 
> I wish you could but I'm afraid that would break stuff? 
> Otherwise why were they added in the first place? I thought 
> Bio::SeqFeature::Annotated needs them maybe?
> 
> 	-hilmar
> 
> --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Yep, removing them clobbers a ton of tests, including anything that requires
SeqIO::FTHelper.  Looks like SeqFeature::Generic and a few others use them.


I could understand if these were meant to be permanent methods, but why add
these in if they were to be deprecated in 1.6?  Something that was meant to
be a transition but wasn't finished?  That seems to be indicated in the
commented out lines for all the *tag* methods:

  #uncomment in 1.6
  #$self->deprecated('remove_tag() is deprecated, use
remove_Annotations()');

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

 
From lincoln.stein at gmail.com  Tue Oct 31 23:18:07 2006
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Tue, 31 Oct 2006 18:18:07 -0500
Subject: [Bioperl-l] Bio::DB::GFF::Util::Binning
In-Reply-To: <loom.20061020T041338-193@post.gmane.org>
References: <loom.20061020T041338-193@post.gmane.org>
Message-ID: <6dce9a0b0610311518l3bec852q5d04a9b488621377@mail.gmail.com>

Hi Keith,

The current Bio/DB/GFF/Util/Binning.pm file just contains the hierarchical
binning system that I implemented some time ago. Where is the R-tree system
that you describe? How much of an improvement did the R-tree scheme give
over the hierarchical scheme?

FTYI the GFF3 implementation uses a different binning scheme in which there
is a fixed-size bin. Every time a feature overlaps a bin, it creates a new
row in a table. So big features will have multiple rows and little features
that fit inside a bin will have only one row. The query for this is simpler
and seems to give the same relative speedup as the hierarchical binning
system. I'd really like to get these queries to go as fast as possible and
would love to work with you on this if you're interested.

Lincoln

On 10/19/06, Keith Player <keithplayer at hotmail.com> wrote:
>
> I know that there may be some changes resulting from new GFF3
> implementations,
> but thought I would see if the following is useful anyway.
>
> I implemented the R-tree binning schema as used by
> Bio::DB::GFF::Util::Binning
> and as mention in this article:
>
> I tested the following query on a normal table (no binning), but it
> assumes
> that you know the longest range in the table.  So for example with a table
> of
> human genes, where the longest gene we know of is around 2.4Mb.
>
> SELECT COUNT(*) as count FROM groups WHERE start > max(0,[start-2.4Mb])
> AND
> g.start < [end] AND g.end > [start] AND g.chromosome = '1'
>
> so for 100Mb:101Mb
>
> SELECT COUNT(*) as count FROM groups WHERE start > 97600000 AND g.start <
> 101000000 AND g.end > 100000000 AND g.chromosome = '1'
>
>
> where [start] and [end] define the region of interest.  This query
> outperforms
> the R-Tree implementation on all tests that I have performed (for lengths
> of
> 200bp to 10Mb across a whole chromsome).  Could this be of some practical
> use?
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu